Help with ingesting an html table using apoc.load.html

pdrangeid · November 4, 2019, 5:53pm

I would like to dynamically create some product information from scraping a version table.

I'd like to create (:Esxiversion) with .version .name .releasedate .buildnumber .installerbuild from
https://kb.vmware.com/s/article/2143832

I'm not exactly sure how to use the tags to consume the table data within the HTML. any hints to get started would be helpful.


With "https://kb.vmware.com/s/article/2143832" as url
call apoc.load.html(url) yield value
return value

paulare · December 12, 2019, 8:04pm

Hi, @pdrangeid

I believe that you cant fetch information from that site. As it uses Content Security Policy (CSP) that protects this site content.

If you query

WITH "https://kb.vmware.com/s/article/2143832/" as url
CALL apoc.load.html(url,{target: 'meta'}) YIELD value
RETURN value

you see in response that

    "http-equiv": "Content-Security-Policy"

.

paulare · January 3, 2020, 11:31am

Consulted with andrea.larus on neo4j-ninjas slack channel - and he explained that the real issue seems to be that page is generated by javascript runtime.

if you put the following link in Chrome URL bar view-source:https://kb.vmware.com/s/article/2143832
then there is no table tag at all.

Created issue post in APOC repo: apoc.load.html ability to read runtime structure of the page · Issue #1372 · neo4j-contrib/neo4j-apoc-procedures · GitHub

pdrangeid · January 3, 2020, 4:35pm

Paul,

Thanks for the follow-up. I was looking at the page content with a colleague and had determined that it was dynamically delivered content, but wasn't sure how that affected the ability to collect it with apoc.

Thanks for submitting the issue!

Topic		Replies	Views
Apoc.load.html cannot read relative href value Procedures & APOC	3	110	March 1, 2023
Failed to invoke procedure `apoc.load.html` when it link is not found (404) Procedures & APOC apoc	1	322	July 7, 2021
jQuery not returning expected result on apoc.load.html Procedures & APOC	3	467	January 30, 2020
Understanding the CALL apoc.... and what it yields Neo4j Graph Platform apoc , browser , cypher , import	8	592	February 25, 2021
Use APOC Procedure for row when using LOAD CSV Neo4j Graph Platform cypher	2	589	February 14, 2023

Submit Your Talk by June 15

Help with ingesting an html table using apoc.load.html

Related topics