Using Nested `LOAD CSV` with `apoc.periodic.iterate`

mckenzma · March 14, 2019, 3:13pm

SOLUTION: (explanation below)

LOAD CSV WITH HEADERS FROM "https://docs.google.com/spreadsheets/d/1sFcY7LFBCGXSFG336UPoOf72BBv3bmv_AVaYLxwiV4A/export?format=csv&id=1sFcY7LFBCGXSFG336UPoOf72BBv3bmv_AVaYLxwiV4A&gid=1318941318" AS row1
WITH CASE
	WHEN NOT row1.Year IS NULL THEN collect(row1.URL)
    END AS fileURLs
UNWIND fileURLs as fileURL
CALL apoc.periodic.iterate(
'
LOAD CSV WITH HEADERS FROM $url AS row RETURN row
//CALL apoc.load.csv($url) YIELD map AS row RETURN row
','
MERGE (state:State {id: row.STATE_CODE_001})
MERGE (state)<-[:OF_STATE]-(county:County {id: row.COUNTY_CODE_003})
MERGE (county)<-[:OF_COUNTY]-(place:Place {id: row.PLACE_CODE_004})
MERGE (place)<-[:OF_PLACE]-(bridge:Bridge {id: row.STRUCTURE_NUMBER_008})
ON CREATE SET bridge.name = row.STRUCTURE_NUMBER_008,
			  bridge.latitude = row.LAT_016,
			  bridge.longitude = row.LONG_017,
			  bridge.yearbuilt = row.YEAR_BUILT_027,
			  
			  bridge.deck_cond = row.DECK_COND_058,
			  bridge.superstructure_cond = row.SUPERSTRUCTURE_COND_059,
			  bridge.substructure_cond = row.SUBSTRUCTURE_COND_060,
			  bridge.channel_cond = row.CHANNEL_COND_061,
			  bridge.culvert_cond = row.CULVERT_COND_062,
			  bridge.structural_eval = row.STRUCTURAL_EVAL_067,
			  bridge.deck_geometry_eval = row.DECK_GEOMETRY_EVAL_068,
			  bridge.undclrence_eval = row.UNDCLRENCE_EVAL_069,
			  bridge.posting_eval = row.POSTING_EVAL_070,
			  bridge.waterway_eval = row.WATERWAY_EVAL_071,
			  bridge.appr_road_eval = row.APPR_ROAD_EVAL_072,
			  
			  place.name = row.PLACE_CODE_004,
			  county.name = row.COUNTY_CODE_003,
			  state.name = row.STATE_CODE_001
',
{batchSize:1000, parallel:false, params:{url:fileURL}}) YIELD batches, total
RETURN batches, total

You will notice that //CALL apoc.load.csv($url) YIELD map AS row RETURN row is commented out. While that is supposed to work. One of the files at the URL I was calling was throwing an error, so at this point I had to use LOAD CSV instead.

EXPLANATION

The first part of the code

LOAD CSV WITH HEADERS FROM "https://docs.google.com/spreadsheets/d/1sFcY7LFBCGXSFG336UPoOf72BBv3bmv_AVaYLxwiV4A/export?format=csv&id=1sFcY7LFBCGXSFG336UPoOf72BBv3bmv_AVaYLxwiV4A&gid=1318941318" AS row1
WITH CASE
	WHEN NOT row1.Year IS NULL THEN collect(row1.URL)
    END AS fileURLs
UNWIND fileURLs as fileURL

loads the file URLs from the shared Google Sheet. The WITH clause is included to "ignore" rows where the file URL has not been verified. All that satisfy are put into a collection and then using the UNWIND passed in to the next batch of code contained within the apoc.periodic.iterate function. The code within apoc.periodic.iterate load the data for each url and add it to the graph using MERGE.

Topic		Replies	Views
Using `UNWIND` with `apoc.periodic.iterate` to load data into a graph Procedures & APOC	1	2223	February 15, 2019
Apoc.load.csv fail (there's no apoc called apoc.load.csv) Neo4j Graph Platform migrated	3	468	January 11, 2023
Import csv or api data in python using apoc.periodic.iterate Neo4j Graph Platform migrated	1	245	January 31, 2023
Apoc.load.csv fail within CALL apoc.periodic.iterate Neo4j Graph Platform migrated	6	316	January 6, 2023
Help with `apoc.periodic.iterate` Procedures & APOC	7	4968	March 14, 2019

Get Certified in June!

Using Nested `LOAD CSV` with `apoc.periodic.iterate`

Related topics