SOLUTION: (explanation below)
LOAD CSV WITH HEADERS FROM "https://docs.google.com/spreadsheets/d/1sFcY7LFBCGXSFG336UPoOf72BBv3bmv_AVaYLxwiV4A/export?format=csv&id=1sFcY7LFBCGXSFG336UPoOf72BBv3bmv_AVaYLxwiV4A&gid=1318941318" AS row1
WITH CASE
WHEN NOT row1.Year IS NULL THEN collect(row1.URL)
END AS fileURLs
UNWIND fileURLs as fileURL
CALL apoc.periodic.iterate(
'
LOAD CSV WITH HEADERS FROM $url AS row RETURN row
//CALL apoc.load.csv($url) YIELD map AS row RETURN row
','
MERGE (state:State {id: row.STATE_CODE_001})
MERGE (state)<-[:OF_STATE]-(county:County {id: row.COUNTY_CODE_003})
MERGE (county)<-[:OF_COUNTY]-(place:Place {id: row.PLACE_CODE_004})
MERGE (place)<-[:OF_PLACE]-(bridge:Bridge {id: row.STRUCTURE_NUMBER_008})
ON CREATE SET bridge.name = row.STRUCTURE_NUMBER_008,
bridge.latitude = row.LAT_016,
bridge.longitude = row.LONG_017,
bridge.yearbuilt = row.YEAR_BUILT_027,
bridge.deck_cond = row.DECK_COND_058,
bridge.superstructure_cond = row.SUPERSTRUCTURE_COND_059,
bridge.substructure_cond = row.SUBSTRUCTURE_COND_060,
bridge.channel_cond = row.CHANNEL_COND_061,
bridge.culvert_cond = row.CULVERT_COND_062,
bridge.structural_eval = row.STRUCTURAL_EVAL_067,
bridge.deck_geometry_eval = row.DECK_GEOMETRY_EVAL_068,
bridge.undclrence_eval = row.UNDCLRENCE_EVAL_069,
bridge.posting_eval = row.POSTING_EVAL_070,
bridge.waterway_eval = row.WATERWAY_EVAL_071,
bridge.appr_road_eval = row.APPR_ROAD_EVAL_072,
place.name = row.PLACE_CODE_004,
county.name = row.COUNTY_CODE_003,
state.name = row.STATE_CODE_001
',
{batchSize:1000, parallel:false, params:{url:fileURL}}) YIELD batches, total
RETURN batches, total
You will notice that //CALL apoc.load.csv($url) YIELD map AS row RETURN row
is commented out. While that is supposed to work. One of the files at the URL I was calling was throwing an error, so at this point I had to use LOAD CSV
instead.
EXPLANATION
The first part of the code
LOAD CSV WITH HEADERS FROM "https://docs.google.com/spreadsheets/d/1sFcY7LFBCGXSFG336UPoOf72BBv3bmv_AVaYLxwiV4A/export?format=csv&id=1sFcY7LFBCGXSFG336UPoOf72BBv3bmv_AVaYLxwiV4A&gid=1318941318" AS row1
WITH CASE
WHEN NOT row1.Year IS NULL THEN collect(row1.URL)
END AS fileURLs
UNWIND fileURLs as fileURL
loads the file URLs from the shared Google Sheet. The WITH
clause is included to "ignore" rows where the file URL has not been verified. All that satisfy are put into a collection and then using the UNWIND
passed in to the next batch of code contained within the apoc.periodic.iterate
function. The code within apoc.periodic.iterate
load the data for each url and add it to the graph using MERGE
.