Loading multiple CSV files in Neo4j with each row of each CSV being a node

kasthuri · November 6, 2019, 5:00pm

I need to upload multiple CSV files in Neo4j with each row of each CSV file being a node. How to effectively do this? I cannot concatenate the CSVs since each row is a label and the rows are the same across multiple CSVs.

It looks like the LOAD CSV command either take a single CSV and generates nodes for each row or takes a bunch of CSVs and makes a node for each CSV. Thanks!

ameyasoft · November 6, 2019, 7:32pm

LOAD CSV reads only one file at a time. You have to run the import with each file.

Other option is copy the data from one file and copy into the first file as second column.

kasthuri · November 6, 2019, 10:19pm

Thanks. Can I automate the import with each file? Like writing LOAD CSV filename multiple times in a file and uploading the query file somewhere so each LOAD CSV query is executed? If so, how do I do it?

mckenzma · November 7, 2019, 1:54am

One way I have handled importing ~1000 files is to create another CSV file with the file names/urls. I use LOAD CSV to import the rows from the CSV that contains all the file names and create nodes for each file. Then iterate over the file nodes to import rows from each file.

kasthuri · November 7, 2019, 2:15am

Ok, I will try this. Is there a query that you could provide for such creation? Like a pseudo-code with Cypher statements. Sorry, I am pretty new to Neo4j. Thanks!

mckenzma · November 29, 2019, 3:07pm

Hi,
Here is an example of what I have been using:

LOAD CSV WITH HEADERS FROM "url" AS row
WITH row
WITH row, row.URL AS fileUrl
MERGE (file:File: {url: fileUrl})
ON CREATE SET file.url = fileUrl,
              file.folder = row.Folder,
              file.name = row.File,
              file.createdOn = timestamp()

Note that I am currently creating File Nodes in my graph to store the url and then iterating over the file nodes to import and connect data.

MATCH (file:File)
WHERE NOT (file)-[:CONTAINS]->(:Row)
WITH collect(file.url) AS fileURLs
UNWIND fileURLs AS fileURL
CALL apoc.periodic.iterate(
'
CALL apoc.load.csv($url,{header:true,quoteChar:"\u0000"}) YIELD map AS row
RETURN row
','
CREATE (fileRow:Row {createdOn: date()})
SET fileRow += row,
fileRow.url = $url,
fileRow.createdOn = date()
',
{batchSize:10000,parallel:false,params:{url:fileURL}}) YIELD batches, total
RETURN batches, total

After this iterate through and connect file nodes to row nodes. Then once all of that is in the graph I just start working with the raw data within the graph.

I hope this helps.

tangxingbang · February 24, 2022, 4:45am

I use python ,neo4j python driver , python os , python csv. Then let python uploads files to database one by one.

sameer.gijare14 · February 24, 2022, 7:07am

May be you can write a custom procedure that takes input folder where you keep all CSV files.You can loop over all file handles ,then transform data as per your target Data model and finally load it into Your graph DBMS.
Many thanks
Mr Sameer S Gijare

Topic		Replies	Views
Issues with Importing Multiple CSV Files to Neo4j and Using Neo4j Bloom (we couldn't find an active graph in neo4j desktop. please check that you have connected to a graph.) Neo4j Bloom	0	24	March 14, 2025
Tutorial: Import Relational Data Into Neo4j Neo4j Website	0	826	August 5, 2020
Load multiple CSV rows into same node Cypher cypher , import , neo4j-desktop	9	552	November 8, 2021
Load CSV - Loading relationships in one column, but same type of node Cypher	7	819	November 24, 2020
Multiple LOAD CSV operations creating "duplicate" nodes Newbie Questions import	7	1818	August 10, 2020

July Summer Fun!

Loading multiple CSV files in Neo4j with each row of each CSV being a node

Related topics