Tip: Avoiding Slow & Messy Conditionals (or: splitting input) in Cypher for bulk import LOAD CSV?

Thanks a lot for your feedback you're totally right for denormalized import files.

I would even go one step further and split node-creation and relationship creation.
At least node-creation can then also be parallelized.

Did you create a generalized script that basically uses a CSV -> graph mapping? (Similar to the import tool)?
B/c you would also want to set some columns onto the nodes.

Some time ago, my colleague @lyonwj built a tool to do this online, and generating the appropriate cypher You could even use his tool with a sample (head -10 file.csv) of your file, and grab the generated Cypher scripts and use them with the full file.

https://neo4j-csv-import.herokuapp.com/

We also have a procedure in APOC that does this for you: apoc.import.csv

https://neo4j-contrib.github.io/neo4j-apoc-procedures/#_import_csv

Eventually, I'd love to have a proper graph model coming from a modeling tool, that you'd map your input (e.g CSV, JSON, RDBMS) to (e.g. visually).

1 Like