Load large volume of data in neo4j

12kunal34 · February 11, 2020, 6:21am

Hi Graphis,

I have some doubts regarding importing data into Neo4j.
I have a large volume of data (i have 100k JSON files and each JSON contains 200k records).
what is the best way to import this data?
I am using Pyspark and neo4j-admin import currently. is there any alternative method for this or can I import this much of huge data using pyspark only?

paulare · February 11, 2020, 8:44am

Hi @12kunal34

Maybe this blog is helpful

If you can describe a more specific issue you having with your current method, then maybe the community may give you more ideas back?

krisgeus · February 11, 2020, 8:56am

Using apache spark only will most likely result in deadlock situations for large graphs. Creating files and using the admin import is currently the best option I believe.
There might be a possibility to run a clustering algorithm on your graph in spark and separate the clusters so you get rid of the deadlocks. At the end you need to create the cluster connections again of course.
This is in no way an easy solution though.

anthapu · February 11, 2020, 12:37pm

You could try this utility written in python.

It has a config yaml file, where you can specify the file URL and corresponding cypher to ingest the data. It import each file in sequence.

If you want to parallelize the import, you can create multiple config yaml files and run them in parallel.

As others mentioned, when you run in parallel there is a possibility of dead locks, as relationship creation locks both side nodes.

Neo4J admin import would still be fastest way to import huge amount of initial data.

Topic		Replies	Views
Current best approach to programmatically bulk import data into Neo4j from Spark? Import / Export	1	747	February 28, 2021
Load large CSV with LOAD CSV or python Neo4j Graph Platform migrated	2	1053	August 4, 2023
I have some questions about importing data Import / Export	4	1065	January 3, 2019
What is the most efficient and fast way to load very large volumes of data into a Neo4j graph database? Import / Export apoc , cypher , import	2	703	August 19, 2021
Bulk import into Neo4j from Spark Stream Processing	1	627	March 14, 2020

Load large volume of data in neo4j

Related topics