I have some doubts regarding importing data into Neo4j.
I have a large volume of data (i have 100k JSON files and each JSON contains 200k records).
what is the best way to import this data?
I am using Pyspark and neo4j-admin import currently. is there any alternative method for this or can I import this much of huge data using pyspark only?
Using apache spark only will most likely result in deadlock situations for large graphs. Creating files and using the admin import is currently the best option I believe.
There might be a possibility to run a clustering algorithm on your graph in spark and separate the clusters so you get rid of the deadlocks. At the end you need to create the cluster connections again of course.
This is in no way an easy solution though.