Showing results for 
Search instead for 
Did you mean: 

Head's Up! Site migration is underway. Pause, resolving how to handle anonymous content

Neo4j Import tools slow ingestion

Node Link

I am trying to ingest large data (around a billion nodes and relations) using the neo4j import tool but the ingestion gradually slows down with time. It quickly ingests the first 100 million nodes after which it becomes increasingly slow. I've found people mentioning that they have ingested around 2-3 billion nodes in around 2 hours so can someone point out what I am doing wrong. I have taken care of deduplication as well. Following is the import setup:

Command to ingest data (25 files for nodes and 26 files for relations):
neo4j-admin import --database=neo4j --nodes=... --relationships=... --force=true --delimiter="\t" --skip-duplicate-nodes=true --skip-bad-relationships=true --high-io=true --cache-on-heap=true


Node Link

Problem: I was running the neo4j import tool on the NFS file system which was causing the data ingestion to slow down. As per neo4j documentation (Linux file system tuning - Operations Manual) EXT4 and XFS file systems are recommended. Once I shifted over to the XFS file system, I was able to ingest this in 1 hour.