Problem is very similar to what is described here:
I have done lots of research, and everyone is mentioning deduplication, but I am pretty sure that ids are unique as they have been generated from dataset with unique ID values per type, and each :ID in header marked with the node type it belongs to:
Sample Node Header:
Sample Relationship Header:
Dataset has around 900M nodes, and 5B relationships
Note that ids are strings.
We can generate globally unique numeric ids for nodes, and re-generate nodes and relationships to use them if that would help.
Can someone please clearly specify what is meant by deduplication (id/namespace combos are already unique), and how exactly do node ids need to be unique,
How to make neo4j-admin import perform reasonably well?
Also, I have noticed a huge slowdown from 3.3.5 to later versions, so I cannot even try to import on the neo4j 4+. I am sure I am doing something wrong again, but cannot find out what as commands have the exact same options on the same machine with same memory/disk.
I am currently using neo4j-3.3.5 community for import, but would like to check 4.1.3 if it is possible to make import speed comparable to 3.3.5
ec2 instance with 256GB RAM and EBS storage volume is used for testing