I'm trying to copy a database from version 3.5 to 4.4 using neo4j-adm copy command. The database is huge (about 3tb). Nodes and relationships have been copied successfully during stages 1-2 in approx 20 hours. Hovewer, the 3rd stage (Relationship linking) has been running for more than 2 days. The progress is stuck on 70%. Looking at high cpu and memory usage I suggest that the process is still active.
There are two files changing from time to time:
neo4j-adm reported that it would need about 90gb of ram but there is just 64gb (with 100gb swap) on my machine. Could it de a potential problem? For now the process uses almost all of the ram and the half of swap.
The database was created in 3.5.8.
Current version is 4.4.7
I had something similar recently on bulk import and it turned out to be the limit on memory that made linking slow.
So if you can limit the heap memory to 4G or so, and have more off-heap memory available for the process that will speed it up.
Don't use swap (disable it if possible)
If you look at iotop or so it's probably busy with IO, I guess/hope that you have an SSD to run this.
My colleague said:
He can also limit the amount of memory that the import process will take using --memory=40G or something. This will avoid the swap and instead do multiple rounds of linking, which I think will be much faster since it's sequential a couple of times instead of random swap access for the off-heap caching