Override the temporary path configuration used by neo4j admin

I am using the neo4j admin tool to import a very large number of nodes and edges into an empty graph db (all running on version 5.26.1).

It all runs smooth until the following step:

Switching to bigger page cache for relationship import 22.15GiB
  using configuration:Configuration[numberOfWorkers=24, temporaryPath=D:\...\neo4j\temp, numberOfTrackedDense=10000, applyBatchSize=64]
Importing relationships

At this step, it becomes very slow, e.g., after 24h, it has not reached 20% of this step. The input data and neo4j directory are all persisted on an HDD. To speed up, I want to use my 4TB NVMe. Running Neo4j on that drive, it ran smoother, but it ran out of storage on NVMe.

So, I am trying to have neo4j run on HDD, and use the NVMe for anything that needs temporaryPath (since this is the slowest step). But I could not find any configuration that lets me change this one only.

I tried symlinking that folder (created a symlinked temp directory when Neo4j expects and point that to the NVMe). But Neo4j ended up deleting my symlink and recreate the temp directory at \neo4j\temp.

Any thoughts on how I can override the value of that setting, or creating a symlink that neo4j will actually use? (Note that I do not want to have the entire neo4j database on the NVMe [since I will run out of storage], only the temporaryPath path).

I have not verified or tested, but I think you add something like "server.jvm.additional=-Djava.io.tmpdir=/desired/temp/dir" to neo4j-admin.conf (my guess is based on: System requirements - Operations Manual)

I set that configuration. It ended up creating a temp folder at a given location with the following files in it.

jna10286307437053657375.dll
jna10286307437053657375.dll.x

These were the only files in that temp directory with no additional files added. And, it still continued creating the large files in the temp directory under \neo4j\relate-data\dbmss\dbms-[uuid]\data\databases\neo4j\temp

I suspected there was more then one temp path at play. Let me investigate and get back to you.

1 Like

@hakan.lofqvist1 Thank you for looking into this! It is extremely helpful as we're currently blocked by this and may miss our deadlines. A solution to this will be very helpful for us!

I got a short comment: temporary files are split up and deleted throughout the import process. Unfortunately it looks like there is no config option to use a separate temp path during the import. Personally, I would use the fastest drive you have at your disposal. It needs to have sufficient space and I am not sure if you would benefit from multiple volumes (at least not for temp + target).

Is it an option for you to get a larger volume? If you are on a cloud vm, it usually pays off to get more muscle and be done with the import in a shorter period of time.

Thanks for checking this.