I filled a database with nodes (CREATE) through a python script using neo4j.GraphDatabase.
Every 1000 entries I finished the session and transaction and began a new one.
This worked fine for a continuous set of ~30.000 nodes.
Then I tried it on a much larger dataset.
After ~1.5 million nodes the python script did not go forward anymore. Checking the task manager showed Java running with 100% cpu.
As of my understanding this should not happen(?)
Batching also helps, if you are able. Doing an UNWIND of a batch of data (such as 10k or so at a time) and processing the entire batch per transaction rather than a single create per transaction will be more efficient.
(1) Creating only one session with several transaction let me insert all data into the database.
(2) UNWIND: This speeded up the insertion process by the factor of 5. Very nice hint.