I have a Neo4j server (via Docker) with around 500 databases. I wanted to upgrade it to a newer version and stopped the DBMS, removed and re-created the container. Now when I restart Neo4j, it wants to start all database at once and seems to run out of memory during startup:
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 2097152 bytes for AllocateHeap
The server has 192GB of memory and and set the recommended values:
I also tried lower values for the pagecache, like 90Gb and 60GB.
My workaround to get the DBMS started was to rename all database folders in /var/lib/neo4j/data/databases/* so Neo4j doesn't start them. Then I started Neo4j again and re-named the database folders back to their original name.
Is there an option for a "slow start" without all databases?
I wanted to upgrade it to a newer version and stopped the DBMS
from what prior version were you running and to what version are you upgrading to?
Under the prior version what were the values for heap and pagecache?
My workaround to get the DBMS started was to rename all database folders in /var/lib/neo4j/data/databases/*
that may have worked but is certainly very kludgy. A database is not just what is at /var/lib/neo4j/data/databases/* but also what is at /var/lib/neo4j/data/transaction/* as well as the system database also has record of what databases exist. It may have worked but is not a supported experience
I upgraded from 5.17 to 5.28, but even a normal restart without upgrade caused a out of memory error during startup.
A database is not just what is at /var/lib/neo4j/data/databases/* but also what is at /var/lib/neo4j/data/transaction/* as well as the system database also has record of what databases exist. It may have worked but is not a supported experience
Yes, I noticed Neo4j puts the missing databases into a quarantine state.
But what is the right way? When I look at the server memory I can see that it exhausts free memory until the last GB, even though it has available memory in buffer/cache.
Yes, the Neo4j server was running with all the databases. The databases were created over time, not all at once. The issue only appears after it has been restarted.
I cannot tell you the exact memory values before the Neo4j start, since I invested a lot of time in getting it back running again and would like to avoid restarting it.
Since the server is dedicated to running Neo4j, almost all it's memory (192GB) was free. On Neo4j startup, the buff/cache increased to roughly 100-110 and the free decreased over time with each new database being started. As soon as the free memory went below 1, I got the JVM error mentioned above. I think there error also mentioned something about overcommitting memory, so maybe that's also relevant to this topic?