At that point I tried to execute some queries in the Neo4j browser, to understand what was going on, but the database wasn't responding to a simple SHOW INDEXES.
Note: the logs didn't indicate any OOM error at this stage.
Furthermore my script is executing everything inside a transaction, and as far as I understand, transactions are still single core, so I don't get how it ends up using every possibles core at 100%.
api-1 | 🚀 Server ready at http://localhost:4000/
neo4j-1 | Exception in thread "neo4j.Scheduler-1" java.lang.OutOfMemoryError: Java heap space
neo4j-1 |
neo4j-1 | Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "qtp168703427-289"
neo4j-1 |
neo4j-1 | Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "neo4j.Scheduler-1"
neo4j-1 |
neo4j-1 | Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "qtp168703427-287"
neo4j-1 |
neo4j-1 | Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-33-thread-1"
neo4j-1 |
neo4j-1 | Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "JNA Cleaner"
neo4j-1 | Exception in thread "pool-35-thread-1" java.lang.OutOfMemoryError: Java heap space
neo4j-1 | Exception in thread "qtp168703427-292" java.lang.OutOfMemoryError: Java heap space
neo4j-1 | Exception in thread "Log4j2-TF-3-Scheduled-1" java.lang.OutOfMemoryError: Java heap space
neo4j-1 |
neo4j-1 | Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "neo4j.IndexSampling-9"
neo4j-1 | 2024-04-14 18:36:01.113+0000 ERROR Client triggered an unexpected error [Neo.DatabaseError.Statement.ExecutionFailed]: Java heap space, reference 6251f318-7da7-43e1-b86a-2c553a445892.
Note: my script worked very well with Neo4j 4.4.32, so I don't understand what's happening here.
Note2: I double checked, and I have my indexes and unique constraints applied this time ;)
Neo4j 5.18.1 (Docker)
Python stack with neomodel and direct Cypher queries
Is total RAM from 4.4.32 and 5.18.1 the same?
Is memory assigned to mix/max heap and pagecache the same between 4.4.32 and 5.18.1?
Is your python script creating a single txn which then for example attempts to commit 10 million changes in a single txn?
Admittedly its apples to apples relative to lack of memory configurations and per my update on Neo4j 5.18: 100% on all CPUs followed by OOM - #4 by dana_canzano but this is not typical. Defaults are but defaults and might not always be optimized
Also, 5.18.1? Any reason not to use 5.19? and as it includes
Scale and Availability
* A new and improved eagerness analysis algorithm reduces the number of eager operators and improves explainability and performance, and reduces memory utilization.
First, you'll want some guardrails to prevent the system from going out of heap memory. We have documentation on that here:
Next, you'll want to understand why your query or queries are executing differently. You'll want to get some EXPLAIN plans of the query, both from the old system and the new.
Things you are looking for:
NodeByLabelScan or AllNodeScans, as these are highly expensive when you are ingesting data
Eager operators along the leftmost line of the plan.
There may be deeper tuning to do, but those are the two big things to watch for.
If you find that there are NodeByLabelScan or AllNodeScans in the plan on 5.18 but not your other system, then you are lacking critical indexes that are probably contributing to the issue.
I just pulled and launched my insertion scripts again, but it ends up the same way:
neo4j-1 | Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "neo4j.Scheduler-1"
neo4j-1 | Exception in thread "neo4j.StorageMaintenance-3" java.lang.OutOfMemoryError: Java heap space
neo4j-1 | Exception in thread "qtp1975880178-60" java.lang.OutOfMemoryError: Java heap space
neo4j-1 | Exception in thread "neo4j.CheckPoint-2" java.lang.OutOfMemoryError: Java heap space
neo4j-1 | Exception in thread "Log4j2-TF-9-Scheduled-2" java.lang.OutOfMemoryError: Java heap space
neo4j-1 | Exception in thread "qtp1975880178-63" java.lang.OutOfMemoryError: Java heap space
neo4j-1 | Exception in thread "neo4j.StorageMaintenance-5" java.lang.OutOfMemoryError: Java heap space
neo4j-1 | Exception in thread "qtp1975880178-67" java.lang.OutOfMemoryError: Java heap space
neo4j-1 | 2024-04-15 23:31:06.097+0000 ERROR [bolt-18] Terminating connection due to unexpected error
neo4j-1 | java.lang.OutOfMemoryError: Java heap space
neo4j-1 | 2024-04-15 23:31:09.568+0000 ERROR Client triggered an unexpected error [Neo.DatabaseError.General.UnknownError]: Could not initialize class org.neo4j.cypher.internal.CypherCurrentCompiler$, reference 15bbbfe0-512c-42aa-9a6a-2290258f0c9a.
Same heap space issue.
Also, so far, I tested until Neo4j 5.13 and this release is working.
The issue must be between 5.14 and 5.18
Defaults are but defaults and might not always be optimized
Given the behavior of Neo4j, I don't think this is related to a simple memory threshold being crossed.
Because the other versions are not consuming so much memory, and also the transaction is commited in a matter of seconds.
With Neo4j 5.18.1, it hangs for minutes while spinning up my CPU for no reason, definitely something weird here.