I am in a big trouble for the last couple of weeks.
Here is the setup:
- neo4j 4.4.6 community on a dedicated machine
- database is small ~2-3GB
- dotnet client 4.1.26
- every query is parameterized
- I have apoc plugins install but they are not in use during the query or any other system touching the database
- this is the startup log of neo4j
I am processing ~500 messages per second which are coming from an external system. Every message generates exactly two read-only queries:
High CPU - despite having indices for everything I am querying for, the CPU is at 60-70% and slowly going up to 100%. I am fine with this because neo4j is working at this point.
After processing couple of million messages I see error in the logs describer here:
Detected VM stop-the-world pause:
Temp fix (for couple of hours):
The only way to resurrect neo4j is to destroy the container and create a new one. After I do this everything is back to normal.
What I have tried:
- every possible combination tuning the heap and pagecache size. As I am writing this the configuration is
--env NEO4J_dbms_memory_heap_initial__size=10G --env NEO4J_dbms_memory_heap_max__size=10G --env NEO4J_dbms_memory_pagecache_size=17G
- machine with 8, 16, 32 GB RAM
- machine with 2, 4, 8 CPU
- Any possible combination of the 4 above.
- The less memory given to neo4j faster it goes to GC pause
I really have no idea what I am doing wrong. Any suggestion which I could try is warmly welcome.
Thank you in advance!