Neo4j GC Problems

m.coric · September 11, 2018, 7:27am

In December 2017 we start with huge project, and since than our database growth every day (it's on 40GB at the moment).

ID Allocation
Node ID: 70322184
Property ID: 362545374
Relationship ID: 84805115
Relationship Type ID: 49

We have server with 16vCPU and 48GB, and it's configured:

16GB Memory Heap (initial and max)
24GB Page Cache

During work period, we were monitoring Neo4j Server with Zabbix, but graphs are not promising. We were forced to restart servers every 2-3 month (max) since Old GC are going crazy (last night it was around 120 sec). Also, while we monitor server we noticed there is some Mixed GC (Young/Old generation space), but after few days it completely stops. And Old generation space just increase till it fill like 90% of Heap and after that Old CG kicks in.

I read a lot of documentation on how to tune G1GC, analyzing GC logs but no success. Here's some setup parts:

dbms.jvm.additional=-XX:MaxGCPauseMillis=750
dbms.jvm.additional=-XX:G1MixedGCLiveThresholdPercent=60
dbms.jvm.additional=-XX:G1HeapWastePercent=3
dbms.jvm.additional=-XX:G1MixedGCCountTarget=16
dbms.jvm.additional=-XX:+ParallelRefProcEnabled

I tried to tune "G1MixedGCLiveThresholdPercent", "G1HeapWastePercent" and "G1MixedGCCountTarget" but no success. For new setup I will enable:

dbms.jvm.additional=-XX:G1NewSizePercent=25
dbms.jvm.additional=-XX:G1MaxNewSizePercent=50

to try to force Young/Old ratio.

Anyone have any idea what we are missing here, since at the moment we are out of ideas. Also, does additional JVM setup actually affect Neo4j server or they are just useless?

michael.hunger · September 11, 2018, 9:23pm

Can you share more details on your read and write workloads? best by enabling the query log

Also for larger graphs you also need to grow the page-cache.
Usually you shouldn't need to do GC tuning G1GC works pretty well.

But in general you should restart your server regularly anyway to apply upgrades/patches.

Please also enable GC loggin in neo4j.conf and share the GC logs over a few days.

stefan.armbruster · September 12, 2018, 7:47am

Additionally: are you using any add-on libraries?

m.coric · September 12, 2018, 7:58am

Think I'm not able to turn on query logging in Community Edition. Also, we are not able to move to Enterprise Edition. We have a lot of query tuning because I found that some Cypher queries are not working in Enterprise Edition default plan (was already reported on Slack group month ago). About workload, here's last three GC logs (third is current one):

24GB Page Cache should be enough, cause we face same problem with 2GB DB size and now with 40GB.

Yup, know that. But we need DB to be 100% uptime between major releases (when we plan small down times).

Sure, I can provide you last 5 logs.

Also, database is on Virtualization (think it's vmware). Could that be a problem?

Just APOC package, nothing special. But function from APOC is used only in few queries, nothing special.

stefan.armbruster · September 12, 2018, 8:08am

I did experience lengthy VM pauses e.g. when the hypervisor performs a vm snapshot. In debug.log this gets reported as application threads stopped for xxxxx ms.

Topic		Replies	Views
Experiencing GC pause and high CPU Cypher	2	499	May 11, 2022
Performance issues using Neo4J community with OGM Spring Data Neo4j & Neo4j-OGM performance , ogm	0	220	January 18, 2024
Neo4j ServiceUnavailable while no error in log output Neo4j Graph Platform performance , cypher	7	159	September 25, 2024
Performance issue while load data into Neo4j Procedures & APOC performance	7	411	March 25, 2022
Config optimization for heavy query traffic Cypher performance , cypher	4	488	February 1, 2021

Get Certified in June!

Neo4j GC Problems

Related topics