Max Page Cache value not being respected

Hi,

We are facing issues with constantly growing neo4j memory, which ultimately crashes the server.

We are running neo4j 3.4 in production on EC2 machine r4.8xlarge with linux OS. ( 32cores, 244 GB ram) Our DB is 2T in size.

We are using the following conf -

dbms.memory.heap.initial_size=30900m
dbms.memory.heap.max_size=30900m
dbms.memory.pagecache.size=200000m

With these configurations we expect max 231GB RAM utilization. But our server load is hitting 99.5% and this is crashing our server.

These are some memory logs from debug.log:

2019-03-22 21:20:34.093+0000 INFO [o.n.k.i.DiagnosticsManager] dbms.memory.heap.initial_size=30900m
2019-03-22 21:20:34.094+0000 INFO [o.n.k.i.DiagnosticsManager] dbms.memory.heap.max_size=30900m
2019-03-22 21:20:34.094+0000 INFO [o.n.k.i.DiagnosticsManager] dbms.memory.pagecache.size=200000m
2019-03-22 21:20:34.097+0000 INFO [o.n.k.i.DiagnosticsManager] System memory information:
2019-03-22 21:20:34.143+0000 INFO [o.n.k.i.DiagnosticsManager] Total Physical memory: 240.08 GB
2019-03-22 21:20:34.144+0000 INFO [o.n.k.i.DiagnosticsManager] Free Physical memory: 206.85 GB
2019-03-22 21:20:34.144+0000 INFO [o.n.k.i.DiagnosticsManager] Committed virtual memory: 38.85 GB
2019-03-22 21:20:34.144+0000 INFO [o.n.k.i.DiagnosticsManager] JVM memory information:
2019-03-22 21:20:34.144+0000 INFO [o.n.k.i.DiagnosticsManager] Free  memory: 30.05 GB
2019-03-22 21:20:34.145+0000 INFO [o.n.k.i.DiagnosticsManager] Total memory: 30.18 GB
2019-03-22 21:20:34.145+0000 INFO [o.n.k.i.DiagnosticsManager] Max   memory: 30.18 GB
2019-03-22 21:20:34.152+0000 INFO [o.n.k.i.DiagnosticsManager] Memory Pool: Code Cache (Non-heap memory): committed=8.94 MB, used=8.86 MB, max=240.00 MB, threshold=0.00 B
2019-03-22 21:20:34.152+0000 INFO [o.n.k.i.DiagnosticsManager] Memory Pool: Metaspace (Non-heap memory): committed=17.50 MB, used=16.47 MB, max=-1.00 B, threshold=0.00 B
2019-03-22 21:20:34.152+0000 INFO [o.n.k.i.DiagnosticsManager] Memory Pool: Compressed Class Space (Non-heap memory): committed=2.25 MB, used=2.05 MB, max=1.00 GB, threshold=0.00 B
2019-03-22 21:20:34.152+0000 INFO [o.n.k.i.DiagnosticsManager] Memory Pool: G1 Eden Space (Heap memory): committed=1.59 GB, used=120.00 MB, max=-1.00 B, threshold=?
2019-03-22 21:20:34.152+0000 INFO [o.n.k.i.DiagnosticsManager] Memory Pool: G1 Survivor Space (Heap memory): committed=0.00 B, used=0.00 B, max=-1.00 B, threshold=?
2019-03-22 21:20:34.153+0000 INFO [o.n.k.i.DiagnosticsManager] Memory Pool: G1 Old Gen (Heap memory): committed=28.59 GB, used=0.00 B, max=30.18 GB, threshold=0.00 B

What are the best ways to fix these issues?

Thanks!

Please have a look at this page which diagrams how all of memory works in Neo4j:

Memory consists of OS Reserve + Page Cache + Heap Space + Transaction State. So Heap + Page Cache is not 100% of the memory you need.

This page has a lot of details to explain this and step through other things you'll need, but in the interest of a quick recommendation, please run neo4j-admin memrec --database and have a look at what it recommends and how that differs from your present memory config.

Hi David,

We set the memory setting using neo4j-admin memrec --database.

And after monitoring for a few days we are noticing a slow rise in the overall memory usage (starting from 90% ; it has now reached 96% - in 1 week).

Any tips on controlling this growth? Is it recommended to restart neo4j db every now and then?

This is very critical for us. Any help is appreciated.

Thanks
Shweta

Please provide your full config and memory settings. You've said you ran memrec but I don't know what the result is and your current config.

Also, please be more specific if you can about memory growing. As measured by what? If you are growing and each category is still at or below the max, this can be totally acceptable and you may need to lower your maxes to live within the budget of memory that your machine has, but I'm not sure what you mean by steadily growing.

Also relevant are other details about your machine, for example if any other processes other than neo4j are running on them, whether you have linux virtual memory configured, and so on.

Ideally, you want your memory utilization to be at 100% and always to stay there. If it's less than 100% then you have some unused memory that does you no good. For a large database, making the page cache big to get as much of the database in memory is a good thing, balancing other factors.

What do you mean by "crashes the server"?

When the server crashes, what is in the debug.log file?