tldr: Big neo4j dataset, trying to load into page cache. Only a part of the db being loaded. Queries with count very slow. Sysinfo and memrec showing different db data volumes. How to load the entire graph into memory, and reduce db hit counts?
I have a 25G 3.5.12 Enterprise Neo4j Database, running on a 52G server. Currently I have set the page cache to be 28G, and heap size to be 12G in the conf file. I want to load the entire dataset into cache for fast querying.
1. Even after warming up with
apoc.warmup.run(true, true, true), my queries are still hitting the database millions of times. I have also tried the basic
MATCH (n)-[r]-(m) RETURN count(r), count(n.name) etc., but the same problem persists.
Here is the post apoc warmup result:
This is a sample profile query:
2. Another very hard to understand thing is that memrec and sysinfo seem to be showing different results.
Memrec says I have 7.7G data volume, and indexes, while sysinfo and du say I have 25G.
Exactly as the numbers indicated by memrec, the system is only using approx. 8G + 12G(Heap) from neo4j :
May someone please help me understand what I am doing wrong here? How can I get all the graph nodes and relationships in the page cache?