I have a read query (call it Query Y) that has been going on for over 8 hours. My dataset is large, but it is not that large that a query would take over 8 hours. A similar query (call it Query X) to this one completed in 61 minutes, before I started with this query.
The query starts out with looking up a Person
node with a given id
(INDEX is already applied on the id
field). Then it locates friends and friends of friends of that Person
, and goes on to locate all Forum
nodes where the friends are a member of, using a hasMember
relationship, such that the joinDate
property on the relationship is after a given start date. Lastly, it needs to return the count of all Post
nodes, for posts which were created by these friends, that are contained in the forum.
All nodes are indexed on their id
field.
Here's the query plan with EXPLAIN
-
I could not get the PROFILE
results because the query never completed.
When I looked at the debug.log
file, I noticed a lot of Stop-the-World
(STW) pauses, sprinkled around during the execution of these two long running queries (X and Y). On reading about it, it seems that these pauses occur when the JVM GC pauses all other application threads.
Here is the relevant portion from the debug.log
file -
2020-01-07 15:39:11.648+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=180, gcTime=192, gcCount=1}
2020-01-07 15:39:11.824+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=130, gcTime=0, gcCount=0}
2020-01-07 15:49:57.542+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=189, gcTime=212, gcCount=1}
2020-01-07 15:50:01.690+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=119, gcTime=130, gcCount=1}
2020-01-07 16:01:23.829+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=162, gcTime=173, gcCount=1}
2020-01-07 16:21:25.807+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=130, gcTime=139, gcCount=1}
2020-01-07 16:21:25.995+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=109, gcTime=0, gcCount=0}
2020-01-07 16:22:28.916+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=125, gcTime=188, gcCount=1}
2020-01-07 16:22:30.558+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=314, gcTime=0, gcCount=0}
2020-01-07 16:25:31.980+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=135, gcTime=144, gcCount=1}
2020-01-07 16:25:32.231+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=169, gcTime=0, gcCount=0}
The complete set of STW logs while the two queries were running is in this github gist, and the complete debug.log
file is here. For reference, the relevant logs would be at the very end, on 2020-01-07 (when I executed these two queries).
What can I do in this case? I had already stopped and re-started both queries X and Y, when they were taking too long. While query X completed in 61 minutes the second time, query Y went on for over 8 hours, and never completed. For comparison, the other queries I execute takes only 1-3 minutes on average, for this dataset. My memory settings in the configuration file are set at heap(initial) - 4G
, heap(max) - 8G
and pagecache - 6G, based on the recommendations suggested here. I don't have the exact recommendations by the memrec
tool, because I'm on JDK 11, and the tool seems to work only with JDK 8.
I have a 16GB RAM, should I set the heap size (both initial and max) to maybe 12G
, based on the suggested config -
The heap memory size is determined by the parameters
dbms.memory.heap.initial_size
anddbms.memory.heap.max_size
. It is recommended to set these two parameters to the same value. This will help avoid unwanted full garbage collection pauses.
- neo4j version, desktop version, browser version - 3.5.14, 1.2.3
- what kind of API / driver do you use - Cypher in Neo4j Desktop