I have a script that runs a few hundred thousand individual transactions to insert data in to Neo4j. Each of these Cypher queries uses the exact same format -- it's just the parameters that change from one query to the next.
However, at seemingly random intervals, an otherwise straightforward query will "hang" for a minute or two. Sometimes it will get imported, but if the query reaches 120s it will fail to be imported.
I kept an eye on
debug.log to see what might be happening during this time, and I found this when I cancelled the script as a transaction was struggling to be imported:
2021-11-19 00:37:02.887+0000 INFO [o.n.c.i.ExecutionEngine] [neo4j/c7a51d5e] Discarded stale query from the query cache after 77412 seconds. Reason: IndexSelectivity(IndexDescriptor(Node(LabelId(3)),List(PropertyKeyId(2)),Set(),org.neo4j.cypher.internal.planner.spi.IndexDescriptor$$$Lambda$3179/0x00000008411d8040@21456f65,org.neo4j.cypher.internal.planner.spi.IndexDescriptor$$$Lambda$3180/0x00000008411d9040@8e006bf,false)) changed from 3.811430018203847E-9 to 3.6296907073761936E-9, which is a divergence of 0.047682709628576375 which is greater than threshold 0.03576053374670699. Query id: 31553318
Is this "stale query" likely to be the reason behind why Neo4j stalls for a few minutes every now and then? If so, is there anything I can do to avoid this happening?
The query is completely standard, and if I were to stop and restart the script the same query will run without a problem. It just seems that Neo4j will "hang" for a few minutes every now and then during this import script, and I don't know why.
At first I thought it may be some sort of garbage collection or "behind the scenes" memory management, but this "stale query" line from
debug.log is the only clue I've got to what may be causing the cypher queries to stall during import.
Thanks in advance for any advice.