Memory issue for a query

ralfeus · April 14, 2025, 7:03pm

Hi all,
I recently upgraded Neo4j from 4.4.2 to 2025.03.0. After that the query has failed:

MATCH (:Person{id:'0000001'})<-[:PARENT*0..]-(n:Person) 
WHERE date(n.when_updated) < date('2025-04-14')
RETURN n.id, n.username, n.password ORDER BY n.id_normalized;

The error:

The allocation of an extra 2.0 MiB would use more than the limit 12.0 GiB. Currently using 12.0 GiB. dbms.memory.transaction.total.max threshold reached

The number of nodes is 3.5M, whole database uses 2.5GB on disk and the container where Neo4j is running, never consumer more than 3GB.

I tried to limit the output and on 200K records it could return the result. Container's memory consumption raised to 4.5GB. At the same time profiler has shown total memory allocation 5.3GB. So, I'd expect at least twice as big output limit to be executed successfully. So, it's very confusing.

I've increased heap and page size to the point when it couldn't start because it exceeded physical RAM capacity. So the memory allocation is almost maximum possible.

What puzzles me at most is that before the upgrade same query ran without problems. Can it be fixed somehow? I am considering splitting outcome and combining results in a calling program but I'd like to avoid that if possible.

Update: I tried to get result in chunks but running same query with SKIP 3000000 LIMIT 20 added also results same error. I don't really understand why LIMIT 200000 produces result by SKIP 3000000 LIMIT 20 does not.

peterpirogtf · April 14, 2025, 8:20pm

You cantry to change memory setting in the file neo4j.conf to:

dbms.memory.transaction.total.max=24g
dbms.memory.heap.initial_size=8g
dbms.memory.heap.max_size=8g

and to process the query in batches:

https://neo4j.com/labs/apoc/4.4/overview/apoc.periodic/apoc.periodic.iterate/

CALL apoc.periodic.iterate(
  "MATCH (:Person {id: '0000001'})<-[:PARENT*0..]-(n:Person) WHERE date(n.when_updated) < date('2025-04-14') RETURN n",
  "RETURN n.id, n.username, n.password",
  {batchSize: 5000, parallel: false}
)

glilienfield · April 15, 2025, 12:01am

BTW, when you have the following type of match MATCH (:Person{id:'0000001'})<-[:PARENT*0..]-(n:Person) , you will get a row back for all the intermediate paths, i.e.

MATCH (:Person{id:'0000001'})<-[:PARENT]-(n:Person) 
MATCH (:Person{id:'0000001'})<-[:PARENT]-()<-[:PARENT]-(n:Person) 
etc.

If you paths are very long, you are going to get a lot of intermediate paths back. Do you need these, or just the longest path? If just the longest, you can add a predicate to filter out the intermediate paths.

MATCH (:Person{id:'0000001'})<-[:PARENT*0..]-(n:Person) 
WHERE NOT EXISTS ( (n)<-[:PARENT]-()  ) and
date(n.when_updated) < date('2025-04-14')
RETURN n.id, n.username, n.password ORDER BY n.id_normalized;

ralfeus · April 15, 2025, 4:27pm

I'm not sure apoc.periodic.iterate is supposed to return anything. In fact it returns result, but it's only a report of execution, not actual dataset.

And could confirm that the issue doesn't occur in version 4.4.2 but occurs in 5.26 and 2025.03.0. So it seems newer versions have memory management problems. Also I could see that in 4.4.2 the container's memory consumption quite quickly raised to 6GB, that didn't happen for 2025.03.0. So that what I'd expect full dataset would take in memory.

ralfeus · April 15, 2025, 4:29pm

I don't need paths, I need all nodes, that are under the root one.

glilienfield:

MATCH (:Person{id:'0000001'})<-[:PARENT*0..]-(n:Person) 
WHERE NOT EXISTS ( (n)<-[:PARENT]-()  ) and
date(n.when_updated) < date('2025-04-14')
RETURN n.id, n.username, n.password ORDER BY n.id_normalized;

It would return only leaf nodes but I need all of them.

glilienfield · April 16, 2025, 12:20am

I get it. See if this run more efficiently.

MATCH path = (:Person{id:'0000001'})<-[:PARENT*0..]-(n:Person) 
WHERE NOT EXISTS ( (n)<-[:PARENT]-()  ) and
date(n.when_updated) < date('2025-04-14')
UNWIND nodes(path) as node
RETURN node.id, node.username, node.password 
ORDER BY node.id_normalized;

ralfeus · April 17, 2025, 1:37pm

Does UNWIND loose some part of context and thus is more memory optimized?

Topic		Replies	Views
Error with memory neo4j browser Neo4j Graph Platform browser	3	577	May 11, 2020
Memory errors while printing query results in python Cypher	2	249	February 19, 2023
Neo4j TransactionMemoryLimit Neo4j Graph Platform	1	200	February 21, 2023
OutOfMemory due to Embedded Neo4J server Cypher performance	2	244	April 22, 2022
Neo4j crashes while exporting Import / Export memory	4	1134	December 12, 2018

July Summer Fun!

Memory issue for a query

Related topics