Slow and crashing queries in driver, but fast in query browser

I’m testing a query searching through some complicated data. The query looks like this:

MATCH p=((a:Something {id: $id})-[:LINKED_TO_DATASET]-(:DataSet)
<-[:PART_OF_SET]-(:DataElement)
<-[:MAPPING]-(:DifferentDataElement)<-[:DATA_FLOW*0..3]-(:DifferentDataElement)
-[:PART_OF_SET]->(:DifferentDataset)
-[:BELONGS_TO_OTHER_THING]->(b:OtherThing)) return p, a, b

Ideally, I’d like to make it search for data flow relations of arbitrary length, but at length 3, the driver returns an error with this kind of weird output:

<--- Last few GCs --->

[23950:0x138008000]    56742 ms: Mark-Compact 4091.8 (4101.3) -> 4089.0 (4102.1) MB, pooled: 2 MB, 1002.88 / 0.00 ms  (average mu = 0.244, current mu = 0.014) allocation failure; scavenge might not succeed
[23950:0x138008000]    59364 ms: Mark-Compact (reduce) 4093.0 (4102.1) -> 4091.4 (4097.6) MB, pooled: 0 MB, 1242.92 / 0.00 ms  (+ 334.0 ms in 64 steps since start of marking, biggest step 8.5 ms, walltime since start of marking 1643 ms) (average mu = 0.34
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----

 1: 0x104b7ff4c node::OOMErrorHandler(char const*, v8::OOMDetails const&) [/Users/mcv/.nvm/versions/node/v24.4.0/bin/node]
 2: 0x104d5e418 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [/Users/mcv/.nvm/versions/node/v24.4.0/bin/node]
 3: 0x104fbc9f8 v8::internal::Heap::stack() [/Users/mcv/.nvm/versions/node/v24.4.0/bin/node]
...

I guess it runs out of memory. But even at length 0..2, the query takes 1919 seconds, whereas in the query browser, even searching for length 0..5 takes only 376 seconds.

Is it slow because it’s using too much memory? Or is the driver always slower than the browser?

How can I assign more memory to the driver?

And does this query really use outrageous amounts of memory? There are a lot of (:DifferentDataElement) (about 200,000, I think), but I’m only requesting the ones that are actually connected, right? And again, the query browser (at http://localhost:7474/) can handle this with no problem.

Browser is using the JavaScript driver under the hood :upside_down_face: so it's can't really be the driver :thinking: I'm not sure where the difference is coming from then.

What happens if you increase the memory size for node?

Another thing: browser truncates long result streams by default.

If you try

UNWIND range(1, 100_000) AS n
RETURN n

You might see a small warning in the bottom right corner of the cell show up stating something like

Fetch limit hit at 5,000 records. Started streaming after 35 ms and completed after 44 ms.

That explains it. My fetch limit is 50,000 now, and browser queries have become slower.

I think this answers my issue. I’ll have to rework my query to fetch less elements.

The big problem is the enormously large number of (:DifferentDataElement). I need them because they have the vital [:DATA_FLOW] relationship, but I’m not actually interested in them. I’m interested in the (:OtherThing) they connect to.

This query balloons in the middle, but the actual number of OtherThings it connects to is fairly limited. I don’t need every route to the same (:DifferentDataset).

It might be worth opening another thread maybe in Neo4j Graph Platform > Cypher asking for help on how to optimize the query, or having a look at How can I optimize my Cypher Queries? to see if some of the suggestions there help you.

Unfortunately, I'm really not the right person to ask for Cypher optimization questions :see_no_evil_monkey: