How to guage how many nodes and UNWIND can handle?

Rogie · June 9, 2020, 2:52pm

I'm currently needing to executing large queries of the form:

UNWIND $data as row
Merge (n:Node {id: row.id, name: row.name, ...})

I have millions of data rows I need to do this for, but I can't just do this with a single UNWIND query because neo4j crashes with a memory error. If I make batches so that $data contains around 20,000 rows at a time, this seems to be OK. But is there a way to increase this? Are there any tricks for dealing with these kinds of situations?

md7 · June 9, 2020, 4:35pm

could be many more efficient solutions.
This scenario is like migrating data.

Here one can think in simple way.
At the source node level maintain a attribute as migrated =false
And at very first stage select only those record where migrated is false and LIMIT=20000
then MERGE it.

Rogie · June 9, 2020, 7:09pm

I don't really understand what you're suggesting. Are you telling me to try to avoid merging any nodes that are already present in the graph?

I wonder if there is a some other underlying problem here, since it's taking 20 minutes to merge around 100,000 nodes. Each node has around 10 attributes, one of which is name_id, for which I have set

CREATE CONSTRAINT ON (n:Node)
ASSERT n.name_id is unique

I see posts about people merging millions of nodes in a few minutes, so I am wondering what I'm doing wrong.

soham.dhodapkar · June 9, 2020, 9:57pm

Hey Rogie,
Since name_id has a constraint, you can merge on name_id and set the other properties using SET clause.

UNWIND $data as row
MERGE (n:Node {name_id:row.name_id})
SET n.id = row.id

and so on..
Can you try this approach?

Rogie · June 9, 2020, 10:23pm

I just tried that and it took 1 minute to merge 20,000 nodes. Is that normal?

soham.dhodapkar · June 9, 2020, 11:50pm

Umm...can be made faster. What syntax are you using for creating batches? Maybe try playing with the batch size a bit, say 10,000.

Topic		Replies	Views
Large Batch Job - Help would be incredibly appreciated Cypher apoc	8	525	January 24, 2021
Very slow relationship creation (UNWIND) Cypher performance , cypher , relationship	8	2347	May 29, 2021
Merge and Nested Unwind: How to write an efficient query wrt. indexing Neo4j Graph Platform performance , cypher , unwind , merge	1	1683	April 2, 2020
Using Unwind and Dumping Data in neo4j - Query Optimization Cypher apoc , performance , cypher	0	519	July 9, 2020
Loading in millions of nodes Import / Export performance , cypher , import	0	333	February 18, 2022

Get Certified in June!

How to guage how many nodes and UNWIND can handle?

Related topics