apoc.refactor.mergeNodes Performance

rezaahnadi99887 · November 20, 2022, 9:35am

Hi,

I use the 4.4.8 neo4j version and my database has 10M nodes and 3M relations.

I want to merge nodes with same "Age" property. but below cypher does nothing, no hangs, no crash, no error and obviously no mergers

props size is 80 which means the subquery runs 80 times

MATCH (n:Person) with distinct n.Age as props
UNWIND props as prop
call{
WITH prop
MATCH (m:Person {Age:prop})
WITH m order by m.Id ASC
with COLLECT(m) AS ns, count(m) as cn where cn > 1
CALL apoc.refactor.mergeNodes(ns, {properties:{`.*`: 'discard'}}) YIELD node RETURN count(*) as s
} RETURN count(s)

Memory config:

dbms.memory.heap.initial_size=8g
dbms.memory.heap.max_size=12g
dbms.memory.pagecache.size=6g

Are there any problems with mergeNodes? Or my cypher is bad behavior ?
Please help me.
Thanks

glilienfield · November 20, 2022, 1:17pm

Try this:

MATCH (n:Person) 
with n.Age as age, collect(n) as ns, count(*) as cn
where cn > 1
call apoc.refactor.mergeNodes(ns, {properties:{`.*`: 'discard'}}) yield node 
return age, cn

rezaahnadi99887 · November 21, 2022, 5:34am

Thanks for your reply, but I want to merge nodes with the same age. in your cypher this does not happen.

glilienfield · November 21, 2022, 1:17pm

It should, since the ‘with’ clause by age will group the nodes with the same age and collect those, so each ‘ns’ collection contains the nodes with the same age.

rezaahnadi99887 · November 23, 2022, 6:58am

You are right, thank you for your answer.
But this also doesn't do anything like my code, but I solved this problem by using apoc.periodic.commit. However, it works very slowly.

call apoc.periodic.commit(
"MATCH (n:Person) with n limit $limit
with n.Age as age, collect(n) as ns, count(*) as cn
where cn > 1
call apoc.refactor.mergeNodes(ns, {properties:{`.*`: 'discard'}}) yield node 
return count(*)",{limit:10000})

glilienfield · November 23, 2022, 1:04pm

Periodic commit continually executes the cypher statement until zero rows result, at which point it stops. In your case, the query executes, gets the first 10000 nodes, merges them, and repeats until all the nodes are merge and cn > 1 is no longer true. It’s possible that each batch of 10000 nodes does not contain all the nodes for the ages represented in that batch, so you can end up taking more than one round to merge a specific value of age. You should try using cypher’s ‘call {} in transactions in 10000 rows’ statement instead, if executing in your browser. You will need to add “:auto” at the very beginning of your query.

I also think you just need the call subquery enclosing the ‘write’ part of the query, which is the call to the apoc method.

https://neo4j.com/docs/cypher-manual/current/clauses/call-subquery/#subquery-call-in-transactions

rezaahnadi99887 · November 28, 2022, 5:28am

In the past few days, I first updated my Neo4j to 5.2, then I tried a variety of queries, including:

apoc.periodic.commit

apoc.periodic.iterate

call {} in transactions

apoc.cypher.parallel

And the combination of these would be with each other.
I tried to reduce the time and volume of each transaction by using the above, but it didn't work.
In all cases, one of the following two things happens
1- The Merge operation is done but slowly
2- The system resources are heavily involved, but over time, nothing happens, that is, it does not give an error, nor does a merge take place.

It seems that the merge operation is not executed in parallel.

Topic		Replies	Views
Performance Issues Merging Nodes Cypher apoc , performance , cypher	3	348	March 13, 2022
Can we merge multiple Nodes based on Single common property Browser cypher	12	12992	March 4, 2020
Speeding up apoc.refactor.mergeNodes query Cypher apoc , performance , cypher , relationship	1	224	April 28, 2023
Improve performance of apoc.refactor.mergeNodes Conferences, Meetups, & Events migrated	6	157	December 21, 2022
NotFoundException error for using Merge Nodes Apoc Procedures & APOC apoc	4	494	March 13, 2022

July Summer Fun!

apoc.refactor.mergeNodes Performance

Related topics