Performance of apoc.refactor.mergeNodes on Windows vs Linux


(Harvey Nguyen) #1

Hi Neo4j admin,

I have using Neo4j version 3.4.7 and I have a question:
Why does this query:

MATCH (g:Group)
WHERE g.GroupId IN $GroupId
WITH collect(g) AS groups
CALL apoc.refactor.mergeNodes(groups) YIELD node
RETURN node

take 40ms on Linux, but it hangs on windows server(for reference both queries run on big groups with the same amount of relationship (around 3m nodes))

Thanks & Best Regards
Harvey Nguyen


(Harvey Nguyen) #2

Sorry, I have tested again with Linux, this still happened. After long time, Neo4j server cannot respond any request even web browser.
For your information, the CPU on Window server is very high (around 70 - 100%).
Are there any problems with mergeNodes procedure ? Or my query is bad behaviour ??
Please help me.
Thanks


(Stefan Armbruster) #3

How large (in number of nodes) are your groups at maximum?


(Harvey Nguyen) #4

Thanks for respond,
It's about > 3m nodes and the merging step is run in real-time. Below is my memory settings:

dbms.memory.heap.initial_size=8g
dbms.memory.heap.max_size=8g
dbms.memory.pagecache.size=4g

Nodes: about 15m
Relationships: 55m


(Stefan Armbruster) #5

I guess your transaction is getting too large. Maybe do the operation in small chunks: merge ~10k nodes into one per transaction and using apoc.periodic.iterate or apoc.periodic.commint for transaction batching.

Or make the heap memory significantly larger.