apoc.refactor.mergeNodes merging more nodes than it should

Hi there,

I've been going round in circles and hope someone can spot my mistake ...

I am trying to merge a series of Domain nodes that all have different IdObject values (messy domain names) but common IdAlternative values (cleaned domain names). Am using the following ... but instead of resulting in hundreds of new nodes based on the unique IdAlternative values the below merges all nodes into one single node ... I cannot figure out why ...

CALL apoc.periodic.iterate("
MATCH (n1:Domain:Invalid:Cleaned) 
WHERE NOT n1.IdAlternative is null and NOT n1.IdAlternative contains 'file extension' 
WITH n1.IdAlternative as idalternative, count(DISTINCT n1.IdObject ) as idobject_count 
WHERE idobject_count > 1 
RETURN DISTINCT idalternative", "
MATCH (n2:Domain:Invalid:Cleaned{IdAlternative:idalternative}) 
WITH collect(n2) as nodes 
CALL apoc.refactor.mergeNodes(nodes,{properties:'discard', mergeRels:true}) yield node 
WITH node 
SET node.IdObject = node.IdAlternative + '|merged' 
SET node :Merged 
WITH node 
RETURN *", 
{batchSize:1000, iterateList:true, parallel:false})

Hi,

I did not understand your exact requirement. However I use to try below

MATCH (n1:Domain:Invalid:Cleaned)
WITH n1.IdAlternative as IdAlternative , collect(n1) as nodes
CALL apoc.refactor.mergeNodes(nodes, {properties: "combine"}) YIELD node
RETURN node;

2 Likes

YES! Thank you! That worked ... well slightly changed but it was all about the WITH statement collecting the nodes into a list for the CALL ...

MATCH (n1:Domain:Invalid:Cleaned) 
WHERE NOT n1.IdAlternative is null and NOT n1.IdAlternative contains 'file extension' 
WITH n1.IdAlternative as idalternative, count(DISTINCT n1.IdObject) as idobject_count, collect(n1) as nodes 
WHERE idobject_count > 1 
CALL apoc.refactor.mergeNodes(nodes,{properties:'discard', mergeRels:true}) yield node 
WITH node 
SET node.IdObject = node.IdAlternative + '|merged' 
SET node :Merged
RETURN node.IdObject, node.IdAlternative, labels(node)