Struggling with matching two subgraphs

I have two graphs that I need to compare. I've already run the HASH to figure out that these two are different, now I need to know how different.

This is the data set

merge (a:Delivery {inode:0, name:'A1', type:'Delivery',load:'Truck',compileunit:'route1'});
merge (a:Delivery {inode:1, name:'A2',type:'Delivery', load:'pallet', compileunit:'route1'});
merge (a:Delivery {inode:2, name:'A3',type:'Delivery' , load:'pallet', compileunit:'route1'});
merge (a:Delivery {inode:3, name:'A4',type:'Delivery' , load:'pallet', compileunit:'route1'});
match (a:Delivery {inode:0}), (b:Delivery {inode:1}) merge (a)-[x:drives]-(b);
match (a:Delivery {inode:1}), (b:Delivery {inode:2}) merge (a)-[x:drives]-(b);
match (a:Delivery {inode:1}), (b:Delivery {inode:3}) merge (a)-[x:drives]-(b);
merge (a:Delivery {inode:4, name:'A5',type:'Rest' , load:'coffee', compileunit:'route1'});
match (a:Delivery {inode:2}), (b:Delivery {inode:4}) merge (a)-[x:drives]-(b);
merge (a:Delivery {inode:5, name:'A6',type:'Delivery', load:'pallet', compileunit:'route1'});
merge (a:Delivery {inode:6, name:'A7',type:'Delivery' , load:'pallet', compileunit:'route1'});
merge (a:Delivery {inode:7, name:'A8',type:'Rest', load:'coffee', compileunit:'route1'});
merge (a:Delivery {inode:8, name:'A9',type:'Rest' , load:'lunch', compileunit:'route1'});
match (a:Delivery {inode:4}), (b:Delivery {inode:5}) merge (a)-[x:drives]-(b);
match (a:Delivery {inode:4}), (b:Delivery {inode:6}) merge (a)-[x:drives]-(b);
match (a:Delivery {inode:6}), (b:Delivery {inode:7}) merge (a)-[x:drives]-(b);
match (a:Delivery {inode:6}), (b:Delivery {inode:8}) merge (a)-[x:drives]-(b);
merge (a:Delivery {inode:9, name:'A10',type:'Rest' , load:'pallet', compileunit:'route1'});
match (a:Delivery {inode:3}), (b:Delivery {inode:9}) merge (a)-[x:drives]-(b);

merge (a:Delivery {inode:20, name:'A1', type:'Delivery',load:'Truck',compileunit:'route2'});
merge (a:Delivery {inode:21, name:'A2',type:'Delivery', load:'pallet', compileunit:'route2'});
merge (a:Delivery {inode:22, name:'A3',type:'Delivery' , load:'Boxes', compileunit:'route2'});
merge (a:Delivery {inode:23, name:'A4',type:'Delivery' , load:'pallet', compileunit:'route2'});
match (a:Delivery {inode:20}), (b:Delivery {inode:21}) merge (a)-[x:drives]-(b);
match (a:Delivery {inode:21}), (b:Delivery {inode:22}) merge (a)-[x:drives]-(b);
match (a:Delivery {inode:21}), (b:Delivery {inode:23}) merge (a)-[x:drives]-(b);
merge (a:Delivery {inode:24, name:'A5',type:'Rest' , load:'coffee', compileunit:'route2'});
match (a:Delivery {inode:22}), (b:Delivery {inode:24}) merge (a)-[x:drives]-(b);
merge (a:Delivery {inode:26, name:'A7',type:'Delivery' , load:'pallet', compileunit:'route2'});
merge (a:Delivery {inode:28, name:'A9',type:'Rest' , load:'lunch', compileunit:'route2'});
match (a:Delivery {inode:24}), (b:Delivery {inode:26}) merge (a)-[x:drives]-(b);
match (a:Delivery {inode:26}), (b:Delivery {inode:28}) merge (a)-[x:drives]-(b);
merge (a:Delivery {inode:29, name:'A10',type:'Rest' , load:'Boxes', compileunit:'route2'});
match (a:Delivery {inode:23}), (b:Delivery {inode:29}) merge (a)-[x:drives]-(b);

My first step is to find the nodes that do not exist in route2.While the inodes are different, the names should be the same. But in route2, name 'A6' and name 'A8' are missing. For those I need the inode value from both (5,7) so I can update those as missing (set a.deleted = yes)

For the nodes that do match, there are differences in the loads. Name:A10 in route2 has Boxes, Name:A3 are boxes also. So I need a list of the nodes that matched name, but had a difference in load (inodes 22, 29) so I can set the change flag b.changed = Yes

So I would like to get 2 lists.

I've tried diff and apoc but i keep getting cartesian products.

Thoughts ?

Try this:

First part:

match (d:Delivery) where d.compileunit = 'route2'
with collect(d.name) as d1
match (c:Delivery) where c.compileunit = 'route1' 
with d1, collect(c.name) as c1
with [n IN c1 WHERE NOT n IN d1] as listC
match (a:Delivery) where a.name in listC
return a.inode, a.compileunit

Result:
Screen Shot 2020-11-15 at 12.10.27 AM

Second part:

match (c:Delivery) where c.compileunit = 'route1'
match (d:Delivery) where d.compileunit = 'route2'
and d.load = c.load

match (e:Delivery) where e.compileunit = 'route2'
and e.name = c.name
with collect(distinct d.name) as n1, collect(distinct e.name) as n2
with n1, n2, apoc.coll.removeAll(n2, n1) as rmvd
match(g:Delivery) where g.name in rmvd
return g.compileunit, g.inode, g.name order by g.name

Result:

2 Likes

Wow. Perfect. Thank you - I would be embarrassed to publish what I had. Thank you - amazing

you've marked your response as the solution, @bill.dickenson ? Shouldn't you have marked @ameyasoft 's reply as solution?

Damn - I am sorry - I didn't realize that applied to the comment specifically. Yes, I will fix. Thanks for the headsup.

1 Like