Hi folks.
TLDR: Is it possible with APOC to include a node in labelFilter but also exclude instances of that node type that also connect to another type of node?
I have a schema that describes the relationships between projects, operations and materials. I want to write a query that finds finalised materials that 'belong to' a particular project. This seems simple but is complicated by the fact that sometimes intermediate materials are 're-used' by secondary projects to produce more outputs and I want to exclude these from the material results of the initial project.
A Simplified example of my problem can be created with the following:
create (p1:PROJECT:EXAMPLE{name:'p1'})
with p1
create (p2:PROJECT:EXAMPLE{name:'p2'})
with p1, p2
create (p1)<-[:BELONGS_TO]-(o1:OPERATION:EXAMPLE{name:'o1'})-[:OUT]->(m1:MATERIAL:EXAMPLE{name:'m1'})-[:IN]->(o2:OPERATION:EXAMPLE{name:'o2'})-[:OUT]->(m2:MATERIAL:EXAMPLE{name:'m2'})-[:IN]->(o3:OPERATION:EXAMPLE{name:'o3'})-[:OUT]->(m3:MATERIAL:USEFUL:EXAMPLE{name:'m3'})
create (p2)<-[:BELONGS_TO]-(o4:OPERATION:EXAMPLE{name:'o4'})<-[:IN]-(m2)
create (o4)-[:OUT]->(m4:MATERIAL:USEFUL:EXAMPLE{name:'m4'})
...Looks like this...
Note that Operations produce materials and also belong to projects. Note also that Material 'm2' has been re-used by Project 'p2' to produce material 'm4'
My first attempt at the query just uses apoc expandConfig...
match (p:PROJECT{name:'p1'})<-[:BELONGS_TO]-(op:OPERATION)
call apoc.path.expandConfig(op, {relationshipFilter:'IN>|OUT>',
labelFilter:'OPERATION|MATERIAL|>USEFUL',
uniqueness: 'NODE_GLOBAL'
}) yield path
with last(nodes(path)) as usefulMaterials
return usefulMaterials.name as name
This does return 'm3' but also incorrectly returns the material that results from project 'p2's re-use 'm4'
My Second attempt uses a blacklistNodes approach to exclude a pre-matched set of nodes that belong to projects. .....
match (o:OPERATION)-[:BELONGS_TO]->(p:PROJECT)
with collect(o) AS projectOps
match (p:PROJECT{name:'p1'})<-[:BELONGS_TO]-(op:OPERATION)
call apoc.path.expandConfig(op, {relationshipFilter:'IN>|OUT>',
labelFilter:'OPERATION|MATERIAL|>USEFUL',
uniqueness: 'NODE_GLOBAL',
blacklistNodes: projectOps
}) yield path
with last(nodes(path)) as usefulMaterials
return usefulMaterials.name as name
This approach correctly returns only "m3' for 'p1' and if I swap to 'p2', it correctly returns 'm4'
So it works, but seems very inefficient. As my DB grows larger, the set of projectOps nodes is going to get very large.
Is it possible with APOC to include the OPERATION node in labelFilter but also exclude OPERATION nodes that also connect to a project i.e. exclude
(o:OPERATION)-[:BELONGS_TO]->(p:PROJECT)