I'm stuck on an issue and need your help to optimise a cypher query.
I have a cypher query which pulls up a graph using the following query: match p=(f:Material)<-[*0..8]-(s:MatPlant)<-[*0..8]-(r:MatPlant)<-[*0..1]-(v:Vendor) where f.node_id in ['abc|xyz001'] and s.matl_typ in ['SEMI FINISHED GOODS','FINISHED GOODS'] and r.matl_typ in ['RAW/PKG MATERIALS','FINISHED GOODS'] return p
This returns a list of paths and the path consists of a List of Segments.
I'm currently firing the above query from a microservice and parsing the response to get nodes and relations to create a nested json, the structure of which is similar to the actual graph which gets rendered in the neo4j browser by the above query.
The issue is that each path contains a lot of common segments, and hence a lot of duplicate nodes and relations in the whole response from neo4j, causing the response from neo4j to bloat up to even more than a GB.
What I need is to get all the distinct nodes and relations from all the paths combined.
I think you are getting multiple paths between the same 'f' and 'v' nodes because of query segment '[*0..8]-(s:MatPlant)<-[*0..8]'. I think you are getting multiple paths with different nodes representing the 's' node along the same path. Are there multiple MatPlant nodes between 'f' and 'r' nodes, where more than one satisfies the predicate below? If so, you will get a separate path for each matching MatPlant (which will be represented by the 's' node in the result).
s.matl_typ in ['SEMI FINISHED GOODS','FINISHED GOODS']
Also, you can't have a zero length path between two different labels, so the following will not match with a length of zero. That would effectively be the same as having no relationship between the two different nodes. As such, I simplified the query by matching to one hop to the vendor node.
(r:MatPlant)<-[*0..1]-(v:Vendor)
In this case, you would get a match for Preformatted text
I think the following refactored query gives the same results and eliminates the duplicates. Sorry, I don't have any test data to verify it actually does so. Give it a try, Let me know if it does not work and what the issue(s) were, so I can try to resolve them. Test data would help.
match (f:Material) where f.node_id in ['abc|xyz001']
match p=(f)<-[*1..8]-(r:MatPlant)<--(v:Vendor)
where r.matl_typ in ['RAW/PKG MATERIALS','FINISHED GOODS'] and
any(i in nodes(p) where i:MatPlant and i.matl_typ in ['SEMI FINISHED GOODS','FINISHED GOODS'] )
return p