Hi, I have a tiny database that contains dependencies between a SQL DB objects. I have three types of nodes: all nodes have the 'objects'
label, then they are split into 'procedures'
and 'tables'
labels. We are only interested in the 'objects'
label. Type of relationships also do not matter, only their direction.
All the nodes have a 'schema' property which we need for this query.
What I want Cypher to return are all the nodes from a certain schema and all their optional ancestors, regardless of the number of hops. It must include isolated nodes from the selected schema, as well.
In other words, starting from each node with the specified value for the 'schema' property, it must recursively traverse upwards against the relationships' direction, until it reaches nodes without inbound relationships (roots).
Seems to be quite a basic pattern, but for some reason I can't find similar posts with a good solution.
Per my example, I have 70 nodes in the selected schema, and the result set should contain 155 nodes.
I tried so far every pattern that seemed reasonable, the only one that returns at least somewhat close results to the correct ones was something like this:
MATCH p=(o:objects{schema:'val'})<-[*0..6]-(a:objects)
WITH *, RELATIONSHIPS(p) AS rels
RETURN o, rels, a;
This gives a graph with 150 nodes, instead of the expected 155. If I increase the depth, the number of nodes returned actually starts to decrease (how can it decrease at larger depths I do not understand).
While something like this:
MATCH p=(o:objects{schema:'val'})<-[*0..]-(a:objects)
WITH *, RELATIONSHIPS(p) AS rels
RETURN o, rels, a;
returns only 19 nodes. The only thing that I changed is relationship depth from [*0..6] to [*0..]. Seems that what it returns is just one path related to the first node from the subset, as per the input file (and id/index order, I guess). So how do I make it iterate over all 70ish paths and combine them into one ? Seems I need to use FOREACH or UNWIND somehow, but the docs are too basic and brief and do not cover any real life scenarios.
This variant returns the same result as above, with only one path.
MATCH (o:objects{schema:$param})
WITH COLLECT(o) as obj
MATCH p=(a:objects)-[*0..]->(c:objects)
WHERE c in obj
RETURN p
And another unsuccessful try that again returns just the first path as per first node of the input subset:
MATCH p=(o:objects{schema:$param})
WITH COLLECT(o) as obj
UNWIND obj as object
MATCH r=(object)<-[*]-()
RETURN r;
What I get is this:
What I need to get is this (obtained by starting with schema nodes and manually expanding relationships in Bloom), nodes colors are the various schemas, the pink one is my active schema in this case:
I use Neo4j Desktop 1.4.1 with a 4.2.3 database on a 64GB ThinkPad with performance configs as per memrec (about 25gb for heap and 28gb for pagecache). I run all of the above from Bloom search phrases as the Browser is not even able to return a result for most of these scripts, it just runs and runs forever (which seems ridiculous for a 239 nodes and 469 relationships database).
Btw, I had no issues implementing a similar pattern, when there is just one starting (anchor) node:
MATCH r=(p:objects {name: $param1, schema: $param2})-->(c)
WITH c
MATCH v=(c)<-[*]-()
RETURN v;
Here I go one hop down to grab the children of my input node, then recursively up to grab all the ancestors. Of course, here I have at most one connected graph.