How do I collect all of the nodes in a subgraph

bill_dickenson · October 17, 2020, 6:54pm

I have a very large graph that has one root node that fans out into multiple nodes. I need to get a list of ALL of the nodes in the subgraph. Other than label, they don't have a common attribute. They are only connected by the "connections"

Looking at my very simplified graph, I wanted to start from B

MATCH (n: {label:'B'}) and get a list of all the subnodes [E,C,H,F]

or Start with D and get [G] or start with C and get [H,F]

I though this would get me where I needed to go

But i can't get the results.

This did not work either. The Label clause below is the equivalent of B.

match p=(:ProgNode {Label:'nonCompliantCwe1045()'})-[*0..]->(n) RETURN n

It successfully finds the root, but it won't walk the set underneath that.

Now the web interface does allow me to do it manually but expanding each node but I need an automated way.

ameyasoft · October 17, 2020, 11:52pm

Try this:

MERGE (a1:Test {label: "A"})
MERGE (a2:Test1 {label: "B"})
MERGE (a3:Test2 {label: "C"})
MERGE (a4:Test3 {label: "D"})
MERGE (a5:Test4 {label: "E"})
MERGE (a6:Test5 {label: "F"})
MERGE (a7:Test6 {label: "G"})
MERGE (a8:Test7 {label: "H"})

MERGE (a1)-[:CONNECTS]->(a2)
MERGE (a2)-[:CONNECTS]->(a3)
MERGE (a2)-[:CONNECTS]->(a5)

MERGE (a3)-[:CONNECTS]->(a6)
MERGE (a3)-[:CONNECTS]->(a8)

MERGE (a1)-[:CONNECTS]->(a4)
MERGE (a4)-[:CONNECTS]->(a7)

RETURN *

Result:
Screen Shot 2020-10-17 at 4.39.47 PM

//relationshipFilter......

MATCH (a:Test1)
CALL apoc.path.subgraphNodes(a, {relationshipFilter:'CONNECTS>'}) 
YIELD node
RETURN node


//labelFilter......

MATCH (a:Test1)
CALL apoc.path.subgraphNodes(a, {labelFilter:'-Test'}) 
YIELD node
RETURN node

Both produce same results.

Screen Shot 2020-10-17 at 4.40.42 PM

mishragautam1994 · October 18, 2020, 10:20am

@ameyasoft
I think he mentioned that the labels are the same for all node in the graph .
Why are you creating different labels

ameyasoft · October 18, 2020, 5:13pm

Solution works whether you have same label for all nodes or different labels for each node. Having the same node label as 'Test', still it works. Only difference is that you have to match the node based on node property.

MATCH (a:Test {label: "B"}) 
CALL apoc.path.subgraphNodes(a, {relationshipFilter:'CONNECTS>'}) 
YIELD node
RETURN node

You get the desired result with all nodes displayed with one color!

bill_dickenson · October 20, 2020, 5:13pm

Thank you ! That solution worked perfectly

andrew_bowman · October 20, 2020, 8:19pm

I am curious why the above didn't work for you. Provided that outgoing relationships are used to connect parent to child, and since it matched to at least the starting node here, it should have also returned all nodes beneath it in your tree.

With a tree like this, where each node only has a single parent and there are no loops, there's no real reason to use subgraphNodes(), unless you have some complex case where the various filters can be be useful.

The solution proposed above using subgraphNodes() can be mimicked just as easily with:

MATCH (a:Test {label: "B"})-[:CONNECTS*0..]->(node)
RETURN node

bill_dickenson · October 20, 2020, 9:14pm

It worked on my test case so but another Neo4j expert who was working on it (Cobra), said it did not work.But I will ask him to update this so you can understand.

I was disappointed in the final solution so if this could be made to work, I would be wickedly pleased.

andrew_bowman · October 20, 2020, 10:35pm

It would help to see the actual query. If the needs are as simple as the example, then the most common reasons why it wouldn't work include:

Typos in the query
Bad data in the database (such as if there was a bad import resulting in property keys that had leading or trailing whitespace) mismatching with the query
Relationship direction in the query mismatching against what's actually in the database

If the query returns no results, a PROFILE plan of the query may provide clues as to what's going wrong. Specifically, you could look for where rows between operators drop to 0, meaning the operation just before that drop resulted in all rows being filtered out.

Cobra · October 21, 2020, 6:45am

At least give the whole problem.

It worked on my test case so but another Neo4j expert who was working on it (Cobra), said it did not work.

This query was returning the whole graph, 0 nodes or 3 nodes depending on the relation direction:

MATCH (:ProgNode {Label: 'nonCompliantCwe1045()'})-[*]->(n)
RETURN n

This query was returning the graph part he was interested in:

MATCH (nc:ProgNode {compileunit: "Cwe1057.java"})-[*]->(i:ProgNode)
WHERE nc.Label STARTS WITH "public void nonCompliant"
RETURN n

The difference between these both queries is that's not the same start node, that's why the first query was not working for your use case but the second does. Since I don't know deeply the use case, I could not know there was another I could use to discriminate the subgraph from the whole graph.

bill_dickenson · October 21, 2020, 1:53pm

Not correct but lets drop it.

Topic		Replies	Views
Problems matching a whole subgraph without knowledge about content Cypher cypher	1	384	May 19, 2020
Cypher query for getting a subgraph by multiple relationship paths Cypher	8	2658	May 27, 2022
subgraphAll problem Procedures & APOC	1	331	October 7, 2021
How apoc.path.subgraphNodes travers the graph? Procedures & APOC apoc , cypher , operations	7	682	October 20, 2020
apoc.path.subgraphAll not behaving as expected with endNodes parameter Procedures & APOC	3	316	March 3, 2023

July Summer Fun!

How do I collect all of the nodes in a subgraph

Related topics