Remove isolates within subgraph, only return connected component

julie.fisher · November 12, 2019, 2:14pm

I have a graph that is essentially bimodal, like the marvel database - Heroes connected to Comics and vice versa. No connections can be created between Heroes nor from Comic to Comic.

I'm working on a graph visualization and trying to remove noise. I have a list of heroes and want to return all the comics they're in. However, I only want the ones where the hero is connected via a comic to another hero within the list.

My solution thus far has been a pattern match with two where clauses and an OPTIONAL MATCH:

MATCH (c:Comic)-[:APPEARED_IN]-(:Hero)-[:APPEARED_IN]-(c1:Comic)
WHERE c.name in ["AVF 5", "COC 1", "H2 251", "H2 252", "M/PRM 35", "M/PRM 36", "M/PRM 37", "WI? 9", "AVF 4", "CM 51", "Q 14", "Q 16", "T 208", "T 214", "T 215", "T 216", "T 440", "CA3 36", "SLEEP 19", "SLEEP 1", "SLEEP 2", "W2 52", "W2 53", "XFOR 108", "XFOR 109"]
AND c1.name in ["AVF 5", "COC 1", "H2 251", "H2 252", "M/PRM 35", "M/PRM 36", "M/PRM 37", "WI? 9", "AVF 4", "CM 51", "Q 14", "Q 16", "T 208", "T 214", "T 215", "T 216", "T 440", "CA3 36", "SLEEP 19", "SLEEP 1", "SLEEP 2", "W2 52", "W2 53", "XFOR 108", "XFOR 109"]
AND id(c) <> id(c1) 
WITH distinct c
OPTIONAL MATCH (c)-[r:APPEARED_IN]-(h:Hero)
RETURN c, r, h

This has worked with a list of around 100. However, I tried a list of over 2,000 and killed the connection.

Is there a more elegant, optimized way to do this?

The profile plan looks like this:

andrew_bowman · November 12, 2019, 5:34pm

Just to note, the PROFILE plan won't be of much use unless you expand all elements of the plan so we can see the details of each operation.

In addition, your query doesn't seem to have a list of heroes but a list of comics. If you already have the comics you're interested in, then you only need half the pattern: MATCH (c:Comic)-[:APPEARED_IN]-(h:Hero) WHERE c.name in ... which will avoid the cartesian product of heroes appearing in the comics from your previous pattern.

julie.fisher · November 12, 2019, 8:16pm

Sorry Andrew, I'm using the Marvel data set because it's public and initially assumed I could do it Hero to Hero. When I was writing the example a hero was connected to too many comics so I had to flip the example. I'm still going for the same concept.

The example code finds comics in a list of names where there is a connection to another comic in the list via a hero, then returns those comics and all heroes that appear in the comic.

So if we had the below graph (blue is a comic, beige is a hero), we'd be trying to exclude 9 and 10 because they're not connected to any of the other comics via a Hero. Then we'd want to return all Heroes that showed up in each of the comics.

22%20PM

Here's the expanded PROFILE plan.

andrew_bowman · November 12, 2019, 9:32pm

Got it.

In that case we can use the approach I suggested, collecting the comics in question and using that list as a filter:

WITH ["AVF 5", "COC 1", "H2 251", "H2 252", "M/PRM 35", "M/PRM 36", "M/PRM 37", "WI? 9", "AVF 4", "CM 51", "Q 14", "Q 16", "T 208", "T 214", "T 215", "T 216", "T 440", "CA3 36", "SLEEP 19", "SLEEP 1", "SLEEP 2", "W2 52", "W2 53", "XFOR 108", "XFOR 109"] as comicNames // use parameters if possible
MATCH (c:Comic)
WHERE c.name IN comicNames
WITH collect(c) as comics
UNWIND comics as c
MATCH (c)-[r:APPEARED_IN]-(h:Hero)-[:APPEARED_IN]-(c2)
WHERE c2 IN comics
RETURN DISTINCT c, r, h

Topic		Replies	Views
Filtering nodes to not include those linked from the original collection Cypher	10	1379	November 29, 2018
Filter by relationship property and then connect unconnected nodes/subgraphs Cypher cypher	3	428	August 24, 2021
Finding all disconnected subgraphs in a graph Graph Algorithms/Graph Data Science	4	2542	October 15, 2020
Optimizing a query with a subgraph/subquery, only look at specific nodes Neo4j Graph Platform performance , cypher	4	395	June 10, 2021
Visualize entire graph but with one condition filter Neo4j Graph Platform	2	211	February 24, 2022

August Summer Fun!

Remove isolates within subgraph, only return connected component

Related topics