Get nodes connected to two different sub-graphs, efficiency question

ukirik · July 9, 2020, 1:38pm

The idea is to check if a node is related to one of the many nodes under two different hierarchies, so imagine you are interested in finding diseases that ails both cats and dogs, but not other mammals like primates or rodents. It's a silly example I know

I have a query that is conceptually something like this:

MATCH p1=(x: Animal)-[:SUB_CLASS_OF*0..3]->(dog: Animal {species: "Canis familiaris"})
WITH nodes(p1) as sub_dog
MATCH p2=(y: Animal)-[:SUB_CLASS_OF*0..3]->(cat: Animal {species: "Felis catus"})
WITH sub_dog, nodes(p2) as sub_cat
MATCH (a1:Animal)-[r1]-(d:Disease)-[r2]-(a2:Animal)
WHERE a1 IN sub_cat AND a2 IN sub_dog
RETURN a1,r1,d,r2,a2

The query above takes forever, since there are many different species and many diseases, and MANY different ways (i.e. rel types) diseases might be associated with animals. (also animals have many different types of relationships with other entities in this large graph which about 3M nodes and 50M edges in total).

I am primarily interacting with the graph via Neo4J browser. On Safari, running this query shuts down, and promptly reloads, the page. So it never runs to completion... On Chrome it runs over 10-15 mins with the computer running close to full capacity on the cpu.

I take that as a clear sign that the query isn't very well written. Any suggestions on how to tackle this?

mdfrenchman · July 9, 2020, 7:35pm

Try

MATCH (:Dog)<-[:SUB_CLASS_OF*0..3]-(dog)-[r1]-(d:Disease)-[r2]-(cat:Animal)-[:SUB_CLASS_OF*0..3]->(:Cat)
RETURN dog, r1, d, r2, cat

If you make r1 and r2 something like [:HAS_DISEASE {transmission}]-> that could make it more efficient.

Hope that helps. Full disclosure, I didn't test or profile this, just off the cuff while taking a break.
Cheers!
Mike

Topic		Replies	Views
Facing difficulties to create a subgraph from original one in NEO4j Cypher networkx	15	1874	November 19, 2019
Cypher query for getting a subgraph by multiple relationship paths Cypher	8	2666	May 27, 2022
Subgraph query in graphDB Cypher cypher	3	315	September 25, 2021
Which query is more expensive? Neo4j Graph Platform migrated	4	109	January 13, 2023
Extremely slow retrieval Neo4j Graph Platform performance	3	234	April 27, 2024

August 🏄 🏖️ 🏊

Get nodes connected to two different sub-graphs, efficiency question

Related topics