Achieving longestPath Using Cypher

cypher
longestpath
knowledge-base

(David Gordon) #1

While Cypher is optimized for finding the shortest path between two nodes, with such functionality as shortestPath(),
it does not have the same sort of function for longest path. In some cases, you may want this, and not the shortest route.

For example, if you wanted to do the longest path between a parent entity and a child (leaf node), and you don't know how many
there are in between:

MATCH p=(parent:Entity)-[r:HAS_CHILD*1..10]->(child:Person)
RETURN p;

The problem with this is that it will give you all paths, one for each hop, until you get to the leaf node. What you want is just
the longest path to see how the parent and leaf child are connected. To do this efficiently, do the following:

MATCH p=(parent:Entity)-[r:HAS_CHILD*1..10]->(child:Person)
WHERE size( (child)-[r:HAS_CHILD]->() ) = 0
RETURN p;

What the above query is doing: The variable length 1..10 will find all paths, including the longestPath, for any Parent-Child path that spans at most 10 hops. The WHERE clause is needed to filter the paths to only those where the leaf child nodes have no outgoing :HAS_CHILD relationships (i.e. it finds the end of the chain).


(Tony Panza) #2

Doesn't this still find all paths (of at most 10 hops) that happen to contain a leaf node? Not necessarily the longest.

Wouldn't all of these paths, for example, be matched?

(root)-[:HAS_CHILD]->(mid1)-[:HAS_CHILD]->(mid2)-[:HAS_CHILD]->(leaf)
(mid1)-[:HAS_CHILD]->(mid2)-[:HAS_CHILD]->(leaf)
(mid2)-[:HAS_CHILD]->(leaf)

(Andrew Bowman) #3

Correct, the paths returned will be from the starting node to each leaf. If you only want the single longest path to a leaf node, you would need to order by the length of the path:

// assuming you want to start at all :Entity nodes, otherwise needs additional filtering
MATCH p=(parent:Entity)-[r:HAS_CHILD*1..10]->(child:Person) 
WHERE size( (child)-[r:HAS_CHILD]->() ) = 0 
WITH p
ORDER BY length(p) DESC
LIMIT 1
RETURN p;