All shortest paths between a set of nodes

cypher
paths
unwind
shortest-path
knowledge-base

(Michael Hunger) #1

Consider a number of arbitrary nodes, A,B,C,D,E,F,.....

I wish to return all of the shortest paths between these nodes.
The nodes may have many edges between them, but anticipate a maximum of 4.
The graph is complex and non hierarchical (if this makes sense - any node may point to any other node).
A typical node has the form: match (n:Entity { name: 'xyz' })

How would I write the match expression to return the shortest paths between the above nodes, in no specific order?

Solution

  1. Find the set of nodes using an indexed lookup operation
  2. Collect them into a list
  3. Unwind the list twice, once for every side of the path
  4. Remove inverse pairs by id comparison
  5. match and return the paths
MATCH (n:Entity) where n.name IN {names}
WITH collect(n) as nodes
UNWIND nodes as n
UNWIND nodes as m
WITH * WHERE id(n) < id(m)
MATCH path = allShortestPaths( (n)-[*..4]-(m) )
RETURN path

(Tenkaleakshay94) #2

Thank you for the post Michael. I have similar scenario where i have to get all shortest path between nodes.
But, I have to list of 10000 nodes of type label1 and I have to find shortest path from these label1 nodes to list of label2 nodes having 10-12 nodes.
I have tried similar cypher but it is too much time consuming, it is taking more than 15 min cpu time. are there any parallel programming or threading technique to reduce cpu time consumption.


(Andrew Bowman) #3

Just to make sure we understand, do you need the shortest path from each of those 10k label1 nodes to every single one of those label2 nodes? Or do you just need the shortest path from each of the label 1 nodes to the nearest of one of those label2 nodes?

If you need the first, shortest path to every label2 node, then that's going to be working with about 100k rows for each relevant pair, executing the shortest path 100k times total. That may take some time.

If you need the second, shortest path to only the single nearest label2 node, then you need a different approach, and should probably use APOC path expander procs to do this, which should be much more efficient.


(Tenkaleakshay94) #4

I need second result. could you please provide any helping documents for same.


(Andrew Bowman) #5

For that approach (shortestPath from each of the starting nodes to the closest node within a pre-matched set) you can use APOC path expander procedures. We have the concept of endNodes, where you can supply a list of possible nodes that you want to expand to and end upon, and we can also supply a limit for the number of paths to return per procedure call. We'll use apoc.path.spanningTree() here, which uses a type of traversal uniqueness such that a single node is never visited more than once, and it also uses breadth-first for expansion. If you only needed nodes instead of paths, then you could use apoc.path.subgraphNodes() instead.

Here's an example of usage:

// first match to and collect end nodes
MATCH (n:label2)
WITH collect(n) as endNodes
MATCH (n:label1)
// proc call will be executed per n node, finding the first shortest path found from n to one of the end nodes
CALL apoc.path.spanningTree(n, {endNodes:endNodes, limit:1}) YIELD path
RETURN n, path as shortestPath