Return relationships traversed, not paths from shortest path

mike.webster · March 5, 2020, 6:12am

Hi all,
I'm just starting using neo4j, and I'm having a few problems finding all the details of what information is able to be extracted from paths in neo4j.

In particular, for the shortestPath and kShortestPaths algorithms, I seem to only be able to extract the nodes from the path, not the relationship elements.

The query I'm using is:

MATCH (start {REFERENCE:'NB7864'}), (end {REFERENCE:'NB4642'})
CALL algo.shortestPath.stream(start, end, 'weight')
YIELD nodeId, cost
RETURN algo.asNode(nodeId), cost

Or,
MATCH (start {REFERENCE:'NB7864'}), (end {REFERENCE:'NB4642'})
CALL algo.kShortestPaths.stream(start, end, 5, 'weight', {path:true})
YIELD path, costs
RETURN path, costs

But I'm not sure how to extract the edges from the returned path or values.

Thanks for any help.

Michael Webster

paulare · March 5, 2020, 6:12pm

Hi, @mike.webster maybe you could provide sample what output are you aiming for? Do you want to get relationships between nodes in this path?

apoc.algo.cover(nodes) yield rel - returns all relationships between this set of nodes

MATCH (start {title: 'The Birdcage'}), (end {title: 'The Matrix'}) 
CALL algo.shortestPath.stream(start, end)
YIELD nodeId
WITH collect(nodeId) as nodeList
CALL apoc.algo.cover(nodeList) 
YIELD rel
RETURN rel

mike.webster · March 6, 2020, 12:06am

Yes, that's just what I want, thanks. I'm just trying to adapt data in a GIS db about fibre optics, and I need to map back to the fibre cables that represent the links between nodes, rather than just the nodes themselves.

Thanks very much.

mike.webster · March 6, 2020, 12:30am

Actually - it's not exactly what I want, as it seems to return all edges from the nodes in the path, not the edges used in the path.

What I'd like ideally is that I would have returned:

node1 -> edgeTaken -> node2 -> edgeTaken -> ... lastNode

I notice the graphical view shows that, but I'm finding it hard to get back a textual representation of it.

paulare · March 6, 2020, 6:49am

So you want node-relationship-node-... ?

http://neo4j-contrib.github.io/neo4j-apoc-procedures/3.5/nodes-relationships/path-functions/
apoc.path.elements(path) returns a list of node-relationship-node-…

MATCH (start {title: 'The Birdcage'}), (end {title: 'The Matrix'}), path = shortestpath((start)-[*]-(end))
RETURN apoc.path.elements(path)

╒══════════════════════════════════════════════════════════════════════╕
│"apoc.path.elements(path)"                                            │
╞══════════════════════════════════════════════════════════════════════╡
│[{"title":"The Birdcage","tagline":"Come as you are","released":1996},│
│{"summary":"Slapstick redeemed only by the Robin Williams and Gene Hac│
│kman's stellar performances","rating":45},{"name":"Jessica Thompson"},│
│{"summary":"An amazing journey","rating":95},{"title":"Cloud Atlas","t│
│agline":"Everything is connected","released":2012},{},{"name":"Lana Wa│
│chowski","born":1965},{},{"title":"The Matrix","tagline":"Welcome to t│
│he Real World","released":1999}]                                      │
└──────────────────────────────────────────────────────────────────────┘

jaini.kiran · March 9, 2020, 10:12am

MATCH (start {REFERENCE:'NB7864'}), (end {REFERENCE:'NB4642'})
CALL algo.kShortestPaths.stream(start, end, 5, 'weight', {path:true})
YIELD index, nodeIds, costs
MATCH path = (a)-[r]-(b)
where id(a) in nodeIds and id(b) in nodeIds and r.weight in costs
with distinct(r) as rls
return rls

I hope this helps.

mike.webster · March 9, 2020, 11:33pm

Thanks very much for that. I don't think I can use the builtin shortest path for a weighted graph, and I can't get the path = shortestPath(...) syntax to work with the algo.shortestPath method.

I think jaini's solution below helps.

andrew_bowman · March 9, 2020, 11:41pm

Note also that your Cypher doesn't have any labels, so you're performing 2 AllNodesScans of all nodes in db to find your start and end nodes, which will get more and more expensive as your graph grows:

So use labels, and ensure you have indexes such that the EXPLAIN plan of your query shows index lookups are being used for each node.

mike.webster · March 9, 2020, 11:45pm

Thanks jaini, I can run that query, and it does extract the paths. I suspect you don't need distinct in the query, because that's guaranteed by the algorithm?

In any case, I've tried modifying it to get a format like this:

n1 , rel1, n2
n2, rel2, n3
.
.
n(k-1), rel(k-1),nk

with this query:

CALL algo.kShortestPaths.stream(start, end, 0, 'weight', {path:true})
YIELD index, nodeIds, costs
MATCH path = (a)-[r]->(b)
where id(a) in nodeIds and id(b) in nodeIds and r.weight in costs
//with distinct(r) as rls, a, b
return a, r, b

But the ordering seems out.

BTW: Does the MATCH path = (a)-[r]-(b) part of the query match back to the path returned by the shortestPaths algorithm, or is it matching against the entire graph? Because that could be problematic.

Thanks.

mike.webster · March 9, 2020, 11:59pm

Actually, I put an index on REFERENCE that I hoped would do the trick.

mike.webster · March 10, 2020, 12:00am

I'm now thinking I need to look into the cypher projection, and nodeQuery and relationshipQuery to solve this issue.

mike.webster · March 10, 2020, 12:04am

Just another question relating to using labels - I have several elements in the domain all of which can be in a path - namely buildings, enclosures, hubsites - and probably few others later.

Is it worthwhile creating a separate label that encompasses all those other labels? That would mean I could use a label in the query above, something like 'FibreTerminationPoint'.

andrew_bowman · March 10, 2020, 12:30am

mike.webster:

CALL algo.kShortestPaths.stream(start, end, 0, 'weight', {path:true})
YIELD index, nodeIds, costs
MATCH path = (a)-[r]->(b)
where id(a) in nodeIds and id(b) in nodeIds and r.weight in costs
//with distinct(r) as rls, a, b
return a, r, b

But the ordering seems out.

The ordering isn't right here, what you're doing in this query is doing the shortest path streaming, and then for each row result, performing a MATCH of a path for every occurrence of the pattern across all nodes of your graph, that will have terrible performance and won't give you what you want as a result.

Using labels is a good start for your start and end nodes, but since your intent is to lookup specific nodes by a property, you need to create indexes so your node lookups are quick.

As for your shortestPath calls, it looks like algo.kShortestPaths.stream() can YIELD a path variable, maybe that will work for you.

Do you have weight properties that need to be evaluated, or are your shortest paths only with respect to the length of the paths and don't need to take any kind of property weight into account? If the latter, there are other approaches we can use, some of which are good at filtering by label.

mike.webster · March 10, 2020, 12:49am

Yes - I don't think that solution is correct. I suspect I'm rematching against the entire graph, which means I could pick up extraneous relations.

I've tried returning path, but all it seems to contain is the cost property which I'm passing in as weight.

I do need to use some kind of 'weight' value, so the graph algorithm versions of shortestPath seem appropriate.

My current theory is that I need to use cypher projection and nodeQuery and relationshipQuery to help achieve what I'm after. Later on I will also want to put some further constraints on the paths that are followed, such as looking at parameters of nodes that indicate a measure of congestion.

I had hoped I might be able to pass a function into shortestPath to calculate a weight at runtime, but I haven't seen how to do this - maybe the nodeQuery and relationshipQuery parameters can help with this.

Possibly I can write a server extension if I can't find anything else to do what I want.

Currently I'm indexing every node based on it's REFERENCE field, eg. NB7864.

Thanks for the suggestions - I'm trying to very rapidly come up to speed.

andrew_bowman · March 10, 2020, 1:06am

Just keep in mind for indexes, your pattern MUST contain the label for which the index is under, as well as a predicate on the indexed property. You can use an EXPLAIN of the plan to double-check that an index lookup is being used, and not a more expensive NodeByLabelScan or AllNodesScan.

jaini.kiran · March 10, 2020, 6:27am

Yes. You are right. No need to write that distinct line in the query. My mistake.
Actually, i came through same problem earlier. Then i solved it like above.

mike.webster · March 12, 2020, 1:49am

Finally found that the apoc shortest path algorithm returns the relationship information I want:

MATCH (start {REFERENCE:'NB7864'}), (end {REFERENCE:'NB4642'})
CALL apoc.algo.dijkstra(start, end, 'FibreConnection','weight')
YIELD path, weight
RETURN path, weight

Topic		Replies	Views
How to retrieve relationship properties from shortest path algorithm? Graph Data Science / Graph Analytics shortest-path	13	1908	March 19, 2021
How to extract relationships from shortest path algorithm Graph Data Science / Graph Analytics shortest-path	3	1806	March 21, 2020
How to get shotest path according to relationships property Procedures & APOC operations	14	1407	May 28, 2021
Return a relationship Label in shortest Path Neo4j Graph Platform migrated	5	233	September 23, 2022
Is there an option to get the relationship properties on a shortest path Graph Data Science / Graph Analytics shortest-path	4	1322	August 29, 2023

Return relationships traversed, not paths from shortest path

Related topics