How to find shortest path limiting the distance from source?

lsnd · June 18, 2022, 4:27pm

I started to learn Neo4j a few days ago.

I'm using it to find best path and make some analyzes.

The logic is a Person (id, name) can go to a Restaurant (id, name) via some Street (id, name). The connection between them have a cost. PS: All streets have a connection between them. For example:

(Person {id: 1})-[CONNECTION {cost:10}]->(Street {id: 1})
(Person {id: 1})-[CONNECTION {cost:11}]->(Street {id: 2})
(Street {id: 1})-[CONNECTION {cost:4}]->(Street {id: 2})
(Street {id: 2})-[CONNECTION {cost:11}]->(Restaurant {id: 1})
(Street {id: 2})-[CONNECTION {cost:7}]->(Restaurant {id: 2})

I am using Dijkstra to find all best path to all Restaurant for a specific Person. But the problem is that I can't set the maximum depth, and I would like to limit a maximum of 3 streets. How could I do that?

CALL gds.graph.project(
  'Person-Street-Restaurant',    
  ['Person', 'Street' 'Restaurant'],   
  'CONNECTION',
  {
    relationshipProperties: 'cost'
  }
)

MATCH (source:Person{id:1})
CALL gds.allShortestPaths.dijkstra.stream('Person-Street-Restaurant', {
    sourceNode: source,
    relationshipWeightProperty: 'cost'
})
YIELD sourceNode, targetNode, totalCost, nodeIds
WHERE 'Restaurant' IN LABELS(gds.util.asNode(targetNode))
RETURN
    gds.util.asNode(sourceNode).name AS sourceNodeName,
    gds.util.asNode(targetNode).name AS targetNodeName,
    totalCost,
    [nodeId IN nodeIds | gds.util.asNode(nodeId).name] AS nodeNames,
    SIZE(nodeIds) AS hops
ORDER BY totalCost

Image example

The best path is: P->S#3->S#4->S#5->R, total cost 7. But if I limit to 2 streets, it should be P->S#1->S#2->R, total cost 10. PS: In this example is easy because we have fewer connections, but in real case we have a lot and all streets have connection between them.

I also tried to find all paths by MATCH p1(Person)-[:CONNECTION*3..3]->(Restaurant), but no success.

lsnd · June 18, 2022, 10:15pm

I created this solution, but it's very slow when matching relationships 3..3.

MATCH (c: Person{id: 1}), (r: Restaurant)
MATCH p3 = (c)-[r3:CONNECTION*3..3]->(r)
WITH r, p3,
REDUCE(path = "", n IN NODES(p3) | path + n.name + "->") AS p3Path,
REDUCE(total = 0, r IN RELATIONSHIPS(p3) | total + r.cost) AS p3Total
WITH r, MIN(p3Total) AS p3TotalMin, COLLECT({path: p3Path, total: p3Total}) AS paths
UNWIND paths AS p
WITH r, p, p3TotalMin
WHERE p3TotalMin = p.total
RETURN r.name, p3TotalMin, p.path

bennu_neo · June 19, 2022, 4:38pm

Hi @lsnd ,

Interesting use case! If you ask me, 2 things should be done:

1. In your case, I may use an specific type of relationship for the link between Persons and Streets (P_CONNECTION) as well for the link between Restaurants and Streets (R_CONNECTION). If you use one type of connection, you may find your self on awful cycles when traversing long paths i.e Going from the person to the street and from that street to a different person. Yes, you can add some conditions related to the cardinality of the Persons and Restaurant nodes on a path, but It will be kinda slow.

2. Based on a scenario where you change this relationships. Can you try this pipeline on your db? (you will need apoc on your db tho)

MATCH (c: Person{id: 2})
CALL apoc.path.expandConfig(c, {
    minLevel : 3,
    maxLevel : 3,
    relationshipFilter : 'P_CONNECTION>|CONNECTION|R_CONNECTION>',
    labelFilter : "/Restaurant",
    uniqueness : "NODE_PATH"
}) yield path
UNWIND relationships(path) as r
WITH r, startNode(r) as source, endNode(r) as target
WITH gds.alpha.graph.project('test', source, target, {
    sourceNodeLabels: labels(source),
    targetNodeLabels: labels(target)
    },{ 
        properties: r { cost : toInteger(r.cost) }         
}) AS g
RETURN
  g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels


MATCH (source:Person{id:2})
CALL gds.allShortestPaths.dijkstra.stream('test', {
    sourceNode: source,
    relationshipWeightProperty: 'cost'
})
YIELD sourceNode, targetNode, totalCost, nodeIds
WHERE 'Restaurant' IN LABELS(gds.util.asNode(targetNode))
RETURN
    gds.util.asNode(sourceNode) AS sourceNodeName,
    gds.util.asNode(targetNode) AS targetNodeName,
    totalCost,
    [nodeId IN nodeIds | gds.util.asNode(nodeId)] AS nodeNames,
    SIZE(nodeIds) AS hops
ORDER BY totalCost

You can configure the number of hops valid on your run with the min and max level of your apoc expansion before the projection.

Bennu

lsnd · June 19, 2022, 5:23pm

Hi @bennu_neo,

Thank you so much for replying and helping.

I'm following your instructions, but what relationship would you use for Streets and Streets?

People and Street: P_CONNECTION
Street and Restaurant: R_CONNECTION
Street and Street: ?

Because someone can go to a street, next to another street, then to the restaurant.

lsnd · June 19, 2022, 5:52pm

@bennu_neo , alright! I left connections between Street as CONNECTION, but when I try to execute your first code, to project the graph, it stuck on loading.

Maybe because I have a lot of data? Maybe because I'm testing on Neo4j Desktop?

MATCH (p: Person) RETURN COUNT(p): 1
MATCH (s: Street) RETURN COUNT(s): 693
MATCH (r: Restaurant) RETURN COUNT(r): 236
MATCH p=()-[:P_CONNECTION]->() RETURN COUNT(p): 668
MATCH rc=()-[:R_CONNECTION]->() RETURN COUNT(rc): 154105
MATCH c=()-[:CONNECTION]->() RETURN COUNT(c): 458452

bennu_neo · June 19, 2022, 6:01pm

Hi @lsnd!

Yes. You can let the Connection relationship between streets. What's the count of r up to line 9?

Hoe much ram u have configured for the transaction heap?

lsnd · June 19, 2022, 6:50pm

Hi @bennu_neo!

Unfortunately, I can't count rows up to line 9 as well. It stuck on loading.

I am using the default configuration from Neo4j.

dbms.memory.heap.initial_size=512m
dbms.memory.heap.max_size=1G
dbms.memory.pagecache.size=512m
dbms.tx_state.memory_allocation=ON_HEAP

Maybe this is the problem!

bennu_neo · June 19, 2022, 7:50pm

Hello @lsnd,

I like trying to fit complex problems on small ram, but yeah, if you can give more power to your db instance it can help a lot, let's try with initial and max heap size set on 2gb.

Which direction have P_CONNECTION and R_CONNECTION?

I assume they go from Person to Street and from Street to Restaurant respectively.

lsnd · June 20, 2022, 12:39am

Hi @bennu_neo!

I set initial and max heap size on 2gb, but the problem persists, and when I execute the query the database seems to stop to work.

Yes, you're right! I imported them from CSV files.

LOAD CSV WITH HEADERS FROM 'file:///streets.csv' AS line
MERGE (s:Street {name: line.name, id: toInteger(line.id)})

LOAD CSV WITH HEADERS FROM 'file:///restaurants.csv' AS line
MERGE (r:Restaurant {name: line.name, id: toInteger(line.id)})

MERGE (Person{id: 1})

LOAD CSV WITH HEADERS FROM 'file:///street_costs.csv' AS line
MATCH (s1: Street {id: toInteger(line.street_src_id)})
MATCH (s2: Street {id: toInteger(line.street_dst_id)})
MERGE (s1)-[:CONNECTION {cost: toInteger(line.cost)}]->(s2)

LOAD CSV WITH HEADERS FROM 'file:///street_restaurant_costs.csv' AS line
MATCH (s: Street {id: toInteger(line.street_id)})
MATCH (r: Restaurant {id: toInteger(line.restaurant_id)})
MERGE (s)-[:R_CONNECTION {cost: toInteger(line.cost)}]->(r)

LOAD CSV WITH HEADERS FROM 'file:///person_street_costs.csv' AS line
MATCH (c: Person {id: 1})
MATCH (s: Street {id: toInteger(line.street_id)})
MERGE (c)-[:P_CONNECTION {cost: toInteger(line.cost)}]->(s)

Topic		Replies	Views
Getting relationship properties of shortest path Neo4j Graph Platform migrated	3	252	November 25, 2022
Find all shortest Paths between given source and target Neo4j Graph Platform migrated	0	230	December 6, 2022
Is there an option to get the relationship properties on a shortest path Graph Algorithms/Graph Data Science shortest-path	4	1251	August 29, 2023
Source - Target - Path finding General migrated	3	228	November 23, 2022
Problem with relationships(path) of dijkstra algorithm Graph Algorithms/Graph Data Science	3	93	December 5, 2024

Get Certified in June!

How to find shortest path limiting the distance from source?

Related topics