Showing results for 
Search instead for 
Did you mean: 

How to improve the speed of Cypher query summing the weights of a path


I have the following graph topology:
There are OSMNode nodes with lat, lon, location properties. The OSMNodes are chained through NEXT relationships. Certain OSMNodes have a FIRST_NODE relationship pointed towards them:
I would like to link the OSMWay nodes with a [:NEXT_WAY] relationship and add a length property to it. The length of the NEXT_WAY relationship should be the sum of all [:NEXT {distance: distance}] properties of OSMNodes linking the two OSMWay nodes.
I created a query and it works fine, but it is rather slow: it creates 10K NEXT_WAY relationships every 45 seconds on average.
This is my query:

CALL apoc.periodic.iterate(
'MATCH (fw:OSMWay)-[:FIRST_NODE]->(fn:OSMNode)
WHERE NOT (fw)-[:NEXT_WAY]->()
RETURN fw, fn',
    WITH fn
    MATCH (fn)-[rls:NEXT*]->(ln:OSMNode)<-[:FIRST_NODE]-(nw:OSMWay)
    WITH nw, rls LIMIT 1
    RETURN nw, REDUCE(s=0, r in rls | s+r.distance) AS length
CREATE (fw)-[:NEXT_WAY {length: length}]->(nw)',
{batchSize:10000, parallel:false});

(How) could we change this query to improve execution speed?
Thank you.



Not precisely answering your question, but, seeing that your nodes have spatial coordinates as attributes: have you taken a look on Neo4j Spatial Spatial functions - Neo4j Cypher Manual ?

Second, any particular reason you are running iterate with parallel:false? I mean, that would be the idea to use iterate: to run it multithreaded, hence, faster.

Nodes 2022
NODES 2022, Neo4j Online Education Summit

On November 16 and 17 for 24 hours across all timezones, you’ll learn about best practices for beginners and experts alike.