Shortestpath query is taking long time

I have 50 million nodes and 64 million relationships in neo4j db. I am using shortestPath query (shown below-3rd Query) to form origin-destination matrix. But it is taking lot of time because of too many db hits.
Please suggest what are the alternatives to minimize the query time?

Query used to Create Nodes with label "Point":

CALL apoc.periodic.iterate(
"
CALL apoc.load.csv('nodes.csv', {header:true,sep:',', ignore:['OBJECTID','CONNECTION_CNT'],
mapping:{
NODE_ID: {type:'int',name:'uid'},
X_COORD: {type:'float',name:'x'},
Y_COORD: {type:'float',name:'y'}
}
})
YIELD map as row
RETURN row
",
"
WITH row WHERE row.uid IS NOT NULL
CREATE (i:Point {{uid: row.uid}})
SET i.x=toFloat(row.x),
i.y = toFloat(row.y)
RETURN COUNT(*) as total
",
{{batchSize:100000, iterateList:true, parallel:true}}
)

Query used to Create Relationship with label "EDGE":
CALL apoc.periodic.iterate(
"
CALL apoc.load.csv('edges.csv', {header:true,sep:',',ignore:['POSTED_AVG_TRAVEL_TM'],
mapping:{
OBJECTID: {type:'int',name:'edgeId'},
FNODE: {type:'float',name:'u'},
TNODE: {type:'float',name:'v'},
TRAVEL_TM: {type:'float',name:'time'},
ROAD_LEN: {type:'float',name:'distance'}
}
})
YIELD map as edge
RETURN edge
",
"
WITH edge
WHERE edge.edgeId IS NOT NULL
MATCH (u:Point {uid: edge.u})
MATCH (v:Point {uid: edge.v})
CREATE (u)-[r:EDGE {edgeId: edge.edgeId}]->(v)
SET r.length = toFloat(edge.distance)
SET r.time = toFloat(edge.time)
RETURN COUNT(*) AS total
",
{batchSize:100000, iterateList:true, parallel:true}
)

Shortest Path Query which is taking long time:

with 20 uids inside WITH taking time=2m 20 sec
with 50 uids inside WITH taking time= 21 mins

WITH [345920715, 345920716, 345920717, 345920718, 345920719, 345920720, 345920721, 345920722, 345920723, 345920724, 345920725, 345920726, 345920727, 345920728, 345920729, 345920730, 345920731, 345920732, 345920733, 345920734, 345920735, 345920736, 345920737, 345920738, 345920739, 345920740, 345920741, 345920742, 345920743, 345920744, 345920745, 345920746, 345920747, 345920748, 345920749, 345920750, 345920751, 345920752, 345920753, 345920754, 345920755, 345920756, 345920757, 345920758, 345920759, 345920760, 345920761, 345920762, 345920763, 345920764] AS uids
MATCH (from:Point), (to:Point)
WHERE from.uid IN uids AND to.uid IN uids
MATCH path = shortestPath((from)-[r:EDGE*]-(to))
WITH from.uid AS fromPoint, to.uid AS toPoint,path,
reduce(time = 0, r in relationships(path) | time + r.time) AS totalTime,
reduce(dist = 0, r in relationships(path) | dist + r.length) AS totalDistance
RETURN fromPoint, toPoint, totalDistance,totalTime
ORDER BY fromPoint, toPoint;

Hi @dt1,

You are currently in the gds section, I have transferred your thread to the more appropriate cypher section in case someone knows how to optimize your query.

That being said, you are welcome to have a look into gds's shortest path algorithms. They work on an in-memory graph so once projecting it could potentially run faster.

Let me know if you have need any help with these GDS procedures.

Best regards,
Ioannis.

Try this:

WITH [345920715, 345920716, 345920717, 345920718, 345920719, 345920720, 345920721, 345920722, 345920723, 345920724, 345920725, 345920726, 345920727, 345920728, 345920729, 345920730, 345920731, 345920732, 345920733, 345920734, 345920735, 345920736, 345920737, 345920738, 345920739, 345920740, 345920741, 345920742, 345920743, 345920744, 345920745, 345920746, 345920747, 345920748, 345920749, 345920750, 345920751, 345920752, 345920753, 345920754, 345920755, 345920756, 345920757, 345920758, 345920759, 345920760, 345920761, 345920762, 345920763, 345920764] AS uids
UNWIND uids as uid
MATCH (n:Point{uid:uid})
WITH COLLECT(n) as Points
UNWIND Points as from
UNWIND Points as to
WITH from, to
WHERE from.uid < to.uid
MATCH path = shortestPath((from)-[r:EDGE*]-(to))
WITH from.uid AS fromPoint, to.uid AS toPoint,path,
reduce(time = 0, r in relationships(path) | time + r.time) AS totalTime,
reduce(dist = 0, r in relationships(path) | dist + r.length) AS totalDistance
RETURN fromPoint, toPoint, totalDistance,totalTime
ORDER BY fromPoint, toPoint;

Thanks @glilienfield for your reply. Earlier I was getting 2425 rows (50*49=2450) in query result and it was taking 21 mins. Now above query is giving 1225 rows and taking 10 mins. Both, Time and rows are reduced by half.

Is there a way to reduce the query time in the range of 1 to 2 mins? Please suggest.

You should get half the results, as I configured it not to calculate the shortest path from each direction, i.e. Node A -> Node B and Node B -> Node A, as the result should be the same since you did not specify relationship direction in your query.

Do you have an index defined on property 'uid' for Point label?

Yes. Index is defined on uid property of Point label.