It takes too long "without a response" while Retrieving all Routes between two nodes for number of hops exceeding 30

Hi,
The below query takes too much time "without a response" while Retrieving all Routes between two nodes for number of hops exceeding 30.

query = (f"""MATCH (from{{nodeName:"{srcNode}"}}),(to{{nodeName:"{dstNode}"}}),path=(from)-[:ots*1..{hops}]-(to)
WHERE NONE (n IN nodes(path) WHERE size([x IN nodes(path) WHERE n = x]) > 1 )
AND NOT NONE (n IN nodes(path) WHERE n.nodeName IN {includes})
AND NONE (n IN nodes(path) WHERE n.nodeName IN {excludes})
RETURN DISTINCT path AS shortestPath,
reduce(distance = 0, r in relationships(path) | distance + toInteger(r.distance)) AS totalDistance
ORDER BY totalDistance ASC LIMIT {limit}""")

Please support

Hi @mohamed_elzonko ,

This looks like a very specific traversal query. Without the second where condition AND NOT NONE (n IN nodes(path) WHERE n.nodeName IN {includes}), and if list of excludes names is not too big. (And assuming APOC installed) This query should help (notice labelWithAnIndexOnNodeName):

MATCH (from:labelWithAnIndexOnNodeName{{nodeName:"{srcNode}"}}),(to:labelWithAnIndexOnNodeName{{nodeName:"{dstNode}"}})
WITH from, to
MATCH(exc:labelWithAnIndexOnNodeName)
WHERE exc.nodeName in {excludes}
WITH from, to, collect(exc) as blk
CALL apoc.path.expandConfig(from, {
    minLevel : 1,
    maxLevel :{hops},
    blacklistNodes : blk

}) yield path
WITH path
//WHERE NOT NONE (n IN nodes(path) WHERE n.nodeName IN {includes})
RETURN DISTINCT path AS shortestPath,
reduce(distance = 0, r in relationships(path) | distance + toInteger(r.distance)) AS totalDistance
ORDER BY totalDistance ASC LIMIT {limit}

For a better traversal handling (if tuning on apoc path is not enough) you may like to check on https://neo4j.com/docs/java-reference/current/traversal-framework/

Bennu

1 Like

Thank you for your prompt reply. May I know the max number of nodes? //"list of excludes names is not too big"//

Regards,
Mohamed

Hi @mohamed_elzonko ,

There's no limit. But if you check the query I shared, you will collect nodes, so you will hydrate them in memory. If they are 3, a couple of Bytes will be needed on heap, if they are millions, you can run on memory issues.

Consider the usage of a label with and index on the name too.

1 Like

Great. I will try it and let u know. Thank you so much. Really appreciated!

Hi Bennu,
I actually tried the query but it didn't return anything. I made sure that the data (node names) are valid and there are possible routes reverted after using the first query, but your query doesn't give any response. Would you please have a look maybe something is missing?

//
MATCH (from:labelWithAnIndexOnNodeName{nodeName:"ETSLEACJ-2PS"}),(to:labelWithAnIndexOnNodeName{nodeName:"20500205-1PSX"})
WITH from, to
MATCH(exc:labelWithAnIndexOnNodeName)
WHERE exc.nodeName in
WITH from, to, collect(exc) as blk
CALL apoc.path.expandConfig(from, {minLevel : 1,maxLevel :60,blacklistNodes : blk}) yield path
WITH path
RETURN DISTINCT path AS shortestPath,
reduce(distance = 0, r in relationships(path) | distance + toInteger(r.distance)) AS totalDistance
ORDER BY totalDistance ASC LIMIT 7
//

Hi,

As stated on my previous messages, labelWithAnIndexOnNodeName is a place holder for an existence label on your model. You should replace it with an indexed label.