Neo4j crashing while running cypher path query

paths
(Abhatt22) #1

Hi,

I am a graduate student, running Cypher path queries on rdf graph dataset (size of dataset: 500MB, 3 million triples). However when I am running the below Cypher query

MATCH (s:Resource {uri:'http://acm.rkbexplorer.com/id/313934'}), (d:Resource {uri:'http://acm.rkbexplorer.com/id/org-5828f7e1c7b9b70c463900b3d5028c75'}), p=((s)-[*]-(d)) RETURN p;

This query never completes and after some time the command window becomes non-responsive and then I would have to restart Neo4j to be able to run anything else. The output of this query should be 2 hops long.

I am using neo4j-community-3.5.3 version. Please could someone help me by pointing out if I am doing something wrong in the query.

Thanks a lot!

0 Likes

(Andrew Bowman) #2

With this query, it is attempting to find all and every possible pattern of any length between these two nodes, but you seem to want the shortest path between these two nodes.

In that case you want to use the shortestPath() function, which will return only the shortest path found, which should return extremely quick:

MATCH (s:Resource {uri:'http://acm.rkbexplorer.com/id/313934'}), (d:Resource {uri:'http://acm.rkbexplorer.com/id/org-5828f7e1c7b9b70c463900b3d5028c75'}), p=shortestPath((s)--(d)) RETURN p;

0 Likes

(Abhatt22) #3

Thank you so much for your reply. I am actually trying to find all paths between these two nodes. Is there some way to make the all paths query run faster? Also I did try the shortest path query and that ran very fast (gave me answer within 283 ms).

Thanks a lot!

0 Likes

(Andrew Bowman) #4

You may be surprised by how many paths could possibly exist. Remember that by itself when a path is found that isn't the end of it, it will continue past the end node to find any and all other possible paths beyond it. You can add a LIMIT after your return, starting small and increasing, to see how the possible paths keeps on increasing. I'm guessing the number of possible paths, or at least the paths to evaluate along the way, skyrockets beyond what you may expect. You may want to consider what other limits to consider for what exactly you want back.

0 Likes

(Michael Hunger) #5
  1. Note that an pure RDF model is not well suited to a property graph, you should at least turn all data triples into node properties

  2. Also returning billions or trillions of paths to the webbrowser will fail no matter what, you need to have a client which can consume them in a streaming manner

  3. For these kinds of operations neo4j enterprise with the better runtime is better suited, as a student you can use that, e.g. with Neo4j Desktop as a Personal Developer license

0 Likes