cancel
Showing results for 
Search instead for 
Did you mean: 

How to make multi level path traversal faster

Raj725
Node Clone

I want to find all the paths from the leaf node(E) to the root node(A).
Not for any specific node so no id or filed filter here.

The data model is as shown in the screenshot.

I used a basic Cypher query to find the paths:

MATCH path=(:A)-[:USE*]->(:E) RETURN path

This is to return all the paths from A to E: It took 254 seconds to return 18k paths.

I also want to return paths from B, C, D to E. I have written this query for that

MATCH path=(n)-[:USE*]->(:E) RETURN path

How can I improve this traversal to return results in 10-15 seconds?

System Configurations:
System memory is 16GB
Storage: SSD

Current Neo4j Conf:
heap size: min-3GB max-3GB
database size: 1.5GB

8 REPLIES 8

We'll need some information here.

Can you do a PROFILE of the query, expand all elements of the query plan, and add it here?

If we assume you're working with a tree, then it will be much faster if we start from :E nodes (leaves) rather than root nodes (since there is only one possible path from a leaf up to a root), and this is probably where the query is being planned suboptimally. If there are few roots and many leaves, the planner is likely using a label scan of :A first, which won't execute well (the planner doesn't understand this is a tree structure, and that for this particular use case it would be more efficient to start with :E nodes).

We need to provide a hint to the planner to start at :E nodes. We can do this through a scan hint.

Try this:

MATCH path=(:A)-[:USE*]->(e:E) 
USING SCAN e:E
RETURN path

@andrew.bowman I tried with USING SCAN, there is no change.

When I run the PROFILE query (with or without USING SCAN takes same time- 6 to 14 seconds), it also returns the result i.e. all the paths 18k.
I think there is some issue with Neo4j browser, it's trying to plot the graph and due large size it's taking time. So I am trying from program to get the results.

Here is the plan:
2X_a_a6fed86a38d82cebac91f9e38600bf9d2bb45755.png

I am facing similar issue @Raj725. Did you able to solve this ?

@karthikeyan My issue was queries was taking 6-14 seconds with PROFILE and 4-5 minutes without PROFILE. Are you facing the same issue?

If your issue is different:

please provide query plan with all elements expanded as explained by @andrew.bowman in above reply.

The actual issue in my case was:

Neo4j Browser was taking too much time to plot these many nodes and later it was freezing when I was running with PROFILE by default result was the query plan as shown above so it was quickly plotting it in few seconds

You should try with indexing

You should try with indexing

Indexing requires us to have a property value. Note that in the queries we're working with here, we don't have any properties to filter on or lookup via the index, we only have labels, so a label scan is the only possibility.

@Raj725 you will need to expand all elements of the plan, as we can't tell what parts of the query are associated with these operations. Use the double-down arrow in the lower right corner of the result frame to expand all plan elements, then export the image and add it here.

Thanks for asking.
My issue was solved, I replied the actual reason for it in this thread. I can't provide you the query plan as I don't have that database on my laptop.

How did you solve this problem? I'm facing the same problem by querying path traversal. It took so long time.