Performance of Graph DB

Hello,
I am executing a complex query to find the childs of label named 'A'. Query is mentioned below:
Match (a:A)-[:r1]->(b:B)-[:r2]->(c:C)-[:r3]->(d:D)
Where ID(a)=0
return d
I'm always getting the same result(7ms) whether I increase the nodes to 1 million or I increase the hierarchy or increase the parent child ratio.
My question is when I will see the deviation in results.
When the value is going to increase.

The query is finding the nodes connected to a specific node (a node) that are three hops away. Unless you change the nodes connected to this specific node, the results will be the same regardless oh how many other unrelated nodes there are in the database.

Are you referring to the query performance tensing the same regardless of db size? This is probably true because neo4j can quickly find the ‘a’ based on its node Id since neo4j maintains an index on node Id. Once the node is found, extending the path to three hops is relatively fast since neo4j maintains threes direct relationships, silo it’s just lookups and traversals to get the results.

If you want to see an example of the performance degrading with db size, Chang your query to find your a node on a property without an index. In such a case neo4j will have to node a be search through all nodes with the same label. This time wii obviously increas as the number of nodes with label A increased.

You can investigate query performance by profiling the query. Prepend ‘profile’ before the query and run it.

Thanks for the detailed information. Let me just add some comments:
Unless you change the nodes connected to this specific node, the results will be the same.

  1. Should I make the graph more complex? Creation of loops, connecting random nodes etc

Change your query to find your a node on a property without an index.
2. Sample query:
Match (a:A)-[:r1]->(b:B)-[:r2]->(c:C)-[:r3]->(d:D)
Where a.Name="A1"
return d

Adding many more paths a greater than will definitely show performance degradation. You can use variable length paths. The following will find all paths starting at node a of all lengths.

Match path = (a:A{name:”foo”)-[*]-(b)
Return path

Thank you for path finding query
Can you give me some suggestions how can I approach this use case?

I just gave an example of a query whose performance would be influenced by the complexity of your graph. The performance will be from 1) finding node 'a' if you don't have an index on (A:name) and 2) the complexity of the graph connected to node 'a' due to the variable length match pattern.

What are your objectives?

1 Like

My objective is to find the performance of Neo4j in finding childs using specific path.
Match (a:A)-[:r1]->(b:B)-[:r2]->(c:C)-[:r3]->(d:D)
Where ID(a)=0
return d
The query I mentioned basically returns all the nodes of D label that are related to A.
I'm noting down the time for it.
For this use case, how can I see the fluctuation of time.
You mentioned that I can create nodes without index, but whenever nodes are created in Neo4j, id is assigned automatically to it.

If you are going to use the system created id, then you should use:
Where id(a) = 0, if you are using Neo4j version < 5.x

That could be a complex experiment, as there are many factored that can impact performance. You would have to identify which factors you care about and device an approach to observing and measuring the impact as you vary parameters.

Do you have a specific performance characteristic you would like to study?

Maybe you can find benchmarks that have been published already.

The system ‘id’ is always index.

1 Like