Performance of Graph DB

anshulchaintha7 · July 31, 2023, 4:34am

Hello,
I am executing a complex query to find the childs of label named 'A'. Query is mentioned below:
Match (a:A)-[:r1]->(b:B)-[:r2]->(c:C)-[:r3]->(d:D)
Where ID(a)=0
return d
I'm always getting the same result(7ms) whether I increase the nodes to 1 million or I increase the hierarchy or increase the parent child ratio.
My question is when I will see the deviation in results.
When the value is going to increase.

glilienfield · July 31, 2023, 8:46am

The query is finding the nodes connected to a specific node (a node) that are three hops away. Unless you change the nodes connected to this specific node, the results will be the same regardless oh how many other unrelated nodes there are in the database.

Are you referring to the query performance tensing the same regardless of db size? This is probably true because neo4j can quickly find the ‘a’ based on its node Id since neo4j maintains an index on node Id. Once the node is found, extending the path to three hops is relatively fast since neo4j maintains threes direct relationships, silo it’s just lookups and traversals to get the results.

If you want to see an example of the performance degrading with db size, Chang your query to find your a node on a property without an index. In such a case neo4j will have to node a be search through all nodes with the same label. This time wii obviously increas as the number of nodes with label A increased.

You can investigate query performance by profiling the query. Prepend ‘profile’ before the query and run it.

anshulchaintha7 · July 31, 2023, 9:16am

Thanks for the detailed information. Let me just add some comments:
Unless you change the nodes connected to this specific node, the results will be the same.

Should I make the graph more complex? Creation of loops, connecting random nodes etc

Change your query to find your a node on a property without an index.
2. Sample query:
Match (a:A)-[:r1]->(b:B)-[:r2]->(c:C)-[:r3]->(d:D)
Where a.Name="A1"
return d

glilienfield · July 31, 2023, 11:16am

Adding many more paths a greater than will definitely show performance degradation. You can use variable length paths. The following will find all paths starting at node a of all lengths.

Match path = (a:A{name:”foo”)-[*]-(b)
Return path

anshulchaintha7 · July 31, 2023, 1:18pm

Thank you for path finding query
Can you give me some suggestions how can I approach this use case?

glilienfield · July 31, 2023, 1:53pm

I just gave an example of a query whose performance would be influenced by the complexity of your graph. The performance will be from 1) finding node 'a' if you don't have an index on (A:name) and 2) the complexity of the graph connected to node 'a' due to the variable length match pattern.

What are your objectives?

anshulchaintha7 · July 31, 2023, 2:09pm

My objective is to find the performance of Neo4j in finding childs using specific path.
Match (a:A)-[:r1]->(b:B)-[:r2]->(c:C)-[:r3]->(d:D)
Where ID(a)=0
return d
The query I mentioned basically returns all the nodes of D label that are related to A.
I'm noting down the time for it.
For this use case, how can I see the fluctuation of time.
You mentioned that I can create nodes without index, but whenever nodes are created in Neo4j, id is assigned automatically to it.

ameyasoft · July 31, 2023, 4:34pm

If you are going to use the system created id, then you should use:
Where id(a) = 0, if you are using Neo4j version < 5.x

glilienfield · August 1, 2023, 12:24am

That could be a complex experiment, as there are many factored that can impact performance. You would have to identify which factors you care about and device an approach to observing and measuring the impact as you vary parameters.

Do you have a specific performance characteristic you would like to study?

Maybe you can find benchmarks that have been published already.

The system ‘id’ is always index.

Topic		Replies	Views
Querying relationships slow performance Cypher performance , cypher , relationship	4	2042	October 15, 2020
Improving the performance of a cypher query Neo4j Graph Platform	15	722	October 26, 2020
Optimizing simple queries for very large graph DB Cypher performance	12	1363	February 29, 2024
Query Performance for Label Matching Cypher	3	291	November 25, 2021
Why do these two queries differ a lot in speed? Neo4j Graph Platform	8	556	June 24, 2021

Get Certified in June!

Performance of Graph DB

Related topics