Very slow performance when aggregate on node property

costa_fernando · November 20, 2020, 4:47pm

Hello,

I'm experimenting with Neo4j for the past 4 months and I'm progressively migrating my production environment from MySQL to Neo4j. I think Neo4j is a great database and is very good to model relationships in my data. But for some simpler queries, it is much slower than MySQL. I'm having a specific hard time when I try to query aggregated data, for some queries, it is 40x slower than the corresponding one in MySQL. I'll provide an example.

I have this data model in Neo4j:

.

And I want to get the aggregated students effort (questions answered and time spent studying) of all schools from a school manager to display in a dashboard for him. So I make this query:

match (:SchoolManager{schoolManagerId:"3aeb3009-07ca-48ae-b0c8-78b9f8f96bd9"})-[:MANAGER_OF_SCHOOL]->()<-[:CLASS_OF_SCHOOL]-()<-[:STUDENT_OF_CLASS]-()-[:ANSWERED]->(sa:StudentAnswer)
where sa.createdAt >= datetime("2020-01-01") and sa.createdAt <= datetime("2020-11-20")
return {questionsAnswered: count(sa), timeSpent: sum(sa.timeSpent)}

This query takes almost 16K ms. This is the PROFILE for the query:

In MySQL, the corresponding query is something like this:

select count(*) as questionsAnswered, sum(qa.timeSpent) as timeSpent from school_managers.users as sm left join schools.schools_members as sm on sm.user_id = sm.id left join schools.schools_classes as sc on sc.school_id = sm.school_id left join schools.schools_classes_members as scm on scm.school_class_id = sc.id left join quizzes.quizzes_attempts as qa on qa.user_id = scm.user_id where sm.id = 1 and qa.created_at >= "2020-01-01" and qa.created_at <= "2020-11-20";

It's much uglier, but much faster too, it runs in about 600 ms.

I already searched a lot online and know that the slowness is aggravated by the aggregation on a node property in sum(sa.timeSpent). If I don't access this node property, the query time decreases by half (which is still slow).

I little more info on my data:

:SchoolManager ~ 200 nodes
:School ~ 200 nodes
:SchoolClass ~ 1K nodes
:Student ~ 20K nodes
:StudentAnswer ~ 6M

I have indexes on :SchoolManager.schoolManagerId and :StudentAnswer.createdAt.

I'm using Neo4j Aura instances with 2GB RAM and 1 CPU. My browser and server configs are:

Neo4j Browser version: 4.1.4
Neo4j Server version: 4.0.0 (aura)

Is there something that I can do to improve the performance of this Neo4j query? Or for this kind of aggregated data on a property MySQL will always be a better choice?

fruitfulapproac · January 2, 2023, 6:52am

I think 20 seconds down to 3 seconds with better hosting option is great. Currently my workload is taking 10 seconds to complete, so it sounds like once I start paying for the Aura DB, that it should be useable as is. I have a plan though, if it ultimately is too slow. Keep for dev purposes and slowly migrate to a custom drop-in-replacement for Neo4j. That way all the application code stays exactly the same, I would just need to custom write perhaps the python Neomodel wrapper. Just the functions that I use.

mad_shivam111 · March 5, 2023, 4:46am

Hii, try rephrasing your query to filter first.
something like this
match (sa:StudentAnswer) where sa.createdAt >= datetime("2020-01-01") and sa.createdAt <= datetime("2020-11-20")
with sa
match (:SchoolManager{schoolManagerId:"3aeb3009-07ca-48ae-b0c8-78b9f8f96bd9"})-[:MANAGER_OF_SCHOOL]->()<-[:CLASS_OF_SCHOOL]-()<-[:STUDENT_OF_CLASS]-()-[:ANSWERED]->(sa)
return {questionsAnswered: count(sa), timeSpent: sum(sa.timeSpent)}

Topic		Replies	Views
Very slow performance when aggregate on node property Cypher performance	21	4620	December 21, 2020
Slow performance on time series queries Cypher performance , cypher	6	2924	January 28, 2019
My count() query is too slow Cypher cypher , counts	6	1436	June 18, 2021
Neo4j result Graph moves slow- Performance issue Browser performance	1	3005	January 24, 2019
Performance issues as database gets bigger Import / Export performance	11	63	March 23, 2025

Very slow performance when aggregate on node property

Related topics