Query taking unusually long to complete

gavvalrohit21 · July 17, 2020, 12:26pm

I have a Neo4j 4.1.0 community edition setup on an EC2 instance (Ubuntu 18.04) with 16 GB RAM. The size of the database is 211 M, determined by running
du -hs /var/lib/neo4j/data/databases/neo4j/
which is made up of about 93K nodes of 3 labels with a single property each.

I have configured the following settings as suggested by neo4j-admin memrec.

dbms.memory.heap.initial_size=6g
dbms.memory.heap.max_size=6g
dbms.memory.pagecache.size=7g

I am running the following query which is taking about 6 minutes to get completed.
MATCH (person:Person), (person:Person)-[r0:STUDIED_AT]-(college:College), (college:College)-[r]-(x) RETURN type(r) AS label, last(labels(x)) AS target, count(r) AS count ORDER BY count(r) DESC

Can someone help me understand why this query is taking so long to run although the size of the graph is pretty small and the system specs are good enough? Also, is there a way to speed up the execution considerably without modifying the query (because the query is coming from popoto.js and I do not have much control over it).

I have already tried the following:

CALL apoc.warmup.run()
Run the same query twice (expecting a better time at second execution)
Create index on all three labels (I do not need to write to the DB, it is largely read-only).

Couple of more questions:

What limits the size/number of requests to the DB? How can I accommodate more?
Is caching results possible? I know that neo4j caches the db and the query plans but not sure if results can be cached. I saw a feature request in the github issues but not sure if it got addressed.

webtic · July 17, 2020, 1:01pm

Can you post the output of the query with prepended with the keyword EXPLAIN ?
This shows the processing done for the query and gives more insight.

See https://neo4j.com/docs/cypher-manual/current/query-tuning/how-do-i-profile-a-query/ for more info

gavvalrohit21 · July 17, 2020, 1:09pm

Here is the output of EXPLAIN. Please let me know if you need more details.

webtic · July 17, 2020, 1:28pm

You probably can rewrite it to which avoids some cartesian duplication:

gavvalrohit21 · July 17, 2020, 1:37pm

Unfortunately I can't edit the query. It's created internally by a js library which I am using for my application. So firstly I am trying to assess if this performance (given the size of the data and the machine configuration) is warranted and if there is a way to configure neo4j for faster performance

webtic · July 17, 2020, 1:45pm

What js library is that?

Even if you can't change the generated code it is interesting to know how it compares to the generated query.

gavvalrohit21 · July 17, 2020, 1:48pm

The library is popoto.js

webtic · July 17, 2020, 1:59pm

Sorry not familiar with it, perhaps others are :)

gavvalrohit21 · July 17, 2020, 2:09pm

Thanks for trying to help. Would you be able to comment on whether this performance (given the size of the data and the machine configuration) is warranted?

webtic · July 17, 2020, 2:17pm

6 minutes seems outrageous long, which instance type are you using?
I would love to see how much time is shaved off with the query rewrite.
Even if you can't "fix" the query its good to know if this helps.

Are you able to download the dataset and try it on a local Neo4J desktop instance?
Just to see how it compares to the EC2 instance..

webtic · July 17, 2020, 2:19pm

you might want to post the output of PROFILE as well, just to get a bit more insight.

gavvalrohit21 · July 17, 2020, 3:03pm

Here you go, thanks for looking

gavvalrohit21 · July 17, 2020, 3:04pm

I'm not sure if it is zoomable. Here is the link to the image in case it is not.

webtic · July 22, 2020, 3:06pm

As you can notice the query causes an enormous cartesian product, this is why its so slow.

What is it you are trying to build?

I would investigate in getting popoto to be smarter with the query or move away from popoto.

gavvalrohit21 · July 24, 2020, 12:45pm

Thanks. I am trying to build a web interface for neo4j to make a dataset available for users to explore. I figured out a way to edit the queries created by popoto on the server side. That resolved the issue. Thanks for looking into this.

webtic · July 24, 2020, 1:04pm

Great thanks for the update, appreciated!

v.phanimadhavi85 · July 14, 2021, 4:04pm

Hi, Could you please share the way to edit queries created by popoto on server side. Also I want to know how we can write custom queries in popoto js . I am trying to do it with help of schema. But your help means a lot to me. Thanks in advance.

Topic		Replies	Views
Dreadfully Long Query Times Cypher performance , cypher	5	469	October 2, 2020
Cypher query slow performance Cypher cypher	5	594	November 12, 2023
Took 11 hours to finish running query. Need Help Query Tuning an APOC Function to Update Graph Import / Export	7	1032	October 13, 2021
Inconsistent run time for query Cypher py2neo , jupyter	9	1261	March 6, 2019
Why are my queries so slow..? Neo4j Graph Platform migrated	2	303	August 12, 2022

Take the Course Then Join The Aura Agent Hackathon

Query taking unusually long to complete

Related topics

Take the Course Then Join
The Aura Agent Hackathon