I have a Neo4j 4.1.0 community edition setup on an EC2 instance (Ubuntu 18.04) with 16 GB RAM. The size of the database is 211 M, determined by running
du -hs /var/lib/neo4j/data/databases/neo4j/
which is made up of about 93K nodes of 3 labels with a single property each.
I have configured the following settings as suggested by
dbms.memory.heap.initial_size=6g dbms.memory.heap.max_size=6g dbms.memory.pagecache.size=7g
I am running the following query which is taking about 6 minutes to get completed.
)-[r]-(x) RETURN type(r) AS label, last(labels(x)) AS target, count(r) AS count ORDER BY count(r) DESC
Can someone help me understand why this query is taking so long to run although the size of the graph is pretty small and the system specs are good enough? Also, is there a way to speed up the execution considerably without modifying the query (because the query is coming from popoto.js and I do not have much control over it).
I have already tried the following:
- CALL apoc.warmup.run()
- Run the same query twice (expecting a better time at second execution)
- Create index on all three labels (I do not need to write to the DB, it is largely read-only).
Couple of more questions:
- What limits the size/number of requests to the DB? How can I accommodate more?
- Is caching results possible? I know that neo4j caches the db and the query plans but not sure if results can be cached. I saw a feature request in the github issues but not sure if it got addressed.