Why are my queries so slow..?

p_xcx · August 12, 2022, 12:37pm

Hi there,

I am currently working the first time with neo4j for my thesis. I am using Neo4j Community 4.4.10 and my server has 180 GB RAM. For my project I am using a dataset that is 200 GB big and nodes :e and their relationships.

I am using the cypher shell and the first thing I do is load the nodes into the RAM with:

CALL apoc.warmup.run(TRUE,TRUE,TRUE);

After that roughly 30GB RAM is being used. This takes at least four minutes.

Then I have two different queries I want to run.

The first one will collect up to 1000 paths starting at entity with "Q886" and is 'fast' with 2 sec. :

MATCH (n:e {nodeid: 'Q886'}) CALL apoc.path.expandConfig(n, {   minLevel: 1, maxLevel: 1}) YIELD path RETURN nodes(path) as nodes, relationships(path) as relations LIMIT 1000;

If I don't Limit it to 1000 it will take longer because it will go through more nodes in the first step. Also some nodes take much longer.

The second query returns the number of outgoing relationships of a node. With the same node "Q886" this takes a minute.

Notably going through more nodes at the beginning.

So my question: What could cause this?

I am guessing it has to do with the RAM usage. Maybe I have to index all nodes that have a nodeid.

Thank you very much!

best regards from a frustrated student

glilienfield · August 12, 2022, 6:27pm

You should definitely create an index as @Cobra suggested to replace the initial NodeByLabel scan with an indexed lookup. In addition, I think you can rewrite the query more efficiently. In the first query you are limiting your search to a min and max length equal to 1. This is equivalent to a direct relationship, allowing you to use a simple pattern match. From explain plan for your second query, it looks like you are matching on the same node and finding the number of relationships with the 'size' method (note, this usage of size has been deprecated). If my interpretation is correct, then you can combine the two queries into one, as follows:

MATCH (n:e{nodeid: 'Q886'})-[r]-(m)
WITH n, r, m
LIMIT 1000
WITH n, collect(m) as children, collect(r) as relations
RETURN n + children as nodes, relations, size(relations) as count

Do you really have more than 1000 nodes related to this single node? I believe @Cobra was addressing the same concern.

cobra · August 12, 2022, 12:43pm

Hello @p_xcx

Do you have a UNIQUE CONSTRAINT on your "e" label for the "nodeid" property?

CREATE CONSTRAINT constraint_e_node_id IF NOT EXISTS FOR (n:e) REQUIRE n.nodeid IS UNIQUE;

Moreover, what is your number of nodes/relationships in your database? Your database looks oversized.

Regards,
Cobra

Topic		Replies	Views
Query taking unusually long to complete Neo4j Graph Platform performance , cypher	16	828	July 14, 2021
Performance query over millions of relationships Cypher	2	2549	January 31, 2020
Why is this query so slow? Neo4j Graph Platform	8	123	June 20, 2024
Improving the performance of a cypher query Neo4j Graph Platform	15	712	October 26, 2020
Neo4j k8s pods start slowing down after a while Neo4j Graph Platform migrated	0	150	October 10, 2022

Why are my queries so slow..?

Related topics