cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! Site migration is underway. Phase 2: migrate recent content

Performance Recommendation Needed

spraja08
Node

I am using 5.1.0 with a single node(docker), running on 16 vCPUs, 64GB RAM. I have a query which retrieves upto 5 degrees of connections for a given node. The result has 72K records and the query latency is about 78 seconds. My configurations to the docker container are as below. The call dbms.listConfig() does list the same values, so that is consistent.

 

 

NEO4J_server_memory_pagecache_size=27g 
NEO4J_server_memory_heap_initial__size=23g 
NEO4J_server_memory_heap_max__size=23g 
NEO4J_db_memory_pagecache_warmup_preload=true 
NEO4J_server_memory_pagecache_scan_prefetchers=32 
 
I also have one Btree index on the nodes. How can I check if the warming up is fully done? Is there anything that I can do reduce the query latency? Target is to get it below 30 seconds.
10 REPLIES 10

glilienfield
Ninja
Ninja

Can you share the query?

spraja08
Node

Thanks glileinfield. The query as below :

 

match p = (node {name:"f4730cad-d48d-4fb6-85c2-096fb90152b8"})-[*1..5]-(toNode)
return p

steggy
Neo4j
Neo4j

@spraja08 have you tried profiling the query to see the plan, page cache hits, etc?

Thanks steggy. I have not and will explore that now...

Hi Steggy,

I have attached the query profile output. I noticed that the PageCache is not used at all. I expect that the ProduceResult operator at least must use the Cache rather than going for 100s of thousands of dbhits... Looks like this is where the performance is lost. Would you kindly help with any recommendations on how to get the engine to use PageCache? Much appreciate it. 

steggy
Neo4j
Neo4j

Does it change if you execute the query a second time?

Nope. It is always consistent. No change to the dbhits count even if the same query is run n number of times consecutively. I am deeply puzzled...

Hi @spraja08 ,

Does your graph contain any cycles? How long does the query takes when you run solely the count of paths? Was this behavior different on previous Neo4J versions? What are you planning to do with this paths?  Considering the relationships had no type.

Bennu

Oh, y’all wanted a twist, ey?

Thanks bennu. The relationships has a property called "type". The graph has cycles but in the cyclic path, the relationship property "type" has different values. So these cycles are valid. 

If I solely count the paths, the query is very efficient. This goes to prove that the last operator (produceResults) creates a lot of dbhits, which contributes to the high latency. I wonder if this operator is not written to leverage the pagecache ?! 

The paths is the key insight that the consumers will utilise in business scenarios.

Hi @spraja08,

How are you measuring the time execution? Are you checking the query.log file? 

Oh, y’all wanted a twist, ey?