Using indexed nodes and simple queries my cypher queries are still taking around 500ms. Can this be further optimized? Looking for advice

sl77 · December 18, 2020, 5:48pm

Here is the simple query I am running that serves as the base for more complex queries. I am trying to optimize these queries to run faster, but am hitting a wall when even this simple query is running at 500ms. I dont understand how to make this query any more bare bones, but perhaps there is something I am missing or another way I can write it? Here is the query:

explain MATCH (o:Order)-[:CONTAINS_PRODUCT]->(p) 
WHERE o.accountGUID = 'xxxxxxxxxxxxxxx' AND p.id IN ['4753230364775'] AND o.createdAt >= 1576775048 
return o, p LIMIT 10

With results as

Started streaming 10 records after 502 ms and completed after 515 ms.

Or when profiling around 600ms.

I understand it is the p.id check that is taking the longest as it has to sort through all products attached to each order to find orders that contain that product until it hits the limit, but am not sure how to make that any faster. For example running

MATCH (o:Order)-[:CONTAINS_PRODUCT]->(p) 
WHERE o.accountGUID = 'xxxxxxxxxxxxxxx' AND o.createdAt >= 1576775048 
return o, p LIMIT 10

Completes in less than 1ms

Any advice would be appreciated.

clem · December 18, 2020, 6:00pm

One clever thing you can do, is move LIMIT before the RETURN statement. Otherwise, the query creates a huge list, only to trim it to 10 entries. (This isn't obvious.)

See: 5 Secrets to More Effective Neo4j 2.2 Query Tuning - Neo4j Graph Data Platform

andrew_bowman · December 18, 2020, 8:08pm

What label is on p? :Product? If so, is there a unique constraint on :Product(id)? If so we may be able to optimize this a bit:

MATCH (o:Order), (p:Product)
WHERE o.accountGUID = 'xxxxxxxxxxxxxxx' AND p.id IN ['4753230364775'] AND o.createdAt >= 1576775048 
MATCH (o:Order)-[:CONTAINS_PRODUCT]->(p1) 
WHERE p1 = p
RETURN o, p
LIMIT 10

This is assuming :Product(id) is unique. It should pre-match the nodes, then perform the expansion, and instead of having to do property checking on p to see if it's the right product, it will compare the nodes themselves, which is more efficient (it's basically comparing the internal graph ids, which should be fast).

You could also try adding a composite index on :Order(accountGUID, createdAt).

sl77 · December 18, 2020, 9:19pm

Thank you Andrew that is very helpful! Yes (p) is :Product and has a uniqueness constraint on id.

Using your advice and the other user who commented on limiting WITH statement I was able to significantly speed up my queries.

However I am facing another confusing issue dealing with indexed searching i was hoping you could help with?

My queries for :Session nodes are taking extremely long, like 25 seconds or more, despite the fact they they are indexed. When I use the explain or Profile planners I can see that these are taking so long because they are scanning by label rather than using the indexes I have. Could you possibly explain why this is happening?

Here is the query:

MATCH (n:Session) where n.accountGUID = 'xxxxxxxxxxxxxxx' 
RETURN n LIMIT 25

It takes 25+ seconds
I have attached a picture of the EXPLAIN plan.

Using :schema in the cypher terminal i can see that is is Indexed:

ON :Session(accountGUID, profileGUID, sessionKey) ONLINE (for uniqueness constraint)

And here are the constaints:

ON ( session:Session ) ASSERT (session.accountGUID, session.profileGUID, session.sessionKey) IS NODE KEY

Does this mean I have to provide all 3, accountGUID, profileGUID, and sessionKey? All my other nodes are Indexed with accountGUID and ID, but i can use one or the other to search by index, so im wondering what the case is here.

Very much appreciate the help!

andrew_bowman · December 18, 2020, 9:22pm

You're correct, when there is a composite index (on its own or because of a node key constraint) all indexed properties must be present for the index to be leveraged. You would have to create an index or unique constraint on :Session(accountGUID) in order to use the index when only accountGUID is used for lookup.

sl77 · December 18, 2020, 9:29pm

Thanks so much! Appreciate the help!

Topic		Replies	Views
Help with optimizing query Cypher performance , browser , cypher , operations	3	463	September 21, 2021
Performance question - how to work with indexes Cypher performance , cypher , data-modeling	5	408	November 25, 2021
Indexes required to make this query fast Cypher	1	611	September 2, 2019
Is this query speed normal? Neo4j Graph Platform	5	713	October 23, 2020
Why do these two queries differ a lot in speed? Neo4j Graph Platform	8	618	June 24, 2021

Take the Course Then Join The Aura Agent Hackathon

Using indexed nodes and simple queries my cypher queries are still taking around 500ms. Can this be further optimized? Looking for advice

Related topics

Take the Course Then Join
The Aura Agent Hackathon