Cypher query optimization

busymo16 · February 20, 2021, 10:37am

Hello everyone,

Basically I have the following query:

MATCH (f:Family)-[:BELONGS_TO]-(pr:Product)
UNWIND range(0, 100) as id
WITH f, pr, [x in keys(pr)] as props
WHERE toLower(f.name) = toLower('family_name') AND toLower('specification') IN [word in pr[props[id]] | toLower(toString(word))]
RETURN DISTINCT f.name AS Family

With this query I want to find the family that the product belongs to based of a quality that the product has. The quality of the product is stored as a property value (in a list of values). When I use PROFILE, for the query above I get 622795 total db hits in 126ms. When I just reduce the range of UNWIND (0, 50), I do get 248773 total db hits in 69 ms.

Is there any better way how to check if a 'specification' is inside a list of property values of the Product node without knowing the property key and using UNWIND (because this seems expensive)??

Thank you in advance.

dana_canzano · February 20, 2021, 10:20pm

2 things

you usage of toLower(f.name) = toLower('family_name') is going to negate any usage of an index on :Family(name) if such an index existed. is your data such that there are :Family nodes with a name and some have a value of family name and some have a value of FamilY NaMe and some have a value of faMily NAME such that the only way to solve this is to do a toLower() on both sides of the expression?
you can move that portion of the WHERE clause higher up for example

MATCH (f:Family)
WHERE toLower(f.name) = 'family_name'
MATCH (f)-[:BELONGS_TO]-(pr:Product)
UNWIND range(0, 100) as id
WITH f, pr, [x in keys(pr)] as props
WHERE 'specification' IN [word in pr[props[id]] | toLower(toString(word))]
RETURN DISTINCT f.name AS Family

and this still keeps the toLower() on both sides of the expression and again still negating any usage of an index, if such an index existed on :Family(name), but is your data such that you must you really use toLower ?
Additionally I removed changed WHERE toLower('specification') IN ....... since toLower is unnecessary since 'specification' is already in its lower form. The same with toLower('family_name');

busymo16 · February 22, 2021, 1:09pm

Thank you for your answer. I can see some optimizations happening. Another question: this optimization can be one when AND operator in the WHERE clause but not when OR operator in WHERE clause right?

But let's say that I have this node (f:Family {'key1':'value1', 'key2':'value2'....}. Is there any way that I can do a where condition over value1, value2 without knowing key1, key2 and without using UNWIND?

dana_canzano · February 22, 2021, 1:24pm

correct. But I can see some optimizations happening. ????? how so? whereas the query used to take time XXXms it now takes ???ms ????

regarding

not sure i understand? you want to restrict for values value1 and value2 but you dont know if these values are with a property named X or Y or Z or ???

busymo16 · February 22, 2021, 1:35pm

From 110ms to 40ms

UNWIND range(0, 100) as id followed by WHERE 'specification' IN [word in pr[props[id]] | toLower(toString(word))] is costing me a lot of time, because it will take in consideration all numbers as id. I do that because I do not know the property(key) and I want to check over property(value)?

dana_canzano · February 22, 2021, 1:44pm

so its now down to 40ms and you looking to get it even faster?

And regarding

UNWIND range(0, 100) as id

this assumes every Node has 100 properties. Is that the case with your datamodel?

you could also do

UNWIND range(0, size ( keys(pr) )  ) as id

which would perform a range from 0 to the number of properties in a :Product node.

But here again still we are basically saying there is a :Product node and somewhere within of of the properties, any property, that its name is 'specification'.

Is there no relative 'model'? i.e. some :Product nodes may have a specification but record it in a property named SpeCification2021 other :Product nodes record the same data in a property named specification_2021 and others with specification-2021

busymo16 · February 22, 2021, 1:50pm

Nope, basically this is very good. Thank you!!

Nope, I just have to accept everything and then try to find out which product CONTAIN that specification as a property(value), but for this, I do not know in which property(key) this might be true. The system has to be dynamic that means different specifications might come as an input and that is what I need to check with the properties.

Topic		Replies	Views
Optimising query performance with a relatively simple match Cypher performance	3	633	July 3, 2020
Is there a more efficient version of this query? Neo4j Graph Platform	20	1484	May 31, 2020
Trying to squeeze as much performance as I can out of this complex query Cypher	2	503	December 31, 2019
Neo4j Cypher query execution optimization Neo4j Graph Platform migrated	2	100	July 19, 2022
Search through whole database Cypher	3	265	February 25, 2021

Cypher query optimization

Related topics