String comparison performance

quan.nguyen · December 13, 2019, 3:16am

Hi all,

I am wondering about the performance of the following string comparison in where clause:
Case 1:

MATCH (p: Person)
WHERE NOT p.name='Person 1'
AND NOT p.name='Person 2'
RETURN p

Case 2:

MATCH (p: Person)
WHERE NOT (p.name='Person 1' OR p.name='Person 2')
RETURN p

Case 3:

MATCH (p: Person)
WHERE NOT p.name IN ['Person 1', 'Person 2']
RETURN p

I have tried on my database but the result is not united, sometimes case 3 gets be best response time and db hits, sometimes the best is case 1 and case 2.
Can you please make this clear for me?

12kunal34 · December 13, 2019, 6:15am

Hi @quan.nguyen

in this case i have one suggestion for you for the best performance .

if you want to searchinf for node on the basis of name then you should create index on this property
create index on :node(name)

this would resolve all your issues regarding db hits

please let me know if any other info required

Cheers

quan.nguyen · December 13, 2019, 6:55am

Thank you for your reply, @12kunal34

My case is more complicated and this is my example for the comparison. I have to consider this because my the property, that I need to filter out, is rarely used and it's not so selective (it has less then 10 different values).

Do you have any suggestions for this?

astridparamita · December 13, 2019, 4:27pm

Hi @quan.nguyen,

Have you tried using the keyword EXPLAIN and PROFILE on before your query?

EXPLAIN will not run the query, but show you the predicted cost of the query
PROFILE will run the query and show the cost of the query

In the your case, the Case 1 & Case 2 should perform with around the same cost. Basically it would scan all node label Person two times, one time for 'Person 1' the second time for 'Person 2'.

In Case 3 it will only scan all the node label Person once, checking both string 'Person 1' and 'Person 2' in one go.

The query Case 3 with the IN keyword would perform even better in comparison with the other two when you actually have more strings to compare (e.g p.name IN ['Person 1', 'Person 2, 'Person 3']

Hope this helps you with your decision!

quan.nguyen · December 13, 2019, 4:33pm

Thank you @astridparamita for your explaination!

Topic		Replies	Views
Difference in specifying the properties inside node and in the where clause Cypher	4	445	April 22, 2020
Difference in performance between inline and where clause Cypher cypher	2	606	July 24, 2019
How to Optimize a Query with Huge Where Clause Cypher performance , cypher	7	780	June 16, 2020
Cypher query Cypher cypher	3	338	June 16, 2020
Why are these two queries perform differently? Neo4j Graph Platform	4	341	November 19, 2020

String comparison performance

Related topics