Optimizing a query and understanding the profiler

eric7 · June 3, 2021, 10:02pm

Hello,

I am trying to optimize a query I have been working on but do not understand why cypher/neo4j profiler hits the database as much as it does.

The query below tries to find all mutual contacts for a given user $user_id.

1st pass

profile MATCH (u1:User {user_id: $user_id})-[:CONTACT]->(u2:User)
where exists ((u2)-[:CONTACT]->(u1))
return u1,u2

2nd pass (better but still not great)

profile MATCH (u1:User {user_id: $user_id})-[:CONTACT]->(u2:User)
with u1,u2
match ((u2)-[:CONTACT]->(u1))
return u1,u2

My understanding is that using WITH I am signaling to the 2nd MATCH clause the existence of the start and end nodes. However, the profiler seems to tell me that this incurs the most amount of db hits: https://ibb.co/JHR9CzB . I'm confused about best practices for things like this and how to optimize my query. Thank you!

andrew_bowman · June 5, 2021, 12:03am

Either approach should work. Generally I would favor the first query.

Given that you already have a unique constraint on :User(user_id), there's not much more tuning you can do here.

As for the number of db hits, perhaps the output of this query will show you how many relationships the query must consider before it arrives at the 28 resulting rows:

match (u1:User {user_id: $user_id})-[:CONTACT]->(u2:User)
with u1,u2
match (u2)-[:CONTACT]->()
return count(*) as relationshipsRequiringFiltering

Topic		Replies	Views
Optimising query performance with a relatively simple match Cypher performance	3	637	July 3, 2020
How to optimize the query Cypher performance , cypher	0	442	February 6, 2020
Help with optimizing query Cypher performance , browser , cypher , operations	3	416	September 21, 2021
Optimizing a query with a subgraph/subquery, only look at specific nodes Neo4j Graph Platform performance , cypher	4	385	June 10, 2021
Cypher optimitization Cypher	3	371	March 27, 2020

Optimizing a query and understanding the profiler

Related topics