Showing results for 
Search instead for 
Did you mean: 

Head's Up! Site migration is underway. Phase 2: migrate recent content

Optimizing a query and understanding the profiler



I am trying to optimize a query I have been working on but do not understand why cypher/neo4j profiler hits the database as much as it does.

The query below tries to find all mutual contacts for a given user $user_id.

1st pass

profile MATCH (u1:User {user_id: $user_id})-[:CONTACT]->(u2:User)
where exists ((u2)-[:CONTACT]->(u1))
return u1,u2

2nd pass (better but still not great)

profile MATCH (u1:User {user_id: $user_id})-[:CONTACT]->(u2:User)
with u1,u2
match ((u2)-[:CONTACT]->(u1))
return u1,u2

My understanding is that using WITH I am signaling to the 2nd MATCH clause the existence of the start and end nodes. However, the profiler seems to tell me that this incurs the most amount of db hits: Screen-Shot-2021-06-03-at-2-58-21-PM — ImgBB . I'm confused about best practices for things like this and how to optimize my query. Thank you!


Either approach should work. Generally I would favor the first query.

Given that you already have a unique constraint on :User(user_id), there's not much more tuning you can do here.

As for the number of db hits, perhaps the output of this query will show you how many relationships the query must consider before it arrives at the 28 resulting rows:

match (u1:User {user_id: $user_id})-[:CONTACT]->(u2:User)
with u1,u2
match (u2)-[:CONTACT]->()
return count(*) as relationshipsRequiringFiltering
Nodes 2022
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online