PROFILE MATCH
(u:User {Id: '<USER_ID>'})-[:MEMBER_OF]->(g:Group),
(t:Tenant)-[:CHILD_OF*0..5]->(root:Tenant)<-[:BELONGS]-(g)
RETURN t.Id, g.Id
So we want to retrieve all the groups for the given user. However, groups can be inherited and it is why we traverse a hierarchy of tenants.
Issue
The query returns correct results but it is slow.I profiled the query and I believe that the problem is here:
The engine firstly finds all possible combinations of users and groups what gives around 2 mln records and from all these records it selects only 5K for a given user.
From my perspective, it is not optimal. It would be much faster to start from a given user and find groups for it.
Do you have any suggestions? So far I tried to play with hints, different structure of query but all the time the query plan is the same.
My first opinion - (good news)
Just looking at the two steps of the profile - i see that there is a huge pagecache hits and 0 misses, which means the page cache is used to cache the data and avoiding accessing disk which will typically make the query run faster.
If can you post the full query plan, the count of the nodes, your infrastructure settings, and neo4j version (enterprise / community) that would be helpful for me and others in the community to provide more insights.
I found out that described problem can be fixed by rewriting the query in the following way:
MATCH (u:Usert {Id: '<USER_ID>'})-[:MEMBER_OF]->(g1:Group) WITH COLLECT(g1.Id) AS groups
MATCH(t:Tenant)-[:CHILD_OF*0..5]->(root:Tenant)<-[:BELONGS]-(g2) WHERE g2.Id IN groups
RETURN t.Id, g2.Id
The main changes are:
In line 2 I eagerly find ids of groups that I'm interested in.
In line 4 I use found groups to filter results.
In this case, the engine does not try to generate all possible combinations of groups and users.