Neo4j multiple level match query with same condition

Hi there !

the following query:

match (n)-[:friend*0..3]->(b)  where n.salary >= 19999 and n.address="Paris" and b.salary >= 19999 and b.address="Paris" return distinct n.name order by n.name

doesn't give me what I want:

I'm trying to display all the names of the persons living in Paris for whom all of their friends for 3 level generations also live in Paris and also have a salary higher than 19999.

The above query doesn't do it well, what am I doing wrong ?

for example: if Buddy lives in Paris, earns +20k and all of his friends for 3 level generations also live in Paris and earn +20k then Buddy should be returned. Else he shouldn't be.

Hello @buddy and welcome to the Neo4j community :slight_smile:

You must use predicate functions and more particularly all():

MATCH path=(n)-[:friend*0..3]->() 
WHERE all(x IN nodes(path) WHERE x.salary >= 19999 AND x.address = "Paris") 
RETURN DISTINCT n.name AS name 
ORDER BY name

Regards,
Cobra

This one should be good:

MATCH path=(n)-[:friend*0..3]->()
WITH n.name AS name, all(x IN nodes(path) WHERE x.salary >= 19999 AND x.address = "Paris") AS check
WITH name, collect(check) AS checks
WHERE all(x IN checks WHERE x = true)
RETURN DISTINCT name
ORDER BY name
1 Like

Thanks it's working fine !

Didn't know we could label the result of a predicate function

What's the meaning of x = true though ? why should we run all() on checks which is itself the result of an all() ?

And what if I had a different condition for each level of the path ?

Thanks a lot Cobra !

Buddy

What's the meaning of x = true though ? why should we run all() on checks which is itself the result of an all() ?

In your case, the depth is a tree so we have to check that all the branches are true, this is what the WHERE clause does.

If you have different condition for each level, this query won't work and will be a bit different since you will have to change the condition based on the level.

1 Like

It looks like you're not really concerned about paths, just that all friends in a 3-depth radius adhere to the conditions.

In that case, we can collect all of their distinct friends in that radius, then apply the all() predicate function:

MATCH (n)
WHERE n.salary >= 19999 and n.address="Paris"
CALL {
 WITH n
 MATCH (n)-[:friend*..3]->(b)  
 WITH n, collect(DISTINCT b) as friends
 WHERE all(friend IN friends WHERE friend.salary >= 19999 and friend.address="Paris")
 RETURN n.name as name
}
RETURN name
ORDER BY name

I'm using a subquery to constrain the scope of the collect() aggregation, otherwise it may stress the heap depending on the number of nodes in your graph.

If the WHERE all() in the subquery fails, then that row will be dropped and won't continue in the query.

Your :friend*..3 variable-length relationship is directed. Was that intentional? If a single :friend relationship indicates it is reciprocal, and not one-way, then you might want to drop the direction in that MATCH pattern so it can traverse a :friend relationship in either direction.

You really should be using labels here, otherwise this is an AllNodesScan. Also, you should consider an index, at the least on the label and address, but you would get better time if it was a composite index on address and salary.

2 Likes