Optional match giving wrong result

ri8ika · April 22, 2021, 7:06pm

This gives me wrong result, an unexpected behavior:

MATCH (n) WHERE id(n) = 123
OPTIONAL MATCH (a:A) -- (n)
WITH n, a, collect(a.actor) as actor
OPTIONAL MATCH (a) -- (n)

When I repeat the WHERE clause, it gives me the correct result:

MATCH (n) WHERE id(n) = 123
OPTIONAL MATCH (a:A) -- (n)
WITH n, a, collect(a.actor) as actor
OPTIONAL MATCH (a) -- (n) WHERE id(n) = 123

I know the behavior with WHERE clause is different when using OPTIONAL MATCH, but how can we avoid the repeat WHERE clause? I have multiple OPTIONAL MATCH queries so, this is not good to go with. What I am missing?

andrew_bowman · April 23, 2021, 10:49pm

Well first, your last OPTIONAL MATCH doesn't serve any purpose. n and a are already bound variables, the OPTIONAL MATCH isn't introducing any additional variables, so it actually isn't doing (or shouldn't be doing) anything meaningful. OPTIONAL MATCH only has meaning when it is introducing new variables, so if there's a difference in what is returned, then there's a bug here somewhere.

It would help if you provided an example of correct vs incorrect results, and also let us know the version of Neo4j you are using.

ri8ika · April 24, 2021, 5:23am

@andrew_bowman
Okay, let me give you an example.

I only have the following in the database:

Node2{id:456} - [:FOO] -> Node1{id:123}
Node2{id:456} - [:BAR] -> Node1{id:789}

Now, when I use the following statement:

MATCH (n1:Node1) WHERE id(n1) = 123
OPTIONAL MATCH (n2:Node2) - [:FOO] -> (n1)
WITH n1, n2, collect(id(n2)) as foo
OPTIONAL MATCH (n2) - [:BAR] -> (n1)
RETURN {foo: foo, bar: collect(id(n2))}

This is giving me wrong result. I think the WHERE clause is not satisfying with all optional matches?

Output:

{
  "foo": [456],
  "bar": [456]
}

Expected:

{
  "foo": [456],
  "bar": []
}

Because, Node1 with id 123 has no BAR relation.

I am using the latest version of neo4j.

tard_gabriel · April 24, 2021, 4:24pm

Hello @ri8ika

Can you provide us what your graph database must looks like with a tool like arrows.app.
And can you, in short explain in english not Cypher want you want from it?

If your Cypher is wrong it's a lot harder for us to understand what you want from it.

ri8ika · April 26, 2021, 12:28pm

I have given an example. Please check.

andrew_bowman · April 26, 2021, 2:21pm

By latest version I'm assuming 4.2.5?

In any case, this is not a bug, but a misunderstanding of how OPTIONAL MATCH behaves.

Once variables are bound from earlier in the query (n1 and n2) an OPTIONAL MATCH will never be able to change those variables, that would be a filtering operation, and OPTIONAL MATCH never filters out results. That's why I said earlier:

OPTIONAL MATCH only has meaning when it is introducing new variables

There are several ways to get the results you want.

One is to do the OPTIONAL MATCH with a new variable in place of n2, but use a WHERE clause to restrict possible values:

MATCH (n1:Node1) WHERE id(n1) = 123
OPTIONAL MATCH (n2:Node2) - [:FOO] -> (n1)
WITH n1, n2, collect(id(n2)) as foo
OPTIONAL MATCH (x) - [:BAR] -> (n1)
WHERE x = n2
RETURN {foo: foo, bar: collect(id(x))}

You could also use subqueries similarly.

One other thing to mention, your aggregation is flawed:

WITH n1, n2, collect(id(n2)) as foo

Because n2 is in the grouping key, this will not produce the results you want. You will get a row per distinct n1, n2. foo will always be a list of the id for that n2. It will be at least a single element. If there are multiple :FOO relationships between the two specific n1 and n2 nodes on that row, then there may be multiple elements in foo, a repeating of n2's id() for however many :FOO relationships are between those two specific nodes.

For example, assume this graph exists, where there are multiple :FOO relationships between n1 and n2:

Node2{id:456} - [:FOO] -> Node1{id:123}
Node2{id:456} - [:FOO] -> Node1{id:123}
Node2{id:456} - [:FOO] -> Node1{id:123}
Node2{id:456} - [:BAR] -> Node1{id:789}

Then the output of the first half of the query, at this line WITH n1, n2, collect(id(n2)) as foo, is:

{
 n1: Node1{id:123}, n2: Node1{id:456}, foo:[456, 456, 456]
}

If there were other nodes connected by :FOO relationships, you would get similar rows per distinct n1 and n2 pairs, with a similar list of at least one element, though there could be more depending on the number of duplicate :FOO relationships between the pair of nodes for that row.

That's not likely what you want. What is it you want from that aggregation? What is its intended behavior?

ri8ika · April 26, 2021, 8:45pm

OPTIONAL MATCH only has meaning when it is introducing new variables

Now, I got your point. Thanks.

Topic		Replies	Views
Maybe I misunderstood OPTIONAL MATCH. If not there should be something wrong Cypher cypher	4	399	October 18, 2021
Optional match - many clauses Cypher	2	593	June 24, 2019
MATCH following OPTIONAL MATCH Cypher	5	386	April 21, 2020
Optional Match returns null when node is present Cypher	3	241	March 17, 2021
OPTIONAL MATCH causing unexpected behavior Cypher	1	177	August 20, 2021

August 🏄 🏖️ 🏊

Optional match giving wrong result

Related topics