I've encountered an inconsistent behavior in your system when running two variations of a similar query. Below are the details:
Query 1:
MATCH (m:Movie {id: 1})
RETURN collect(m.title) as movies
Actual Result: [{'movies': []}] (as expected)
Query 2:
WITH 'a' as a
MATCH (m:Movie {id: 1})
RETURN a, collect(m.title) as movies
Expected Result: [{'a': 'a', 'movies': []}] Actual Result: [] (unexpected)
It appears that including a WITH clause changes the behavior unexpectedly, leading to a completely empty result set. This inconsistency could signify a bug in the query processing mechanism.
Thank you very much for the reference. It is good to know it is expected behavior and that the optional match avoids the issue.
I still could argue that the behavior is not consistent. In one case you don't have any rows and you return a result and in the other you also don't have any rows and you do return a result.
I read the explanation in the article about why you can't get a zero count value when there is a grouping factor. I think it is not intuitive. I think if you have an aggregate function, which will be an eager operation, that no rows should be thrown away so you can calculate the aggregation for those grouping values that have no match.
I would have liked to get a row with bob showing zero emails. This does not seem unreasonable since a query without a grouping value and zero rows returns zero instead of "no changes, no records".
I understand this is a matter of design preference and that is fine. Fortunately, there is a work around using optional match to get the desired result.
That all being said, the behavior is actually the same in sql. If I have a query that counts records and there are zero matches and there is no "group by" clause, then I get a row returned with zero count. In contrast, if I add a group by clause I get zero records back. In my example above, using an optional match is kinda equivalent to a left join, which would return a zero value for persons that don't have any emails.
There are a few different approaches you could use to get the kind of behavior you want.
As you saw and mentioned, OPTIONAL MATCH (which behaves like a left join as you pointed out) can be used when we are not sure if such a pattern / node exists, and and rather than removing (or not emitting) a row for the lack of result, it will instead emit a null for the variable value, so that's an important tool.
An alternate approach may be to use a subquery, as subqueries execute per row, and you do not have to keep an imported variable around within the subquery when you no longer need it, which can allow you to perform an aggregation without a grouping key:
MATCH (n:Person)
CALL {
WITH n
MATCH (m:Email {recipient:n.name})
RETURN count(m) as noOfEmails
}
RETURN n.name, noOfEmails