Why can't I use a conditional on an aggregating function?

andrew_bowman · May 27, 2021, 11:24pm

One thing that's not clear is the scope of the aggregation, the grouping key is not clear if we were to use it in a WHERE clause.

Would it be asking for the average of all persons who acted in all movies (so the same average would display on every result row)? Would persons be counted multiple times because they act in multiple movies, and should that duplication be captured in the average? Or is it asking for the average per movie? What if the pattern was more complicated, or there were more variables in scope?

Consider this MATCH and WHERE clause:

MATCH (m:Movie)<-[:ACTED_IN]-(p:Person)-[:ATTENDED_COLLEGE_AT]->(college:College)
WHERE AVG(p.born) > 1967

How could we tell what the grouping key is here? Is it per movie? Per college? Neither, and you want the average of all persons who acted in a movie and attended college? What if they acted in multiple movies, or attended multiple colleges?

When we restrict aggregations to WITH clauses, the grouping key becomes much clearer, as in the workaround you cited: you're getting, per movie, the average born year of that movie's actors, and then filtering that row based on whether the average is over 1967.

Topic		Replies	Views
How groupby works implicitly with more than 1 column? Cypher	2	253	March 12, 2022
Aggregate Results vs Un-aggregated Cypher	9	1623	September 23, 2018
Using WITH clause along with an aggregating function like count() Cypher	3	1803	May 28, 2021
Do we have 'groupby' clause in Neo4j? Neo4j Graph Platform apoc , cypher , knowledge-base	2	227	March 11, 2022
Understanding grouping key in aggregations Cypher cypher	3	514	June 11, 2021

July Summer Fun!

Why can't I use a conditional on an aggregating function?

Related topics