Understanding grouping key in aggregations

I have a currently small database with the following relevant part:

            (:UsedMeasure) // multiple of these!!!

The relevant properties here are: User.createdAt (DateTime), Address.addressCityCode (String), UsedMeasure.reductionAvg (Number).

Now I want to list the User with Date (createdAt), its city code and the sum of the reductionAvg (divided by 2, but that does not matter) of all their UsedMeasures.

But when I use a query like this:

MATCH (u:User {isBetUser: true})-->(m:UsedMeasure) WITH u, m, u.createdAt.epochMillis AS date MATCH (u)-->(a:Address) RETURN ROUND((SUM(m.reductionAvg) / 2), 1) AS sum, a.addressCityCode AS PLZ, SUM(m.reductionAvg), apoc.date.format(date, 'ms', 'dd.MM.yyyy') as Datum;

the result is grouped by the Datum, which makes sense due to:

Grouping keys are non-aggregate expressions, that are used to group the values going into the aggregate functions.

How can I get exactly one row per user without adding ID(user) to the RETURN list? (I don't want to get the id here, the data will be sent away in a CSV file.) Is there any way to add a different grouping key to the (ROUND(SUM(m.reductionAvg) / 2), 1) part of the query without returning it?

Yes, a key understanding here is that you do not have to do all your aggregations in the RETURN, you can use WITH clauses to perform work like aggregations, and perform later matches from them afterward:

MATCH (u:User {isBetUser: true})-->(m:UsedMeasure) 
WITH u, ROUND((SUM(m.reductionAvg) / 2), 1) AS sum, SUM(m.reductionAvg) as sumReductionAvg
MATCH (u)-->(a:Address)
RETURN u, apoc.date.format(u.createdAt.epochMillis, 'ms', 'dd.MM.yyyy') AS Datum, a.addressCityCode AS PLZ, sum, sumReductionAvg

So here, we get the sums with respect to each u node, and only after we perform the aggregations do we MATCH out to the rest and project out the properties we need.

1 Like

Thanks for your suggestion, it is way more easy than my former solution.

"MATCH (u:User {isBetUser: true})-->(m:UsedMeasure) WHERE u.createdAt > datetime($fromDate) " +
    "WITH {id: ID(u), sum: ROUND(SUM(m.reductionAvg) / 2, 1), betCode: COALESCE(u.betCode, '')} AS userMeas UNWIND userMeas as uM " +
    "MATCH (u)-->(a:Address) WHERE ID(u) = uM.id WITH u.createdAt.epochMillis AS date, uM, a" +
    " RETURN apoc.date.format(date, 'ms', 'dd.MM.yyyy') as Date," +
    " uM.sum AS `Some Column Title`, a.addressCityCode as PLZ, " +
    "uM.betCode AS `KLIMAWETTE - Code (optional)`"

So UNWIND might never be useful for processing MATCH results.

I wouldn't say that. UNWIND can be very useful, it's just a tool, you will need to develop experience to understand when that particular tool is needed. Just think of it like the opposite of collect(), it takes elements of a list, and emits a row for each of them. There are cases where a FOREACH can't be used, so an UNWIND might be the next thing to try. Or a case where you had to collect something, and need to expand from the nodes collected, so you need to UNWIND them back to rows first.