I still don't quite understand, how the number or rows are determined by a WITH/RETURN statement in cypher.
What I guess:
- Uniq data give one row. For example: if you have 10 user in your DB and you ask for them with
MATCH (u:User) RETURN u;
you will get 10 row. If you add a count likeRETURN u, COUNT(u);
, the result will still be 10 rows, the actual count will be 1 for each row. But what does uniq mean exactly? - For example:
WITH [1, 2, 4, 2, 3] as nums UNWIND nums as num return num;
gives 5 rows. Because you have 5 elements (could be nodes, relationships, maps etc). - If you use some aggregation method in the RETURN statement, the elements, which go into the aggregation method, are grouped by some grouping key. This is another part or the RETURN statement, like
u
inRETURN u, COUNT(u)
. - Now I know how to be careful with aggregation in most cases. But sometimes the results are not grouped too aggressively, but they are multiplied. I don't understand the following case:
**Please note that the ids are actually numbers instead of ***.
Query:
MATCH (u:User)-->(personalTypeMeasurement:PersonalTypeMeasurement)<--(typeValue:PersonalTypeValue) WHERE ID(u) = *****
AND ID(personalTypeMeasurement) = *****
WITH u, personalTypeMeasurement, typeValue
MATCH (u)-->(belongsTo:Household) WHERE ID(belongsTo) = *****
WITH u, personalTypeMeasurement, belongsTo, typeValue
MERGE (personalTypeMeasurement)-[:BELONGS_TO {createdAt: datetime.transaction('Europe/Berlin')}]->(belongsTo)
WITH personalTypeMeasurement, ID(personalTypeMeasurement) AS id, belongsTo, typeValue, u // adding u or not does not change anything
RETURN personalTypeMeasurement{.*, id, belongsTo, typeValue}
Result:
"personalTypeMeasurement" β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ‘
β{"typeValue":{"name":"heatingMid","value":4,"startAt":"2021-06-30T08:3β
β7:46[Europe/Berlin]"},"id":***,"belongsTo":{"name":"Zweitwohnsitz","hhβ
βMember":1,"hhArea":56,"hhApartmentsHeatSupply":"","hhHouseType":"singlβ
βeHouse"},"sector":"heating","createdAt":"2021-07-21T08:37:58.286405000β
β[Europe/Berlin]","ghgDomain":"housing","supportedType":"heating"} β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β{"typeValue":{"name":"heatingMid","value":4,"startAt":"2021-06-30T08:3β
β7:46[Europe/Berlin]"},"id":***,"belongsTo":{"name":"Zweitwohnsitz","hhβ
βMember":1,"hhArea":56,"hhApartmentsHeatSupply":"","hhHouseType":"singlβ
βeHouse"},"sector":"heating","createdAt":"2021-07-21T08:37:58.286405000β
β[Europe/Berlin]","ghgDomain":"housing","supportedType":"heating"}
I have only exactly one (personalTypeMeasurement:PersonalTypeMeasurement) in the game, it is even checked against its id. But why do I have two elements in the result, two rows? Would a DISTINCT
help in this case? How can I improve multi-step-queries like this one in order to get the most precise result?