You were so close! It took me a while to figure this one out the first time I encountered it.
Aggregation functions are calculated per row. To get the total, you have to kinda run the query twice, with two aggregations. One for one row counting the records, and another for the results you want to return.
Simplified Example:
WITH [1,3] as x, [1,2] as y
unwind x as a
unwind y as b
WITH distinct a, b
WITH count(*) as total, [1,3] as x, [1,2] as y
unwind x as a
unwind y as b
RETURN distinct a, b, total
Result:
a b total 1 1 4 1 2 4 3 1 4 3 2 4
Applied to your query
Notes:
- Were it not for the need for distinct, we could reuse
neighbours
in the second MATCH. - The result includes a total on each row, duplicating data.
- These results can be though of as "distinct pairs," or as removing multiple rels between pairs. If you reframe your problem to look at Relationships instead of Nodes, you might find a better solution.
MATCH (u1 {id:522})-[r1:InteractsWith]-(u2)
WITH collect(distinct u2.id) as neighbours
MATCH (u1 {id: 522})-[r1:InteractsWith]-(u2)-[r2:InteractsWith]-(u3)
WHERE u3.id in neighbours
WITH distinct u2.id, u3.id
WITH count(*) as total
// deja vu +total
MATCH (u1 {id:522})-[r1:InteractsWith]-(u2)
WITH total, collect(distinct u2.id) as neighbours
MATCH (u1 {id: 522})-[r1:InteractsWith]-(u2)-[r2:InteractsWith]-(u3)
WHERE u3.id in neighbours
RETURN distinct u2.id, u3.id, total