Summing up relationship attributes through a chain of nodes

I need to go through a chain of nodes and sum up values from their relationships. Each relationship has two values I am interested in, walkTime and rideTime.. Each has a duration time value that I created with the relationship.
I need to traverse these nodes and sum up how much time the person was walking and how much they were riding during a trip or vacation, whatever.
How do I do this? I found this.. maybe there is a better (simpler) way?

Neo4j / Cypher: Returning sum of value in relationship between nodes within the node itself - Stack Overflow

CREATE (temp)
WITH temp
MATCH (tf:Transfer {id:'some_id'})
MATCH (tf)<-[in:IN_TO]-(in_account:Account)
MATCH (tf)-[out:OUT_TO]->(out_account:Account)
SET temp += PROPERTIES(tf)
WITH temp, SUM(in.value) AS sum_in, SUM(out.value) AS sum_out, COLLECT(in_account) AS in_accounts, COLLECT(out_account) AS out_accounts
SET temp.sum_in = sum_in
SET temp.sum_out = sum_out
WITH temp, PROPERTIES(temp) AS props, in_accounts, out_accounts
DELETE temp
RETURN props, in_accounts, out_accounts

unwind maybe.. Thoughts?
thanks Guys!

You do not need the 'temp' node. You can do it without:

MATCH (tf:Transfer {id:100})
MATCH (tf)<-[in:IN_TO]-(in_account:Account)
MATCH (tf)-[out:OUT_TO]->(out_account:Account)
WITH PROPERTIES(tf) as props, SUM(in.value) AS sum_in, SUM(out.value) AS sum_out, COLLECT(in_account) AS in_accounts, COLLECT(out_account) AS out_accounts
RETURN props, in_accounts, out_accounts, sum_in, sum_out

Even so, this does not work because you are forming a cartesian product with the second and third matches. This causes your sums to sum duplicate values and your collections to also have duplicate values. Here is an example:

Test Data:

CREATE (tf:Transfer {id:100})
CREATE (tf)<-[:IN_TO{value:100}]-(:Account)
CREATE (tf)-[:OUT_TO{value:-100}]->(:Account)
CREATE (tf)<-[:IN_TO{value:100}]-(:Account)
CREATE (tf)-[:OUT_TO{value:-100}]->(:Account)
CREATE (tf)<-[:IN_TO{value:100}]-(:Account)
CREATE (tf)-[:OUT_TO{value:-100}]->(:Account)

Modifying the return of the nodes to show the node ids, you can see the in_account and out_account nodes repeat three times. You can also see the sum is three times what it should be. This is all because the second match creates three rows and the third match creates three rows, resulting in nine total rows.

Here is one solution:

MATCH (tf:Transfer {id:100})
WITH
    tf,
    [(tf)<-[in:IN_TO]-(in_account:Account) | {rel: in, node: in_account}] as in_account_data,
    [(tf)-[out:OUT_TO]->(out_account:Account) | {rel: out, node: out_account}] as out_account_data
RETURN
    tf,
    [i in in_account_data | i.node] as in_accounts,
    [i in out_account_data | i.node] as out_accounts,
    reduce(s = 0, i in in_account_data | s + i.rel.value) as in_account_sum,
    reduce(s = 0, i in out_account_data | s + i.rel.value) as out_account_sum

Result with test data. Note the sums are correct and there are no duplication nodes.

LOL.. see.. this is why I ask you.. you just cannot trust what people put on the web.. always go to someone that knows what they are talking about.. LOL

And I .. know such a person..

I will try this out over the weekend, if not sooner.. thanks man!

Thanks Again..

ray lukas

1 Like

Slight change in what I need sir.. if you dont mind.. the summing mechanism is in a thread of nodes connected together. Each node has a GPS point in it and the relationships between them have the two settings, one is the time that they walked between these two points and then other is the time that they drove between these two points.. so I need to Traverse (match then an unwind??) this thread of nodes and add up all the times the person walked (totalTimeWalked) and all the times when they drove (totalTimeDriven). then we just return these values into my java code and issue awards.. Make sense Captain?..

Is this between two given nodes? Will there be multiple paths between these two nodes?

Do you have sample data?

Do you f he ave the initial part of the query that finds the path? I can add the calculations.

Is this between two given nodes?

Yes sir. These Destination Nodes will be a chain of connected Destination Nodes.. one for each place that person goes to. So we need a mechanism that starts at the first node in the chain, which I can pass into this cypher chunk, and traverse the entire length of the chain of nodes. There will be one Traversed relationship between each of these nodes. That Traversed relationship will have three attributes TraversedType (which will hold Walked or Drove) and a walkedTime and a droveTime. The walkedTime and a droveTime will be durations. So for each relationship that connects these Destination Nodes together in the chain, sum up the DroveDurationTime and WalkedDurationTime for the entire chain. If TraversedType is Drove then totalDroveTime = totalDroveTime + duration in the current relationship. If TraversedType is Walked then totalWalkedTime = totalWalkedTime + duration in the current relationship. Duration attribute doesn’t care it is just the diff between the two timestamps. At the end of the chain processing, return both totalWalkedTime and totalDroveTime durations.. see?

Will there be multiple paths between these two nodes?

No sir.. Only one relationship/path. That relationship will hold attributes TraversedType (walked or drove), DroveDurationTime and WalkedDurationTime that we care about (there will be others, distance, etc. but, who cares).

Each Destination Node will hold a Neo4j timestamp of when they got to that destination, I use to calculate the Duration inside the relationship.. so we are trying to figure out in the chain of Destination Nodes, how much time did they spend walking or driving or… make sense Captain?

Do you have sample data?

Let me put that together for you now.. Thanks man.. I will send along the Cypher for this.. no problem.. I have to build that for us, and will, right after this meeting ends..

Do you have the initial part of the query that finds the path? I can add the calculations.

I think I can do that.. sure.. Not sure.. I have the starting node.. then we would use an unwind you think? I can try and use your thinking from my other question on removing relationship throughout a chain of nodes.

Thanks man!

ray lukas

merge (n1:Destination{Name: "D1"})-

[:traverse{TraversedType:"Drove", duration:duration("PT6M17.91S")}]->

(n2:Destination{Name: "D2"})-

[:traverse{TraversedType:"Drove", duration:duration("PT2M1.3S")}]->

(n3:Destination{Name: "D3"})-

[:traverse{TraversedType:"Walked", duration:duration("PT5M2.1S")}]->

(n4:Destination{Name: "D4"})-

[:traverse{TraversedType:"Drove", duration:duration("PT51M3S")}]->

(n5:Destination{Name: "D5"})-

[:traverse{TraversedType:"Walked", duration:duration("PT5M2.1S")}]->

(n6:Destination{Name: "D6"})-

[:traverse{TraversedType:"Drove", duration:duration("PT5M3S")}]->

(n7:Destination{Name: "D7"})

So for walked add PT5M2.1S + PT5M2.1S yielding totalWalked and for Drove add PT6M17.91S + PT2M1.3S + PT51M3S + PT5M3S yielding totalDrove times.. see?

ray lukas

Here is an approach using variable length paths:

match(start:Destination{Name: 'D1'})
match path=(start)-[:traverse*]->(end:Destination)
where not exists ( (end)-[:traverse]->(:Destination) )
with    
    start,
    [i in relationships(path) where i.TraversedType = 'Drove' | i.duration] as driveDurations,
    [i in relationships(path) where i.TraversedType = 'Walked' | i.duration] as walkDurations 
return
    start,
    reduce(s=duration({seconds:0}), i in driveDurations | s+i) as totalDriveTime,
    reduce(s=duration({seconds:0}), i in walkDurations | s+i) as totalWalkTime

Here is a solution using qualified paths:

match(start:Destination{Name: 'D1'})
match path=(start)-[r:traverse]->*(end:Destination)
where not exists ( (end)-[:traverse]->(:Destination) )
with    
    start,
    [i in r where i.TraversedType = 'Drove' | i.duration] as driveDurations,
    [i in r where i.TraversedType = 'Walked' | i.duration] as walkDurations 
return
    start,
    reduce(s=duration({seconds:0}), i in driveDurations | s+i) as totalDriveTime,
    reduce(s=duration({seconds:0}), i in walkDurations | s+i) as totalWalkTime

Is this what you were thinking?

Thank you.. for.. saving me.. yet again.. sigh.. just .. thanks man..

I will build this into our solution today and tonight..
ah.. check that.. over the weekend.. sigh.. to many meetings.. LOL

Thanks man

ray lukas

1 Like