Quantify the strength of relationships of a group of nodes

Hello,

I have a graph made of companies (nodes) related to each other by transactions (edges). In addition to that, multiple companies might be related to each other because of capitalistic links (at this point I didn't translate that as a new edge type, but rather as a property of the node with a "Group_Id" property).

For some companies with a label "Scope", I'm interested in quantifying the strength of outgoing edges towards direct neighbors : this is easy because for each edge, I have a property that quantifies how strong that edge is for the sender company, so I can just look at that property (the property I'm speaking about is the ratio of the edge amount to all outgoing amounts for the node).

The problem comes when I try to quantify the strength of outgoing edges for a capitalistic Group, which is made of multiple companies. I want to know the strongest dependencies of that Group towards the outside (so I'm not interested in capturing the links between the companies belonging to that Group).

To summarize, here are the steps that I have in mind :

  1. Identify all companies with label "Scope"
  2. For those companies identify the "Group_Id" they belong to
  3. For those groups, identify the strongest relationships towards the outside at order 1 neighborhood
  4. Quantify the strength of the relationship : that would be the ratio between (1) sum of the outgoing amounts of the companies in the Group towards a specific neighbor and (2) sum of total
    outgoing amounts of the Group
  5. Create a new property for each company in the Group which is the strength of the relationship calculated in 4.

Anyone has an idea of what the Cypher query should look like ?

Thank you very much !

Can you give me a small script to generate some test data?

Does the following graph illustrate your data?

In this example, if I want to calculate the "strength" of the relationship between the group_id =100 nodes (shown as orange nodes) and the one green node, would it be the ratio of (45 + 10) / (45 + 10 + 15 + 50 + 10)?

I went ahead and assumed my assumptions where correct. I can change the query if my assumptions are not correct.

The following derives the two values and returns their ratio, which represents the strength of the target node (node with key = 3) to the group (nodes with Group_Id = 100).

Test Data:

create
(n1:Scope{key:0, Group_Id:100}),
(n2:Scope{key:1, Group_Id:100}),
(n3:Scope{key:2, Group_Id:100}),
(n4:Scope{key:3, Group_Id:101}),
(n5:Scope{key:4, Group_Id:102}),
(n6:Scope{key:5, Group_Id:104}),
(n1)-[:RELATED_TO{strength:5}]->(n2),
(n2)-[:RELATED_TO{strength:4}]->(n3),
(n3)-[:RELATED_TO{strength:6}]->(n1),
(n1)-[:RELATED_TO{strength:45}]->(n4),
(n3)-[:RELATED_TO{strength:10}]->(n4),
(n1)-[:RELATED_TO{strength:15}]->(n6),
(n2)-[:RELATED_TO{strength:50}]->(n6),
(n3)-[:RELATED_TO{strength:10}]->(n5)

Query:

match(n:Scope{Group_Id:100})
match(n)-[r:RELATED_TO]->(m)
with 
    collect(distinct n) as group_nodes, 
    collect(distinct r) as all_relationships, 
    collect(distinct m) as related_nodes
with 
    [i in all_relationships where not endNode(i) in group_nodes] as out_of_group_relationships,
    [j in related_nodes where j.key = 3][0] as target_node
with 
    out_of_group_relationships,
    [k in out_of_group_relationships where endNode(k) = target_node] as target_node_relationships
with
    reduce(sum = 0, n in target_node_relationships | sum + n.strength) as numerator,
    reduce(sum = 0, n in out_of_group_relationships | sum + n.strength) as denominator
return numerator, denominator, toFloat(numerator)/ toFloat(denominator) as strength

Do you want this value calculated for each related node? In this case, all three nodes outside the group. I can change it to do that.

Hello @glilienfield,

Thank you very much for your extensive response, it's awesome ! :slight_smile:

To answer your questions:

  1. Yes the graph that you displayed perfectly illustrates what I had in mind
  2. The calculation in the example is also the result that I'm expecting
  3. In the query, you calculate this strength for one particular group (Group_id = 100) towards one particular node (node 101). Is there way we could generalize the query so that it can be applied simultaneously on all groups and towards all nodes ?

Something I realized while reading your answer, is that the property I would like to SET should not be on the nodes of the Group, but rather the relationships : basically every relationship pointing towards the same external node should have as a property the strength of that node with respect to the whole Group. I don't know if it is clear. So in your example the 2 relationships pointing towards 101 should have a property "Group_strength" of 0.42

Try this:

create
(n1:Scope{key:0, Group_Id:100}),
(n2:Scope{key:1, Group_Id:100}),
(n3:Scope{key:2, Group_Id:100}),
(n4:Scope{key:3, Group_Id:101}),
(n5:Scope{key:4, Group_Id:102}),
(n6:Scope{key:5, Group_Id:104}),
(n1)-[:RELATED_TO{strength:5}]->(n2),
(n2)-[:RELATED_TO{strength:4}]->(n3),
(n3)-[:RELATED_TO{strength:6}]->(n1),
(n1)-[:RELATED_TO{strength:45}]->(n4),
(n3)-[:RELATED_TO{strength:10}]->(n4),
(n1)-[:RELATED_TO{strength:15}]->(n6),
(n2)-[:RELATED_TO{strength:50}]->(n6),
(n3)-[:RELATED_TO{strength:10}]->(n5),
(n7:Scope{key:0, Group_Id:200}),
(n8:Scope{key:1, Group_Id:200}),
(n9:Scope{key:2, Group_Id:200}),
(n10:Scope{key:3, Group_Id:201}),
(n11:Scope{key:4, Group_Id:202}),
(n12:Scope{key:5, Group_Id:204}),
(n7)-[:RELATED_TO{strength:15}]->(n8),
(n8)-[:RELATED_TO{strength:14}]->(n9),
(n9)-[:RELATED_TO{strength:16}]->(n7),
(n7)-[:RELATED_TO{strength:55}]->(n10),
(n9)-[:RELATED_TO{strength:20}]->(n10),
(n7)-[:RELATED_TO{strength:25}]->(n12),
(n8)-[:RELATED_TO{strength:60}]->(n12),
(n9)-[:RELATED_TO{strength:20}]->(n11),
(n12)-[:RELATED_TO{strength:100}]->(n6)

A group is formed by nodes with outgoing relationships with the same group_Id.

match(n:Scope)
match(n)-[r:RELATED_TO]->(m)
with 
    n.Group_Id as group_id,
    collect(distinct n) as group_nodes, 
    collect(distinct r) as all_relationships, 
    collect(distinct m) as related_nodes
with 
    group_id,
    [i in all_relationships where not endNode(i) in group_nodes] as out_of_group_relationships,
    [j in related_nodes where not j in group_nodes] as target_nodes
with
    group_id, 
    target_nodes,
    out_of_group_relationships,
    reduce(sum = 0, n in out_of_group_relationships | sum + n.strength) as denominator
unwind target_nodes as target_node
with 
    group_id,
    target_node,
    denominator,
    [k in out_of_group_relationships where endNode(k) = target_node] as target_node_relationships
with
    group_id,    
    target_node,
    denominator,
    reduce(sum = 0, n in target_node_relationships | sum + n.strength) as numerator
return group_id, target_node, numerator, denominator, toFloat(numerator )/ toFloat(denominator) as strength

If the above calculations are correct, the following refactored query will set the relationships with a new property called derived_strength equal to the ratios calculated in the original query:

match(n:Scope)
match(n)-[r:RELATED_TO]->(m)
with 
    n.Group_Id as group_id,
    collect(distinct n) as group_nodes, 
    collect(distinct r) as all_relationships, 
    collect(distinct m) as related_nodes
with 
    group_id,
    [i in all_relationships where not endNode(i) in group_nodes] as out_of_group_relationships,
    [j in related_nodes where not j in group_nodes] as target_nodes
with
    group_id, 
    target_nodes,
    out_of_group_relationships,
    reduce(sum = 0, n in out_of_group_relationships | sum + n.strength) as denominator
unwind target_nodes as target_node
with 
    group_id,
    target_node,
    denominator,
    [k in out_of_group_relationships where endNode(k) = target_node] as target_node_relationships
with
    denominator,
    target_node_relationships,
    reduce(sum = 0, n in target_node_relationships | sum + n.strength) as numerator
with
    target_node_relationships,
    toFloat(numerator )/ toFloat(denominator) as derived_strength
forEach(r in target_node_relationships | 
    set r.derived_strength = derived_strength
)

Browse Results:

match(n)-[r:RELATED_TO]->(m)
where n.Group_Id <> m.Group_Id
return n.Group_Id as group_id, n.key as group_node_key, m.Group_Id as target_node_group_id, m.key as target_key, r.derived_strength as derived_strength

1 Like

Thanks a lot @glilienfield, that's perfect !

1 Like