How to show aggregate info in Graph

groverjatin17 · January 10, 2019, 9:28am

Hi All,

I want to do some analysis(identify a Pattern or see how many times a person is sendering money) on the data using the graphs.

I have a table which have 5 columns.

Txn ID	Trade Number	Sender	Broker	Receiver
1	12345	Bank 1	Broker 1	Receiver 1
2	56789	Bank 2	Broker 2	Receiver 2
3	56789	Bank 2	Broker 2	Receiver 2
4	98765	Bank 2	Broker 3	Receiver 3

To create a relationships I separated it into 4 CSVs. And independently Load them in Neo4j.

Label 1:-Trade Number

Txn ID	Trade Number
1	12345
2	56789
3	56789
4	98765

Label 2: SenderBank

Txn ID	Trade Number	Sender
1	12345	Bank 1
2	56789	Bank 2
3	56789	Bank 2
4	98765	Bank 2

Label 3: Broker

Txn ID	Trade Number	Broker
1	12345	Broker 1
2	56789	Broker 2
3	56789	Broker 2
4	98765	Broker 3

_Label 4: Receiver

Txn ID	Trade Number	Receiver
1	12345	Receiver 1
2	56789	Receiver 2
3	56789	Receiver 2
4	98765	Receiver 3

Since for many transactions the Sender and Receiver are same except of "Txn ID" for all the data.
I just want to check with how many Brokers/ Receivers the distinct Sender Banks are having relationships and how many times(aggregations) Like for x connections x relationships popping out of the Bank.

How can I sort of show only 1 Node for "Bank 2" instead of 3 and show 3 relationships going to 3 Brokers with only 1 Node for "Broler 2" Insted of 2. I can perform aggregations result as rows but I want to see the bigger picture.

Note:-

I tried using

apoc.refractor.mergeNodes

which actually merges
all 4 Sender nodes into 2 nodes,
all 4 Broker Nodes into 3
all 4 Receiver Nodes into 3
and multiple relationships pinting towards respective nodes.

But it also diminished the "Txn Id" property of (:SENDER) nodes or all that I chose to merge.
And I used it on my current data of 64k records. It is super slow after all merges.

Please Help if anything can be done or what am i doing wrong.

michael.hunger · January 10, 2019, 11:50am

You can just use aggregation, with count or sum.

And to visualize it as graph not as tables you can use virtual nodes with apoc, I wrote it up here.

groverjatin17 · January 16, 2019, 7:26am

Hi Michael,

Thank a lot. That is actually a huge information so worthy of calling a blog.
But for someone like me who is pure novice doing it for sake of PoC, its hard to comprehend.

I did check virtual nodes in APOC libraries, but i fail to verify if it can suit my use case.

MATCH (a)-[r]->(b)
WITH head(labels(a)) AS l, head(labels(b)) AS l2, type(r) AS rel_type, count(*) as count
CALL apoc.create.vNode([l],{name:l}) yield node as a
CALL apoc.create.vNode([l2],{name:l2}) yield node as b
CALL apoc.create.vRelationship(a,rel_type,{count:count},b) yield rel
RETURN *;

In the above documentation provided example they are selecting all the different labels and creating vNode/vRelationships for count of different types of relationships they desire. which is perfect to eyeball what connects with what.

However, my goal is to create different nodes of all distinct values in 1 LABEL and then see the count of relationships of those particular type of value with other nodes.

Like, assuming in LABEL :Person there are 10 records/nodes with Name property as "Jatin".
And "Jatin" has [:ACTED_IN] with 3 times in "Movie 1", 4 times in "Movie 2", 3 times in "Movie 3" from LABEL. :Movie.

The use case is to see:

1 node of "Jatin" with 3 relationships (count as a property) with 3 nodes of "Movie 1", "Movie 2", and "Movie 3".

OR
1 node of "Jatin" and 10 relationships protruding with 3 relationships for "Movie 1", 4 relationships to "Movie 2" and 3 relationships to "Movie 3".

Apart from that I also tried Grouping methods that you mentioned.
But it is not even returning same result as the documentation example.

match (n) set n.century = toInteger(coalesce(n.born,n.released)/100) * 100;
call apoc.nodes.group(['Person','Movie'],['century']);

NOTE:- The dataset used is Movie Graph provided by default.

Please help me with your expertise if above mentioned use case is possible.

Thank you

groverjatin17 · January 16, 2019, 9:38am

HI,

Also one more thing Michael

In the following
Label :SenderEntity nodes(13k with 151 distinct) have senderEntityName as property.
Label :CounterParty nodes(13k as well) have counterPartyName as property.
The Relationship schema is (SenderEntity)-[:BROKER]->(CounterParty)

Like my original goal is to show which ones and how many times a counterparty is attached to each senderEntity.

I tried the Grouping method in APOC.

call apoc.nodes.group(['SenderEntity','CounterParty'],['senderEntityName','counterPartyName']) yield nodes, relationships
UNWIND nodes as node
UNWIND relationships as rel
RETURN node, rel;

It does give 151 distinct sender nodes with count in each nodes. but it does not give any counterparty nodes or the relationships with it.

michael.hunger · January 16, 2019, 10:13am

This should give you your summary if you nodes are distinct. Just a regular Cypher aggregation.

MATCH (se:SenderEntity)-[:BROKER]->(cp:CounterParty)
RETURN se, cp, apoc.create.vRelationship(se, 'ALL_BROKER', {count:count(*)}, cp) as rel

If only the names are distinct:

MATCH (se:SenderEntity)-[:BROKER]->(cp:CounterParty)
WITH se.senderEntityName as sender,  cp.counterPartyName as counter, count(*) as count
WITH collect(distinct sender) as senders, collect(distinct counter) as counters, collect([sender, counter, count]) as rels
WITH apoc.map.groupBy([seName IN senders | apoc.create.vNode(['SenderEntity'],{name:seName})], 'name') as senders, 
apoc.map.groupBy([cpName IN counters | apoc.create.vNode(['CounterParty'],{name:cpName})], 'name') as counters, rels
UNWIND rels as rel
RETURN senders[rel[0]], counters[rel[1]], apoc.create.vRelationship(senders[rel[0]], 'ALL_BROKER', {count:rel[2]}, counters[rel[1]]) as rel

michael.hunger · January 16, 2019, 10:14am

The grouping should actually do exactly what you want.

You might need to add an tempoary "name" property to both node types.

groverjatin17 · January 16, 2019, 2:48pm

Hi Michael,

It is giving this error:-

Neo.ClientError.Statement.SyntaxError: Invalid input 'p': expected 'n/N' (line 8, column 2 (offset: 653))
"apoc.map.groupBy(senders, 'name') as senders, " ^

michael.hunger · January 16, 2019, 3:53pm

fixed teh copy & paste error

groverjatin17 · January 17, 2019, 10:13am

I have extended my query to more levels and have included more labels.
Tons of thanks Michael. You have been of great help.

I will make your answer as solution.

God Bless you

Topic		Replies	Views
How can I display groups of nodes which realized more than n realtionships with different nodes in the graph Cypher apoc , cypher , operations	8	3124	February 17, 2020
Neo4j returning incorrect number of rows Procedures & APOC cypher	1	920	December 6, 2018
Can we merge multiple Nodes based on Single common property Browser cypher	12	12920	March 4, 2020
Counting Number of Relationships Per Grouped Nodes Cypher apoc , performance , cypher , operations , relationship , index	2	1178	November 21, 2020
Aggregation by Group By Import / Export	4	559	December 19, 2019

How to show aggregate info in Graph

Related topics