How to show aggregate info in Graph

apoc
cypher
merge

(Groverjatin17) #1

Hi All,

I want to do some analysis(identify a Pattern or see how many times a person is sendering money) on the data using the graphs.

I have a table which have 5 columns.

Txn ID Trade Number Sender Broker Receiver
1 12345 Bank 1 Broker 1 Receiver 1
2 56789 Bank 2 Broker 2 Receiver 2
3 56789 Bank 2 Broker 2 Receiver 2
4 98765 Bank 2 Broker 3 Receiver 3

To create a relationships I separated it into 4 CSVs. And independently Load them in Neo4j.

Label 1:-Trade Number

Txn ID Trade Number
1 12345
2 56789
3 56789
4 98765

Label 2: SenderBank

Txn ID Trade Number Sender
1 12345 Bank 1
2 56789 Bank 2
3 56789 Bank 2
4 98765 Bank 2

Label 3: Broker

Txn ID Trade Number Broker
1 12345 Broker 1
2 56789 Broker 2
3 56789 Broker 2
4 98765 Broker 3

_Label 4: Receiver

Txn ID Trade Number Receiver
1 12345 Receiver 1
2 56789 Receiver 2
3 56789 Receiver 2
4 98765 Receiver 3

Since for many transactions the Sender and Receiver are same except of "Txn ID" for all the data.
I just want to check with how many Brokers/ Receivers the distinct Sender Banks are having relationships and how many times(aggregations) Like for x connections x relationships popping out of the Bank.

How can I sort of show only 1 Node for "Bank 2" instead of 3 and show 3 relationships going to 3 Brokers with only 1 Node for "Broler 2" Insted of 2. I can perform aggregations result as rows but I want to see the bigger picture.

Note:-

I tried using

apoc.refractor.mergeNodes

which actually merges
all 4 Sender nodes into 2 nodes,
all 4 Broker Nodes into 3
all 4 Receiver Nodes into 3
and multiple relationships pinting towards respective nodes.

But it also diminished the "Txn Id" property of (:SENDER) nodes or all that I chose to merge.
And I used it on my current data of 64k records. It is super slow after all merges.

Please Help if anything can be done or what am i doing wrong.


(Michael Hunger) #2

You can just use aggregation, with count or sum.

And to visualize it as graph not as tables you can use virtual nodes with apoc, I wrote it up here.


(Groverjatin17) #3

Hi Michael,

Thank a lot. That is actually a huge information so worthy of calling a blog.
But for someone like me who is pure novice doing it for sake of PoC, its hard to comprehend.

I did check virtual nodes in APOC libraries, but i fail to verify if it can suit my use case.

MATCH (a)-[r]->(b)
WITH head(labels(a)) AS l, head(labels(b)) AS l2, type(r) AS rel_type, count(*) as count
CALL apoc.create.vNode([l],{name:l}) yield node as a
CALL apoc.create.vNode([l2],{name:l2}) yield node as b
CALL apoc.create.vRelationship(a,rel_type,{count:count},b) yield rel
RETURN *;

In the above documentation provided example they are selecting all the different labels and creating vNode/vRelationships for count of different types of relationships they desire. which is perfect to eyeball what connects with what.

However, my goal is to create different nodes of all distinct values in 1 LABEL and then see the count of relationships of those particular type of value with other nodes.

Like, assuming in LABEL :Person there are 10 records/nodes with Name property as "Jatin".
And "Jatin" has [:ACTED_IN] with 3 times in "Movie 1", 4 times in "Movie 2", 3 times in "Movie 3" from LABEL. :Movie.

The use case is to see:

  • 1 node of "Jatin" with 3 relationships (count as a property) with 3 nodes of "Movie 1", "Movie 2", and "Movie 3".

    OR

  • 1 node of "Jatin" and 10 relationships protruding with 3 relationships for "Movie 1", 4 relationships to "Movie 2" and 3 relationships to "Movie 3".

Apart from that I also tried Grouping methods that you mentioned.
But it is not even returning same result as the documentation example.

match (n) set n.century = toInteger(coalesce(n.born,n.released)/100) * 100;
call apoc.nodes.group(['Person','Movie'],['century']);

NOTE:- The dataset used is Movie Graph provided by default.

Please help me with your expertise if above mentioned use case is possible.

Thank you


(Groverjatin17) #4

HI,

Also one more thing Michael

In the following
Label :SenderEntity nodes(13k with 151 distinct) have senderEntityName as property.
Label :CounterParty nodes(13k as well) have counterPartyName as property.
The Relationship schema is (SenderEntity)-[:BROKER]->(CounterParty)

Like my original goal is to show which ones and how many times a counterparty is attached to each senderEntity.

I tried the Grouping method in APOC.

call apoc.nodes.group(['SenderEntity','CounterParty'],['senderEntityName','counterPartyName']) yield nodes, relationships
UNWIND nodes as node
UNWIND relationships as rel
RETURN node, rel;

It does give 151 distinct sender nodes with count in each nodes. but it does not give any counterparty nodes or the relationships with it.


(Michael Hunger) #5

This should give you your summary if you nodes are distinct. Just a regular Cypher aggregation.

MATCH (se:SenderEntity)-[:BROKER]->(cp:CounterParty)
RETURN se, cp, apoc.create.vRelationship(se, 'ALL_BROKER', {count:count(*)}, cp) as rel

If only the names are distinct:

MATCH (se:SenderEntity)-[:BROKER]->(cp:CounterParty)
WITH se.senderEntityName as sender,  cp.counterPartyName as counter, count(*) as count
WITH collect(distinct sender) as senders, collect(distinct counter) as counters, collect([sender, counter, count]) as rels
WITH apoc.map.groupBy([seName IN senders | apoc.create.vNode(['SenderEntity'],{name:seName})], 'name') as senders, 
apoc.map.groupBy([cpName IN counters | apoc.create.vNode(['CounterParty'],{name:cpName})], 'name') as counters, rels
UNWIND rels as rel
RETURN senders[rel[0]], counters[rel[1]], apoc.create.vRelationship(senders[rel[0]], 'ALL_BROKER', {count:rel[2]}, counters[rel[1]]) as rel

(Michael Hunger) #6

The grouping should actually do exactly what you want.

You might need to add an tempoary "name" property to both node types.


(Groverjatin17) #7

Hi Michael,

It is giving this error:-

Neo.ClientError.Statement.SyntaxError: Invalid input 'p': expected 'n/N' (line 8, column 2 (offset: 653))
"apoc.map.groupBy(senders, 'name') as senders, " ^


(Michael Hunger) #8

fixed teh copy & paste error


(Groverjatin17) #10

I have extended my query to more levels and have included more labels.
Tons of thanks Michael. You have been of great help.

I will make your answer as solution.

God Bless you