cancel
Showing results for 
Search instead for 
Did you mean: 

Help on creating repeating time-sequenced pattern

ranvirm
Node

I'm using Neo4J to find instances of financial crime at the bank where i work. We have our data loaded with nodes being our clients and edges being transactions between those clients (with direction). Certain clients have an attribute "is_high_risk" and what i want to do is, given a certain starting client (with client_id = 9999 in the below), find ways in which money could flow to any high risk client while ensuring that each transaction happens after the previous transaction.

I've been able to do this so far by writing queries like the below:

match (n1:Client)-[t1:Transaction]->(n2:Client)-[t2:Transaction]->(n3:Client)
where n1.cust_no = 99999
and n3.is_high_risk = 1
and t1.eff_date <= t2.eff_date <= (date(t1.eff_date) + duration('P7D'))
return *

This query matches the logic: find any flow from client 9999 (n1) to any high risk client (n3) where each transaction in the chain happens up to 7 days after the previous transaction. In this case i've allowed the path to be 2 degrees long. In order to extend the path i'd have to write a new query like below:

match (n1:Client)-[t1:Transaction]->(n2:Client)-[t2:Transaction]->(n3:Client)-[t3:Transaction]->(n4:Client)
where n1.client_id = 99999
and n4.is_high_risk = 1
and t1.eff_date <= t2.eff_date <= (date(t1.eff_date) + duration('P7D'))
and t2.eff_date <= t3.eff_date <= (date(t2.eff_date) + duration('P7D'))
return *

As you can see, writing a query to allow for increasingly longer paths becomes increasingly difficult.

Is there a way to write a query which allows for a variable length X whilst ensuring that each transaction in the path occurs after the previous transaction?

 

 

1 ACCEPTED SOLUTION

glilienfield
Ninja
Ninja

Try this.  I set an upper limit of 10 hops, but you can change it to what you need.

match p=(n1:Client)-[:Transaction*..10]->(n2:Client)
where n1.client_id = 99999
and n2.is_high_risk = 1
with n1, n2, relationships(p) as r
with n1, n2, r, range(1, size(r)-1) as indexes
where all(x in indexes where r[x-1].eff_date <= r[x].eff_date <= (date(r[x-1].eff_date) + duration('P7D')))
return n1, n2, size(r) as hops

 You may want to consider using a label to indicate a client is high risk instead of a property. Nodes can have multiple labels. 

View solution in original post

2 REPLIES 2

glilienfield
Ninja
Ninja

Try this.  I set an upper limit of 10 hops, but you can change it to what you need.

match p=(n1:Client)-[:Transaction*..10]->(n2:Client)
where n1.client_id = 99999
and n2.is_high_risk = 1
with n1, n2, relationships(p) as r
with n1, n2, r, range(1, size(r)-1) as indexes
where all(x in indexes where r[x-1].eff_date <= r[x].eff_date <= (date(r[x-1].eff_date) + duration('P7D')))
return n1, n2, size(r) as hops

 You may want to consider using a label to indicate a client is high risk instead of a property. Nodes can have multiple labels. 

Thank you! This worked brilliantly

Used with some other logic I've already been able to identify over 100 suspected criminals with this, I hope that makes your time spent on this worthwhile

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

On November 16 and 17 for 24 hours across all timezones, you’ll learn about best practices for beginners and experts alike.