Query optimization for fraud rings matching

borism · November 1, 2018, 1:05pm

Hello,
I have the following case :
We are looking for fraud were 2 persons are connected to one car in one police case
and the same 2 persons are connected to 2 different cars in another police case.
unnamed

The script for database creation :

CREATE (police1:PoliceFile {FileID:1})
CREATE (police2:PoliceFile {FileID:2})
CREATE (car1:Car {CarID:1})
CREATE (car2:Car {CarID:2})
CREATE (car3:Car {CarID:3})
CREATE (inv1:Involved {PesronID:1})
CREATE (inv2:Involved {PesronID:2})
CREATE (cl1:Claim {ClaimID:1})
CREATE (cl2:Claim {ClaimID:2})
CREATE (police1)-[:ConnectedTo]->(car1)
CREATE (police2)-[:ConnectedTo]->(car2)
CREATE (police2)-[:ConnectedTo]->(car3)
CREATE (car1)-[:ConnectedTo]->(inv1)
CREATE (car1)-[:ConnectedTo]->(inv2)
CREATE (car2)-[:ConnectedTo]->(inv1)
CREATE (car3)-[:ConnectedTo]->(inv2)
CREATE (inv1)-[:ConnectedTo]->(cl1)
CREATE (inv1)-[:ConnectedTo]->(cl2)
CREATE (cl1)-[:ConnectedTo]->(police1)
CREATE (cl2)-[:ConnectedTo]->(police2)

I use the following query to find the ring :

MATCH p=(c3:Car)-[:ConnectedTo]-(inv2:Involved)-[:ConnectedTo]-(c2:Car)-[:ConnectedTo]-(p1:PoliceFile)-[:ConnectedTo]-(c1:Car)-[:ConnectedTo]-(inv1:Involved)-[:ConnectedTo]-(c3:Car)-[:ConnectedTo]-(p2:PoliceFile)
where ID(p1)<>ID(p2) and ID(inv1)<>ID(inv2)
return p

It works with the sample data, but in the real database with 2 million nodes it works hours and does not bring results back.

How should I change the query in order to efficiently find fraud rings in the database?

Thank you,
Boris

paul.thomas · November 1, 2018, 2:32pm

Could be the data, could be not enough RAM on the box ...

But your query is looking for a long chain not a ring!

Any ring loops back to the start for example, notice p1 at the start and at the end

match pathx=(p1:Person)--(c1:Claim)--(p2:Person)--(c2:Claim)--(p1)
return pathx

ameyasoft · November 3, 2018, 2:15am

Hi,

May be using the same label for all relationship is causing the problem?.

You can try this.
It may be better to collect all the claims filed by PersonID = 1 and use the collected claim ids to see all other persons involved in these claims.

Here is the query:

MATCH (c:Claim)-->(p)-->(d:Car)--(i:Involved)
WHERE i.PesronID = 1
WITH COLLECT (c) as n1
UNWIND n1 as n2
MATCH (n2)-->(p)-->(d:Car)--(i:Involved)
RETURN n2, p, d, i;
Result:

fraudcase2

If I may suggest a better model, here it is:

fraudcase .

-Kamal

Topic		Replies	Views
Extracting fraud ring based on client name Cypher browser , cypher , neo4j-desktop	1	339	April 13, 2021
Doubt with "Finding Duplicate" Cypher Query Cypher cypher	9	3124	November 15, 2018
When a relationship is created, also new nodes are created Cypher	4	133	January 25, 2024
Help needed on making graph database of a fraud detection system Neo4j Graph Platform	1	738	March 25, 2019
Returning millions of rows from cypher-shell taking too long Cypher	5	1229	April 12, 2020

Query optimization for fraud rings matching

Related topics