After creating clusters , one of them looks weird since it has more nodes than others!


(Mehdi Ajroud) #1

First of all , I am working on companies and contract so I started by importing both of them and relations between them .
Then I runned this query to have only compnaies who collaborated together :

MATCH (at1:Attributaires)<-[r:ASSIGNED_TO]-(c:Contrats)-[x:ASSIGNED_TO]->(at2:Attributaires)
MERGE (at1)-[s:CollaborateWith]-(at2)

After that , I deleted all contracts and their relations . So I still have only companies which collaborated together .
Moving to the last part which is making clusters , I followed this process :

CALL algo.unionFind(
'match (n) return id(n) as id',
'match (n1:Attributaires)--(n2:Attributaires) return id(n1) as source, id(n2) as target',
{graph:'cypher', write:true, partitionProperty:"clusterId"})
YIELD nodes, setCount, loadMillis, computeMillis, writeMillis;

in the end , I runned this query to assign an ID for each Cluster :

match (n1:Attributaires)-[r]-(n2:Attributaires)
with n1.clusterId as clusterId, count(n1) as clusterSize,count(distinct r) as numberOfRels
where clusterSize > 5 AND  numberOfRels > 10
match (n2) where n2.clusterId = clusterId
MERGE (x:Cluster {id: clusterId})
MERGE (n2)-[:IN_CLUSTER]->(x)

After exporting the results , I realized that I have one cluster which contains 21039 companies ! but for the rest all of them contains 6 to 62 companies .
Any explanation ? why I am having a huge cluster ?