cancel
Showing results for 
Search instead for 
Did you mean: 

Find the largest clusters of a database

mack
Node

Hello, I'm new with neo4j and cypher 

I have a database of transactions between persons, I load the data from a CSV file



LOAD CSV WITH HEADERS FROM 'file:///low.csv' AS line FIELDTERMINATOR ','
merge (O:Ordenante {nombre: line.NOMBRE_COMPLETO_ORDENANTE})
merge (B:Beneficiario {nombre: line.NOMBRE_COMPLETO_BENEFICIARIO})
CREATE (O)-[R:Envió]->(B)
SET O.estado_de_ordenante = line.ESTADO_ORDENANTE
SET B.estado_de_beneficiario = line.ESTADO_BENEFICIARIO
SET B.estado_de_apertura = line.SUC_APERTURA
SET R.monto_enviado = line.MONTO_EN_PESOS

I create two nodes, and one relation, the graph looks like this:

Captura.PNG

I want to return only the clusters of nodes with 5 or more nodes 

Captura2.PNGI'm using this query, but I get all the nodes. What am I doing wrong?

 

match (O)-[R]->(B)
with count(O) as mnt   
where mnt > 2
match (O)-[R]->(B)
return R, O, B
1 REPLY 1

glilienfield
Ninja
Ninja

When you use an aggregate function without a grouping term, you are counting all the rows.  In your case, the 'count(O)' counts all the paths returned. If you want to know how many B nodes are related to each O node, then you need something like the following:

match (O)-[R]->(B)
with O, count(O) as mnt   
where mnt > 2
match (O)-[R]->(B)
return R, O, B

If you want to avoid matching after the filtering to get 'R, O, and B' back, you can try something like the following:

match (O)-[R]->(B)
with O, collect({R:R, B:B}) as items, count(O) as mnt
where mnt > 2
unwind items as item
return O, item.R, item.B

 

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

On November 16 and 17 for 24 hours across all timezones, you’ll learn about best practices for beginners and experts alike.