List the nodes forming unrelated clusters

masavini · January 12, 2023, 10:38am

Hi everybody!

I have a graph with a few thousand nodes which form unrelated clusters. I'd like to get a list of lists with the nodes with a certain label forming each cluster, regardless of the relation type linking them.

Let's consider this simple example:

CREATE
    (:Person {name: "Andrea"})-[:LOVES]->(:Person {name: "Bob"})-[:LIKES]->(:Food {name: "pizza"}),
    (n:Person {name: "Mike"})<-[:KNOWS]-(:Person {name: "Paul"})-[:KNOWS]->(:Person {name: "Jen"})<-[:LOVES]-(n)

If I'd like to know which persons are somehow related, then I would get:

[
    ["Andrea", "Bob"],
    ["Mike", "Paul", "Jen"]
]

Could please help me building such a query? Thanks!

masavini · January 12, 2023, 11:05pm

what can i say? OUTSTANDING job, guys. never seen such a good support before, congratulations.

masavini · January 12, 2023, 11:39am

Thank you very much, that is exactly what I was looking for.

So I first projected a new graph:

CALL gds.graph.project(
  'myGraph',
  'Person',
  '*'
)

Then I retrieved information about each cluster (aka component) in myGraph:

CALL gds.wcc.stream('myGraph')
YIELD nodeId, componentId
RETURN gds.util.asNode(nodeId).name AS name, componentId
ORDER BY componentId, name


╒════════╤═════════════╕
│"name"  │"componentId"│
╞════════╪═════════════╡
│"Andrea"│0            │
├────────┼─────────────┤
│"Bob"   │0            │
├────────┼─────────────┤
│"Jen"   │2            │
├────────┼─────────────┤
│"Mike"  │2            │
├────────┼─────────────┤
│"Paul"  │2            │
└────────┴─────────────┘

Could you suggest how to group names inside a list of lists?

cuneyttyler · January 12, 2023, 10:55am

You might need to extract connected components, Check https://neo4j.com/docs/graph-data-science/current/algorithms/wcc/

glilienfield · January 12, 2023, 3:02pm

This groups the names by componentId. Do you want further grouping?

CALL gds.wcc.stream('myGraph')
YIELD nodeId, componentId
RETURN collect(gds.util.asNode(nodeId).name) AS names, componentId
ORDER BY componentId

cuneyttyler · January 12, 2023, 1:30pm

Can you give an example about 'list of lists'. What kind of data is this?

masavini · January 12, 2023, 1:37pm

in the example above, something like this:

[
    ["Andrea", "Bob"],
    ["Mike", "Paul", "Jen"]
]

a list of components where each component is represented by a list of node names.

cuneyttyler · January 12, 2023, 1:58pm

In neo4j there is no explicit grouping logic. Group by works with aggregate functions so I will use 'count'. I suppose something similar to this would work for you.

WITH [
    ["Andrea", "Bob", "Andrea"],
    ["Mike", "Paul", "Jen","Bob"]
] as myList UNWIND myList as subList UNWIND subList as name return name, count(name)

returns

"Andrea"	2
"Bob"	2
"Mike"	1
"Paul"	1
"Jen"	1

Topic		Replies	Views
Clustering of nodes. Combining nodes based on commonality with other nodes Cypher apoc , performance , cypher , relationship	2	347	August 24, 2021
How to get group of connected nodes? Cypher apoc , cypher , relationship , knowledge-base	4	1162	July 24, 2021
How can I display groups of nodes which realized more than n realtionships with different nodes in the graph Cypher apoc , cypher , operations	8	3140	February 17, 2020
Getting mutual nodes of multiple labels (more then 2) Cypher performance , cypher , knowledge-base	7	700	July 30, 2021
Connect nodes using community detection when they are more than x relationships Neo4j Graph Platform migrated	0	55	November 16, 2022

August Summer Fun!

List the nodes forming unrelated clusters

Related topics