Network graph clustering

samasamaan · October 30, 2022, 10:03am

Hello

I have a large-scale network consisting of 62,578 nodes. It represents the network topology of an SDN network (Software Defined Networking).

- I want to partition the graph into a number of clusters, each cluster should be controlled by an SDN controller.

- I tried to use the k-means algorithm, but it doesn't take into account the relationships between the nodes. This algorithm relies on the nodes and their properties.

- Then I tried the similarity algorithm, but it calculates the similarity score between 2 nodes and creates a new relationship that holds this value between these 2 nodes. As a result, I couldn't benefit from this way in k-means algorithm.

- Louvain and Leiden don't allow the number of clusters in advance. Is there any way to do that with these algorithms?

Suggestions Please. Any idea may be of great help for me. Many thanks.

@alicia_frame1

@koji

@michael_hunger

ponceortiz · October 30, 2022, 1:15pm

Hi , it would be good if you share the database schema , but you could try :

1.- try to create the features property for you node including the sdnID .. for example If you have :
(Node)-[:CONTROLLED_BY]->(SDN)
include the sdnID for each Node and other properties in one called "features" then provide this to the Kmean algorithm

2.- the latest version of bloom have some GDS Algorithms , try to invesigate with it.

samasamaan · October 30, 2022, 1:24pm

Thanks for your reply.

The graph is homogeneous. No SDN nodes are their. The 1st picture is a part of the whole graph.

cobra · October 30, 2022, 6:38pm

Hello @samasamaan

Are there any disconnected subgraphs? If yes, then you should have a look at Weakly Connected Components (WCC) algorithm.

Regards,
Cobra

samasamaan · October 30, 2022, 6:45pm

Hello @Cobra

No there are no disconnected subgraphs. No islands! It is a network topology that consists of nodes (switches and hosts).

Any suggestion please.

cobra · October 31, 2022, 7:55am

Can the double relations be replaced by a single relation or do they have an importance?

samasamaan · October 31, 2022, 8:00am

Here they don't have importance. I used the double relations for path-finding algorithms.

What about using FastRP for node embeddings and then using these embeddings in the K-means algorithm?

If you like to take a look at this Suggestion in stackoverflow.

cobra · October 31, 2022, 8:48am

Yeah it's a good solution

Topic		Replies	Views
Modeling SDN network Topology as a Graph in Neo4j Modeling	5	844	August 30, 2021
Using Community Detection algorithms for load balancing in communication networks Neo4j Graph Platform migrated	0	146	October 24, 2022
Count The Number of Isolated Clusters in Subgraph Graph Algorithms/Graph Data Science cypher	1	761	February 9, 2021
Looking for advice on community detection and cluster splitting Graph Algorithms/Graph Data Science	0	228	December 16, 2023
Create clusters using node embeddings + K-means Neo4j Graph Platform migrated	0	160	November 23, 2022

Get Certified in June!

Network graph clustering

Related topics