# How to achieve the following problem with Neo4j?

There is a relatively simple graph, where the edge has weight.
I would like to subdivide graph based on minimum weights.

1. When one node is part of one sub-graph, it can't be part of a different sub-graph.
2. input parameter is the max. size of the sub-graphs. (default is 6)
3. not all the nodes are connected with each other (it's like a forest)

There are similar algorithm, which can be used, however i'm stuck all of them:

1. minimum/k-minimum spanning tree - it does not work on forest.
2. all pair shortest path - the result does not show any partition related info (forest),
and i can't filter out nodes from the first sub-graph afterwards.

My idea is:

Combine the modularity algorithm with the weekly connected components algorithm and the all pair shortest path algorithm.

1. The modularity algoritm works on higher weights
2. the all shortest path on lower distances.

We need to either reverse the weight value on the fly or use an other edge.

Either the all pair shortest path or the modularity has a weakness, as you can't figure out whether the nodes are in the same partition.

Weekly connected components can help on the partition.

As Neo4j is highly transnational and slow on writes (you need to put a kafka at the front of it), and has a hefty fee (it's not just subscription, additional memory, virtual cores etc.) while running on multiple machines, for rule queries we need to combine it with mongodb. Nodes and relations can have a generated id by mongodb, so it can be sharded.

As in my case rules are relative rules which are only applicable for strict set of nodes, i need to store the result periodically to generate the sub-graphs dynamically.

Rules and weights can be dynamically changing, even edges can disappear (sub-graph needs to reach the input size again)

I have a relation table where active relations has status 'A'.

where status = 'A' sortBy partitionId, componentId, distance

What do you think?

As i'm already using Neo4j any suggestion or optimization is welcome, however GDS algorithm has a gap, does not return back the partition, if they are running on unconnected graphs.