This query will set a community_id
property for each node where the community_id
is the size of the network where the node is:
MATCH (a)-[*]-(b)
WITH id(a) AS id, apoc.coll.sort(apoc.coll.toSet(collect(DISTINCT id(b)) + [id(a)])) AS nodes_list
WITH DISTINCT nodes_list, size(nodes_list) AS size
WITH size, apoc.coll.flatten(collect(nodes_list)) AS nodes_list
CALL apoc.periodic.iterate('
MATCH (n)
WHERE id(n) IN $nodes_list
SET n.community_id = $community_id
', '
DETACH DELETE n
', {batchSize:1000, params:{nodes_list:nodes_list, community_id:size}}) YIELD batch, operations
RETURN 1
After if you want to delete the connected components that have less than 5 nodes:
CALL apoc.periodic.iterate('MATCH (n) WHERE n.community_id < 5 RETURN n', 'DETACH DELETE n', {batchSize:1000})
Regards,
Cobra