Using similarity to find nodes connected to more than one relationship? Or other algo?

There is a graph with multiple node labels (ip, device, url, etc) and a "main" node (main_id) which will always have a connection to the previous mentioned nodes (they are all directed from the main_id to the other nodes).

(:ip)<-[:HAS_IP]-(:main_id)-[:USING_DEVICE]->(:device), etc

Some nodes (main_id) will have a specific relationship with a special labelled node (special_id).

(:main_id)-[:HAS_SPECIAL_ID]->(:special_id)

I'm trying to discover other main_id that are similar to the main_id connected to special_ids.

There's no need to have ALL connections the same, so I thought about using different weights to each relationship (ip - 0.6, device - 0.2, etc) and triggering the merge of the relationship between the main_id and the special_id if the value is higher than X (haven't decide this yet, maybe > 0.8).

I thought about using similarity but I'm not sure if it can uses this weight property or even compare multiple relationships.

Is there a better algo to compare this relationships?

NodeSimilarity seems like the right place to start - it supports weights, and can run over multiple relationship and node types. It will calculate the similarity between source nodes based on the overlap in their target nodes - Node Similarity - Neo4j Graph Data Science

1 Like

I was using it wrong lol.

CALL gds.nodeSimilarity.stream('graph-undirected', {nodeLabels: ['main_id']}) YIELD
  node1,
  node2,
  similarity
  RETURN gds.util.asNode(node1).domain AS node1, gds.util.asNode(node2).domain AS node2, similarity
  ORDER BY similarity DESC

Gave me something really close to what I was looking for. Now I'll apply weight to the relationships and see how it modify my results.
Thank you