There are two people that have distinct interactions. Bob had a phone call, an order, received an email, and had a website visit. John's interactions were two orders, received an email, and had a website visit.
These two "Person" nodes are not connected by relationships, but I'd like to compute the similarity between them. After some research, there are graph similarity algorithms but those seem to mostly rely on relationships.
Is there a way to compute similarity between nodes based on Labels?
In this case, Bob & John would be roughly 75% similar as they share 3 node labels (Order, Email, WebsiteVisit).
CALL gds.nodeSimilarity.stream('myGraph')
YIELD node1, node2, similarity
RETURN gds.util.asNode(node1).name AS Person1, gds.util.asNode(node2).name AS Person2, similarity
ORDER BY similarity DESCENDING, Person1, Person2
I suppose you could compute the Jaccard Similarity measure between each pair of Person nodes using your definition of similar.
Applying the definition to your use case, I believe you get the following:
Jaccard Similarity Score = (Total number of connected nodes with labels in common)/(Total number of different labels among connected nodes for the two Person nodes)