Find similarity between two node clusters that are not connected

henry007 · April 12, 2023, 2:35pm

Hello, I have data structured in a way as seen in the image below:

There are two people that have distinct interactions. Bob had a phone call, an order, received an email, and had a website visit. John's interactions were two orders, received an email, and had a website visit.

These two "Person" nodes are not connected by relationships, but I'd like to compute the similarity between them. After some research, there are graph similarity algorithms but those seem to mostly rely on relationships.

Is there a way to compute similarity between nodes based on Labels?

In this case, Bob & John would be roughly 75% similar as they share 3 node labels (Order, Email, WebsiteVisit).

Thank you!

alison.cossette · May 11, 2023, 7:26pm

Hi Henry:

CALL gds.nodeSimilarity.stream('myGraph')
YIELD node1, node2, similarity
RETURN gds.util.asNode(node1).name AS Person1, gds.util.asNode(node2).name AS Person2, similarity
ORDER BY similarity DESCENDING, Person1, Person2

Details about the above can be seen here. Node Similarity - Neo4j Graph Data Science

You will want to make sure that the graph projection you are using contains the nodes and relationships you note in your question above.

glilienfield · May 12, 2023, 12:05am

I suppose you could compute the Jaccard Similarity measure between each pair of Person nodes using your definition of similar.

Applying the definition to your use case, I believe you get the following:
Jaccard Similarity Score = (Total number of connected nodes with labels in common)/(Total number of different labels among connected nodes for the two Person nodes)

michael.hunger · May 13, 2023, 9:05am

If you just generally want to compare them for a "diff" you can also look apoc.diff.nodes

Topic		Replies	Views
How to find the similarity between common nodes of multiple type nodes? Graph Data Science / Graph Analytics cypher , neo4j	20	5842	October 28, 2021
Finding Ad-hoc Similarity of a Node or set of nodes (without running similarity algorithm on all the nodes) Graph Data Science / Graph Analytics	1	395	July 27, 2020
Graph Data Science Library: Jaccard similarity Graph Data Science / Graph Analytics	2	790	April 20, 2020
Node similarity algorithm Clustering Operations	12	1256	June 12, 2020
Calculate similarity for Nodes in the same level and calculate similarity betweeen two sub-graph depths Graph Data Science / Graph Analytics	5	828	August 31, 2021

Find similarity between two node clusters that are not connected

Related topics