How to find the similarity between common nodes of multiple type nodes?

So i'm using the "PIMA INDIANS DIABETES DATASET".
And i made these type nodes:

  1. Person {age,id}
    2.BMI {bmi level}
    3.OUTCOME {outcome}
    3.Blood Pressure {Blood Presseure level} and so on.....

I want to find the similarity between all the persons whose age is between 21-25 and who have been diagnosed with diabetes.
I want my answer something like this:
BMI similarity: 0.82
BP similarity: 0.67
I have seen all the graph algorithms but i didn't find anything relevant.
can we achieve this using Neo4j?

Ps. All the examples i have seen uses similarity between same type of relation.

Welcome to the community. Can you describe more of what you mean by similarity or give a link to an example you have considered?

Ok so i have a dataset which contains the following columns.

1.ID
2.AGE
3.BMI
4.BP
5.INSULIN
6. OUTCOME

I Took Id as a node and added age as it's property.
Then i made separate nodes of all other columns like BMI, BP, INSULIN etc.
I have made relationship such that each "ID" has nodes to connected to their "BMI", "BP","INSULIN" etc values.

Now my query is this:
"Find the mean of BMI of all the persons whose age is <=25?"

Is creating a dedicated node set for each column the most efficient way?

Our node similarity algorithm calculates the similarity of nodes based on their neighboring nodes (think of a (:Person)-[:LIKES]->(:Instrument) graph -- we measure how similar Person nodes are based on the number of the same Instruments they like vs. the number of different ones.

If you wanted to use that algorithm, you would need to the things you want to measure similarity on (eg. outcomes) into nodes. If you have a schema where Person is a node label with age and id attributes, and Outcome is a node label with a description attribute you could use nodeSimilarity in this way:

CALL algo.nodeSimilarity.stream(
     'MATCH(p:Person) WHERE p.age < 25 RETURN id(n) as id', 
     'MATCH (p:Person)-[:HAS_OUTCOME]->(o:Outcome) RETURN id(p) as source, id(o) as target',
{graph:'cypher')

In your reply to @nsmith_piano, you're asking about a mean value. Check out our documentation on aggregating functions here: https://neo4j.com/docs/cypher-manual/current/functions/aggregating/ .

1 Like

Thanks for the info. As you mentioned, the node similarity calculates similarity for only one type of relationship "LIKES" in your example. Like 'A like guitar and piano", "B likes keyboard and guitar". So they are 50% similar. What i want is "A likes guitar and lives at London", "B likes piano and lives at Mumbai"., so "A and B are 50% similar as they like same instrument but stay at different place. I know we can do this by measuring similarity to relation "LIKES" once, and then with "LIVES" once. But what if i want to compare using two relations at the same time? Btw, sorry if i framed the question wrong. I was just confused.

You can combine multiple node and relationship types for the purpose of running an algorithm -- either by pre-loading a named graph (see section 2.3.4 loading multiple relationship types and node labels), or by using a cypher projection that references the nodes and relationships you want to consider.

For the musical intrument example, if we add in a Place node and a LIVES_IN relationship, you could use a cypher projection like this:

CALL algo.nodeSimilarity.stream(
     'MATCH(n) WHERE n:Person or n:Instrument or n:Place RETURN id(n) as id', 
     'MATCH (s:Person)-[]->(t) RETURN id(s) as source, id(t) as target',
{graph:'cypher', direction:'outgoing'})
2 Likes

Solved my issue.Thanks a lot! :slight_smile:

1 Like

Hey Alicia, great solution!
How can we return the node label instead of node id?

You can use the asNode function -- in the YIELD statement, return the nodeId, and then you can use algo.asNode to access labels and attributes. For example:

CALL algo.nodeSimilarity.stream('Person | Instrument', 'LIKES', {
  direction: 'OUTGOING'
})
YIELD node1, node2, similarity
RETURN algo.asNode(node1).name AS Person1, algo.asNode(node2).name AS Person2, similarity
ORDER BY similarity DESCENDING, Person1, Person2
1 Like

I have similar question.
How can we apply node similarity based on edge property value?
I have graph in which stock names are node.
Dates are node.
And price links node with dates.
So how to apply node similarity for different stocks?

1 Like

Hi Alicia

I think nodeSimilarity is now deprecated, I tried to run this cypher projection with jaccard similarity but i get an error "Procedure call does not provide the required number of arguments: got 3 expected 2."

@mangesh.karangutkar Node Similarity has not been deprecated: https://neo4j.com/docs/graph-data-science/current/algorithms/node-similarity/

The error message you received from jaccard indicates that you've provided more inputs that it expects. The jaccard function expects a pair of inputs (the two nodes being compared); perhaps that's the issue. I would look to the docs for more information on the syntax: https://neo4j.com/docs/graph-data-science/current/alpha-algorithms/jaccard/