Calculate similarity for Nodes in the same level and calculate similarity betweeen two sub-graph depths

mariaplerou · August 26, 2021, 1:21pm

In my graph I have nodes called Suite (6 nodes), Test (18 nodes) ,Keyword(600) which have relationships each other for example a Test calls a Keyword(sub-test meaning). I would like to find similarities of the same type of nodes and at the same time to find similarity of the length of the sub-graph depth for each node.
I initially began my investigation using the Node Similarity algorithm with this procedure call to create a virtual graph

CALL gds.graph.create(
    'myGraph1',
    ['Test', 'Keyword'],
    {
        CALLS: {
            type: 'CALLS'
            
        }
    }
);

and then

CALL gds.nodeSimilarity.stream('myGraph1')
YIELD node1, node2, similarity
RETURN gds.util.asNode(node1).name AS From, gds.util.asNode(node2).name AS To, similarity
ORDER BY similarity DESCENDING,From, To

with no result.

Also I tried the following:

CALL gds.graph.create.cypher(
    'my-cypher-graph_8',
    'MATCH (t:Test) RETURN id(t) AS id',
    'MATCH (t:Test)-[r:NEXT]->(k:Test) RETURN id(t) AS source, id(k) AS target'
)

giving as a result

When running the similarity algorithm, while it seems syntactically correct I get no result with the following:

CALL gds.nodeSimilarity.stream('my-cypher-graph_8')
YIELD node1, node2, similarity
RETURN *

Could you please if I can find the similariy of the same kind of nodes comparing also the sub-graph sequense and how can improve my queries?

I run neo4j 4.2.7 as container and use apoc plugin.

Thank you in advance.

alicia_frame1 · August 28, 2021, 4:00pm

It shouldn't return nothing - can you try the following to try to debug what's going on:

What are the metrics returned when you run gds.graph.create - how many nodes and how many relationships are loaded? If it's 0 relationships, that's a sign that the data you loaded is incorrect.
Try dropping the YIELD statement and just running: CALL gds.nodeSimilarity.stream('myGraph1')
Run statistics mode - gds.nodeSimilarity.stats to see how many nodes are compared.

Off the top of my head, you may end up with no results due to the directionality of the relationships (node similarity is built for a bipartite graph, where you'd have (:Test)-[:CALLS]->(:Keyword), or

mariaplerou · August 30, 2021, 7:47am

Thank you Alicia for your reply. I made the debug steps proposed but no good news.
Regarding bullet 1. It ssems that in gds.graph.createm, some nodes and relationships are loaded.Like in the following picture:

Regarding bullet 2: Even without yield statement I have "no changes no records" result
Regarding bullet 3:

mariaplerou · August 30, 2021, 11:42am

I add another question that already posted to #cypher channel. Besides the above case(that I need to investigate too). I would like to ask if there is a way to compare graphs/subgraphs that don't share common nodes based on their properties in order to calculate similarities. Node similarity algorithms Similarity - Neo4j Graph Data Science seem to not match in my case as the graph nodes I want to compare don't share common nodes.

alicia_frame1 · August 30, 2021, 6:06pm

Probably the reason you're not getting any results is, then, due to the fact that your nodes don't have any similarity (set similarityCutoff to 0 to test the hypothesis). We calculate similarity between pairs of nodes based on the number of common neighbors (using Jaccard). If no nodes have common neighbors, then they're not similar.

We don't have anything out of the box to compare the similarity of entire graphs. You can use graph algorithms and compare, for example, average number of communities or average number of nodes per community, but we don't offer - for example - full graph embeddings, or graph isomorphism.

One option, if you don't have nodes with common neighbors, but you still want to look at similarity, is the Node2Vec embedding - which can encode structural similarity, as well as topological - in combination with KNN (cosine similarity). Check out this blog post: Bringing traditional ML to your Neo4j Graph with node2vec | Dave Voutila

mariaplerou · August 31, 2021, 7:15am

Indeed, Alicia setting similarityCutoff to 0 gives 0.0 similarity as a result. Thank you for your answer, I will check the possibilities proposed.

Topic		Replies	Views
How to find similarity between two graphs in Neo4j? Graph Data Science / Graph Analytics	13	5478	December 15, 2019
Subgraph similarity (for an exam) Neo4j Graph Platform apoc	0	350	June 16, 2020
Node Similarity Algorithm for second and third level relationships comparison Graph Data Science / Graph Analytics	5	1327	May 19, 2020
How to find the Similarity between two graphs using neo4j libraries? Graph Data Science / Graph Analytics apoc , cypher	0	257	April 7, 2023
Find similarity between two node clusters that are not connected Graph Data Science / Graph Analytics	3	526	May 13, 2023

Take the Course Then Join The Aura Agent Hackathon

Calculate similarity for Nodes in the same level and calculate similarity betweeen two sub-graph depths

Related topics

Take the Course Then Join
The Aura Agent Hackathon