Text Similarity: Compare text property of one node to all other nodes and create relationship

sagarhowal · June 15, 2020, 6:03am

I have a graph database which will be populated with nodes containing text messages. Every time a node is saved, I need to calculate the similarity with respect to other nodes. the similarity metric can be any of these [https://neo4j.com/docs/labs/apoc/current/misc/text-functions/#text-functions-text-similarity] available within APOC. When the similarity is more than (say) 0.5, the query should establish a relationship SIMILAR_TO among those nodes compared.

My graph looks kind of like this:

As of now, this is a learning project/PoC.
I am looking for a cypher query or a stored procedure.
Can someone give me pointers on how to structure the query and anything else I must know before doing this?

I am aware that the complexity will increase exponentially as the nodes increase. But for now, I am not worrying about that.

I am using Neo4j version: 4.0.3 and python driver to create nodes.

Thanks.

michael.hunger · June 15, 2020, 1:08pm

You can just when you create your node, after insertion do the comparision and create the relationship.

CREATE (m:Message {...})
MATCH (o:Message) 
WITH o,m, apoc.text.distance(m.Text, o.Text) as similarity
WHERE similarity > 0.5
CREATE (m)-[:SIMILAR {similarity:similarity}]->(o)

sagarhowal · June 18, 2020, 1:55am

Thank you Micheal.

This is the error I was getting.

Neo.ClientError.Statement.SyntaxError

WITH is required between CREATE and MATCH (line 2, column 1 (offset: 48)) "MATCH (o:Message)"

I added a WITH

CREATE (m:Message {...})
WITH m #Edit
MATCH (o:Message) 
WITH o,m, apoc.text.distance(m.Text, o.Text) as similarity
WHERE similarity > 0.5
CREATE (m)-[:SIMILAR {similarity:similarity}]->(o)

This creates a relationship of the created node with itself too.

So then I matched the node first and then created the node like this:

MATCH (o:Message) 
WITH o
CREATE (m:Message {...})
WITH o,m, apoc.text.distance(m.Text, o.Text) as similarity
WHERE similarity > 0.5
CREATE (m)-[:SIMILAR {similarity:similarity}]->(o)

But this creates 2 additional nodes which I don't seem to get how that would happen.

Topic		Replies	Views
Text similarity Cypher	2	354	September 12, 2021
Text similarity using cosine similarity Neo4j Graph Platform migrated	2	899	January 3, 2023
Similarity Query using a string compare it to property on a node Procedures & APOC	3	655	October 1, 2020
NodeSimilarity Algorithm Cypher apoc , cypher , knowledge-base	0	375	February 5, 2020
Neo4j Cypher query to quickly find nodes with similar text property value Cypher apoc , performance	8	3185	November 30, 2021

August Summer Fun!

Text Similarity: Compare text property of one node to all other nodes and create relationship

Neo.ClientError.Statement.SyntaxError

Related topics