Hi,
i am a university student and i am doing my thesis on neo4j.
The aim of the thesis is to find the similarity between various nodes.
Nodes have 3 properties but I am only interested in one which is called "text" and is a very long string (a newspaper article).
Does anyone know if there is a function that is able to compare the similarity between 2 such long texts?
Hello,
You may want to check this page of APOC for text -
There are some similarity functions here.
Maybe also check -
Good luck!
Try this:
with "I am a university student and i am doing my thesis on neo4j" as txt1,
"I am a University student and I am doing my project on Neo4j" as txt2
with apoc.text.clean(txt1) as norm1, apoc.text.clean(txt2) as norm2
return toInteger(apoc.text.jaroWinklerDistance(norm1, norm2) * 100) as similarity
Result: 94% similarity
Check this: https://neo4j.com/labs/apoc/4.1/misc/text-functions/