Text similarity

Hi,
i am a university student and i am doing my thesis on neo4j.
The aim of the thesis is to find the similarity between various nodes.
Nodes have 3 properties but I am only interested in one which is called "text" and is a very long string (a newspaper article).
Does anyone know if there is a function that is able to compare the similarity between 2 such long texts?

Hello,
You may want to check this page of APOC for text -

There are some similarity functions here.

Maybe also check -

Good luck!

Try this:

with "I am a university student and i am doing my thesis on neo4j" as txt1,
"I am a University student and I am doing my project on Neo4j" as txt2

with apoc.text.clean(txt1) as norm1, apoc.text.clean(txt2) as norm2
return toInteger(apoc.text.jaroWinklerDistance(norm1, norm2) * 100) as similarity

Result: 94% similarity
Check this: https://neo4j.com/labs/apoc/4.1/misc/text-functions/