I'm seeing an odd result when using Neo4JVector.from_existing_graph that I hope someone can shed some light on.
The short story is that embedding a property with a string value, then doing a similarity search for that exact string value does not return a 100% match.
The attached python notebook compares 2 methods of embedding a text property in a single node labelled "EmbeddingTest".
Method 1 creates a vector index manually, then embeds a string value, then saves that vector back to Neo4J. This vector is EmbeddingTest.embedding_text_1.
Method 2 uses Neo4JVector.from_existing_graph to create the index, perform the embedding ansd save the vector back to Neo4J as a single step. This vector is EmbeddingTest.embedding_text_2.
A similarity search is performed using both vectors. Method 1 score is 1.0 as expected, but method 2 is 0.973. Why??? This should be an exact match.
Attached is a python notebook with this test scenario and screen shot showing the vectors are indeed different even though the embedding settings are the same.
My only hunch is that Method 2 is embedding some meta data in addition to the node property value, but I can't find any evidence that is the case.
Any ideas or insight would be greatly appreciated.
embedding_test.py.txt (3.9 KB)