Hi Community
I am using VectorCypherRetriver method for performing vector search with graph traversal. When I use this method I am getting results which has similarity score greater than 0.7 and blanks for less than 0.7. I want to see the results less than 0.7 as well. is there any default threshold setting that need to be tweaked.
VectorCypherRetriver is from neo4j_graphrag.retrievers not Langchain’s retriever.
You’re right that VectorCypherRetriever is from neo4j_graphrag.retrievers and not LangChain’s retriever.
Because of that, it has its own internal logic for vector search and graph traversal, including a default similarity threshold that filters out lower-scoring results.
the retriever generates a Cypher query that looks roughly like
MATCH (n)
WHERE n.embedding IS NOT NULL
WITH n, vector_similarity(n.embedding, $queryEmbedding) AS score
WHERE score >= $similarity_cutoff // default is around 0.7
RETURN n, score
ORDER BY score DESC
LIMIT $top_k
So anything with similarity below the cutoff is filtered out before results are returned, which is why nodes below 0.7 came back blank.
This behavior comes from the GraphRAG retriever itself — not Neo4j and not LangChain.
You can disable or lower the similarity cutoff by setting it explicitly:
from neo4j_graphrag.retrievers import VectorCypherRetriever
retriever = VectorCypherRetriever(
driver,
embedding_dim=1536,
top_k=20,
similarity_cutoff=0.0 # show all scores, including < 0.7
)
or one needs to restrict - similarity_cutoff=0.3
There are other factors even with a lower cutoff:
top_k still limits how many results you get.
vector search is still ordered by similarity descending.
weak results (< 0.3) may be noisy depending on the embedding model.