Approximate nearest neighbour algorithm does not result with similar nodes

triin.ruutli · May 8, 2020, 11:12am

Im trying to find similar images using approximate nearest neighbour algorithm using cosine similarity. This is the query:

MATCH (p:Image)
WITH {item:id(p), weights: p.vec} AS userData
WITH collect(userData) AS data
CALL gds.alpha.ml.ann.write({
nodeProjection: '',
relationshipProjection: '',
data: data,
topK:20,
algorithm: 'cosine',
writeRelationshipType:"SIMILAR_APPROX",
similarityCutoff: 0.1,
p:0.5,
maxIterations:50
})
YIELD nodes, similarityPairs, computations
RETURN nodes,
apoc.number.format(similarityPairs) AS similarityPairs,
apoc.number.format(computations) AS computations

But when I search similar images to one specific image, non of the results are from the same category as the first image (dolphin). I have 9119 nodes in my database. Here's the query for searching similar images to one specific image:

MATCH (r:Image) WHERE id(r)=1932
WITH r,
[(r)-[:SIMILAR_APPROX]->(i)| i.path ] AS similarNodes
RETURN similarNodes

input image:
one example of output images:

Am I missing some parameters in algorithm or why am I getting results from other categories when clearly I have more similar images in database?

Thank you in advance!

alicia.frame · May 18, 2020, 10:58pm

What are you passing to ANN to measure similarity on? The node property in p.vec?

I would check the similarity of the two images using cosine similarity directly Similarity functions - Neo4j Graph Data Science. It's possible that there's something off in your image embedding that's causing the two vectors to be quite similar. The categories you're referencing aren't available to ANN, so it's solely based on the values in p.vec.

You're also returning the top 20 most similar images, with a cutoff of 10%... which could give you some fairly dissimilar images. If you return the similarity scores of the pairs, what are they? And do you get the same value from cosine similarity run over that pair?

Topic		Replies	Views
Building a similarity graph with Neo4j’s Approximate Nearest Neighbors Algorithm Neo4j Developer Blog Archive	0	1972	October 1, 2019
Cosine similarity on 1M person nodes Neo4j Graph Platform migrated	5	909	August 22, 2023
Question of Approximate Nearest Neighbour algorithm Graph Algorithms/Graph Data Science newbie , data-science	0	347	March 7, 2022
How to search for the most similar array of numbers against a given array General migrated	6	163	December 2, 2022
How to use of Cosine algorithm optimally, and why does it sometimes returning empty Graph Algorithms/Graph Data Science	6	655	January 10, 2020

Approximate nearest neighbour algorithm does not result with similar nodes

Related topics