Trying to write a Cypher query for node similarity to design a movie recommendation system

abhishekshankar88 · March 20, 2024, 12:27am

node_labels
1 ["Movie"]
2 ["ProductionCompany"]
3 ["Genre"]
4 ["SpokenLanguage"]
5 ["ProductionCountry"]
6 ["Person"]
7 ["User"]

relationship_type
1 "RATING"
2 "ACTED_IN"
3 "PRODUCED_IN"
4 "PRODUCED_BY"
5 "CREWED_IN"
6 "HAS_GENRE"
7 "HAS_SPOKEN_LANGUAGE"
8 "SIMILAR"

I have the above nodes and relationships and have the required gds projection as well.

CALL gds.graph.project('movies', 
  ['User', 'Movie', 'ProductionCompany', 'Genre', 'SpokenLanguage', 'Country', 'Person'], 
  {
    RATING: {orientation: 'UNDIRECTED'},
    HAS_GENRE: {orientation: 'UNDIRECTED'},
    HAS_SPOKEN_LANGUAGE: {orientation: 'UNDIRECTED'},
    ACTED_IN: {orientation: 'UNDIRECTED'}
  }
);

I am trying to use Node Similarity to come up with possible movie recommendations within a Genre or similar to more movies. I seem to be a little lost when applying the gds.alpha.nodeSimilarity.filtered.stream or other APIs. I would love some clarity on how to frame my cypher query or just possible areas where I might be going wrong.

florentin_dorre · March 20, 2024, 8:42am

Hey @abhishekshankar88 ,
for movie recommendations, you want to find (:Movie)-[:SIMILAR_TO]-(:Movie).

With node similarity, you will find movies with the most common neighbor nodes.
I would try CALL gds.nodeSimilarity.filtered.stream('movies', {sourceNodeFilter:'Movie' , targetNodeFilter:'Movie' } ).

An alternative approach would be to first create node embeddings such as using FastRP (Fast Random Projection - Neo4j Graph Data Science).
This can be combined with filtered KNN to find similar Movies. I would assume this could get you better recommendations as you look at more than the intermediate neighbors.
You can find this workflow at End-to-end workflow - Neo4j Graph Data Science.

Hope this gives a better starting point :)

abhishekshankar88 · March 20, 2024, 11:52pm

Hello @florentin_dorre thanks for the input. I tried doing the FastRP and I got similar movies, but I also I wanted to know if its possible to add multiple node filters. Like if I want to find the most similar movies inside a particular genre for example? I tried adding a list to the NodeFilter options but was met with errors. Any advice with respect to this? I have added my code below

CALL gds.graph.project('movies', 
              ['Movie','User','Genre','SpokenLanguage'], 
              {
                RATING:{orientation: 'UNDIRECTED',properties: 'rating'},
                HAS_GENRE:{orientation: 'UNDIRECTED'},
                HAS_SPOKEN_LANGUAGE:{orientation:'UNDIRECTED'}
              }
            );

CALL gds.fastRP.mutate(
          'movies',
          {
            embeddingDimension: 100,
            randomSeed: 42,
            mutateProperty: ['similarities'],
            embeddingDimension: 4,
            iterationWeights: [1, 1, 1, 1]
          }
        )
        YIELD nodePropertiesWritten;

MATCH (m1:Movie)-[:HAS_GENRE]->(:Genre {name: "Fantasy"})<-[:HAS_GENRE]-(m2:Movie)-[:SIMILAR]-(m3:Movie)
WHERE m1 <> m3 AND m1 <> m2 AND m2 <> m3 // Ensure distinct movies
RETURN DISTINCT m3.title AS SimilarMovie

This was one approach I was trying do let me know where I might be going wrong?

florentin_dorre · March 25, 2024, 8:52am

In your code example, you must have omitted the call to gds.knn.write?

I dont understand why m1, m2 and m3 should be distinct.
Reading m2 as the query movie, m3 as the recommendation, shouldnt you make sure, that m2 and m3 have the requested genre?

You could also dictionary encode the genres and use them as feature properties for the fastRP embedding.

abhishekshankar88 · March 27, 2024, 1:02am

Hey @florentin_dorre thanks for your reply. I was able to dabble around with gds.knn.write and create a similarity relationship among the movies. Yes I can see that I have made an error in the query and I have corrected it (I was just dabbling with the query so was not of much importance). Thanks for your help.

Topic		Replies	Views
Graph Data Science: Filtered Node Similarity Neo4j Graph Platform migrated	2	176	November 16, 2022
Node Similarity algorithm problem Graph Algorithms/Graph Data Science	1	462	July 21, 2023
GDS query doesn't show results Conferences, Meetups, & Events migrated	4	170	November 17, 2022
Similarity Problem Cypher	1	388	June 26, 2020
Compare lots of nodes of the same type by a List<String> property with GDS Projection Graph Algorithms/Graph Data Science	7	423	August 15, 2023

Get Certified in June!

Trying to write a Cypher query for node similarity to design a movie recommendation system

Related topics