I am new to Cypher and trying to figure out how one would achieve some kinds of queries that are relatively straightforward in SQL. One that comes to mind is computing Jaccard Similarity. I know there is a special procedure that does that, but say I wanted to do that natively.
In SQL, I could just:
WITH Edges AS ( SELECT m1.movie as movie1, m2.movie as movie2 FROM movie_actors as m1 LEFT JOIN movie_actors as m2 ON m1.actor = m2.actor ), Intersection AS ( SELECT movie1, movie2, COUNT(distinct actor) as num FROM Edges GROUP BY 1 ), Counts AS ( SELECT movie, COUNT(distinct actor) as num FROM Edges GROUP BY 1 ) SELECT movie1, movie2, (I.num) / (C1.num + C2.num - I.num) as similarity FROM Intersection I JOIN Counts C1 ON I.movie1 = C1.movie JOIN Counts C2 ON I.movie2 = C2.movie
How would you do such chaining with the WITHs in Cypher? I would need to store and reuse intermediate results and I wasn't sure how to do that