I amvery new with cypher and i have the knowledge about the basic queries. But recently i came across the given CQL while reading an artice on Recommendation Systems in Northwind Database. Please refer to Query 23 in this article: Article
Can someone explain the query step by step so that it is easy for me to understand? Any help is greatly appreciated. Thank you.
WITH 1 as neighbours
MATCH (me:Customer)-[:SIMILARITY]->(c:Customer)-[r:RATED]->(p:Product)
WHERE me.customerID = 'ANTON' and NOT ( (me)-[:RATED|PRODUCT|ORDER*1..2]->(p:Product) )
WITH p, COLLECT(r.rating)[0..neighbours] as ratings, collect(c.companyName)[0..neighbours] as customers
WITH p, customers, REDUCE(s=0,i in ratings | s+i) / size(ratings) as recommendation
ORDER BY recommendation DESC
RETURN p.productName, customers, recommendation LIMIT 10
WITH 1 as neighbours
Static single row value going into the neighbours result set.
MATCH (me:Customer)-[:SIMILARITY]->(c:Customer)-[r:RATED]->(p:Product)
Return all rows of Customer that have SIMILARITY relationship to another Customer.
That other customer, they must have RATED a Product.
WHERE me.customerID = 'ANTON' and NOT ( (me)-[:RATED|PRODUCT|ORDER*1..2]->(p:Product) )
Filter statement for the above MATCH. That first customer is ANTON. And me has not rated, product, order, for 1-2 recursive levels to a product.
WITH p, COLLECT(r.rating)[0..neighbours] as ratings, collect(c.companyName)[0..neighbours] as customers
Take the result of that match clause and make that a result set of the p variable, a collection of ratings, and a collection of customers.
WITH p, customers, REDUCE(s=0,i in ratings | s+i) / size(ratings) as recommendation
Excute the Reduce() function on the ratings
Then finally order the result set and and return the first 10 rows
WITH 1 as neighbours <= [Assign value 1 to a variable neighbours]
MATCH (me:Customer)-[:SIMILARITY]->(c:Customer)-[r:RATED]->(p:Product) <= [A customer, namely "me", who has a similarity relationship to other custmers who rated a product]
WHERE me.customerID = 'ANTON' <= [Qualify me with a customer ID, 'ANTON']
AND NOT ((me)-[:RATED|PURCHASED|PRODUCT*1..2]->(p:Product) ) <= [me is NOT related to Product, via RATED (1 hop) and "PURCHASED and PRODUCT" (2 hops) relationships, Noted that I could not find the ORDER relationship in the schema.]
WITH p, <= Using WITH clause to store the intermediate result which has p,
collect(r.rating)[0..neighbours] as ratings, <= a list of ratings (2 elements)
collect(c.companyName)[0..neighbours] as customers <= a list of customer company names (2 elements)
WITH p, customers, <= Pass on to another intermediate result
REDUCE(s=0,i in ratings | s+i) / size(ratings) as recommendation <= The rating property of all nodes ratings are summed and then divided by the size of ratings. (i.e. the average rating from the list)
ORDER BY recommendation DESC <= Get the highest recommendations
RETURN p.productName, customers, recommendation LIMIT 10 <= Return the first 10 highest recommendations
On review, that article seems to have a lot of errors in the Cypher snippets. Variables are case-sensitive, for example, and there are several cases (in relationship creation) where the cases for some variables don't match, so either relationships won't be created, or when MERGE is used, relationships will be created to new blank nodes.
We'll see if we can get some attention on fixing these.
can you please explain what is the significance of c.companyName as in why we are showing it in the final return statement. How is it helping in the recommendation? Thanks
here the similarity property which stores the cosine similarity value is not being used. All the xcustomers will have a similarity relationship to each other. So, even if we dont write the line 2 of the query we should get the same result. Right? can you explain please? thanks