Let's see what we can learn about movies for the first weekly challenge. Specifically, movies that contain "Life" in the title. Is there a genre or director you think came up with the best movie on this list? Let us know!
Here is what we want you to do:
Create a recommendations sandbox at sandbox.neo4j.com.
For this sandbox, write a query that returns one row for every movie in the graph that contains "Life" in its title.
Use a parameter named testString with a value of "Life" for the predicate.
The rows will be returned in movie year order.
Each row will contain:
Movie (title of the movie)
Year (the year the movie was released)
Genres (list of genre names for the movie)
Directors (list of director names for the movie)
Please post your solution code in your response and a screenshot of the solution results. We will be giving away a T-shirt to one of the submissions between 7-29-2022 and 8-5-2022, submit your solution to be included in the drawing!
:param testString => 'Life';
MATCH (m:Movie)
WHERE m.title CONTAINS $testString
RETURN m.title AS title, m.year AS year, [(m)-[:IN_GENRE]->(g) | g.name] AS genres, [(m)<-[:DIRECTED]-(d) | d.name] AS directors
ORDER BY year;
MATCH (m:Movie)
WHERE m.title CONTAINS $testString
RETURN m.title AS title, m.year AS year, [(m)-[:IN_GENRE]->(g) | g.name] AS genres, [(m)<-[:DIRECTED]-(d:Director) | d.name] AS directors
ORDER BY year;
MATCH (m:Movie)
WHERE m.title CONTAINS $testString
WITH m
MATCH (m)<-[:DIRECTED]-(d)
WITH m, collect(d.name) AS directors
MATCH (m)-[:IN_GENRE]->(g)
RETURN m.title AS title, m.year AS year, directors, collect(g.name) AS genres
ORDER BY year
Your query performed as well as the solution that we had which is:
PROFILE
MATCH (movie:Movie)-[:IN_GENRE]->(g)
WHERE movie.title CONTAINS $testString
WITH movie, collect(g.name) AS genres
MATCH (movie)<-[:DIRECTED]-(director)
WITH movie, genres, collect(director.name) AS directors
RETURN movie.title AS Movie, movie.year AS Year, genres AS Genres, directors AS Directors ORDER BY movie.year
I just gave this query a go and while I got an answer quickly I was struggling to understand why I was getting higher DB hits than both of you (not by much though). I widdled it down to this specific difference in the example below
PROFILE
MATCH (m)-[gi:IN_GENRE]->**(**g:Genre)
WHERE m.title CONTAINS $testString
WITH m, collect(g.name) AS genres
RETURN m.title AS Title
Using just (g) vs (g:Genre) results in fewer DB hits (1325 vs 1482). I would have expected the answers to be the same as there aren't any [:IN_GENRE] relationships with anything other than a genre right? I tried to validate this assumption by using the following where I replaced (g:Genre) with (g), both returned the same 20340 count but using (g) results in 228475 db hits where (g:Genre) results in 20400. I'm now left even more confused because this would imply g:genre is the more efficient method, so why is it worse in the weekly challenge query?