Weekly Challenge #1: Movies of "Life"

TrevorS · July 13, 2022, 2:12pm

Let's see what we can learn about movies for the first weekly challenge. Specifically, movies that contain "Life" in the title. Is there a genre or director you think came up with the best movie on this list? Let us know!

Here is what we want you to do:

Create a recommendations sandbox at sandbox.neo4j.com.
For this sandbox, write a query that returns one row for every movie in the graph that contains "Life" in its title.
Use a parameter named testString with a value of "Life" for the predicate.

The rows will be returned in movie year order.
Each row will contain:

Movie (title of the movie)
Year (the year the movie was released)
Genres (list of genre names for the movie)
Directors (list of director names for the movie)

Please post your solution code in your response and a screenshot of the solution results. We will be giving away a T-shirt to one of the submissions between 7-29-2022 and 8-5-2022, submit your solution to be included in the drawing!

elaine_rosenber · August 8, 2022, 3:52pm

@Cobra it is a great query. The solution is one where the elapsed ms goes down to as low as 26 ms

Cobra · August 8, 2022, 6:55am

Hello @TrevorS

Here is my query:

:param testString => 'Life';

MATCH (m:Movie) 
WHERE m.title CONTAINS $testString 
RETURN m.title AS title, m.year AS year, [(m)-[:IN_GENRE]->(g) | g.name] AS genres, [(m)<-[:DIRECTED]-(d) | d.name] AS directors 
ORDER BY year;

Here is the screenshot:

Regards,
Cobra

Cobra · August 8, 2022, 4:46pm

I was able to reduce to 18ms with this query:

MATCH (m:Movie) 
WHERE m.title CONTAINS $testString 
RETURN m.title AS title, m.year AS year, [(m)-[:IN_GENRE]->(g) | g.name] AS genres, [(m)<-[:DIRECTED]-(d:Director) | d.name] AS directors 
ORDER BY year;

Do you have db hits info?

Regards,
Cobra

Cobra · August 8, 2022, 4:10pm

I see, do you have any hints?

elaine_rosenber · August 8, 2022, 5:30pm

Your query returns 81 rows because there is a row with no directors. I can get your query down to 23 ms. Total db hits of 2518.

The solution I have has 17ms with total db hits of 2499.
Did you add any indexes to the graph? You should not.

Cobra · August 9, 2022, 9:16am

Here is my new query:

MATCH (m:Movie) 
WHERE m.title CONTAINS $testString 
WITH m 
MATCH (m)<-[:DIRECTED]-(d) 
WITH m, collect(d.name) AS directors 
MATCH (m)-[:IN_GENRE]->(g) 
RETURN m.title AS title, m.year AS year, directors, collect(g.name) AS genres
ORDER BY year

Regards,
Cobra

elaine_rosenber · August 9, 2022, 12:10pm

Great job @Cobra !

Your query performed as well as the solution that we had which is:

PROFILE
MATCH (movie:Movie)-[:IN_GENRE]->(g)
WHERE movie.title CONTAINS $testString
WITH movie, collect(g.name) AS genres
MATCH (movie)<-[:DIRECTED]-(director)
WITH movie, genres, collect(director.name) AS directors
RETURN movie.title AS Movie, movie.year AS Year, genres AS Genres, directors AS Directors ORDER BY movie.year

Cobra · August 9, 2022, 8:46am

The query must return movies who have at least 1 director and 1 genre?

What is the expected number of rows for the query: 80, 81 or other?

No, I didn't add any indexes, I just relaunched multiple times the query

JoshKos · November 30, 2022, 4:43am

Hi!

I just gave this query a go and while I got an answer quickly I was struggling to understand why I was getting higher DB hits than both of you (not by much though). I widdled it down to this specific difference in the example below

PROFILE

MATCH (m)-[gi:IN_GENRE]->**(**g:Genre)

WHERE m.title CONTAINS $testString

WITH m, collect(g.name) AS genres

RETURN m.title AS Title

Using just (g) vs (g:Genre) results in fewer DB hits (1325 vs 1482). I would have expected the answers to be the same as there aren't any [:IN_GENRE] relationships with anything other than a genre right? I tried to validate this assumption by using the following where I replaced (g:Genre) with (g), both returned the same 20340 count but using (g) results in 228475 db hits where (g:Genre) results in 20400. I'm now left even more confused because this would imply g:genre is the more efficient method, so why is it worse in the weekly challenge query?

Appreciate any insight!
Thanks,
Josh

MATCH (m)-[:IN_GENRE]->(g:Genre)

RETURN count(g)

Topic		Replies	Views
Weekly Challenge #3 Lots of directors Archive migrated	5	328	August 17, 2022
Help on query Introduce-Yourself	2	263	May 23, 2022
Limit results per year in movies example Neo4j Graph Platform	3	21	October 19, 2024
Weekly Challenge #4 Movies with limited actors Archive migrated	3	303	November 30, 2022
Neo4J Movies Database: Which movie has the most number of actors in it? General migrated	4	174	November 30, 2022

July Summer Fun!

Weekly Challenge #1: Movies of "Life"

Related topics