Exercise 4.11 in intro-neo4j-exercises

Hi all!

I am going through the exercises one gets with the command :play intro-neo4j-exercises in the neo4j browser. I have trouble understanding if one of the solutions given in the exercises will work in a special case.

The prompt for Exercise 4.11 is this:

Retrieve the movies and their actors where one of the actors also directed the movie, returning the actors names, the director’s name, and the movie title.

The solution given is this:

MATCH (a1:Person)-[:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(a2:Person)
WHERE exists( (a2)-[:DIRECTED]->(m) )
RETURN  a1.name as Actor, a2.name as `Actor/Director`, m.title as Movie

My question is: Consider the special case where there is a Movie node in the database that is connected to only one Person node in the database and that person both acted and directed the movie. Will this query retrieve this Movie node also? I feel like it wouldn't, since the MATCH pattern requires atleast two nodes with relationship type ACTED_IN to be connected to a Movie node. If my feeling is right, how do I write the query that works even in the special case aforementioned?

Thanks for your help in advance.

Hi Venkatesh.
Are you from telugu states? I'm from Andhra Pradesh.

MATCH (a1:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(a1)
OPTIONAL MATCH (a2:Person)-[:ACTED_IN]->(m)
RETURN  a1.name as Actor, a2.name as `Actor/Director`, m.title as Movie

Hope this helps you.

1 Like

Hi Chandra

Thanks for your reply. I first ran your query and got this:

This actually gave pretty much the same output as the query in the exercise solution except for the additional record where Actor = Actor/Director. This makes sense since "OPTIONAL MATCH could be considered the Cypher equivalent of the outer join in SQL". Another minor detail is that the aliasing in your query should be a1.name as `Actor/Director`, a2.name as Actor, m.title as Movie.

Then to check if your query addresses the special case, I did this:

CREATE (TheVenki:Movie {title:'The Venki', released:2019, tagline:'Welcome to the Real World'})
CREATE (Venki:Person {name:'Venki', born:1996})
CREATE
  (Venki)-[:ACTED_IN {roles:['Venki']}]->(TheVenki)
CREATE  
  (Venki)-[:DIRECTED]->(TheVenki)

.. and then ran your query. It indeed worked as shown below!

Thanks for your help, I didn't know about OPTIONAL MATCH before.

I am from Chennai BTW. So I am a Tamil. But I do have a lot of Telugu friends! :slight_smile:

Yes... My aliasing is wrong... My mistake

This is a good question, and the solution query in the exercise probably isn't the best, now that I look at it.

As you say, in a special case with a movie with a single actor who both acted in and directed the movie, the solution query will NOT find this pattern, for exactly the reason you specified, the MATCH pattern requires at least two actors in the movie.

Another problem is that for movies with multiple actors who-are-also-the-directors, we're going to get duplicate results, since we get the full cast of results per actor-who-is-also-the-director.

A better solution might be something like:

MATCH (m:Movie)<-[:DIRECTED]-(a2:Person)
WHERE (m)<-[:ACTED_IN]-(a2)
WITH m, collect(a2.name) as directors
OPTIONAL MATCH (m)<-[:ACTED_IN]-(a1)
WHERE NOT a1 IN directors
RETURN  a1.name as Actor, directors as `Actor/Director(s)`, m.title as Movie
1 Like

Hi Andrew

Thanks for your reply. I ran your query after adding a Movie node of my own to the DB with just one Person node who is both the actor and director to replicate the sepcial case I mentioned in my OP (See my reply to @jaini.kiran 's answer for the exact Cypher statements I used to do this). This is what I got (in the text tab of the result set):

This query does reduce the number of duplicates. But, we end up having nulls in the result set wherever the special case occurs. I am guessing this happens since the query has the clause: WHERE NOT a1 IN `Actor/Director(s)` .

Another interesting thing that happened was the result set by default was a graph visualization instead of a table. I was not expecting that to happen when I ran the query! I don't understand why exactly that happened. I suspect the collect had something to do with that...

When we return graph elements, then the browser can display them in a graph result view. I modified the query to return a collection of extracted names rather than the nodes themselves, that should ensure it only renders tabular data.

Yes, with an OPTIONAL MATCH there will be a null value here, since in your added movie node there are no other actors besides the actor/director, and if we want to display the row for the movie at all we have to use null for the missing value.

Now if you don't want to work with separate rows per movie, and only want a single row per movie, then we can collect up the actor results instead:

MATCH (m:Movie)<-[:DIRECTED]-(a2:Person)
WHERE (m)<-[:ACTED_IN]-(a2)
WITH m, collect(a2.name) as directors
WITH m, directors, [(m)<-[:ACTED_IN]-(a1) WHERE NOT a1 IN directors | a1.name] as actors
RETURN actors, directors as `Actor/Director(s)`, m.title as Movie

We're using a list comprehension in the second-to-last line, this is like an OPTIONAL MATCH and collect all in one, resulting in a list from projected elements of matching paths (in this case the names of actors).

Okay so I coded this alternate query, which uses simultaneous match queries

  1. To match the movies which have a director who have acted in the same movie
  2. Find all the actors in that movie
match (p:Person)-[:DIRECTED]->(m:Movie), (p1:Person)-[rel:ACTED_IN]->(m)
where exists((p)-[:ACTED_IN]->(m))
return p1.name, m.title, 
CASE exists((p1)-[:DIRECTED]->(m))
WHEN true THEN "Director/Actor"
ELSE "Actor"
END AS Roles

This differs from the expected result set, but it provides the required matches for both cases where there's only one actor in a movie directed by the same actor; and a movie directed by multiple directors who have also acted in the movies.

Please note that this query does not return directors who have directed the movie but not have acted in it in a movie, which has other directors who have acted in the movie, as the question asks to return only those people who have acted in the movie.

Any feedback as to how the query can be optimized is highly appreciated - I'm just a newbie getting his hands dirty.