Counting actors' total number of roles

Hi all,

I'm studying Neo4j and faced with a problem I can't seem to solve by myself. I'm working with the sample movie dataset and having troubles solving the following problem: Which five actors have acted in the largest number of movies and what is the total number of roles they have acted in?

So far I've come up with the following query, which finds the names of the actors, the number of movies and a list that contains lists of all their roles. How should I count all items in the list?

    count(m) as num_movies,
    collect(rel.roles) as roles
return as actor, num_movies, roles
order by
    num_movies desc

The first two result rows of this query look like this:

"actor"         │"num_movies"│"roles"                                                               │
│"Tom Hanks"     │12          │[["Joe Fox"],["Sam Baldwin"],["Joe Banks"],["Mr. White"],["Zachry","Dr│
│                │            │. Henry Goose","Isaac Sachs","Dermot Hoggins"],["Dr. Robert Langdon"],│
│                │            │["Paul Edgecomb"],["Jim Lovell"],["Chuck Noland"],["Rep. Charlie Wilso│
│                │            │n"],["Hero Boy","Father","Conductor","Hobo","Scrooge","Santa Claus"],[│
│                │            │"Jimmy Dugan"]]                                                       │
│"Keanu Reeves"  │7           │[["Neo"],["Neo"],["Neo"],["Kevin Lomax"],["Shane Falco"],["Johnny Mnem│
│                │            │onic"],["Julian Mercer"]]                                             │

Thanks in advance.



Hi @arttu.rasanen !

Welcome to the jungle :smile:

You can try,

RETURN as actor, count(distinct m) as num_movies, count(distinct rel.roles) as num_roles
ORDER BY num_movies DESC LIMIT 5


PS: I added distinct to the movies as well considerin' an acotr tht may have 2 differents roles on the same movie. Like that with Eddie Murphy tho.