Assume the following data for the discussion. It consists of two movies nodes, each with a list of three languages.
create(n:Movie{languages:['english', 'french', 'arabic']})
create(n:Movie{languages:['english', 'german', 'spanish']})
- MATCH (m:Movie)
The match clause searches for all nodes with label "Movie". This will produce a result row for each Movie node. Sorry, I used 'n' in my queries instead of 'm' that is used in the question's query.
- UNWIND m.languages AS language
Property m.language is a list of languages. The unwind converts this list of languages into rows, one element of the language list per row. This is done for each Movie node in the result set. The other elements on each row are repeated for each language element. The result is a follows. Notice how the two result rows from above got converted into three rows each. The node 'n' corresponding to the language is repeated for each new row. I outputted the node's 'id', so you can see which rows are from the same node.
- WITH language, collect(m) AS movies
The 'WITH' clause is used to pass data between stages of a multi-stage query. In this case, we also have an aggregate function (collect) in the with clause, so the data is getting grouped by the properties outside the collect method's argument. In this case, the data is getting grouped by 'language', so the result will be a row for each different value of language and the collect will gather all the movie nodes that have that value of language. As you can see, the 'movies' column contains a list of movies. The list are the movie nodes that contained that language. The 'English' language has two movies nodes in its list, since each movie had 'English' as a language.
- MERGE (l:Language {name:language})
The 'language' and 'movies' values are now available to the 'match' statement since they were passed in the 'with' clause. This merge is creating a new 'Language' node, with the name property each to the value of language. In our example, this will create five language nodes, one for each row of the above result.
- WITH l, movies
The 'with' clause is passing the newly created Language node and the corresponding list of movies for this language. A 'with' clause is required between certain sets of clauses, such as a 'merge' followed by a 'match'. See how each row is one new 'language' node and has that language's corresponding list of movies that contain that language.
- UNWIND movies AS m
The unwind is taking the list of movie nodes on each row and unwinding them into rows. Each movie node from the list will also have its corresponding language node as part of the row. Only the English language node has multiple movies, so unwind its movies resulted in the first two rows. The other language nodes had only one element their movies lists, so only one row resulted from each language node.
- WITH l,m
The 'with' clause is passing the language node and movie node through to the next stage of the query. This will be the 'l' and 'm' columns from the above result set. This is now available for further processing.
- MERGE (m)-[:IN_LANGUAGE]->(l);
The 'merge' is creating a relationship between the 'movie' node and 'language' node for each row passed by the 'with'. Since the line is terminated with a ';', this cypher is complete.