Reading Relationships As Line From CSV

Please, I am trying to import a csv file into Neo4j desktop, which has a column that already defines the relationship between the two nodes in the file. Below is a simple example of what I am talking about:

Name       Action         Movie
Tom        DIRECTED       Mega Day
Harry      PRODUCED       Falling Sun

Now, I want to read in the csv file as line with cypher. Is it possible to read the relationship column as line as well, such that the nodes are automatically connected with each other via the relationship between them per line?

The "load csv" method loads each line and makes each column available, either with the column names if headers are included, or by index if not. Nothing is automatically created/connected by reading each line. You need to specify what you want to create and connect using cypher.

In your case, you are conditionally creating one of two relationships based on the Action column value. Cypher does not really have conditional workflow to support this directly, nor can you set a node label or relationship type with a variable.

One approach to implementing conditional logic is to use the family of 'do' methods in the apoc library.

Another way is to use 'call' subqueries that execute based on conditions. The following cypher uses this approach as a potential solution to your requirement:

load csv with headers from "file:///Book1.csv" as line
merge(a:Actor{name: line.Name})
merge(m:Movie{title: line.Movie})
with a, m, line
call {
    with a, m, line
    with a, m, line
    where line.Action = "DIRECTED"
    merge(a)-[:DIRECTED]->(m)
}
call {
    with a, m, line
    with a, m, line
    where line.Action = "PRODUCED"
    merge(a)-[:PRODUCED]->(m)
}

In the above query, each row of the csv file is read and assigned to the variable 'line.' Then the actor and movie nodes are merged, which will match to them if they already exist or create them if they do not exist. Each is bound to a variable, 'a' for actor and 'm' for movie.

A 'call' subquery is used to conditional execute the creation of the relationship between the actor and movie nodes based on the line.Action value. Each call subquery will execute per 'line' and only the one with 'true' condition will execute. The use of two 'with' statements is a work around, as the parser complains that the 'with' in a 'call' clause has to be 'simple', i.e. can't have a 'where' clause or operations. Including the second 'with' clause gets around this, by allowing the first 'with' clause to be 'simple.'

Result:

Does this help?

1 Like

Thanks so much Gary for this comprehensive answer to my question. This is exactly what I am looking for. With this, I can more easily create graphs from csv file with pre-defined relationships. Thanks a lot man!

1 Like