cancel
Showing results for 
Search instead for 
Did you mean: 

Reading Relationships As Line From CSV

mavisnconsultin
Node Link

Please, I am trying to import a csv file into Neo4j desktop, which has a column that already defines the relationship between the two nodes in the file. Below is a simple example of what I am talking about:

Name       Action         Movie
Tom        DIRECTED       Mega Day
Harry      PRODUCED       Falling Sun

Now, I want to read in the csv file as line with cypher. Is it possible to read the relationship column as line as well, such that the nodes are automatically connected with each other via the relationship between them per line?

1 ACCEPTED SOLUTION

glilienfield
Ninja
Ninja

The "load csv" method loads each line and makes each column available, either with the column names if headers are included, or by index if not. Nothing is automatically created/connected by reading each line. You need to specify what you want to create and connect using cypher.

In your case, you are conditionally creating one of two relationships based on the Action column value. Cypher does not really have conditional workflow to support this directly, nor can you set a node label or relationship type with a variable.

One approach to implementing conditional logic is to use the family of 'do' methods in the apoc library.

Another way is to use 'call' subqueries that execute based on conditions. The following cypher uses this approach as a potential solution to your requirement:

load csv with headers from "file:///Book1.csv" as line
merge(a:Actor{name: line.Name})
merge(m:Movie{title: line.Movie})
with a, m, line
call {
    with a, m, line
    with a, m, line
    where line.Action = "DIRECTED"
    merge(a)-[:DIRECTED]->(m)
}
call {
    with a, m, line
    with a, m, line
    where line.Action = "PRODUCED"
    merge(a)-[:PRODUCED]->(m)
}

In the above query, each row of the csv file is read and assigned to the variable 'line.' Then the actor and movie nodes are merged, which will match to them if they already exist or create them if they do not exist. Each is bound to a variable, 'a' for actor and 'm' for movie.

A 'call' subquery is used to conditional execute the creation of the relationship between the actor and movie nodes based on the line.Action value. Each call subquery will execute per 'line' and only the one with 'true' condition will execute. The use of two 'with' statements is a work around, as the parser complains that the 'with' in a 'call' clause has to be 'simple', i.e. can't have a 'where' clause or operations. Including the second 'with' clause gets around this, by allowing the first 'with' clause to be 'simple.'

Result:

Does this help?

View solution in original post

3 REPLIES 3

glilienfield
Ninja
Ninja

The "load csv" method loads each line and makes each column available, either with the column names if headers are included, or by index if not. Nothing is automatically created/connected by reading each line. You need to specify what you want to create and connect using cypher.

In your case, you are conditionally creating one of two relationships based on the Action column value. Cypher does not really have conditional workflow to support this directly, nor can you set a node label or relationship type with a variable.

One approach to implementing conditional logic is to use the family of 'do' methods in the apoc library.

Another way is to use 'call' subqueries that execute based on conditions. The following cypher uses this approach as a potential solution to your requirement:

load csv with headers from "file:///Book1.csv" as line
merge(a:Actor{name: line.Name})
merge(m:Movie{title: line.Movie})
with a, m, line
call {
    with a, m, line
    with a, m, line
    where line.Action = "DIRECTED"
    merge(a)-[:DIRECTED]->(m)
}
call {
    with a, m, line
    with a, m, line
    where line.Action = "PRODUCED"
    merge(a)-[:PRODUCED]->(m)
}

In the above query, each row of the csv file is read and assigned to the variable 'line.' Then the actor and movie nodes are merged, which will match to them if they already exist or create them if they do not exist. Each is bound to a variable, 'a' for actor and 'm' for movie.

A 'call' subquery is used to conditional execute the creation of the relationship between the actor and movie nodes based on the line.Action value. Each call subquery will execute per 'line' and only the one with 'true' condition will execute. The use of two 'with' statements is a work around, as the parser complains that the 'with' in a 'call' clause has to be 'simple', i.e. can't have a 'where' clause or operations. Including the second 'with' clause gets around this, by allowing the first 'with' clause to be 'simple.'

Result:

Does this help?

Thanks so much Gary for this comprehensive answer to my question. This is exactly what I am looking for. With this, I can more easily create graphs from csv file with pre-defined relationships. Thanks a lot man!

Hi Gary. Thanks again for your suggestion. While I still think it is the solution, I tried to implement it with my actual dataset, but got an error. Here is my actual cypher statement:

load csv with headers from "file:///pixar.csv" as line
merge (p:Person{name:line.name})
merge (m:Movie{title:line.film})
with p, m, line
call {
    with p, m, line
    with p, m, line
    where line.role_type = "DIRECTED"
    merge (p)-[:DIRECTED]->(m)    
}
call {
    with p, m, line
    with p, m, line
    where line.role_type = "PRODUCED"
    merge (p)-[:PRODUCED]->(m)    
}
call {
    with p, m, line
    with p, m, line
    where line.role_type = "WAS_SCREEN_WRITER_FOR"
    merge (p)-[:WAS_SCREEN_WRITER_FOR]->(m)    
}
call {
    with p, m, line
    with p, m, line
    where line.role_type = "WAS_STORY_WRITER_FOR"
    merge (p)-[:WAS_STORY_WRITER_FOR]->(m)    
}
call {
    with p, m, line
    with p, m, line
    where line.role_type = "PERFORMED_SOUNDTRACK_FOR"
    merge (p)-[:PERFORMED_SOUNDTRACK_FOR]->(m)    
}

Running that gave me the error message below:

Neo.ClientError.Statement.SyntaxError

Query cannot conclude with CALL (must be RETURN or an update clause) (line 29, column 1 (offset: 703))
"call {"

Please, what do you think?

 
 
Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

On November 16 and 17 for 24 hours across all timezones, you’ll learn about best practices for beginners and experts alike.