I am importing csv file from import directory by cypher and I would like to create nodes with labels from csv file. Which would look something like this
LOAD CSV WITH HEADERS FROM "file:///B.csv" AS csv
CREATE (c:csv.Type {name:csv.Name})
return c
I know that this is wrong, but hope you can show me the right way to do it.
if you have APOC installed this should be possible via
load csv with headers from 'file:///B.csv' as row
call apoc.create.node([row.label],{name: row.name})
yield node return count(node);
and for example if B.csv has the following content
label,id,name
P1,1,Dana
P2,2,Armen
then the result is
neo4j@neo4j> match (n:P1) return n;
+----------------------+
| n |
+----------------------+
| (:P1 {name: "Dana"}) |
+----------------------+
1 row available after 6 ms, consumed after another 2 ms
neo4j@dana> match (n:P2) return n;
+-----------------------+
| n |
+-----------------------+
| (:P2 {name: "Armen"}) |
+-----------------------+
Thx. Seems like apoc is more useful, then build in functions.
for simple CREATE statements as part of LOAD CSV this is acceptable. Part of the reason that labels can not be parameterized is because the Neo4j query planner would struggle to find the correct plan on a more complex query. If you query was for example
match (n:$param1), ( n2: $param2) where n.id=n2.id create (n)-[:FOLLOWS]->(n2);
then there is no way the planner would know that for the 1st row in the csv $param1 had a value of ':Person' and $param2 had a value of ':Actor' and because of this the most efficient plan is X but whereas for row2 of the csv $param1 has a value of :Person but $param2 has a value of `:SportsStar' and thus now the most efficient plan is plan Y. And what about if there is an index on :Person(id) but not one on :SportsStar(id). And for these reasons Neo4j does not allow passing lables via parameteres etc
1 Like
Thx for detailed explanation!