I am reading data from a file and want to create nodes and relations between them. There is a chance that either of the two nodes of the relation are not existing by now.
I solved this by:
LOAD CSV WITH HEADERS FROM "foobar.csv" AS ROW
MERGE (a:TypeA { name: row.name, revision: toInt(row.revision) })
MERGE (b:TypeB { name: row.name }) ;
LOAD CSV WITH HEADERS FROM "foobar.csv" AS ROW
MATCH (a:TypeA { name: row.name, revision: toInt(row.revision) })
MATCH (b:TypeB { name: row.name })
MERGE (a) -[r:is_instance]-> (b) ;
This loads the csv two times, which is not optimal.
I have the following cases:
(a), (b), (r) do not exists: In this case I could have simply used one single MERGE command
(b) exists, (a), (r) do not exists: In this case I would need a MATCH against (b) followed by a merge of the relation
(a), (b), (r) exists: a simple merge or a match against all three all would be fine.
Using the second variant that MATCH against (b) fails to create anything in the first case, because there is no MATCH at all.
Is there any other way to avoid using the double loop over csv?
I think doing it all in one go will cause an EAGER (run it through an EXPLAIN). If you have a large CSV this could lead to problems. Might be better to iterate over the file twice. Just something to keep an eye on, if it works loading all at once then no worries.