My problem is that whenever I try to import data via a large csv and connect to a central node, it seems to make many copies of the central node (see photo). The brown nodes are what I want connected to just one "Earth Justice" node. I realize I can merge duplicate nodes, but I would like to have it right as I load in.
:auto Using periodic commit
LOAD CSV WITH HEADERS FROM 'file:///EJ_Sample.csv' as row
CREATE (l:labid) SET l+=row
CREATE (o:Origin {name:"Earth Justice"})
MERGE (l)<-[:Created]-(o)
MERGE can be a bit confusing to use, I agree, there are nuances that even the experienced run into again (and again). At a glance I think maybe you are explicitly creating the nodes (and creating duplicates after the first time)? Perhaps the MERGE statement is ok.
Try this:
LOAD CSV WITH HEADERS FROM 'file:///EJ_Sample.csv' as row
MATCH (o:Origin {name:"Earth Justice"})
with o
CREATE (l:labid) SET l+=row
MERGE (l)<-[:Created]-(o)
in order to insure that MERGE is recognizing the node you are addressing, you should instore a uniqness constraint on one property of the node-label before loading the data.
CREATE CONSTRAINT ON ( orig:Origin) ASSERT org.name IS UNIQUE
then you will be able to MERGE this as "Origin" labelled node without recreating it, which means the first appearance of "Earth Justice" will create the node, all other will merge.
:auto Using periodic commit
LOAD CSV WITH HEADERS FROM 'file:///EJ_Sample.csv' as row
CREATE (l:labid) SET l+=row
MERGE (o:Origin {name:"Earth Justice"})
MERGE (l)<-[:Created]-(o)
Pay attention, that this means the node (o:Origin {name:"Earth Justice"}) should come from the file, which is not the case in your cypher: no reference to any column.
If the origin node allready exists, just make a match:
:auto Using periodic commit
LOAD CSV WITH HEADERS FROM 'file:///EJ_Sample.csv' as row
CREATE (l:labid) SET l+=row
MATCH (o:Origin {name:"Earth Justice"})
MERGE (o)-[:Created]->(l)
In some case, for instance if you are delivering a new origin, to which every node of the source have to be connected, but the name of this node already exist, you will have to
create a temporary label e.g. tempOrigin,
create a constraint on this label,
load the data
destroy the constraint (change "Create Constraint on ..." into "Drop Constraint on ...")
relabel all nodes with label tempOrigin (should be only one) to label Origin
:auto Using periodic commit
LOAD CSV WITH HEADERS FROM 'file:///EJ_Sample.csv' as row
CREATE (l:labid)
MATCH (o:Origin {name:"Earth Justice"})
MERGE (o)-[:Created]->(l)
SET l+=row
or make a "matched merge":
:auto Using periodic commit
LOAD CSV WITH HEADERS FROM 'file:///EJ_Sample.csv' as row
CREATE (l:labid)
MERGE (o:Origin {name:"Earth Justice"})-[:Created]->(l)
SET l+=row