Load csv with fields with multiple values to create multiple relationships but with different properties

Hey all, thanks for thinking with me.

I m struggling with the following. I have a csv which looks a bit like this

Id, posture, stateXY
"1","postureA","3; 6"
"2","postureB","4; 5"

This should result for each (already existing) node of type Posture in 2 two relationships towards (already existing) nodes type State. For each, the first relationship should have property "X", the second property "Y".

Concretely:

CREATE (:Posture {name: "postureA"})-[psA_3]-> (:State {id:3)
ON CREATE SET psA_3.prop = "X"
CREATE (:Posture {name: "postureA"})-[psA_6]-> (:State {id:6)
ON CREATE SET psA_6.prop = "Y"

CREATE (:Posture {name: "postureB"})-[ps_4]-> (:State {id:4)
ON CREATE SET psB_4.prop = "X"
CREATE (:Posture {name: "postureB"})-[psA_5]-> (:State {id:5)
ON CREATE SET psB_5.prop = "Y"

Now doing this with LOAD CSV, I can create the relationships easily using split().
But how do I assign the properties X and Y in the right sequence?

LOAD CSV WITH HEADERS FROM 'file:///file.csv' AS row
MATCH (p:Posture {name: row.posture)})
WITH p, row
UNWIND split(row.stateXY, '; ') AS stateXY
MERGE (sXY:State {Id: toInteger(stateXY)})
MERGE (p)-[ps:HAS_STATE]->(sXY);

I would use
ON CREATE SET ps.prop = "X"
ON CREATE SET ps.prop = "Y";
but this ofcourse won't work well since I can't reference to the right relationship. How do I reference the right one?

Thanks for you insights!

You should be able to do this using apoc.coll.zip. This will take your list and attach context to it as:

LOAD CSV WITH HEADERS FROM 'file:///file.csv' AS row
[snip]
// The zip returns a result of the format [["X", 123], ["Y", 456]]
UNWIND apoc.coll.zip(["X", "Y"], split("row.stateXY, "; ")) AS stateXY

MERGE (sXY:State {Id: toInteger(stateXY[1])})
MERGE (p)-[ps:HAS_STATE]->(sXY)
    ON CREATE SET ps.prop = stateXY[0]

note: I'm assuming the format of the CSV is always X; Y

1 Like

I think I understand. Would this work. I assumed the nodes existed, as you stated, so the match statements.

LOAD CSV WITH HEADERS FROM 'file:///file.csv' AS row
WITH row.posture as posture, split(row.stateXY, '; ') AS stateXY
MATCH (p:Posture {name: posture})
MATCH (x:State {Id: toInteger(stateXY[0])})
MATCH (y:State {Id: toInteger(stateXY[1])})
MERGE (p)-[r1:HAS_STATE]->(x) set r1.prop = 'X'
MERGE (p)-[r2:HAS_STATE]->(y) set r2.prop = 'Y'

Hey thanks, this works. I had to install the apoc -this won't give problems in aura (I'm transfering it later to aura)
It does give a warning about using the eager operator.

I have a small followup question as well.
Now some of the rows in the excel don't have two numbers. In that case, both the X and Y relationship should still get created, but to the same State
How do i go about this?
Thanks

Hey, thanks. It seems the simplest thing to do, and it doesn't give an error, though doesn't create any relationships either :thinking:. Not sure what s going on in this one.

I have a small followup question as well.
Now some of the rows in the excel don't have two numbers. In that case, both the X and Y relationship should still get created, but to the same State
How do i go about this?
Thanks

The following will create the Y relationship to the same state node, but have the relationship property β€˜prop’ remain β€˜Y’ to distinguish the two relationships. Is this what you want?

LOAD CSV WITH HEADERS FROM 'file:///file.csv' AS row
WITH row.posture as posture, split(row.stateXY, '; ') AS stateXY
MATCH (p:Posture {name: posture})
MATCH (x:State {Id: toInteger(stateXY[0])})
MATCH (y:State {Id: toInteger(coalesce(stateXY[1], stateXY[0]))})
MERGE (p)-[r1:HAS_STATE]->(x) set r1.prop = 'X'
MERGE (p)-[r2:HAS_STATE]->(y) set r2.prop = 'Y'

It sounds cool this coalesce thing :). I'm gonna play with it a bit.
Though it gave me before the error on stateXY[0] returning NULL..

Aura supports apoc that that should be all good.

With regards to not creating the relationship - I suspect you need to OPTIONAL MATCH for the state nodes, so they will be created if they aren't there. If you MATCH and they are missing, the query terminates before it gets to the merge.

With regards to the eager operator, have a look at what the warning says. Usually it explains why it is using eager and where you need to update.

Finally, if you have data that doesn't consistently have two fields you are going to have a bad time. If you can resolve that in the data first that would be easiest. Alternatively look at different load mechanisms like py2neo or whatever your favourite language driver is, and parse the inputs that way.

If you do want two relationships to the same node, but with 'X' and 'Y' property values, then try the following. The previous query will only create one relationship with the property 'Y', as the relationship with property 'X' gets updated with the new property value of 'Y'. The way to handle this is to specify the property value in the match clause.

LOAD CSV WITH HEADERS FROM 'file:///file.csv' AS row
WITH row.posture as posture, split(row.stateXY, ';') AS stateXY
MATCH (p:Posture {name: posture})
MATCH (x:State {Id: toInteger(stateXY[0])})
MATCH (y:State {Id: toInteger(coalesce(stateXY[1], stateXY[0]))})
MERGE (p)-[r1:HAS_STATE {prop: 'X'}]->(x)
MERGE (p)-[r2:HAS_STATE {prop: 'Y'}]->(y)