Trouble with SPLIT and UNWIND syntax

Hey folks - I am struggling a bit with my syntax for SPLIT and UNWIND. I think I am close, but I am getting some weird type errors...

I have a CSV that looks like the following:

rel-vmtods.csv

,VM Name,Datastore
0,KALPWVAPPEDGE01,"BGHVMAX250_01ba, BGHVMAX250_01bb"
1,KALCEXPEXT01,BGHVMAX250_01ba
2,KALTWVINTOAUT02,BGHVMAX250_01ba
3,KALSFTP02,BGHVMAX250_01bb
4,KALPWVAPPEDGE02,"BGHVMAX250_01ba, BGHVMAX250_01bb"
5,VSHPDMDMZ01,BGHVMAX250_01bb
6,VSWEBEXT01,BGHVMAX250_01ba

As you can see, when the CSV was generated (output from Pandas) some of the rows contain a single datastore, but some contain multiples and are quoted. Also Pandas apparently stuck an ID column in there with no header...

Anyways, I am using the following code to try to import the data - objects have already been created, so all I have to do is match them and create a relationship:

LOAD CSV WITH HEADERS FROM 'file:///rel-vmtods.csv' AS vmdsrow
//WITH vmdsrow.`VM Name` AS impVM, vmdsrow.Datastore AS impDS
WITH vmdsrow.`VM Name` AS impVM, SPLIT(vmdsrow[2], ",") AS multiDS
UNWIND multiDS as impDS
MATCH (vm:VM {VMName: impVM})
MATCH (ds:Datastore {DatastoreName:impDS})
MERGE (vm)-[rel:IS_STORED_ON]->(ds)

However I am receiving the following error if I use "SPLIT" -

Expected Long(1) to be a org.neo4j.values.storable.TextValue, but it was a org.neo4j.values.storable.LongValue (Failure when processing file '/Users/username/Library/Application%20Support/com.Neo4j.Relate/Data/dbmss/dbms-ce00a1b5-549e-4941-a09e-1221aac8a74b/import/rel-vmtods.csv' on line 2.)

If I uncomment the second line (and comment out the 3rd) it runs fine - but obviously it doesn't successfully match on any rows with multiple values in column 3, so I only get a fraction of the necessary relationships created.

updated my pandas export - CSV now reads as follows:

"VM Name","Datastore"
"KALPWVAPPEDGE01","BGHVMAX250_01ba, BGHVMAX250_01bb"
"KALCEXPEXT01","BGHVMAX250_01ba"
"KALTWVINTOAUT02","BGHVMAX250_01ba"
"KALCEXPEXT02","BGHVMAX250_01bb"
"VSSKYPEEDGE01","BGHVMAX250_01ba"
"vswebexttst02","BGHVMAX250_01bb"
"KALPWVMFTGW02","BGHVMAX250_01bb"
"VSNWKVPRX08","BGHVMAX250_01bb"
"VSADFSEDGE01","BGHVMAX250_01ba"
"KALPXVUTLNS01","BGHVMAX250_22A"
"KALPWVMFTGW01","BGHVMAX250_01bb"

Both VM and datastore nodes have already been created, and I have a contraint on both as follows:

CREATE CONSTRAINT ON (ds:Datastore) ASSERT ds.DatastoreName IS UNIQUE;
CREATE CONSTRAINT ON (vm:VM) ASSERT vm.VMName IS UNIQUE;

using the following code to import, SPLIT, UNWIND and merge:

LOAD CSV WITH HEADERS FROM 'file:///rel-vmtods.csv' AS vmdsrow
WITH vmdsrow.`VM Name` AS impVM, SPLIT(vmdsrow[1], ",") AS multiDS
UNWIND multiDS as impDS
MATCH (vm:VM {VMName: impVM})
MATCH (ds:Datastore {DatastoreName:impDS})
MERGE (vm)-[rel:IS_STORED_ON]->(ds)

Still getting the following error:

Expected Long(1) to be a org.neo4j.values.storable.TextValue, but it was a org.neo4j.values.storable.LongValue (Failure when processing file '/Users/ttwyman/Library/Application%20Support/com.Neo4j.Relate/Data/dbmss/dbms-ce00a1b5-549e-4941-a09e-1221aac8a74b/import/rel-vmtods.csv' on line 2.)

Hi @tmvt

If NO "WITH HEADERS", you can write SPLIT(vmdsrow[2], ",").
If "WITH HEADERS", you can write SPLIT(vmdsrow.Datastore, ",").

1 Like

Update - modified my code as follows:

LOAD CSV WITH HEADERS FROM 'file:///rel-vmtods.csv' AS vmdsrow
WITH vmdsrow.`VM Name` AS impVM, SPLIT(vmdsrow.Datastore, ",") AS multiDS
UNWIND multiDS as impDS
MATCH (vm:VM {VMName: impVM})
MATCH (ds:Datastore {DatastoreName:impDS})
MERGE (vm)-[rel:IS_STORED_ON]->(ds)

Looks like Cypher prefers the header reference as I have imported a file with headers. However, the first couple times I ran this way it wasn't actually creating the secondary / tertiary relationships.

Then I ran it explicitly calling a specific object as follows:

LOAD CSV WITH HEADERS FROM 'file:///rel-vmtods.csv' AS vmdsrow
WITH vmdsrow.`VM Name` AS impVM, SPLIT(vmdsrow.Datastore, ",") AS multiDS
UNWIND multiDS as impDS
MATCH (vm:VM {VMName: "Cloud_vSphere_Machine_1-mcm402020-105303377304"})
MATCH (ds:Datastore {DatastoreName:impDS})
MERGE (vm)-[rel:IS_STORED_ON]->(ds)

And it successfully created multiple relationships for the (ds) object... I am getting closer!

I think I've got it - Actually Koji's solution above was the answer for the cypher. I had a second problem in that the CSV file created by Pandas inserted a space into the string. I had to TRIM it as follows:

LOAD CSV WITH HEADERS FROM 'file:///rel-vmtods.csv' AS vmdsrow
WITH vmdsrow.`VM Name` AS impVM, SPLIT(vmdsrow.Datastore, ",") AS multiDS
UNWIND multiDS as impDS
MATCH (vm:VM {VMName: impVM})
MATCH (ds:Datastore {DatastoreName:TRIM(impDS)})
MERGE (vm)-[rel:IS_STORED_ON]->(ds)
1 Like

Hi @tmvt

I am glad that the problem has been resolved.
When I use SPLIT(), I often use the TRIM() function.

1 Like