Is it possible split column values in apoc.load.csv function

lingvisa · October 2, 2021, 7:35pm

For example, this is my load query:

CALL apoc.load.csv('test.csv', {nullValues:['','na','NAN',false], sep:'	'}) 
yield map as row
MERGE (m:Test {nid: row.nid})
ON CREATE SET m += row 
ON MATCH SET m += row 
RETURN count(m) as mcount

My data format could be like this:

nid  tag         uid         date
001   c|python|java       1003252452  20210929

The 'tag' column has 3 values and they should be split by '|'. In apoc.load.csv, is it possible to automatically convert and load three records:

nid  tag         uid         date
001   c       1003252452  20210929
002   python       1003252452  20210929
003   java       1003252452  20210929

In my csv, if I first expand them into 3 lines, my CSV becomes very large. Is that possible?

koji · October 3, 2021, 2:25pm

Hi @lingvisa

This is my data.

nid tag uid date
001 c|python|java 1003252452 20210929
002 go|c++|javascript 1234567890 20211001

I wasn't sure if "nid" was simply a sequence, so I created the nodes anyway.
６ nodes have been generated from ２ records.

CALL apoc.load.csv('test.csv', {nullValues:['','na','NAN',false], sep:' '}) 
yield map as row
WITH row, split(row.tag, '|') AS tags
UNWIND tags AS onetag
CREATE (m:Test {nid: row.nid})
SET m.tag = onetag
SET m.uid = row.uid
SET m.date = row.date

lingvisa · October 5, 2021, 5:02pm

My case is a little more complicated than the example I presented. I will try your approach. And I need to compare the overall speed impact between:

Pre-split in csv file before it calls apoc
Split in apoc statement.

The 2nd approach can reduce csv files size a lot, and it may also boost loading speed, but makes the loading code more complicated.

koji · October 5, 2021, 11:48pm

Hi @lingvisa

How about "LOAD CSV" instead of "apoc.load.csv"?

LOAD CSV WITH HEADERS FROM 'file:///test.csv' AS row
FIELDTERMINATOR ' '
UNWIND split(row.tag, '|') AS onetag
CREATE (m:Test {nid: row.nid})
SET m.tag = onetag
SET m.uid = row.uid
SET m.date = row.date

Topic		Replies	Views
How to split columns into array when the columns is unknown? Cypher	3	795	November 6, 2020
How to load float array values by apoc.load.csv() from csv file? Neo4j Graph Platform	3	402	November 22, 2021
Load multiple CSV rows into same node Cypher cypher , import , neo4j-desktop	9	552	November 8, 2021
Use APOC Procedure for row when using LOAD CSV Neo4j Graph Platform cypher	2	597	February 14, 2023
Transforming data format when using LOAD CSV Desktop load-csv	4	3548	November 27, 2021

July Summer Fun!

Is it possible split column values in apoc.load.csv function

Related topics