How to avoid error while loading a large CSV

Neo4j server Community edition 4.4.6 , running on Ubuntu 20.04

Well, I've just solved a big problem, being able to load a quite large number of transactions using LOAD CSV.

I dumped a table from Postgres, a table running fine without problems, but when I go to LOAD into neo4j it returns me some errors. Often, there is only an unpaired ' or " or a spare \n.

As they are millions rows, It's really difficult to understand where the errors are hidden, expecially because neo4j returns the offending position and rot the offending row.

I created a shell script to split the input file in chunks of 10.000 rows and then executing a CALL {} IN TRANSACTION OF 5000 rows, in this way, if I find an error, I loose only 10.000 rows and not all the millions.

I think should be nice if the LOAD CSV shoul be able to reject in an error.log the offending lines, without stopping its run.

Anyway has an idea?

Thank you

Please try with apoc.load.csv function available in Neo4j. to use this 1st we need to add apoc plugin.

CALL apoc.load.csv("MY_CSV_URL", {failOnError:false})
YIELD list, map
WITH list, map
call apoc.do.when(list = , "return 'nothingToList' as list, 'nothingToMap' as map", "return list, map", {list: list, map: map})
YIELD value
RETURN value["list"], value["map"]

for more info you can refer below