Skip Error Records while loading data in Neo4J

apoc
load-csv
import
json
(Parvez Hazari) #1

I am trying to load data into Neo4j using apoc.load.json from a file containing streaming JSONs (sequence of JSONs separated by new line character). I am able to load the data pretty quickly using apoc.periodic.iterate and apoc.load.json together. The only issue with it is that in case any one record is not correct or invalid then my process fails. Is there a possible approach to skip the error records? If Yes, can I get them somehow in response?
Also if we do not have a capability in APOC, I can even use LOAD CSV to load the same data by asking the upstream application to send the file in CSV format, if LOAD CSV provides the same functionality to skip the records in case of an erroneous record and somehow provide an ability to get the same in response.

0 Likes

(Paul Drangeid) #2

When you say not correct or invalid, is it actually invalid formatted json, or is it just that some data is not coming across as expected (array vs string, or null where it should have a value etc)?

I've used apoc.load.json and had to perform some validation using WITH * where .... this value is a string or not null, etc etc

0 Likes

(Parvez Hazari) #3

Thanks Paul for responding. I need to know both whether APOC can handle format error in case if there is an error with one of the record. Also would like to know how can I achieve individual field level validation.

When you were able to validate, were you able to somehow log those skipped errors?

0 Likes

(Paul Drangeid) #4

I'm not sure if APOC has the ability to give details on malformed json returned. Maybe someone else has some insight on that. When I did field validation I don't think I logged them but you could try something like the following:

CALL apoc.load.json(url) YIELD value
UNWIND value.result as jsonrecord
FOREACH (ignoreMe in CASE WHEN exists(jsonrecord.myrequiredvalue) THEN [1] ELSE [] END | MERGE (n:Mynode {name:jsonrecord.myrequiredvalue})
FOREACH (ignoreMe in CASE WHEN not (exists(jsonrecord.myrequiredvalue)) THEN [1] ELSE [] END |CREATE (cle:Cypherlogentry {date:timestamp()}) set cle.url=url,cle.source='my import procedure', cle.description='jsonrecord.myrequiredvalue returned null')
0 Likes