Loading csv with date fields and default values

yanivts1 · November 12, 2024, 10:24am

Hi,

I have a csv file with date fields (string in the file)...
not all of them contain values.
Those who have values - the format is dd/MM/yyyy hh:mm:ss AM
I need to import them and convert to datetime with the following conditions:
If it's null or empty, save it as NULL
If it's a string, convert it to datetime and save as datetime
I'm struggling with the load script
I'm using "CALL apoc.periodic.iterate("LOAD CSV....." , MATCH....
Any ideas how to id right?
Thnaks

glilienfield · November 13, 2024, 6:40pm

That date format is not supported by neo4j temporal date time functions. As such, you can parse it and convert it to a datetime value manually. The following script creates a node and then sets the datetime property only when the datetime value in the import is not null.

load csv with headers from "file:///Untitled.csv" as row
merge(n:TestNode{id: toInteger(row.id)})
with n, row
call (n, row) {
    with row.date as date_value
    where date_value is not null
    with datetime({
        day: toInteger(substring(date_value, 0, 2)), 
        month: toInteger(substring(date_value, 3, 2)),
        year: toInteger(substring(date_value, 6, 4)),
        hour: toInteger(substring(date_value, 11, 2)) + CASE WHEN substring(date_value, 20, 2) = "PM" THEN 12 ELSE 0 END,
        minute: toInteger(substring(date_value, 14, 2)),
        second: toInteger(substring(date_value, 17, 2)),
        timezone: 'America/New York'
    }) as converted_date
    set n.dateValue = converted_date
}
return n

Test file:

Result:

You could use an APOC function to convert your date to a neo4j valid format, such that you could use the Neo4j datetime function. Here is an example:

with "10/01/2024 10:15:30 PM" as date_value
return datetime(apoc.date.convertFormat(date_value, "dd/MM/yyyy hh:mm:ss a", "yyyy-MM-dd'T'HH:mm:ss"))

Here is an example using apoc.periodic.iterate:

CALL apoc.periodic.iterate(
"
load csv with headers from 'file:///Untitled.csv' as row
return row
",
"
merge(n:TestNode{id: toInteger(row.id)})
with n, row
call (n, row) {
with row.date as date_value
where date_value is not null
with datetime({
    day: toInteger(substring(date_value, 0, 2)), 
    month: toInteger(substring(date_value, 3, 2)),
    year: toInteger(substring(date_value, 6, 4)),
    hour: toInteger(substring(date_value, 11, 2)) + CASE WHEN substring(date_value, 20, 2) = 'PM' THEN 12 ELSE 0 END,
    minute: toInteger(substring(date_value, 14, 2)),
    second: toInteger(substring(date_value, 17, 2)),
    timezone: 'America/New York'
}) as converted_date
set n.dateValue = converted_date}
",
{}
)

yanivts1 · November 14, 2024, 1:10pm

Hi, Thanks for your answer...
I'm trying something similar to what you wrote and it doesn't work:

CALL apoc.periodic.iterate(

           "LOAD CSV WITH HEADERS FROM 'file:///data/TEST/Relationships.csv' AS row RETURN row",

"MATCH(fact:Factory {factNumber: row.factNumber})

WITH row, fact

MATCH(per:Person {personId: row.person_Id})

WITH per, fact, row, CASE WHEN row.End_Date IS NOT NULL AND row.End_Date <> '' THEN

date({

    day: toInteger(substring(row.End_Date, 0, 2)),

    month: toInteger(substring(row.End_Date, 3, 2)),

    year: toInteger(substring(row.End_Date, 6, 4))

})  ELSE '' END as dateEnd

FOREACH(ignoreMe IN CASE WHEN row.Contact_Type_Id = '39' THEN [1] ELSE END |

MERGE (per)-[r:CEO {

source:per.idNumber

           , target: fact.factNumber

           , relTypeId: row.Contact_Type_Id}]->(fact)

ON CREATE SET

r.factoryNumber = fact.factNumber

, r.dateEnd = dateEnd

)

// More Foreach statements here for other cases row.Contact_Type_Id = '39' ('40', '41'....)

",

{batchSize:5000, parallel:false, iterateList:true, failOnError:false, retries:0} )

           YIELD batch, operations, timeTaken, failedOperations, failedBatches, errorMessages WITH failedOperations AS failedRows UNWIND failedRows AS row RETURN row;

glilienfield · November 14, 2024, 5:33pm

This executes.

CALL apoc.periodic.iterate(
"LOAD CSV WITH HEADERS FROM 'file:///data/TEST/Relationships.csv' AS row RETURN row",
"
    MATCH(fact:Factory {factNumber: row.factNumber})
    MATCH(per:Person {personId: row.person_Id})
    WITH per, fact, row, CASE WHEN row.End_Date IS NOT NULL AND row.End_Date <> '' THEN
    date({
        day: toInteger(substring(row.End_Date, 0, 2)),
        month: toInteger(substring(row.End_Date, 3, 2)),
        year: toInteger(substring(row.End_Date, 6, 4))
    })  ELSE '' END as dateEnd
    FOREACH(ignoreMe IN CASE WHEN row.Contact_Type_Id = '39' THEN [1] ELSE [] END |
        MERGE (per)-[r:CEO {source:per.idNumber, target: fact.factNumber, relTypeId: row.Contact_Type_Id}]->(fact)
        ON CREATE SET r.factoryNumber = fact.factNumber, r.dateEnd = dateEnd
    )
    // More Foreach statements here for other cases row.Contact_Type_Id = '39' ('40', '41'....)
",
{batchSize:5000, parallel:false, iterateList:true, failOnError:false, retries:0} )
YIELD batch, operations, timeTaken, failedOperations, failedBatches, errorMessages WITH failedOperations AS failedRows UNWIND failedRows AS row RETURN row;

BTW- as it is written, the query does not do anything if row.Contact_Type_Id = '39'. As such, you could just add that condition upfront after the load csv to not continue.

Topic		Replies	Views
Transforming data format when using LOAD CSV Desktop load-csv	4	3548	November 27, 2021
Import csv format issue: convert str date to datetime format Import / Export	2	139	June 17, 2024
How-To: Import CSV Data with Neo4j Desktop Neo4j Website	0	1536	August 5, 2020
Neo4j Admin Import Tool w/ epoch column values General	2	771	April 19, 2019
How to change string in to datetime type? Import / Export import	2	2925	April 7, 2020

July Summer Fun!

Loading csv with date fields and default values

Related topics