I'm importing a .csv file with 400 million rows into Neo4j, and I'm getting this error:
"Required array length 2147483639 + 1803406 is too large."
I used the apoc iterate with a batch size of 10,000 (I also tried 5,000, but the same error occurred).
My server has 64GB of RAM and 32 cores.
What could be happening?
My query:
CALL apoc.periodic.iterate(
"LOAD CSV WITH HEADERS FROM 'file:///TELEFONES/TELEFONES.csv' AS row RETURN row",
"WITH row
MERGE (phone:Phone {PhoneNumber: row.DDD + row.TELEFONE})
ON CREATE SET
phone.AreaCode = row.DDD,
phone.Number = row.TELEFONE,
phone.SerasaContatosID = row.CONTATOS_ID,
phone.SerasaCadastroID = row.CADASTRO_ID,
phone.SerasaDataAtualizacao =
CASE WHEN row.DT_ATUALIZACAO IS NULL
THEN ''
ELSE row.DT_ATUALIZACAO
END,
phone.SerasaDataInclusao = row.DT_INCLUSAO
MERGE (person:Person {SerasaContatosID: row.CONTATOS_ID})
MERGE (person)-[:OWNS {Date:
CASE WHEN row.DT_ATUALIZACAO IS NULL
THEN ''
ELSE row.DT_ATUALIZACAO
END
}]->(phone)
",
{batchSize: 5000, iterateList: true, parallel: true}
);