@glilienfield @bennu_neo @Rcolinp Justr for your info, I loaded 11 million transactions of different kind, using the latest approach proposed by @glilienfield.
This approach using apoc turned out to be by far the most readable one (after a bit of a sprucing up of the query!)
This is the result:
LOAD CSV WITH HEADERS FROM 'file:///transactions.csv' AS row FIELDTERMINATOR ';'
CALL {
WITH row
WITH row
WHERE row.transactionid IS NOT NULL
MERGE (transaction:Transaction { transactionid : row.transactionid })
SET transaction = row,
transaction.uuid = apoc.create.uuid(),
transaction.consumer_id = NULL // remove the attribute no more used
with transaction, row
OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})
CALL apoc.do.when(organisation IS NOT NULL,
'WITH $tran as tran, $org as org
MERGE (tran)-[h:TRANSACTION_BELONGS_TO_ORGANISATION]->(org)
RETURN 1',
'WITH $tran as tran
MERGE (error:Error)
MERGE (tran)-[h:TRANSACTION_HAS_NO_ORGANISATION]->(error)
RETURN 1',
{tran: transaction, org: organisation}) yield value AS X
WITH transaction, row
OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})
CALL apoc.do.when(bankAccount IS NOT NULL,
'WITH $tran as tran, $bankAcct as bankAcct
MERGE (tran)-[h:TRANSACTION_HAS_BANK_ACCOUNT]->(bankAcct)
RETURN 1',
'WITH $tran as tran
MERGE (error:Error)
MERGE (tran)-[h:TRANSACTION_HAS_NO_BANK_ACCOUNT]->(error)
RETURN 1',
{tran: transaction, bankAcct: bankAccount}) yield value AS Y
return transaction
} IN TRANSACTIONS OF 5000 ROWS
return count(transaction)
The story will follow with a new discussion .... but this is an achieved result I'd like to share!
Thank you to everybody!