Why parallel:true can't be used in apoc.load.csv?

lingvisa · November 12, 2022, 4:45pm

CALL apoc.periodic.iterate("
CALL apoc.load.csv('/Users/martin/test/test.csv', {nullValues:['','na','NAN',false], sep:' ' })
yield map as row",
"MERGE (m🏷Generics {nid: row.nid})
ON CREATE SET m += row
ON MATCH SET m += row
RETURN count(m) as mcount", {batchSize:1000, iterateList:true, parallel:true})

This used to work, but now in neo4j-community-4.4.12, it reports this an error below with 'parallel:true'. But if I change it to "parallel:false", it works fine. Why is that? I got the message when I copy the command to the Browser to test it.

{
"ForsetiClient[transactionId=1121, clientId=2] can't acquire ExclusiveLock{owner=ForsetiClient[transactionId=1120, clientId=12]} on NODE(1062), because holders of that lock are waiting for ForsetiClient[transactionId=1121, clientId=2].\n Wait list:ExclusiveLock[\nClient[1120] waits for [ForsetiClient[transactionId=1121, clientId=2]]]": 1
}

busymo16 · November 15, 2022, 10:20am

Hi @lingvisa,

The error actually shows that you are having locks, which is the case when you use parallelization on your query and Merge operation.
What is happening is that on two different threads (parallel sessions), the node (1062) is being accessed and that is creating a conflict. That happens when the nid column in your CSV file is not unique.

Anyway, it will still retry until it succeeds in the background but it will show you the error anyway. If it does not succeed is going to stop the operation with errors (maybe your instance does not have enough time set to wait for the transaction to finish).

In short, if there are duplicates in a column and you are using it with MERGE, it would be better to not use parallelization to avoid such errors from happening.

Regards,

Topic		Replies	Views
Neo4j apoc.periodic.iterate Import / Export	9	281	March 1, 2024
APOC load csv file with call apoc.merge.node does not execute Procedures & APOC load-csv , apocperiodiciterate	4	1833	August 31, 2019
Concurrently Loading Hundreds of CSVs to Neo4j Import / Export apoc , operations , import	0	301	March 4, 2021
Why does apoc.periodic.iterate keep creating new relationships for multiple running? Neo4j Graph Platform apoc , import	6	620	October 9, 2021
DeadlockDetectedException Procedures & APOC	8	1531	June 18, 2019

August Summer Fun!

Why parallel:true can't be used in apoc.load.csv?

Related topics