Apoc.periodic.iterate only writing one batch with parallel

Minyall · July 12, 2020, 1:28pm

I've managed to solve my issue but in a way that seems not particularly efficient.

'CALL apoc.periodic.iterate('UNWIND $batch as row RETURN row',
'MATCH (s:STORY), (t:ISSUE) WHERE s.id = row.id AND t.id = row.cat_id 
CREATE (s)-[r:IS_TAGGED_WITH]->(t)', 
{batchSize:10000, parallel:false, iterateList:true, params:{batch:$edge_list}})'

This query works for around 50K relationships between STORY nodes and ISSUE nodes. I use the python driver to pass in a list of dicts as the $edge_list parameter. However, if I set parallel:true the procedure only writes what is probably the first batch, i.e. I only get 10,000 relationships created.

Is this just a quirk of apoc.periodic,iterate, or can I change the query to ensure parallel works as expected?

Many thanks,

mark.needham · July 14, 2020, 3:00pm

Do you see any errors when you're using the parallel version? I'm wondering if you're getting a deadlock exception because it's trying to write two relationships to the same node in parallel...

Minyall · July 15, 2020, 7:36am

Yes, after running a toy version in the browser rather than through python I saw the errors regarding the lock. Is there a way to rewrite the query that works around that, or is it just the nature of neo4j?

Many thanks for your reply!

mark.needham · July 28, 2020, 10:37am

I don't think there's a way to work around it by rewriting the query, but you can set the retries parameter, which will retry up to a specified number of times if it runs into problems.

See https://neo4j.com/docs/labs/apoc/current/graph-updates/periodic-execution/#commit-batching for more details.

Minyall · July 29, 2020, 11:55am

Thanks Mark! I'll keep retries in mind.

Topic		Replies	Views
Optimizing the writing of large amounts of data in neo4j with apoc Parquet, periodic iterate Procedures & APOC apoc , performance , cypher	2	586	November 24, 2023
Understanding `apoc.periodic.iterate` parallel performance Cypher apoc , performance , browser , cypher	4	654	November 1, 2023
Apoc.periodic.iterate() parallelization not working with Python driver Drivers & Stacks apoc , performance , browser , neo4j-python-driver	0	43	November 4, 2024
Why does apoc.periodic.iterate keep creating new relationships for multiple running? Neo4j Graph Platform apoc , import	6	619	October 9, 2021
Apoc.periodic.iterate is never ending Procedures & APOC apoc , cypher , relationship , import	5	367	April 24, 2023

July Summer Fun!

Apoc.periodic.iterate only writing one batch with parallel

Related topics