Optimize concurrency and batch size with apoc.periodic.iterate()

holdsworthtimmy · July 26, 2019, 2:22am

How does one optimize the concurrency (and batch_size) parameters with apoc.periodic.iterate()? Is there some rule of thumb for parallelizing with respect to the number of cores on my machine or available RAM?

Any help would be greatly appreciated!

Thanks,
Tim

leonard.panichi · July 26, 2019, 1:20pm

Hi,
This is my opinion, and my opinion only, but using apoc.periodic.iterate is far from easy.
To the best of my knowledge there is no rule of thumb to set the batch size, the reason is that it depends a lot on what you do in your query.
The size of the batch size has a great impact on how fast your query run and you will want to have it as big as possible but if it's too big your database will crash (I crashed mine many times because of this).
There is no easy way, as far as I know, to timeout this function and this can be a real problem (imagine the query has been running for an hour and you have no idea how close it is to the end, what will you do ? Stop it or keep it going ?)
I realized that handling the batch myself, when possble, was far more efficient (I use MATCH with LIMIT).

If you really want to stick with apoc.periodic.iterate, my advices would be :

don't use MERGE in your request.
don't use CREATE in your request.
don't be greedy on your batch size.

I have no idea why but I realized that things went smoother when I was just MATCHing and SETing in my query.
Hope that helps. If you find a way to set the batch size properly with a rule of thumb, please share it, I know other people in the same situation.

EDIT : just so you know, i was using periodic.iterate when matching aruond one million nodes or a few millions links

Topic		Replies	Views
Understanding `apoc.periodic.iterate` parallel performance Cypher apoc , performance , browser , cypher	4	653	November 1, 2023
Code optimization to decrease run time within apoc.periodic.iterate Cypher	1	396	February 3, 2020
Apoc.periodic.iterate for apoc.load.jdbcUpdate method Procedures & APOC	4	217	March 20, 2023
Optimizing the writing of large amounts of data in neo4j with apoc Parquet, periodic iterate Procedures & APOC apoc , performance , cypher	2	580	November 24, 2023
Failure when calling apoc.periodic.iterate multiple times Procedures & APOC	4	1294	June 4, 2019

July Summer Fun!

Optimize concurrency and batch size with apoc.periodic.iterate()

Related topics