What's wrong with apoc.periodic.iterate.sub-batching.cypher example code?

Sun · September 30, 2018, 11:05am

Hello,
I tried apoc.periodic.iterate.sub-batching.cypher example code from #neo4j cypher tips & tricks · GitHub, but failed with belows error.

Failed to invoke procedure apoc.periodic.iterate: Caused by: org.neo4j.cypher.internal.util.v3_4.SyntaxException: Unknown function 'apoc.coll.partition' (line 2, column 7 (offset: 67))

apoc.periodic.iterate.sub-batching.cypher

CALL apoc.periodic.iterate(
"LOAD CSV WITH HEADERS FROM 'FILE:///dropd_noun.csv' AS line
WITH apoc.coll.partition(collect(line),10000) AS batchesOfLines
UNWIND batchesOfLines as batch
RETURN batch",
"UNWIND {batch} AS word
MERGE (w:Word {word: word.sentence_noun})",
{batchSize: 1, parallel: true});

I made similar cypher code with above apoc.periodic.iterate.sub-batching.cypher, it works. dropd_noun.csv has one column, sentence_noun column

LOAD CSV WITH HEADERS FROM 'FILE:///dropd_noun.csv' AS line
WITH collect(line) AS nounlists
UNWIND nounlists AS nounlist
CREATE (w:Word {word:nounlist.sentence_noun} )

Thank you,

michael.hunger · September 30, 2018, 12:27pm

Which apoc version do you have?
Does it find the function otherwise?

Why do you do it so complicated? It's all built into periodic iterate.

CALL apoc.periodic.iterate(
"LOAD CSV WITH HEADERS FROM 'FILE:///dropd_noun.csv' AS line
RETURN line",
"MERGE (w:Word {word: word.sentence_noun})",
{batchSize: 10000, iterateList:true, parallel: true});

Sun · September 30, 2018, 6:36pm

Here're the answers.

apoc version? The plugins directory has only one file apoc-3.4.0.3-all.jar. Neo4j server version is 3.4.6 under ubuntu 18.04.

~/neo4j/plugins$ ls
apoc-3.4.0.3-all.jar

Does it find the function otherwise? I don't know how to check it find the function otherwise. Please let me know how to show debug trace it find the other functions. And I checked apoc.coll.partition be in apoc-xxx.jar.

neo4j> CALL apoc.coll.partition([1,2,3,4,5,6], 5) YIELD value
       RETURN value;
+-----------------+
| value           |
+-----------------+
| [1, 2, 3, 4, 5] |
| [6]             |
+-----------------+

2 rows available after 6 ms, consumed after another 1 ms

3, Why do you do it so complicated? It's all built into periodic iterate.
The reason why I test apoc,periodic.iterate, I read below git comment.
apoc.periodic.iterate: how to decide the optimized batch size · Issue #714 · neo4j-contrib/neo4j-apoc-procedures · GitHub.

3. Use apoc.periodic.iterate

* the biggest benefit of using iterate() is you don't need a large Heap memory anymore;
* iterate() can split large update into smaller batches to execute and submit, which keeps the Heap memory usage low;
* iterate() can also leverage the CPU power by running updates in parallel (set parallel:true). This works particularly well for SSD, but avoid using it on mechanical HD;

Thank you,
Sun

michael.hunger · October 1, 2018, 11:58am

So partition a procedure, not a function, so it would have to be called differently in your first example.

But I wouldn't recommend that kind of use anyway, except if one really knows why they are using that approach, and suggest to use the built in functionality.

Topic		Replies	Views
Struggling with apoc.periodic.iterate and syntax Cypher apocperiodiciterate	2	1701	October 15, 2019
There is no procedure with the name `apoc.periodic.iterate` Procedures & APOC apoc	1	469	December 6, 2023
I'm struggling with how to properly use the call apoc.periodic.iterate syntax with my query Procedures & APOC	6	7846	November 23, 2018
Understanding `apoc.periodic.iterate` parallel performance Cypher apoc , performance , browser , cypher	4	662	November 1, 2023
Struggling with apoc.periodic.iterate in a big Query from python code Cypher apoc , cypher , apocperiodiciterate	12	5288	May 8, 2019

August Summer Fun!

What's wrong with apoc.periodic.iterate.sub-batching.cypher example code?

Related topics