Why does apoc.periodic.iterate keep creating new relationships for multiple running?

lingvisa · October 5, 2021, 11:45pm

My version is neo4j-community-4.0.4. My loading code:

CALL apoc.periodic.iterate("
CALL apoc.load.csv('/data/20210929/test6_keyword_text_keywordOf.csv', {nullValues:['','na',false], sep:'	'}) 
yield map as row ", 
"MATCH (h:keyword{nid: row._start})
MATCH (t:text{nid: row._end})
CALL apoc.merge.relationship(h, row._type, {}, {}, t, {}) yield rel
RETURN rel", {batchSize:1000, iterateList:true, parallel:true})

If parallel:true, if I keep running the same csv loading code, the relationships increase accordingly. 'keep running' means that I run the code once, check the results, then. restart the program to load again. Nothing changes of my code and csv data between different runs.

If If parallel:false, the relationships created keep constant for multiple runs. This is expected.
My relationship data format:

start    end           type

867f9c6f099 589e81c406 keywordOf

Then I copy the loading statement for one csv file into the neo4j browser to load directly without using my loading code, it shows error message below, but if I re-run the loading statement in the browser again, the error is gone (parallel:true):

{
  "total": 13,
  "committed": 6,
  "failed": 7,
  "errors": {
    "org.neo4j.graphdb.QueryExecutionException: LockClient[22639] can't wait on resource RWLock[NODE(2080), hash=895117327] since => LockClient[22639] <-[:held_by]- rwlock[node(21381), hash="1970823408]" lockclient[22646] rwlock[node(2080), 1, "org.neo4j.graphdb.queryexecutionexception: lockclient[22650] can't wait on resource rwlock[node(186), since> LockClient[22650] <-[:held_by]- rwlock[node(3390), hash="1637932220]" lockclient[22642] rwlock[node(186), 1, "org.neo4j.graphdb.queryexecutionexception: lockclient[22638] can't wait on resource rwlock[node(40), since> LockClient[22638] <-[:held_by]- rwlock[node(343), hash="1225909397]" lockclient[22639] rwlock[node(40), 1, "org.neo4j.graphdb.queryexecutionexception: lockclient[22642] can't wait on resource rwlock[node(3729), since> LockClient[22642] <-[:held_by]- rwlock[node(20676), hash="1845973237]" lockclient[22639] rwlock[node(21381), lockclient[22646] rwlock[node(3729), 1, "org.neo4j.graphdb.queryexecutionexception: lockclient[22640] can't wait on resource rwlock[node(3644), since> LockClient[22640] <-[:held_by]- rwlock[node(5317), hash="1804000373]" lockclient[22638] rwlock[node(3644), 1, "org.neo4j.graphdb.queryexecutionexception: lockclient[22644] can't wait on resource rwlock[node(914), since> LockClient[22644] <-[:held_by]- rwlock[node(3384), hash="1276882950]" lockclient[22646] rwlock[node(914), 1, "org.neo4j.graphdb.queryexecutionexception: lockclient[22643] can't wait on resource rwlock[node(1696), since> LockClient[22643] </-[:held_by]-></-[:held_by]-></-[:held_by]-></-[:held_by]-></-[:held_by]-></-[:held_by]->

In the documentation, it says it can be parallel:true to speed up. Also, for debugging purpose, I selected a small number of nodes and relationships, the issue can not be reproduced when parallel:true, probably because the data is less than the batch size (1000).

What's the implication of using parallel:true in apoc.periodic.iterate?

lingvisa · October 7, 2021, 11:35pm

Any thoughts on this issue?

michael.hunger · October 8, 2021, 5:38am

the parallel:true to speed up is for node-creation/updates and property updates.

relationship creation/deletion in versions before 4.3 is handled by a lock on both start and end-node.

I suggest to upgrade to 4.3.5 and try again.

Merge Relationships has no uniqueness guarantees as it doesn't take locks on the rel-type and/or properties.

lingvisa · October 8, 2021, 7:25am

I upgraded to 4.3.4 for using gds 1.7, but still got an error, slightly different. I read the documentation for Neo4j 4.3 and it did say that below. I am not sure Neo4j 4.3.5 will make a difference.

Relationship Chain Locks provide fine-grained locking so that relationships and nodes can be created/updated/deleted concurrently without causing contention – even when nodes have tens or millions of relationships. This means you can now achieve faster transaction throughput, and faster data import.

My driver:

neo4j==4.3.6
neo4j-driver==4.3.6

On this link it says that I only need to install "neo4j", but I found that I have to install neo4j-driver as well:
https://neo4j.com/docs/api/python-driver/current/#installation

I used the same query. In Python driver mode, it doesn't report any error message, but just that I got wrong number of relationships created. I again copied the query into the Browser, the similar error message (parallel:true) is below:

{
  "ForsetiClient[1] can't acquire ExclusiveLock{owner=ForsetiClient[8]} on NODE_RELATIONSHIP_GROUP_DELETE(111), because holders of that lock are waiting for ForsetiClient[1].\n Wait list:ExclusiveLock[\nClient[8] waits for [1]]": 1,
  "ForsetiClient[6] can't acquire ExclusiveLock{owner=ForsetiClient[8]} on NODE_RELATIONSHIP_GROUP_DELETE(27), because holders of that lock are waiting for ForsetiClient[6].\n Wait list:ExclusiveLock[\nClient[8] waits for [6]]": 1,
  "ForsetiClient[15] can't acquire ExclusiveLock{owner=ForsetiClient[5]} on NODE_RELATIONSHIP_GROUP_DELETE(2027), because holders of that lock are waiting for ForsetiClient[15].\n Wait list:ExclusiveLock[\nClient[5] waits for [15]]": 1,
  "ForsetiClient[14] can't acquire ExclusiveLock{owner=ForsetiClient[8]} on NODE_RELATIONSHIP_GROUP_DELETE(83), because holders of that lock are waiting for ForsetiClient[14].\n Wait list:ExclusiveLock[\nClient[8] waits for [9,14]]": 1,
  "ForsetiClient[10] can't acquire ExclusiveLock{owner=ForsetiClient[8]} on NODE_RELATIONSHIP_GROUP_DELETE(39), because holders of that lock are waiting for ForsetiClient[10].\n Wait list:ExclusiveLock[\nClient[8] waits for [10]]": 1,
  "ForsetiClient[0] can't acquire ExclusiveLock{owner=ForsetiClient[9]} on NODE_RELATIONSHIP_GROUP_DELETE(3703), because holders of that lock are waiting for ForsetiClient[0].\n Wait list:ExclusiveLock[\nClient[9] waits for [0]]": 1,
  "ForsetiClient[11] can't acquire ExclusiveLock{owner=ForsetiClient[8]} on NODE_RELATIONSHIP_GROUP_DELETE(124), because holders of that lock are waiting for ForsetiClient[11].\n Wait list:ExclusiveLock[\nClient[8] waits for [11]]": 1,
  "ForsetiClient[13] can't acquire ExclusiveLock{owner=ForsetiClient[8]} on NODE_RELATIONSHIP_GROUP_DELETE(43), because holders of that lock are waiting for ForsetiClient[13].\n Wait list:ExclusiveLock[\nClient[8] waits for [13]]": 1,
  "ForsetiClient[14] can't acquire ExclusiveLock{owner=ForsetiClient[8]} on NODE_RELATIONSHIP_GROUP_DELETE(171), because holders of that lock are waiting for ForsetiClient[14].\n Wait list:ExclusiveLock[\nClient[8] waits for [5,14]]": 1,
  "ForsetiClient[4] can't acquire ExclusiveLock{owner=ForsetiClient[8]} on NODE_RELATIONSHIP_GROUP_DELETE(149), because holders of that lock are waiting for ForsetiClient[4].\n Wait list:ExclusiveLock[\nClient[8] waits for [4]]": 1,
  "ForsetiClient[14] can't acquire ExclusiveLock{owner=ForsetiClient[8]} on NODE_RELATIONSHIP_GROUP_DELETE(167), because holders of that lock are waiting for ForsetiClient[14].\n Wait list:ExclusiveLock[\nClient[8] waits for [10,14]]": 1,
  "ForsetiClient[16] can't acquire ExclusiveLock{owner=ForsetiClient[8]} on NODE_RELATIONSHIP_GROUP_DELETE(28), because holders of that lock are waiting for ForsetiClient[16].\n Wait list:ExclusiveLock[\nClient[8] waits for [9,16]]": 1,
  "ForsetiClient[9] can't acquire ExclusiveLock{owner=ForsetiClient[8]} on NODE_RELATIONSHIP_GROUP_DELETE(89), because holders of that lock are waiting for ForsetiClient[9].\n Wait list:ExclusiveLock[\nClient[8] waits for [9]]": 1,
  "ForsetiClient[3] can't acquire ExclusiveLock{owner=ForsetiClient[0]} on NODE_RELATIONSHIP_GROUP_DELETE(843), because holders of that lock are waiting for ForsetiClient[3].\n Wait list:ExclusiveLock[\nClient[0] waits for [3]]": 1,
  "ForsetiClient[2] can't acquire ExclusiveLock{owner=ForsetiClient[9]} on NODE_RELATIONSHIP_GROUP_DELETE(4624), because holders of that lock are waiting for ForsetiClient[2].\n Wait list:ExclusiveLock[\nClient[9] waits for [2]]": 1,
  "ForsetiClient[7] can't acquire ExclusiveLock{owner=ForsetiClient[8]} on NODE_RELATIONSHIP_GROUP_DELETE(95), because holders of that lock are waiting for ForsetiClient[7].\n Wait list:ExclusiveLock[\nClient[8] waits for [7]]": 1,
  "ForsetiClient[3] can't acquire ExclusiveLock{owner=ForsetiClient[8]} on NODE_RELATIONSHIP_GROUP_DELETE(56), because holders of that lock are waiting for ForsetiClient[3].\n Wait list:ExclusiveLock[\nClient[8] waits for [3]]": 1,
  "ForsetiClient[5] can't acquire ExclusiveLock{owner=ForsetiClient[8]} on NODE_RELATIONSHIP_GROUP_DELETE(39), because holders of that lock are waiting for ForsetiClient[5].\n Wait list:ExclusiveLock[\nClient[8] waits for [5]]": 1,
  "ForsetiClient[16] can't acquire ExclusiveLock{owner=ForsetiClient[9]} on NODE_RELATIONSHIP_GROUP_DELETE(2050), because holders of that lock are waiting for ForsetiClient[16].\n Wait list:ExclusiveLock[\nClient[9] waits for [16]]": 1
}

I am on MacOS Big Sur 11.2.3

michael.hunger · October 8, 2021, 8:28am

It could be that your concurrency is too high you can control that with a concurrency parameter.
Also those are transient exceptions so you can use retries.

You can use much larger batch sizes too, e.g. 10k or 100k

lingvisa · October 8, 2021, 5:38pm

I tried concurrency=20 or 5, and retries:3 or 5, but they don't really help much. What helps is to increase batch size to a big number, that effectively takes a whole csv into one batch, and that does speed up in my testing. In this case, parallel with true or false doesn't matter and the speed is almost the same.

lingvisa · October 9, 2021, 4:44pm

It turns out that setting batch_size=100000 is two times slower than batch_size=1000 when parallel:false on a bigger dataset test. So it would be great if parallel:true really work for relationship update operations in next release.

Topic		Replies	Views
Apoc.periodic.iterate is never ending Procedures & APOC apoc , cypher , relationship , import	5	359	April 24, 2023
Apoc.periodic.iterate only writing one batch with parallel Procedures & APOC	4	749	July 29, 2020
Why parallel:true can't be used in apoc.load.csv? Neo4j Graph Platform migrated	1	159	November 15, 2022
Statement using Apoc Periodic Iterate gets stuck, but works without the iterate Cypher apoc , cypher	3	185	March 10, 2023
Apoc.periodic.iterate leading to java.lang.StackOverflowError Neo4j Graph Platform migrated	6	196	June 10, 2022

Why does apoc.periodic.iterate keep creating new relationships for multiple running?

Related topics