Create/Merge getting slower over time

driver = GraphDatabase.driver(...,
auth=(...))
session = driver.session()
chunk = chunk.drop_duplicates()
for start in range(0, len(chunk), 500):
df_temp = chunk[start:start + 500].copy()
driver.execute_query(item[1], data=df_temp.to_dict('records'))
driver.close()

over time, the operations of create and merge are getting slower. does anyone know the reason? it starts fast but becomes slow over time.

@infraplataforma

Any details relative to what version of Neo4j?

also and per MERGE - Cypher Manual

For performance reasons, creating a schema index on the label or property is highly recommended when using MERGE. See Create, show, and delete indexes for more information.

are there indexes to support the merge?

im working with auradb and yes, i use constraints and indexes and i tried with merge and create, but it doesn't work well. all options are getting slow over time.

@infraplataforma

Aura. And thus Neo4j v5? i presume?

if you preface the MERGE with PROFILE do you see the index being used?

correct! and yes, the index is being used

@infraplataforma

are you able to share the query plan?

Planner COST

Runtime PIPELINED

Runtime version 5.19

Batch size 128

+---------------------+----+--------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+---------------------+----+--------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | | 1 | 0 | 0 | 0 | | | |
| | +----+--------------------------------------------------------------+----------------+------+---------+----------------+ | | |
| +EmptyResult | 1 | | 1 | 0 | 0 | | | | |
| | +----+--------------------------------------------------------------+----------------+------+---------+----------------+ | | |
| +Create | 2 | (a)-[r:HAS_PARTNER_RELATION {type: $autoint_2}]->(b) | 1 | 1 | 3 | | | | |
| | +----+--------------------------------------------------------------+----------------+------+---------+----------------+ | | |
| +MultiNodeIndexSeek | 3 | RANGE INDEX a:Company(document) WHERE document = $autoint_0, | 1 | 0 | 0 | 240 | 4/5 | 34.648 | Fused in Pipeline 0 |
| | | RANGE INDEX b:Company(document) WHERE document = $autoint_1 | | | | | | | |
+---------------------+----+--------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 3, total allocated memory: 304

im working with millions nodes but the example is with two only