Excessive transaction logging during DETACH DELETE

I'm using Neo4J enterprise (v4,4,2) on an AWS EC2 instance running CentOS 7 and having trouble cleaning up a database after a runaway query added about 8M excess labeled nodes.

I'm running a "DETACH DELETE" operation that is removing about 8M nodes in small batches (10,000). I use this small batch size to avoid running out memory.

Although the query appears to be behaving as desired, Neo4J is filling the disk with large transaction log files every minute or so.

Here is an excerpt from /var/log/neo4j/debug.log:

2022-05-10 19:32:35.273+0000 INFO [o.n.k.d.Database] [covid-b/2ffabc51] Rotated to transaction log [/var/lib/neo4j/data/transactions/covid-b/neostore.transaction.db.131] version=130, last transaction in previous log=523146, rotation took 48 millis, started after 71159 millis.
2022-05-10 19:33:09.861+0000 INFO [o.n.k.d.Database] [covid-b/2ffabc51] Rotated to transaction log [/var/lib/neo4j/data/transactions/covid-b/neostore.transaction.db.132] version=131, last transaction in previous log=523176, rotation took 86 millis, started after 34502 millis.
2022-05-10 19:33:44.477+0000 INFO [o.n.k.d.Database] [covid-b/2ffabc51] Rotated to transaction log [/var/lib/neo4j/data/transactions/covid-b/neostore.transaction.db.133] version=132, last transaction in previous log=523206, rotation took 45 millis, started after 34571 millis.
2022-05-10 19:34:16.330+0000 INFO [o.n.k.d.Database] [covid-b/2ffabc51] Rotated to transaction log [/var/lib/neo4j/data/transactions/covid-b/neostore.transaction.db.134] version=133, last transaction in previous log=523236, rotation took 44 millis, started after 31809 millis.
2022-05-10 19:34:49.175+0000 INFO [o.n.k.d.Database] [covid-b/2ffabc51] Rotated to transaction log [/var/lib/neo4j/data/transactions/covid-b/neostore.transaction.db.135] version=134, last transaction in previous log=523266, rotation took 47 millis, started after 32798 millis.

The resulting files in the transactions subdirectory are many and large:

ls -l /var/lib/neo4j/data/transactions/covid-b
total 6341444
-rw-r--r-- 1 root root 176896 May 10 15:36 checkpoint.0
-rw-r--r-- 1 root root 281823579 May 10 15:20 neostore.transaction.db.120
-rw-r--r-- 1 root root 300188833 May 10 15:22 neostore.transaction.db.121
-rw-r--r-- 1 root root 300139578 May 10 15:23 neostore.transaction.db.122
-rw-r--r-- 1 root root 300247329 May 10 15:24 neostore.transaction.db.123
-rw-r--r-- 1 root root 300140025 May 10 15:25 neostore.transaction.db.124
-rw-r--r-- 1 root root 266161319 May 10 15:26 neostore.transaction.db.125
-rw-r--r-- 1 root root 300331081 May 10 15:28 neostore.transaction.db.126
-rw-r--r-- 1 root root 300511449 May 10 15:29 neostore.transaction.db.127
-rw-r--r-- 1 root root 299871860 May 10 15:30 neostore.transaction.db.128
-rw-r--r-- 1 root root 300844707 May 10 15:31 neostore.transaction.db.129
-rw-r--r-- 1 root root 263526666 May 10 15:32 neostore.transaction.db.130
-rw-r--r-- 1 root root 270276758 May 10 15:33 neostore.transaction.db.131
-rw-r--r-- 1 root root 270243754 May 10 15:33 neostore.transaction.db.132
-rw-r--r-- 1 root root 269995516 May 10 15:34 neostore.transaction.db.133
-rw-r--r-- 1 root root 270051214 May 10 15:34 neostore.transaction.db.134
-rw-r--r-- 1 root root 270246701 May 10 15:35 neostore.transaction.db.135
-rw-r--r-- 1 root root 270053812 May 10 15:35 neostore.transaction.db.136
-rw-r--r-- 1 root root 262469905 May 10 15:36 neostore.transaction.db.137
-rw-r--r-- 1 root root 262577377 May 10 15:37 neostore.transaction.db.138
-rw-r--r-- 1 root root 270839820 May 10 15:37 neostore.transaction.db.139
-rw-r--r-- 1 root root 270257299 May 10 15:38 neostore.transaction.db.140
-rw-r--r-- 1 root root 262144000 May 10 15:38 neostore.transaction.db.141

According to du -h, it filled this with more than 6G of logs in just 18 minutes.

What am I doing wrong and what should I do differently?


DETACH DELETE is also going to remove relationships. Do you have dense nodes, i.e. some nodes which have for example 50k relationships and to which it may be viewed as just deleteing 1 node but its really deleting 1 node and 50k relationships

You can also influences the txn retention via dbms.tx_log.rotation.retention_policy Configuration settings - Operations Manual
and this can be set dynamically via call dbms.setconfigValue() see https://neo4j.com/docs/operations-manual/current/configuration/dynamic-settings/

I don't think I have any "dense" nodes as you describe them.

Each deleted node (Datapoint) has a single labeled :DATASET relationship to an instance of another labeled node (Dataset). There are typically about 3K Datapoint instances bound to each Dataset, although one anomalous Dataset has many more than that.

I don't know about and have not attempted to configure any transaction-related configuration.

I'll read more about "Dynamic settings". I'm attempting to do a one-time patch of two databases.