Disk Usage: Why does the used disk space is far different from the sum of the other related metrics?

Hi there,

I'm using Neo4j 3.4.9, and have been wondering why the reported disk space is correct (4.17GB) but the rest of the metrics don't add up. There are 2.3M nodes and 5.6M relationships. Most of the nodes are short texts (support ticket replies).

image

I've been using Kafka Streams to send data back and forward within the DB Engine and from/to outside, to other scripts etc.

Has anyone has any advice on this?

I guess the delta might be transaction log files. See Transaction log - Operations Manual.

Maybe branched data stores as well if you use HA style clustering.

HI Elwosto,

The answer is as follows, the Total Store Size = the physical data store files (as in the table above) + the transaction log files size (neostore.transaction.db.). So if you doing a lot of write transactions in a short time the transaction log space can be big. Standard the transaction log files are kept for 7 days.

regards

Thanks guys, it's very likely that! Considering almost every node was created using a new transaction, it makes a lot of sense.

Thanks a lot. I'll keep an eye on it next week.