I am trying to understand some curious behavior in Neo4j that is desirable but perplexing. On occasion, the size of the stored data on disk seems to shrink? In the most recent example, I was uploading more nodes/edges via CSV bulk load. During the process, the number of nodes and edges increased, but the size on disk decreased (as reported in linux by "df -h | grep "var/lib/neo4j/data")
BEFORE: 308 million nodes, 441 million edges, 755 GB on disk
AFTER: 309 million nodes, 453 million edges, 516 GB on disk
Is there some background process that compresses data from time to time? System details are below
Dell optiplex 7010, i7-3770, 24GB ram
Ubuntu Linux 18.04
Neo4j 4.0.0 (though I observed a similar phenomenon with Neo4j 3.5x)
Driver: py2neo for python
OS drive: 250GB SSD
*A 2TB HDD formatted as ext4 is mounted to /var/lib/neo4j/data to hold the large amount of data
Thanks in advance for any insight.