I recently brought up an issue where my database with heavily edited edges was growing exponentially in size, much larger than expected. The resolution recommended was a database update, but it seems that the issue has rearose on this new version.
Once again, my database ballooned from 6GB to 130GB. After doing a neo4j-admin database copy I once again returned to 6GB.
Is this a side effect of upgrading from 5.24? Is this amount of fragmentation common? Pinging @dana_canzano since they were very helpful previously.
Otherwise, I feel like this is a persistent bug that was not fixed in 5.26.
do you have details at the file system and at data/database/ and data/transactions which describes where all the data is being used. Are we still seeing a very large data/databases/<databaseName>/block.big_values.db and for example in many many GB?
one of the things neo4j-admin database copy does is effectively deletes the current data/transactions/<databaseName>/ ( and this is expected) so that if prior to running it is 10GB, post running neo4j-admin database copy you should expect this path to be significantly smaller, i.e. in the MB range. Now I dont suspect you had 100GB+ of txn logs prior to running the neo4j-admin database copy but without a before/after of the filesystem for data/ this is not so easy to understand
Yeah, it's all in block.big_values.db. This is the old database, where 97.6% of the neo4jclean db is being taken up by these values. Here is a windirstat image of the whole structure:
Thanks for all this detail. I think im able to reproduce and if so will report to engineering and let you know of its progress.
Besides the block.big_values.db being large the block.big_values.db.id is also 'relatively' large. This file, and files ending in .id typically represent the internal 'ids' which were once used and then freed up, as a result of a delete / update, and these ids are thus eligible for re-use. But in this case it appears we are not reusing these id, thus leading to unexpected growth.