Deleting records does not reduce size of database

After deleting few records from database the size of database does not reduce. We are using neo4j community desktop edition version 3.1. We have also tried using copy-store utils (bash script) but the script is also giving errors.

Hello, you may want to review this article on understanding database growth.

A summary of the relevant details, deletion does not compact the store, it only marks existing records as deleted, which are then eligible for reuse later. In addition the ids of the deleted entries are stored in id files (so that we quickly know which entries are eligible for reuse). That takes up space as elements are deleted, so you may see some growth there. When data is added that ends up reusing that space, the id files shrink as those ids are reclaimed, and the store does not grow, so you may see the database size decrease some when this happens.

You may also want to pay attention to transaction logs that build up as database changes are committed. By default the last 7 days of logs are kept around, so this may be contributing the most to your graph size. You may want to review transaction log handling, as changing the retention configuration to something lower may end up pruning a good amount of logs that are taking up space. Note that you should never modify or delete transaction logs yourself, use the configuration properties instead.

2 Likes

Thanks Andrew,

My basic problem is, as database size is increasing the Neo4j database is hanging and the desktop application I am using is unusable. I have to restart Neo4j in order to work again. So I wanted to remove old data which is of no use and reduce the size of database and fix the issue of Neo4j hang. Currently the database size is 15 GB. IS there anything else I can do in order to fix the hang problem?

I'm not sure if you're looking at the right place for a solution for that kind of problem. The size of the db is rarely the direct cause of a hang. We'll need more information of what you're trying to do, including your query and the indexes/constraints you have on the graph.

If you want to clean the whole database you can just remove the data/databases/graph.db directory
and restart.

If you want to keep the data after your deletion but want to compact it, try my

1 Like

Hello Michael,

Thanks for getting back on this.

We want to keep the data after deletion but want to compact the database. And as you suggested we are using stor-utils for compaction. But we are getting errors. Please see attached screenshots. We are using Windows 10.

Hi!

Could you please let me know what version of Neo4j you're using, along with what version of the util tool you downloaded? Looking at your screenshots, it looks like you're using Neo4j CE 3.1.1 but 3.4.5 of the utils? Is that correct?

Thanks!

I am using Neo4j version 3.1.1 and utils version 3.4.5

Hi!

Thanks for that. I suspect the problem you're having is that you've got a version imbalance between your Neo4j database and the utils you're using. We recommend that you keep the utils/APOC plugins in sync with the database version that you have. Is it a possibility to upgrade the version of Neo4j you are currently using?

No, I cannot upgrade database version. But now I have downloaded store utils 3.1 from branch 31 of github repository. Will that work?

Hopefully! I suspect the issues you've been experiencing is a version mismatch between the store utils and the database.

Please do let us know how you get on.

After downloading version 3.1 from github I did manage to run store-utils successfully and also compact the database.

Thanks so much for your help.

One more thing I did like to consult, we have around 15 GB database which hangs often. Approximately 200 users work on an application. The server where database is installed is high end server with 8 core processor and 32 GB RAM 500 GB SSD. Can you suggest what can be done to solve this problem? In Neo4j conf we have 16 GB cache and heap max size.

Pleased to hear it's all working now!

There could be a vast many number of things that could be causing the challenges you're experiencing and more information would be needed. Are you currently working with Neo4j rep for your project?

No I am not working on Neo4j rep. My application is a desktop application developed in node js using Neo4j 3.1 community edition. Also let me know what information you need in order to suggest something related to the problem I am facing.

Hello Ljubica,

I am having problem with Neo4j crashes when multiple users start working. Below are the debug.logs and neo4j.log. Mostly I am seeing error of idletimeout after 30000 ms. Please suggest what should be done. Is this related to Neo4j version (3.1.1)? Upgrading to 3.2 will fix this issue?

debug.log

neo4j.log

Hey Michael,

I am using this util (3.5.19) to compact my datasize which is huge in TB . my database version is 3.5.20, i did try with lower version too (3.5.17/18).. for my database i see store-util is getting stuck at 99% everytime. i analyzed thread dumps that always shows RUNNABLE. some of the instances it was running for 3 weeks but not yet completed. I am using powerfull AWS instances (64VCPUs, 2TB mem) but still same. log says below (after that nothing happens)

copying of 6764780802 relationship records took 9054 seconds (748263 rec/s). Unused Records 689969492 (10%) Removed Records 0 (0%)