Running neo4j-admin backup from inside docker container on cron interval

Greetings! We are running neo4j using docker in a production environment. We are deriving our own neo4j docker image that encapsulates the plug-ins we are using.

We need to have a backup process using neo4j-admin backup that runs from inside the container on a regular time interval as a cron job.

Ideally this would be configurable from outside container through environmental variables in terms of the specific time the backup was run.

I was wondering if there were any guidance available as to how we might set this up?


1 Like

This post discusses the topic right here:

It provides a docker container with a sample backup script and talks about how to do backups with docker (here with kubernetes, but not using kubernetes is very similar, if you want to follow up with extra questions).

For the scheduling part -- there are many ways to do this but it matters a lot how you're running docker containers in production. Using kubernetes? Using something like docker cloud? Something else?

The simplest cron solution (but not the most elegant) is to actually use cron inside of a container. You can write a shell script to deploy the backup container, and then make a very tiny extra container to run cron like this:

Within most management frameworks though you have better options.

Greetings! We are using rancher in the case of the current production installation it's rancher 1.6, for new ones we are moving to rancher 2 which is kunernetes based.

I had looked at the scripting docker container approach - but it seems fragile and inelegant, so was seeking an alternative...

I wonder if you have found some solution for that ?

One way I thought of doign this is cron job from outside container. This works as long as the container name doesn't change:

docker exec CONTAINER NAME --backup-dir=/var/lib/neo4j/backups --name=graph.db-backup --check-consistency=true

That approach will certainly work.

In the Kubernetes world there's also this:

I'm afraid the real answer here is that there are a bunch of different ways of doing this, depending on the assumptions about your docker environment, like whether you're doing kubernetes, mesos, running on your local machine, some place remote, whether it's on one big machine or a cluster, etc. etc.

UPDATE - this is now almost 2 years later - but the neo4j-helm backup container has built-in job scheduling. See the jobSchedule argument here: