I've got Neo4j Community successfully deployed into ECS (via AWS Copilot, nonetheless).
To take a backup successfully, I need to be able to stop the Neo4j service. But it seems that neo4j stop
will effectively terminate the container as well (I assume because it's the CMD
being executed by the Dockerfile
: docker-neo4j-publish/5.17.0/bullseye/community/Dockerfile at 7422ac53238f689a26144d3c1c5aee434a07a325 · neo4j/docker-neo4j-publish · GitHub).
Calling neo4j-admin database dump
will also fail because of the lock held by the Neo4j service.
So I'm wondering if there's some mechanism here to successfully stop the service to get a backup without killing the container. As the container is running in ECS, the documentation regarding the Docker operations for offline backups is not valid. Everything else works otherwise and we have no issues as we test Neo4j for graph RAG, but we'd like to be able to dump the database for local testing and development as well.
And for anyone curious, here is the AWS Copilot manifest for this (note the sidecar used for the healthcheck):
# Your service name will be used in naming your resources like log groups, ECS services, etc.
name: neo4j-db
type: Load Balanced Web Service
# We use a sidecar to respond to the healthcheck so we can stop the neo4j instance
sidecars:
health:
port: 7470
image: ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/our-apps/health-sidecar
# Configuration for your containers and service.
image:
location: docker.io/neo4j:5-community
# Port exposed through your container to route traffic to it.
port: 7474
depends_on:
health: start
cpu: 1024 # Number of CPU units for the task.
memory: 2048 # Amount of memory in MiB used by the task.
count: 1 # Number of tasks that should be running in your service.
exec: true # Enable running commands in your container.
network:
connect: true # Enable Service Connect for intra-environment traffic between services.
# See EFS: https://aws.github.io/copilot-cli/docs/developing/storage/#managed-efs
# This is the path inside the container
storage:
volumes:
neo4j_data_volume:
efs:
uid: 7474 # The UID of the neo4j user via id -u neo4j
gid: 7474 # The GID of the neo4j user via id -g neo4j
path: /data
read_only: false
# This is a workaround; see:
# - https://github.com/aws/copilot-cli/issues/5907
# - https://github.com/aws/copilot-cli/issues/1292
secrets:
NEO4J_PLUGINS: /copilot/${COPILOT_APPLICATION_NAME}/${COPILOT_ENVIRONMENT_NAME}/secrets/NEO4J_PLUGINS
variables:
NEO4J_apoc_export_file_enabled: true
NEO4J_apoc_import_file_enabled: true
NEO4J_apoc_import_file_use__neo4j__config: true
#NEO4J_PLUGINS: "['apoc', 'apoc-extended', 'graph-data-science']"
NEO4J_dbms_security_procedures_unrestricted: apoc.*,gds.*,algo.*,spatial.*
# Cannot add a certificate to the NLB; must manually do it or use CF
nlb:
port: 7687/tcp
target_port: 7687
stickiness: true
# Force recreate since Neo4j is holding a lock on the file system.
deployment:
rolling: recreate
# Distribute traffic to your service.
http:
# Import the existing ownit-shared-lb
alb: arn:aws:elasticloadbalancing:us-east-1:ACCOUNT:loadbalancer/app/shared-beta-lb/RESOURCE_ID
path: "/"
deregistration_delay: 5s # Speeds up deploys
redirect_to_https: true
alias: "domain.example.com"
hosted_zone: "ZONE_ID"
healthcheck:
path: "/health"
port: 7470
healthy_threshold: 2
unhealthy_threshold: 3
grace_period: 240s