I'm trying to create a development workflow where I can spin up a (slightly customised*) Neo4j Community Docker container on AWS EC2 Amazon Linux, then use an image of that instance to create other instances.
When I set up an EC2 instance with my install scripts that get Neo4j running in Docker, everything works fine (I can use cypher-shell to execute statements). However, when I create an image of the instance, then create a new instance from the image, Neo4j isn't working on the new instance.
I've seen three different errors during different attempts:
The client is unauthorized due to authentication failure." when trying to log in with the
neo4juser (which I've assigned a new, permanent password in the container config)
Failed to obtain connection towards WRITE server. Known routing table is: Ttl 1588219230458, currentTime 1588218930464, routers AddressSet=, writers AddressSet=, readers AddressSet=, database '<default database>'
As I said, everything works fine on the instance where the installation steps are initially performed. It's only on instances created from an image of that instance (while is was shutdown) that the errors happen.
I've grabbed the debug.logs from the good instances and a bad one and there's no difference between them.
Is there something about Neo4j (even inside Docker?) that prevents its data from being used after it's moved to another host?
Am I doing something wrong, or is there some condition that would obviously create the above symptoms?
(* The only customisation on top of the official Docker image is to inject my own neo4j.conf)
I'm running multiple neo4j dbs (but light load, dev dbs) under docker on an AWS host. I'm curious about your use case, do you need to do this or just want to learn about modifying and republishing those modified docker images?
I prefer to keep current with the official releases from Neo4j, so thus far I've always made changes on the fly (scripted) after pulling the official release docker image.
My first thought is potential port/resource collisions. Double check for resource conflicts, ports, host mapped directories/files, etc.
Are there any other potentially complicating factors? are you mapping volumes and if yes, does it include neo4j config, or related files that are on the host? Could that be related to the error? (e.g. are you trying to share any neo4j files across docker containers?)
Thanks for the thoughts, Joel. I'm trying to make a Neo4j image that I can use to regularly deploy new instances. My thought was to get Neo4j all set up in Docker and running then take an image. It's a pretty standard setup. Config is in the container, data/ and logs/ are mapped to directories on the host. That's about it. I have no idea why it wouldn't work when re-constituted from an image. Seems like maybe something unique about the host is used in authentication hashes or something.
I've now changed it to do what you do and have the instance just contain scripts that set up Neo4j on the first startup. It takes a bit longer and may be more error prone, but at least the image works now.