Is Neo4j Docker the recommended approach for production deployment?

lingvisa · November 12, 2020, 6:20pm

I am reading the documentation:

I assume Neo4j Docker container is the recommended way of deploying it onto multiple containers. For performance consideration, if my graph is not that big, is it OK to deploy the graph onto a single very powerful machine other than multiple much less powerful containers? For example, if my performance metric requirement is QPS <= 500. Is that possible to achieve with a single big machine with many client to access concurrently? Read transaction only, no write.

Also, Neo4j Sandobox is a packaged example of how to use neo4j docker, not a product to use or install. It's for demonstration and testing only. Right?

david_allen · November 12, 2020, 6:26pm

Sandbox is for testing and learning, nothing to do with production deploys.

You can choose to use docker for production, or something else. It's not that Neo4j runs better in one vs. the other, it's really ultimately up to what's easiest for you.

You can run Neo4j either with multiple containers as a causal cluster, or with a single machine that's larger. These are not equivalent ways to run it though - it's less to do with how much hardware you have, and more to do with high availability - you simply can't make a single machine highly available, you need a cluster for that.

Have a look into what clusters do:

lingvisa · November 12, 2020, 7:05pm

@david_allen, so you are suggesting cluster + docker as the preferred method for deployment.

For testing this architecture, can I use one machine with multiple cores to test it, or I need multiple containers or physical machines to test it? I want to try out docker + cluster on one machine first. Probably that's not possible?

Besides, when you say 'you can't simply make a single machine highly available', do you mean a single machine's resource is insufficient to serve potentially large amounts of client requests, or you mean a single machine may crash, therefore unable to provide services at all?

david_allen · November 12, 2020, 8:55pm

You can test causal cluster by running many containers on a single machine. It's a good test of the architecture overall, but of course it isn't highly available, since if the single machine crashes, you lose the entire database.

I mean the second one - "high availability" means that the database stays available even if a machine fails.

lingvisa · November 13, 2020, 4:48am

@david_allen I also saw a neo4j spark connector, Neo4j Connector for Apache Spark v5.0.0 - Neo4j Spark Connector

How is that project related to the Neo4j Docker architecture? Is it a requirement for using Neo4j Docker?

david_allen · November 13, 2020, 11:44am

The two are not connected - you can use the Neo4j Connector for Apache Spark with Neo4j deployed in docker, and you can also use it with VMs, or wherever you install Neo4j.

The connector doesn't have any restrictions on how you deploy Neo4j, so whatever you pick they should work together just fine

shawnngtq · January 25, 2021, 6:45am

@david_allen

I see the (https://neo4j.com/docs/operations-manual/current/tutorial/local-causal-cluster/#tutorial-local-cluster) tutorial on a single machine.

Is there a tutorial for setting up a causal cluster on multiple machines? For example, causal cluster using 4 machines to host 3 core servers and 1 read-replica server?

dominicvivek06 · January 25, 2021, 9:02am

you can change the localhost to your respective server -

causal_clustering.initial_discovery_members=localhost:5000,localhost:5001,localhost:5002

shawnngtq · January 25, 2021, 9:32am

@dominicvivek06

I see that there is (GitHub - neo4j-contrib/neo4j-helm: Helm Charts for running Neo4j on Kubernetes [DEPRECATED]). What I am not sure is how do you assign these servers.

From my limited understanding of helm (still learning), it's very easy to setup causal cluster on a single machine. But how do you do that across multiple machines?

Based on (Installation - Neo4j-Helm User Guide), you can create a causal cluster running

#  creates a cluster containing 3 core servers and 3 read replicas
helm install my-neo4j \
    --set core.numberOfServers=3,readReplica.numberOfServers=3,acceptLicenseAgreement=yes,neo4jPassword=mySecretPassword .

But this command don't specified all the required servers, should I do that in values.yaml? And it will automatically setup all neo4j core / read-replica servers in the specified machine locations?

Can you provide a simple operational step-by-step tutorial on this? Running Neo4j helm across multiple machines. Thanks!

cc @david_allen

david_allen · February 5, 2021, 9:16pm

Installation information for neo4j-helm can be found here: Installation - Neo4j-Helm User Guide

Basically, you probably want to specify a core.numberOfServers setting, and (optionally, possibly) a standalone setting to control the number of servers. Check the documentation.

Topic		Replies	Views
Is that OK to use multiple neo4j instances without using cluster? Neo4j Graph Platform	0	233	December 2, 2020
Running neo4j as part of a docker multi-stage build Docker docker	4	1612	August 5, 2019
Installation on distributed environment Installation	4	539	February 10, 2020
Questions about neo4j edition Neo4j Graph Platform	2	294	November 29, 2023
Creating a docker image having neo4j and a c# microservice together Docker	2	606	February 14, 2022

Get Certified in June!

Is Neo4j Docker the recommended approach for production deployment?

Related topics