Understanding cluster load balancing and high availability

(Henry) #1

Hello. I set a 3 core cluster in AWS. I'm trying to understand how to properly set up the cluster. Are all cores capable of read and write or do I set up which does which in the config?

I understand that the bolt+routing handles LB but do I point it to one core only? What happens if the core becomes unavailable? Does it mean there's no need to use AWS ELB? What about auto-scaling group? Is it even needed or is there a mechanism I can set up in the config?

(M. David Allen) #2

In a 3-node cluster you get 1 leader and 2 followers. Only the leader may accept writes. That is, as you add cores you're adding redundancy/safety so that your data is less likely to be lost, and you're scaling your ability to do read queries, but your writes will be limited by the leader.

Load balancers such as AWS ELB can often introduce problem, because they work against the way the bolt+routing protocol works. Load balancers tend to want to treat all connections as equal, and they don't know for example if your bolt connection is going to issue a write query, they'll just route the connection to one of the three, and it may often fail if for example you send a write to a follower.

What you can do is use a client with bolt+routing, point it at any IP of the 3. It will bootstrap a routing table. And then as nodes fail or get added / removed from the cluster topology, the client will keep track of that routing table. If you connect to machine A in cluster A, B, C, it will discover all three -- then later even if A is removed, it'll know to keep talking to B, C, and (possibly a new D)

As for auto-scaling groups, these can be helpful in as far as they guarantee availability of a set of instances.

To learn more about these topics, I'd recommend this link: https://neo4j.com/docs/operations-manual/current/clustering/introduction/

(Henry) #3

Thanks David. This is a very clear explanation.