Changing number of core servers required to acknowledge

Hi, I wanted to know that is it possible to reduce the number of core servers required to acknowledge before committing a transaction. I know that we need N/2 +1 to acknowledge but can we increase or decrease this number?

No, this is an intrinsic part of the Raft protocol, and is needed to support data durability.

Since commits, cluster member addition, and cluster member removal all require majority quorum, this safeguards your data. Commits also cannot go out of order, so a commit (including on member change events) guarantees that all previous commit operations have been performed by that node.

As an example, let's say you had a 5 node cluster. Quorum is 3 of the 5 nodes, and this is needed for commit operations and cluster member changes. If we lose 2 nodes, we are guaranteed that at least one of the remaining nodes still has the latest data, and due to the nature of Raft, the node that is most caught up will become leader. This guarantees that the leader always has the latest commits, and that data cannot be lost.

If we lose 3 nodes instead, then because we've lost quorum, we cannot guarantee that one of the remaining online nodes still has the latest data. It is entirely possible that the 3 offline nodes are the ones that participated in one of the latest commits, and it is entirely possible that some commits have not been replicated/acknowledged by the remaining 2 nodes. The Raft protocol is designed to understand this, and so there can be no leader, and no write capability, since we do not know whether the remaining nodes have all the latest commits. At least one of those offline nodes must be able to come online again with its current store and state to regain quorum and write capability.

If we allowed a node to become leader in this scenario (when we've lost quorum) and for writes to resume, then we would introduce the possibility of a branched store, where the commits start to diverge, and may even conflict with what was committed before. This would also require additional logic to recognize this event, and cope by merging and attempting to reconcile any conflicts, once any of the offline nodes comes back online.

So if we allowed variation in how many cores was required to participate in a commit, it would break the durability guarantee of Raft. We could no longer assume that a majority quorum has the latest commits, and that introduces scenarios where data could be lost, where commits could branch, and where conflicts between branched commits could arise, requiring resolution.

1 Like