Leadership election issue

Dear Members,

Kindly I built 3 Core instances with Neo4j 3.4.10 Enterprise version. run all from my PC using a localhost with different port for each instance.

1- In the 1st part of the image: all 3 are running.
2- In the 2nd part of the image showed what happen when I stop the "LEADER" instance, a new election happened and new LEADER appeared.
3- In the 3rd part of the image showed what happen when I stop the new "LEADER" instance from step (2), the remained one role still "FOLLOWER" and never become a "LEADER".

So, how I can solve this issue.

Regards,
Ahmad

This is expected behavior, everything is working normally.

Here is the important part of the docs that discusses this:

Core Servers' main responsibility is to safeguard data. The Core Servers do so by replicating all transactions using the Raft protocol. Raft ensures that the data is safely durable before confirming transaction commit to the end user application. In practice this means once a majority of Core Servers in a cluster ( N/2+1 ) have accepted the transaction, it is safe to acknowledge the commit to the end user application.

  • In order to tolerate two failed Core Servers we would need to deploy a cluster of five Cores.
  • The smallest fault tolerant cluster, a cluster that can tolerate one fault, must have three Cores.
  • It is also possible to create a Causal Cluster consisting of only two Cores. However, that cluster will not be fault-tolerant; if one of the two servers fails, the remaining server will become read-only.

Note that should the Core Server cluster suffer enough failures that it can no longer process writes, it will become read-only to preserve safety.

When you stopped the other two nodes, you simulated 2 failures in your cluster. The remaining one member can never be a majority of 3, so leader re-elections are not possible. Your database is read-only because you have only a follower.

Why does it work this way? Well -- let's consider what if a single node became a leader by itself? This is the usual question you might wonder about. In databases there's a possibility called "split brain". Suppose you had 2 nodes left of a 3 node cluster. And they didn't crash, but there was a network partition (they can't talk to one another). Each would think they had crashed. In this case, if each one elected itself leader and accepted writes then you would end up with a "split brain". Your database would keep taking writes down two separate paths, and you'd end up in a very confusing situation you would not be able to merge later.

Relevant documentation: https://neo4j.com/docs/operations-manual/current/clustering/introduction/

Hi David,

Kindly I appreciate your efforts and support, your clarification is very clear. I got your points and will run the test again regarding that.

Thank you for your time and consideration.

Respectfully,
Ahmad

1 Like