Leadership election issue

a.alshafey74 · May 14, 2019, 12:56pm

Dear Members,

Kindly I built 3 Core instances with Neo4j 3.4.10 Enterprise version. run all from my PC using a localhost with different port for each instance.

1- In the 1st part of the image: all 3 are running.
2- In the 2nd part of the image showed what happen when I stop the "LEADER" instance, a new election happened and new LEADER appeared.
3- In the 3rd part of the image showed what happen when I stop the new "LEADER" instance from step (2), the remained one role still "FOLLOWER" and never become a "LEADER".

So, how I can solve this issue.

Regards,
Ahmad

david_allen · May 14, 2019, 4:56pm

This is expected behavior, everything is working normally.

Here is the important part of the docs that discusses this:

Core Servers' main responsibility is to safeguard data. The Core Servers do so by replicating all transactions using the Raft protocol. Raft ensures that the data is safely durable before confirming transaction commit to the end user application. In practice this means once a majority of Core Servers in a cluster ( N/2+1 ) have accepted the transaction, it is safe to acknowledge the commit to the end user application.

In order to tolerate two failed Core Servers we would need to deploy a cluster of five Cores.

The smallest fault tolerant cluster, a cluster that can tolerate one fault, must have three Cores.

It is also possible to create a Causal Cluster consisting of only two Cores. However, that cluster will not be fault-tolerant; if one of the two servers fails, the remaining server will become read-only.

Note that should the Core Server cluster suffer enough failures that it can no longer process writes, it will become read-only to preserve safety.

When you stopped the other two nodes, you simulated 2 failures in your cluster. The remaining one member can never be a majority of 3, so leader re-elections are not possible. Your database is read-only because you have only a follower.

Why does it work this way? Well -- let's consider what if a single node became a leader by itself? This is the usual question you might wonder about. In databases there's a possibility called "split brain". Suppose you had 2 nodes left of a 3 node cluster. And they didn't crash, but there was a network partition (they can't talk to one another). Each would think they had crashed. In this case, if each one elected itself leader and accepted writes then you would end up with a "split brain". Your database would keep taking writes down two separate paths, and you'd end up in a very confusing situation you would not be able to merge later.

Relevant documentation: Introduction - Operations Manual

a.alshafey74 · May 15, 2019, 7:55am

Hi David,

Kindly I appreciate your efforts and support, your clarification is very clear. I got your points and will run the test again regarding that.

Thank you for your time and consideration.

Respectfully,
Ahmad

Topic		Replies	Views
Cluster leader keeps changing Operations	7	4071	December 17, 2018
[Neo4j Cluster] How to re-electing Leader Neo4j Graph Platform migrated , cluster-tagged	0	184	November 7, 2022
Managing Multi-DataCentre(DC/DR) cluster environment in neo4j 3.5.5 Cluster	2	429	June 2, 2020
Data storage among the Read Replicas Cluster	4	1914	December 18, 2018
Two leaders in cluster after OutOfMemoryError Operations	4	2492	April 25, 2019

Leadership election issue

Related topics