-
Notifications
You must be signed in to change notification settings - Fork 104
Closed
Description
ra:transfer_leadership appears to reduce the quorum count by 1 during a leadership transfer. This leads to cluster failure if you ever call transfer_leader on a cluster that is one server away from being in-quorate. The easiest way to reproduce the behaviour is to run a cluster of 2 nodes and attempt to transfer the leadership, but the same behaviour occurs if you are ever on the cusp of being in-quorate (2 servers up out of 3, 3 out of 4, 3 out of 5...)
Repro: note the timeouts calling ra members after the call to transfer_leader and that once the cluster becomes operational again ra1 is still the leader...
Eshell V12.1.3 (abort with ^G)
(runner@macbookM1)1> ErlangNodes = ['ra1@macbookM1', 'ra2@macbookM1'].
[ra1@macbookM1,ra2@macbookM1]
(runner@macbookM1)2> [io:format("Attempting to communicate with node ~s, response: ~s~n", [N, net_adm:ping(N)]) || N <- ErlangNodes].
Attempting to communicate with node ra1@macbookM1, response: pong
Attempting to communicate with node ra2@macbookM1, response: pong
[ok,ok]
(runner@macbookM1)3> [rpc:call(N, ra, start, []) || N <- ErlangNodes].
[ok,ok]
(runner@macbookM1)4> ServerIds = [{quick_start, N} || N <- ErlangNodes].
[{quick_start,ra1@macbookM1},{quick_start,ra2@macbookM1}]
(runner@macbookM1)5> ClusterName = quick_start.
quick_start
(runner@macbookM1)6> Machine = {simple, fun erlang:'+'/2, 0}.
{simple,fun erlang:'+'/2,0}
(runner@macbookM1)7> {ok, ServersStarted, _ServersNotStarted} = ra:start_cluster(default, ClusterName, Machine, ServerIds).
{ok,[{quick_start,ra2@macbookM1},
{quick_start,ra1@macbookM1}],
[]}
(runner@macbookM1)8> {ok, StateMachineResult, LeaderId} = ra:process_command(hd(ServersStarted), 5).
{ok,5,{quick_start,ra1@macbookM1}}
(runner@macbookM1)9> {ok, 12, LeaderId1} = ra:process_command(LeaderId, 7).
{ok,12,{quick_start,ra1@macbookM1}}
(runner@macbookM1)10> ra:members({quick_start,ra1@macbookM1}). {ok,[{quick_start,ra1@macbookM1},
{quick_start,ra2@macbookM1}],
{quick_start,ra1@macbookM1}}
(runner@macbookM1)11> ra:transfer_leadership({quick_start,ra1@macbookM1}, {quick_start,ra2@macbookM1}).
ok
(runner@macbookM1)12> ra:members({quick_start,ra1@macbookM1}).
{timeout,{quick_start,ra1@macbookM1}}
(runner@macbookM1)13> ra:members({quick_start,ra1@macbookM1}).
{timeout,{quick_start,ra1@macbookM1}}
(runner@macbookM1)14> ra:members({quick_start,ra1@macbookM1}).
{timeout,{quick_start,ra1@macbookM1}}
(runner@macbookM1)15> ra:members({quick_start,ra1@macbookM1}).
{ok,[{quick_start,ra1@macbookM1},
{quick_start,ra2@macbookM1}],
{quick_start,ra1@macbookM1}}
(runner@macbookM1)16>Metadata
Metadata
Assignees
Labels
No labels