Solves situation when in 2 node cluster tie-breaker node dies. Because
code contains two bugs, other node got NACK instead of ACK.
- Algo timer is not stack, so calling abort and schedule in timer
callback without setting reschedule is noop.
- It's needed to check not only what current node thinks about
membership, but also what other nodes thinks. If views diverge -> wait.
Thanks Christine Caulfield <ccaulfie@redhat.com> for fixing the English
in the comments somewhat.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
To prevent receiving vote from old membership ring id is sent to server
during init and replied back to client in every node list,
ask for vote reply and vote info messages.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Use the new timers to get better response from LMS when the network
splits, this also closes a race where the remote side could go inquorate
before we confirmed the vote.
Add client-side (qdevice-net) code to cope with a detached qnetd if we
are quorate and have wait_for_all enabled. THat situation will now
keep quorum.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Algo timer is simplified timer designed for qnetd algorithm. Instead of
full timer only one can exists per client. Workflow is:
- In one of algorithm callbacks qnetd_client_algo_timer_schedule is
called
- On timeout .timer_callback is called (for example
qnetd_algo_test_timer_callback). It's possible to set send_vote and
result_vote to send vote info to client
- It's possible to discard timer by calling
qnetd_client_algo_timer_abort
Timer is automatically deleted on client disconnect.
To make all this possible, qnetd main loop now has support for
timer-list (main_timer_list). To be able to handle error and disconnect
client from timer callback, client has schedule_disconnect. If this is
set to 1, client is disconnected on current call of poll loop.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
ring_id should only be copied into the client structure after the
algorithm has run (so the last one is also available), so fix the
algorithms to use the passed-in ring_id where available.
Also tidy some debug logging in algo-lms
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
- Add support for cmap node list configuration change
- Add client side algorithms
- Check if currently received ring id in membership message
equals to last sent ring id
- Send config node list only if config node list really changes and not
after every reload
- Add tlv_ring_id_eq (replacing qnetd_algo_rings_eq) so it's usable in
client
- Move debug logs from algo-test into qnetd-log-debug.c and call them in
proper places (= logs are now algorithm independent)
- Fix memory leak in msg
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Move several commonly used routines into their own
qnetd-algo-utils.[ch] files and change over to using
the ring_id held in the client structure rather than
managint it ourself.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
State used for informative only callbacks (quorum node list) and
possibly informative only callbacks (configuration node list). Client
doesn't change cast vote timer state.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
We were looking for us in other node lists, rather than
others in our nodelist.
Also, remove debug print in votequorum.c
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>