Previously dead peer detection timer was scheduled every dpd_interval,
added dpd_interval to all of the clients timestamp and if timestamp was
larger than client hearbeat interval * 1.2 then check if client sent
some message. If so, flag was reset.
This method was source of number of problems so instead different method
is now used.
Every single client has its own timer with timeout based on
(configurable) dpd_interval_coefficient and multiplied with
client heartbeat timeout. When message is received from client timer is
rescheduled. When timer callback is called (= client doesn't sent
message during timeout) then client is disconnected.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
When tie happens prefer partition with members of
previously active (quorate) partition. This is hard-coded
behavior of LMS algorithm so this setting affects only
FFSplit algorithm. By default it is disabled for backwards
compatibility.
This solves problem with FFSplit when node A (with lowest id) is killed,
node B gets vote and then node A starts up and creates single node
membership and gets vote.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
With default config of running dpd timer every 10 second and waiting for
2 * client_timeout to clear message received flag and then waiting
another 2 * client_timeout without message received it was possible that
client was marked as a dead after more than 40 seconds making qdevice to
stop sending votequorum hearbeat for too long so corosync lost votes
from qdevice.
This patch is simpler solution which just changes default dpd timer to
1 second and timeout to 1.2 * client_timeout.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
This is not needed at least for cert8 -> cert9, but it's still nice to
have it documented. Also document NSS_IGNORE_SYSTEM_POLICY=1 workaround.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>