mirror of
https://git.proxmox.com/git/mirror_corosync
synced 2026-01-13 11:01:47 +00:00
This patch handles the situation where the leader node (the node with lowest node_id) crashes and is started again before token timeout of the rest of the cluster. The newly restarted node restores the ringid of the old ring from stable storage, so it has the same ringid as rest of the nodes, but ARU is zero. If the node is able to create a singleton membership before receiving the joinlist from rest of the cluster, everything works as expected, because the ring id gets increased correctly. But if the node receives a joinlist from another cluster node before its own joinlist, then it continues as it would had it never left the cluster. This is not correct, because the new node should always create a singleton configuration first. During the recovery phase, ARUs are compared and because they differ (the ARU of the old leader node is 0), the other nodes try to sent all of their previous messages. This is impossible (even if it was correct), because other nodes have already freed most of those messages. The implementation uses an assert to limit maximum number of messages sent during recovery (we could fix this, but it's not really the point). The solution here is to increase the ring_id sequence number by 1 after loading it from storage. During creation of the commit token it is always increased by 4, so it will not collide with an existing sequence. Thanks Christine Caulfield <ccaulfie@redhat.com> for clarify commit message. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> |
||
|---|---|---|
| .. | ||
| .gitignore | ||
| apidef.c | ||
| apidef.h | ||
| cfg.c | ||
| cmap.c | ||
| coroparse.c | ||
| cpg.c | ||
| cs_queue.h | ||
| fsm.h | ||
| icmap.c | ||
| ipc_glue.c | ||
| ipcs_stats.h | ||
| logconfig.c | ||
| logconfig.h | ||
| logsys.c | ||
| main.c | ||
| main.h | ||
| Makefile.am | ||
| mon.c | ||
| pload.c | ||
| quorum.c | ||
| quorum.h | ||
| schedwrk.c | ||
| schedwrk.h | ||
| service.c | ||
| service.h | ||
| stats.c | ||
| stats.h | ||
| sync.c | ||
| sync.h | ||
| timer.c | ||
| timer.h | ||
| totemconfig.c | ||
| totemconfig.h | ||
| totemip.c | ||
| totemknet.c | ||
| totemknet.h | ||
| totemnet.c | ||
| totemnet.h | ||
| totempg.c | ||
| totemsrp.c | ||
| totemsrp.h | ||
| totemudp.c | ||
| totemudp.h | ||
| totemudpu.c | ||
| totemudpu.h | ||
| util.c | ||
| util.h | ||
| votequorum.c | ||
| votequorum.h | ||
| vsf_quorum.c | ||
| vsf_ykd.c | ||
| vsf_ykd.h | ||
| vsf.h | ||
| wd.c | ||