ha-manager: add section for recovery after fencing

Describe how and why nodes get selected on a recovery of a fenced service Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-08-13 16:32:59 +00:00 · 2016-06-14 16:57:45 +02:00 · 2016-06-14 16:57:45 +02:00 · 2957ef8041
commit 2957ef8041
parent a3189ad1f3
1 changed files with 18 additions and 0 deletions
--- a/ha-manager.adoc
+++ b/ha-manager.adoc
@ -350,6 +350,24 @@ If you have a hardware watchdog available remove its kernel module from the
 blacklist, load it with insmod and restart the 'watchdog-mux' service or reboot
 the node.
 Recover Fenced Services
 ~~~~~~~~~~~~~~~~~~~~~~~
 After a node failed and its fencing was successful we start to recover services
 to other available nodes and restart them there so that they can provide service
 again.
 The selection of the node on which the services gets recovered is influenced
 by the users group settings, the currently active nodes and their respective
 active service count.
 First we build a set out of the intersection between user selected nodes and
 available nodes. Then the subset with the highest priority of those nodes
 gets chosen as possible nodes for recovery. We select the node with the
 currently lowest active service count as a new node for the service.
 That minimizes the possibility of an overload, which else could cause an
 unresponsive node and as a result a chain reaction of node failures in the
 cluster.
 Groups
 ------