ha-manager: add section for recovery after fencing

Describe how and why nodes get selected on a recovery of a fenced
service

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This commit is contained in:
Thomas Lamprecht 2016-06-14 16:57:45 +02:00 committed by Dietmar Maurer
parent a3189ad1f3
commit 2957ef8041

View File

@ -350,6 +350,24 @@ If you have a hardware watchdog available remove its kernel module from the
blacklist, load it with insmod and restart the 'watchdog-mux' service or reboot
the node.
Recover Fenced Services
~~~~~~~~~~~~~~~~~~~~~~~
After a node failed and its fencing was successful we start to recover services
to other available nodes and restart them there so that they can provide service
again.
The selection of the node on which the services gets recovered is influenced
by the users group settings, the currently active nodes and their respective
active service count.
First we build a set out of the intersection between user selected nodes and
available nodes. Then the subset with the highest priority of those nodes
gets chosen as possible nodes for recovery. We select the node with the
currently lowest active service count as a new node for the service.
That minimizes the possibility of an overload, which else could cause an
unresponsive node and as a result a chain reaction of node failures in the
cluster.
Groups
------