ha-manager.adoc: improve section Recover Fenced Services

2025-07-09 08:42:11 +00:00 · 2016-11-21 11:37:50 +01:00 · 2016-11-21 11:37:50 +01:00 · 480e67e158
commit 480e67e158
parent a472fde8cd
1 changed files with 16 additions and 13 deletions
--- a/ha-manager.adoc
+++ b/ha-manager.adoc
@ -575,20 +575,23 @@ the specified module at startup.
 Recover Fenced Services
 ~~~~~~~~~~~~~~~~~~~~~~~
-After a node failed and its fencing was successful we start to recover services
+After a node failed and its fencing was successful, the CRM tries to
-to other available nodes and restart them there so that they can provide service
+move services from the failed node to nodes which are still online.
 again.
-The selection of the node on which the services gets recovered is influenced
+The selection of nodes, on which those services gets recovered, is
-by the users group settings, the currently active nodes and their respective
+influenced by the resource `group` settings, the list of currently active
-active service count.
+nodes, and their respective active service count.
-First we build a set out of the intersection between user selected nodes and
+
-available nodes. Then the subset with the highest priority of those nodes
+The CRM first builds a set out of the intersection between user selected
-gets chosen as possible nodes for recovery. We select the node with the
+nodes (from `group` setting) and available nodes. It then choose the
-currently lowest active service count as a new node for the service.
+subset of nodes with the highest priority, and finally select the node
-That minimizes the possibility of an overload, which else could cause an
+with the lowest active service count. This minimizes the possibility
-unresponsive node and as a result a chain reaction of node failures in the
+of an overloaded node.
-cluster.
+
 CAUTION: On node failure, the CRM distributes services to the
 remaining nodes. This increase the service count on those nodes, and
 can lead to high load, especially on small clusters. Please design
 your cluster so that it can handle such worst case scenarios.
 [[ha_manager_start_failure_policy]]