mirror of
https://git.proxmox.com/git/pve-docs
synced 2025-07-09 08:42:11 +00:00
ha-manager.adoc: improve section Recover Fenced Services
This commit is contained in:
parent
a472fde8cd
commit
480e67e158
@ -575,20 +575,23 @@ the specified module at startup.
|
|||||||
Recover Fenced Services
|
Recover Fenced Services
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
After a node failed and its fencing was successful we start to recover services
|
After a node failed and its fencing was successful, the CRM tries to
|
||||||
to other available nodes and restart them there so that they can provide service
|
move services from the failed node to nodes which are still online.
|
||||||
again.
|
|
||||||
|
|
||||||
The selection of the node on which the services gets recovered is influenced
|
The selection of nodes, on which those services gets recovered, is
|
||||||
by the users group settings, the currently active nodes and their respective
|
influenced by the resource `group` settings, the list of currently active
|
||||||
active service count.
|
nodes, and their respective active service count.
|
||||||
First we build a set out of the intersection between user selected nodes and
|
|
||||||
available nodes. Then the subset with the highest priority of those nodes
|
The CRM first builds a set out of the intersection between user selected
|
||||||
gets chosen as possible nodes for recovery. We select the node with the
|
nodes (from `group` setting) and available nodes. It then choose the
|
||||||
currently lowest active service count as a new node for the service.
|
subset of nodes with the highest priority, and finally select the node
|
||||||
That minimizes the possibility of an overload, which else could cause an
|
with the lowest active service count. This minimizes the possibility
|
||||||
unresponsive node and as a result a chain reaction of node failures in the
|
of an overloaded node.
|
||||||
cluster.
|
|
||||||
|
CAUTION: On node failure, the CRM distributes services to the
|
||||||
|
remaining nodes. This increase the service count on those nodes, and
|
||||||
|
can lead to high load, especially on small clusters. Please design
|
||||||
|
your cluster so that it can handle such worst case scenarios.
|
||||||
|
|
||||||
|
|
||||||
[[ha_manager_start_failure_policy]]
|
[[ha_manager_start_failure_policy]]
|
||||||
|
Loading…
Reference in New Issue
Block a user