ha-manager.adoc: improve fencing docs

This commit is contained in:
Dietmar Maurer 2016-11-21 10:19:23 +01:00
parent 0d42707747
commit 61972f5533

View File

@ -523,25 +523,36 @@ multiple mounts.
How {pve} Fences How {pve} Fences
~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~
There are different methods to fence a node, for example fence devices which There are different methods to fence a node, for example, fence
cut off the power from the node or disable their communication completely. devices which cut off the power from the node or disable their
communication completely. Those are often quite expensive and bring
additional critical components into a system, because if they fail you
cannot recover any service.
Those are often quite expensive and bring additional critical components in We thus wanted to integrate a simpler fencing method, which does not
a system, because if they fail you cannot recover any service. require additional external hardware. This can be done using
watchdog timers.
We thus wanted to integrate a simpler method in the HA Manager first, namely .Possible Fencing Methods
self fencing with watchdogs. - external power switches
- isolate nodes by disabling complete network traffic on the switch
- self fencing using watchdog timers
Watchdogs are widely used in critical and dependable systems since the Watchdog timers are widely used in critical and dependable systems
beginning of micro controllers, they are often independent and simple since the beginning of micro controllers. They are often independent
integrated circuit which programs can use to watch them. After opening they need to and simple integrated circuits which are used to detect and recover
report periodically. If, for whatever reason, a program becomes unable to do from computer malfunctions.
so the watchdogs triggers a reset of the whole server.
Server motherboards often already include such hardware watchdogs, these need During normal operation, `ha-manager` regularly resets the watchdog
to be configured. If no watchdog is available or configured we fall back to the timer to prevent it from elapsing. If, due to a hardware fault or
Linux Kernel softdog while still reliable it is not independent of the servers program error, the computer fails to reset the watchdog, the timer
Hardware and thus has a lower reliability then a hardware watchdog. will elapse and triggers a reset of the whole server (reboot).
Recent server motherboards often include such hardware watchdogs, but
these need to be configured. If no watchdog is available or
configured, we fall back to the Linux Kernel 'softdog'. While still
reliable, it is not independent of the servers hardware, and thus has
a lower reliability than a hardware watchdog.
Configure Hardware Watchdog Configure Hardware Watchdog
~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~