add basic cluster management docs

2025-07-26 23:46:09 +00:00 · 2018-01-11 09:55:19 +01:00 · 2018-01-11 09:55:19 +01:00 · 3ea67bfee1
commit 3ea67bfee1
parent ab7d0ac946
2 changed files with 209 additions and 4 deletions
--- a/pmg-admin-guide.adoc
+++ b/pmg-admin-guide.adoc
@ -31,6 +31,7 @@ include::pmg-mail-filter.adoc[]

 include::pmgbackup.adoc[]

+include::pmgcm.adoc[]

 // Return to normal title levels.
 :leveloffset: 0
@ -110,6 +111,12 @@ Command Line Interface
 include::pmgbackup.1-synopsis.adoc[]


+*pmgcm* - Proxmox Mail Gateway Cluster Management Toolkit
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+include::pmgcm.1-synopsis.adoc[]
+
+
 *pmgsh* - API Shell
 ~~~~~~~~~~~~~~~~~~~

--- a/pmgcm.adoc
+++ b/pmgcm.adoc
@ -20,14 +20,212 @@ DESCRIPTION
 -----------
 endif::manvolnum[]
 ifndef::manvolnum[]
-Cluster Management Toolkit
-==========================
+Cluster Management
+==================
 :pmg-toplevel:
 endif::manvolnum[]

-Toolkit to simplify cluster management tasks.
+We are living in a world where email becomes more and more important -
+failures in email systems are just not acceptable. To meet these
+requirements we developed the Proxmox HA (High Availability) Cluster.
+
+The {pmg} HA Cluster consists of a master and several slave nodes
+(minimum one node). Configuration is done on the master. Configuration
+and data is synchronized to all cluster nodes over a VPN tunnel. This
+provides the following advantages:
+
+* centralized configuration management
+
+* fully redundant data storage
+
+* high availability
+
+* high performance
+
+We use a unique application level clustering scheme, which provides
+extremely good performance. Special considerations where taken to make
+management as easy as possible. Complete Cluster setup is done within
+minutes, and nodes automatically reintegrate after temporary failures
+without any operator interaction.
+
+image::images/pmg-ha-cluster.png[]
+
+
+Hardware requirements
+---------------------
+
+There are no special hardware requirements, although it is highly
+recommended to use fast and reliable server with redundant disks on
+all cluster nodes (Hardware RAID with BBU and write cache enabled).
+
+The HA Cluster can also run in virtualized environments.
+
+
+Subscriptions
+-------------
+
+Each host in a cluster has its own subscription. If you want support
+for a cluster, each cluster node needs to have a valid
+subscription. All nodes must have the same subscription level.
+
+
+Load balancing
+--------------
+
+You can use one of the mechanism described in chapter 9 if you want to
+distribute mail traffic among the cluster nodes. Please note that this
+is not always required, because it is also reasonable to use only one
+node to handle SMTP traffic. The second node is used as quarantine
+host (provide the web interface to user quarantine).
+
+
+Cluster administration
+----------------------
+
+Cluster administration is done with a single command line utility
+called `pmgcm'. So you need to login via ssh to manage the cluster
+setup.
+
+NOTE: Always setup the IP configuration before adding a node to the
+cluster. IP address, network mask, gateway address and hostname can’t
+be changed later.
+
+
+Creating a Cluster
+~~~~~~~~~~~~~~~~~~
+
+You can create a cluster from any existing Proxmox host. All data is
+preserved.
+
+* make sure you have the right IP configuration
+  (IP/MASK/GATEWAY/HOSTNAME), because you cannot changed that later
+
+* run the cluster creation command:
+
+----
+pmgcm create
+----
+
+
+List Cluster Status
+~~~~~~~~~~~~~~~~~~~
+
+----
+pmgcm status
+--NAME(CID)--------------IPADDRESS----ROLE-STATE---------UPTIME---LOAD----MEM---DISK
+pmg5(1)              192.168.2.127   master A       1 day 21:18   0.30    80%    41%
+----
+
+
+Adding Cluster Nodes
+~~~~~~~~~~~~~~~~~~~~
+
+When you add a new node to a cluster (join) all data on that node is
+destroyed. The whole database is initialized with cluster data from
+the master.
+
+* make sure you have the right IP configuration
+
+* run the cluster join command (on the new node):
+
+----
+pmgcm join <master_ip>
+----
+
+You need to enter the root password of the master host when asked for
+a password.
+
+CAUTION: Node initialization deletes all existing databases, stops and
+then restarts all services accessing the database. So do not add nodes
+which are already active and receive mails.
+
+Also, joining a cluster can take several minutes, because the new node
+needs to synchronize all data from the master (although this is done
+in the background).
+
+NOTE: If you join a new node, existing quarantined items from the other nodes are not synchronized to the new node.
+
+
+Deleting Nodes
+~~~~~~~~~~~~~~
+
+Please detach nodes from the cluster network before removing them
+from the cluster configuration. Then run the following command on
+the master node:
+
+----
+pmgcm delete <cid>
+----
+
+Parameter `<cid>` is the unique cluster node ID, as listed with `pmgcm status`.
+
+
+Disaster Recovery
+~~~~~~~~~~~~~~~~~
+
+It is highly recommended to use redundant disks on all cluster nodes
+(RAID). So in almost any circumstances you just need to replace the
+damaged hardware or disk. {pmg} uses an asynchronous
+clustering algorithm, so you just need to reboot the repaired node,
+and everything will work again transparently.
+
+The following scenarios only apply when you really loose the contents
+of the hard disk.
+
+
+Single Node Failure
+^^^^^^^^^^^^^^^^^^^
+
+* delete failed node on master
+
+----
+pmgcm delete <cid>
+----
+
+* add (re-join) a new node
+
+----
+pmgcm join <master_ip>
+----
+
+
+Master Failure
+^^^^^^^^^^^^^^
+
+* force another node to be master
+
+-----
+pmgcm promote
+-----
+
+* tell other nodes that master has changed
+
+----
+pmgcm sync --master_ip <master_ip>
+----
+
+
+Total Cluster Failure
+^^^^^^^^^^^^^^^^^^^^^
+
+* restore backup (Cluster and node information is not restored, you
+  have to recreate master and nodes)
+
+* tell it to become master
+
+----
+pmgcm create
+----
+
+* install new nodes
+
+* add those new nodes to the cluster
+
+----
+pmgcm join <master_ip>
+----
+

 ifdef::manvolnum[]
 include::pmg-copyright.adoc[]
 endif::manvolnum[]
-