mirror of
https://git.proxmox.com/git/pve-docs
synced 2025-08-10 07:56:02 +00:00
Update pvecm documentation for corosync 3
Parts about multicast and RRP have been removed entirely. Instead, a new section 'Corosync Redundancy' has been added explaining the concept of links and link priorities. Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
This commit is contained in:
parent
3254bfddb3
commit
a9e7c3aa23
446
pvecm.adoc
446
pvecm.adoc
@ -56,13 +56,8 @@ Grouping nodes into a cluster has the following advantages:
|
||||
Requirements
|
||||
------------
|
||||
|
||||
* All nodes must be in the same network as `corosync` uses IP Multicast
|
||||
to communicate between nodes (also see
|
||||
http://www.corosync.org[Corosync Cluster Engine]). Corosync uses UDP
|
||||
ports 5404 and 5405 for cluster communication.
|
||||
+
|
||||
NOTE: Some switches do not support IP multicast by default and must be
|
||||
manually enabled first.
|
||||
* All nodes must be able to connect to each other via UDP ports 5404 and 5405
|
||||
for corosync to work.
|
||||
|
||||
* Date and time have to be synchronized.
|
||||
|
||||
@ -84,6 +79,11 @@ NOTE: While it's possible for {pve} 4.4 and {pve} 5.0 this is not supported as
|
||||
production configuration and should only used temporarily during upgrading the
|
||||
whole cluster from one to another major version.
|
||||
|
||||
NOTE: Running a cluster of {pve} 6.x with earlier versions is not possible. The
|
||||
cluster protocol (corosync) between {pve} 6.x and earlier versions changed
|
||||
fundamentally. The corosync 3 packages for {pve} 5.4 are only intended for the
|
||||
upgrade procedure to {pve} 6.0.
|
||||
|
||||
|
||||
Preparing Nodes
|
||||
---------------
|
||||
@ -96,10 +96,13 @@ Currently the cluster creation can either be done on the console (login via
|
||||
`ssh`) or the API, which we have a GUI implementation for (__Datacenter ->
|
||||
Cluster__).
|
||||
|
||||
While it's often common use to reference all other nodenames in `/etc/hosts`
|
||||
with their IP this is not strictly necessary for a cluster, which normally uses
|
||||
multicast, to work. It maybe useful as you then can connect from one node to
|
||||
the other with SSH through the easier to remember node name.
|
||||
While it's common to reference all nodenames and their IPs in `/etc/hosts` (or
|
||||
make their names resolvable through other means), this is not necessary for a
|
||||
cluster to work. It may be useful however, as you can then connect from one node
|
||||
to the other with SSH via the easier to remember node name (see also
|
||||
xref:pvecm_corosync_addresses[Link Address Types]). Note that we always
|
||||
recommend to reference nodes by their IP addresses in the cluster configuration.
|
||||
|
||||
|
||||
[[pvecm_create_cluster]]
|
||||
Create the Cluster
|
||||
@ -113,10 +116,10 @@ node names.
|
||||
hp1# pvecm create CLUSTERNAME
|
||||
----
|
||||
|
||||
CAUTION: The cluster name is used to compute the default multicast address.
|
||||
Please use unique cluster names if you run more than one cluster inside your
|
||||
network. To avoid human confusion, it is also recommended to choose different
|
||||
names even if clusters do not share the cluster network.
|
||||
NOTE: It is possible to create multiple clusters in the same physical or logical
|
||||
network. Use unique cluster names if you do so. To avoid human confusion, it is
|
||||
also recommended to choose different names even if clusters do not share the
|
||||
cluster network.
|
||||
|
||||
To check the state of your cluster use:
|
||||
|
||||
@ -124,20 +127,6 @@ To check the state of your cluster use:
|
||||
hp1# pvecm status
|
||||
----
|
||||
|
||||
Multiple Clusters In Same Network
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
It is possible to create multiple clusters in the same physical or logical
|
||||
network. Each cluster must have a unique name, which is used to generate the
|
||||
cluster's multicast group address. As long as no duplicate cluster names are
|
||||
configured in one network segment, the different clusters won't interfere with
|
||||
each other.
|
||||
|
||||
If multiple clusters operate in a single network it may be beneficial to setup
|
||||
an IGMP querier and enable IGMP Snooping in said network. This may reduce the
|
||||
load of the network significantly because multicast packets are only delivered
|
||||
to endpoints of the respective member nodes.
|
||||
|
||||
|
||||
[[pvecm_join_node_to_cluster]]
|
||||
Adding Nodes to the Cluster
|
||||
@ -150,7 +139,7 @@ Login via `ssh` to the node you want to add.
|
||||
----
|
||||
|
||||
For `IP-ADDRESS-CLUSTER` use the IP or hostname of an existing cluster node.
|
||||
An IP address is recommended (see xref:pvecm_corosync_addresses[Ring Address Types]).
|
||||
An IP address is recommended (see xref:pvecm_corosync_addresses[Link Address Types]).
|
||||
|
||||
CAUTION: A new node cannot hold any VMs, because you would get
|
||||
conflicts about identical VM IDs. Also, all existing configuration in
|
||||
@ -158,7 +147,7 @@ conflicts about identical VM IDs. Also, all existing configuration in
|
||||
workaround, use `vzdump` to backup and restore to a different VMID after
|
||||
adding the node to the cluster.
|
||||
|
||||
To check the state of cluster:
|
||||
To check the state of the cluster use:
|
||||
|
||||
----
|
||||
# pvecm status
|
||||
@ -173,7 +162,7 @@ Date: Mon Apr 20 12:30:13 2015
|
||||
Quorum provider: corosync_votequorum
|
||||
Nodes: 4
|
||||
Node ID: 0x00000001
|
||||
Ring ID: 1928
|
||||
Ring ID: 1/8
|
||||
Quorate: Yes
|
||||
|
||||
Votequorum information
|
||||
@ -217,15 +206,15 @@ Adding Nodes With Separated Cluster Network
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
When adding a node to a cluster with a separated cluster network you need to
|
||||
use the 'ringX_addr' parameters to set the nodes address on those networks:
|
||||
use the 'link0' parameter to set the nodes address on that network:
|
||||
|
||||
[source,bash]
|
||||
----
|
||||
pvecm add IP-ADDRESS-CLUSTER -ring0_addr IP-ADDRESS-RING0
|
||||
pvecm add IP-ADDRESS-CLUSTER -link0 LOCAL-IP-ADDRESS-LINK0
|
||||
----
|
||||
|
||||
If you want to use the Redundant Ring Protocol you will also want to pass the
|
||||
'ring1_addr' parameter.
|
||||
If you want to use the built-in xref:pvecm_redundancy[redundancy] of the
|
||||
kronosnet transport layer, also use the 'link1' parameter.
|
||||
|
||||
|
||||
Remove a Cluster Node
|
||||
@ -283,7 +272,7 @@ Date: Mon Apr 20 12:44:28 2015
|
||||
Quorum provider: corosync_votequorum
|
||||
Nodes: 3
|
||||
Node ID: 0x00000001
|
||||
Ring ID: 1992
|
||||
Ring ID: 1/8
|
||||
Quorate: Yes
|
||||
|
||||
Votequorum information
|
||||
@ -302,8 +291,8 @@ Membership information
|
||||
0x00000003 1 192.168.15.92
|
||||
----
|
||||
|
||||
If, for whatever reason, you want that this server joins the same
|
||||
cluster again, you have to
|
||||
If, for whatever reason, you want this server to join the same cluster again,
|
||||
you have to
|
||||
|
||||
* reinstall {pve} on it from scratch
|
||||
|
||||
@ -329,14 +318,14 @@ storage with another cluster, as storage locking doesn't work over cluster
|
||||
boundary. Further, it may also lead to VMID conflicts.
|
||||
|
||||
Its suggested that you create a new storage where only the node which you want
|
||||
to separate has access. This can be an new export on your NFS or a new Ceph
|
||||
to separate has access. This can be a new export on your NFS or a new Ceph
|
||||
pool, to name a few examples. Its just important that the exact same storage
|
||||
does not gets accessed by multiple clusters. After setting this storage up move
|
||||
all data from the node and its VMs to it. Then you are ready to separate the
|
||||
node from the cluster.
|
||||
|
||||
WARNING: Ensure all shared resources are cleanly separated! You will run into
|
||||
conflicts and problems else.
|
||||
WARNING: Ensure all shared resources are cleanly separated! Otherwise you will
|
||||
run into conflicts and problems.
|
||||
|
||||
First stop the corosync and the pve-cluster services on the node:
|
||||
[source,bash]
|
||||
@ -400,6 +389,7 @@ the nodes can still connect to each other with public key authentication. This
|
||||
should be fixed by removing the respective keys from the
|
||||
'/etc/pve/priv/authorized_keys' file.
|
||||
|
||||
|
||||
Quorum
|
||||
------
|
||||
|
||||
@ -419,12 +409,13 @@ if it loses quorum.
|
||||
|
||||
NOTE: {pve} assigns a single vote to each node by default.
|
||||
|
||||
|
||||
Cluster Network
|
||||
---------------
|
||||
|
||||
The cluster network is the core of a cluster. All messages sent over it have to
|
||||
be delivered reliable to all nodes in their respective order. In {pve} this
|
||||
part is done by corosync, an implementation of a high performance low overhead
|
||||
be delivered reliably to all nodes in their respective order. In {pve} this
|
||||
part is done by corosync, an implementation of a high performance, low overhead
|
||||
high availability development toolkit. It serves our decentralized
|
||||
configuration file system (`pmxcfs`).
|
||||
|
||||
@ -432,75 +423,57 @@ configuration file system (`pmxcfs`).
|
||||
Network Requirements
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
This needs a reliable network with latencies under 2 milliseconds (LAN
|
||||
performance) to work properly. While corosync can also use unicast for
|
||||
communication between nodes its **highly recommended** to have a multicast
|
||||
capable network. The network should not be used heavily by other members,
|
||||
ideally corosync runs on its own network.
|
||||
*never* share it with network where storage communicates too.
|
||||
performance) to work properly. The network should not be used heavily by other
|
||||
members, ideally corosync runs on its own network. Do not use a shared network
|
||||
for corosync and storage (except as a potential low-priority fallback in a
|
||||
xref:pvecm_redundancy[redundant] configuration).
|
||||
|
||||
Before setting up a cluster it is good practice to check if the network is fit
|
||||
for that purpose.
|
||||
Before setting up a cluster, it is good practice to check if the network is fit
|
||||
for that purpose. To make sure the nodes can connect to each other on the
|
||||
cluster network, you can test the connectivity between them with the `ping`
|
||||
tool.
|
||||
|
||||
* Ensure that all nodes are in the same subnet. This must only be true for the
|
||||
network interfaces used for cluster communication (corosync).
|
||||
If the {pve} firewall is enabled, ACCEPT rules for corosync will automatically
|
||||
be generated - no manual action is required.
|
||||
|
||||
* Ensure all nodes can reach each other over those interfaces, using `ping` is
|
||||
enough for a basic test.
|
||||
NOTE: Corosync used Multicast before version 3.0 (introduced in {pve} 6.0).
|
||||
Modern versions rely on https://kronosnet.org/[Kronosnet] for cluster
|
||||
communication, which, for now, only supports regular UDP unicast.
|
||||
|
||||
* Ensure that multicast works in general and a high package rates. This can be
|
||||
done with the `omping` tool. The final "%loss" number should be < 1%.
|
||||
+
|
||||
[source,bash]
|
||||
----
|
||||
omping -c 10000 -i 0.001 -F -q NODE1-IP NODE2-IP ...
|
||||
----
|
||||
|
||||
* Ensure that multicast communication works over an extended period of time.
|
||||
This uncovers problems where IGMP snooping is activated on the network but
|
||||
no multicast querier is active. This test has a duration of around 10
|
||||
minutes.
|
||||
+
|
||||
[source,bash]
|
||||
----
|
||||
omping -c 600 -i 1 -q NODE1-IP NODE2-IP ...
|
||||
----
|
||||
|
||||
Your network is not ready for clustering if any of these test fails. Recheck
|
||||
your network configuration. Especially switches are notorious for having
|
||||
multicast disabled by default or IGMP snooping enabled with no IGMP querier
|
||||
active.
|
||||
|
||||
In smaller cluster its also an option to use unicast if you really cannot get
|
||||
multicast to work.
|
||||
CAUTION: You can still enable Multicast or legacy unicast by setting your
|
||||
transport to `udp` or `udpu` in your xref:pvecm_edit_corosync_conf[corosync.conf],
|
||||
but keep in mind that this will disable all cryptography and redundancy support.
|
||||
This is therefore not recommended.
|
||||
|
||||
Separate Cluster Network
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
When creating a cluster without any parameters the cluster network is generally
|
||||
shared with the Web UI and the VMs and its traffic. Depending on your setup
|
||||
even storage traffic may get sent over the same network. Its recommended to
|
||||
change that, as corosync is a time critical real time application.
|
||||
When creating a cluster without any parameters the corosync cluster network is
|
||||
generally shared with the Web UI and the VMs and their traffic. Depending on
|
||||
your setup, even storage traffic may get sent over the same network. Its
|
||||
recommended to change that, as corosync is a time critical real time
|
||||
application.
|
||||
|
||||
Setting Up A New Network
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
First you have to setup a new network interface. It should be on a physical
|
||||
First you have to set up a new network interface. It should be on a physically
|
||||
separate network. Ensure that your network fulfills the
|
||||
xref:pvecm_cluster_network_requirements[cluster network requirements].
|
||||
|
||||
Separate On Cluster Creation
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
This is possible through the 'ring0_addr' and 'bindnet0_addr' parameter of
|
||||
the 'pvecm create' command used for creating a new cluster.
|
||||
This is possible via the 'linkX' parameters of the 'pvecm create'
|
||||
command used for creating a new cluster.
|
||||
|
||||
If you have setup an additional NIC with a static address on 10.10.10.1/25
|
||||
and want to send and receive all cluster communication over this interface
|
||||
If you have set up an additional NIC with a static address on 10.10.10.1/25,
|
||||
and want to send and receive all cluster communication over this interface,
|
||||
you would execute:
|
||||
|
||||
[source,bash]
|
||||
----
|
||||
pvecm create test --ring0_addr 10.10.10.1 --bindnet0_addr 10.10.10.0
|
||||
pvecm create test --link0 10.10.10.1
|
||||
----
|
||||
|
||||
To check if everything is working properly execute:
|
||||
@ -509,20 +482,20 @@ To check if everything is working properly execute:
|
||||
systemctl status corosync
|
||||
----
|
||||
|
||||
Afterwards, proceed as descripted in the section to
|
||||
Afterwards, proceed as described above to
|
||||
xref:pvecm_adding_nodes_with_separated_cluster_network[add nodes with a separated cluster network].
|
||||
|
||||
[[pvecm_separate_cluster_net_after_creation]]
|
||||
Separate After Cluster Creation
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
You can do this also if you have already created a cluster and want to switch
|
||||
You can do this if you have already created a cluster and want to switch
|
||||
its communication to another network, without rebuilding the whole cluster.
|
||||
This change may lead to short durations of quorum loss in the cluster, as nodes
|
||||
have to restart corosync and come up one after the other on the new network.
|
||||
|
||||
Check how to xref:pvecm_edit_corosync_conf[edit the corosync.conf file] first.
|
||||
The open it and you should see a file similar to:
|
||||
Then, open it and you should see a file similar to:
|
||||
|
||||
----
|
||||
logging {
|
||||
@ -560,37 +533,41 @@ quorum {
|
||||
}
|
||||
|
||||
totem {
|
||||
cluster_name: thomas-testcluster
|
||||
cluster_name: testcluster
|
||||
config_version: 3
|
||||
ip_version: ipv4
|
||||
ip_version: ipv4-6
|
||||
secauth: on
|
||||
version: 2
|
||||
interface {
|
||||
bindnetaddr: 192.168.30.50
|
||||
ringnumber: 0
|
||||
linknumber: 0
|
||||
}
|
||||
|
||||
}
|
||||
----
|
||||
|
||||
The first you want to do is add the 'name' properties in the node entries if
|
||||
you do not see them already. Those *must* match the node name.
|
||||
NOTE: `ringX_addr` actually specifies a corosync *link address*, the name "ring"
|
||||
is a remnant of older corosync versions that is kept for backwards
|
||||
compatibility.
|
||||
|
||||
Then replace the address from the 'ring0_addr' properties with the new
|
||||
addresses. You may use plain IP addresses or also hostnames here. If you use
|
||||
The first thing you want to do is add the 'name' properties in the node entries
|
||||
if you do not see them already. Those *must* match the node name.
|
||||
|
||||
Then replace all addresses from the 'ring0_addr' properties of all nodes with
|
||||
the new addresses. You may use plain IP addresses or hostnames here. If you use
|
||||
hostnames ensure that they are resolvable from all nodes. (see also
|
||||
xref:pvecm_corosync_addresses[Ring Address Types])
|
||||
xref:pvecm_corosync_addresses[Link Address Types])
|
||||
|
||||
In my example I want to switch my cluster communication to the 10.10.10.1/25
|
||||
network. So I replace all 'ring0_addr' respectively. I also set the bindnetaddr
|
||||
in the totem section of the config to an address of the new network. It can be
|
||||
any address from the subnet configured on the new network interface.
|
||||
In this example, we want to switch the cluster communication to the
|
||||
10.10.10.1/25 network. So we replace all 'ring0_addr' respectively.
|
||||
|
||||
After you increased the 'config_version' property the new configuration file
|
||||
NOTE: The exact same procedure can be used to change other 'ringX_addr' values
|
||||
as well, although we recommend to not change multiple addresses at once, to make
|
||||
it easier to recover if something goes wrong.
|
||||
|
||||
After we increase the 'config_version' property, the new configuration file
|
||||
should look like:
|
||||
|
||||
----
|
||||
|
||||
logging {
|
||||
debug: off
|
||||
to_syslog: yes
|
||||
@ -626,26 +603,28 @@ quorum {
|
||||
}
|
||||
|
||||
totem {
|
||||
cluster_name: thomas-testcluster
|
||||
cluster_name: testcluster
|
||||
config_version: 4
|
||||
ip_version: ipv4
|
||||
ip_version: ipv4-6
|
||||
secauth: on
|
||||
version: 2
|
||||
interface {
|
||||
bindnetaddr: 10.10.10.1
|
||||
ringnumber: 0
|
||||
linknumber: 0
|
||||
}
|
||||
|
||||
}
|
||||
----
|
||||
|
||||
Now after a final check whether all changed information is correct we save it
|
||||
and see again the xref:pvecm_edit_corosync_conf[edit corosync.conf file] section to
|
||||
learn how to bring it in effect.
|
||||
Then, after a final check if all changed information is correct, we save it and
|
||||
once again follow the xref:pvecm_edit_corosync_conf[edit corosync.conf file]
|
||||
section to bring it into effect.
|
||||
|
||||
As our change cannot be enforced live from corosync we have to do an restart.
|
||||
The changes will be applied live, so restarting corosync is not strictly
|
||||
necessary. If you changed other settings as well, or notice corosync
|
||||
complaining, you can optionally trigger a restart.
|
||||
|
||||
On a single node execute:
|
||||
|
||||
[source,bash]
|
||||
----
|
||||
systemctl restart corosync
|
||||
@ -665,7 +644,8 @@ They will then join the cluster membership one by one on the new network.
|
||||
Corosync addresses
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
A corosync link or ring address can be specified in two ways:
|
||||
A corosync link address (for backwards compatibility denoted by 'ringX_addr' in
|
||||
`corosync.conf`) can be specified in two ways:
|
||||
|
||||
* **IPv4/v6 addresses** will be used directly. They are recommended, since they
|
||||
are static and usually not changed carelessly.
|
||||
@ -691,104 +671,132 @@ Nodes that joined the cluster on earlier versions likely still use their
|
||||
unresolved hostname in `corosync.conf`. It might be a good idea to replace
|
||||
them with IPs or a seperate hostname, as mentioned above.
|
||||
|
||||
[[pvecm_rrp]]
|
||||
Redundant Ring Protocol
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
To avoid a single point of failure you should implement counter measurements.
|
||||
This can be on the hardware and operating system level through network bonding.
|
||||
|
||||
Corosync itself offers also a possibility to add redundancy through the so
|
||||
called 'Redundant Ring Protocol'. This protocol allows running a second totem
|
||||
ring on another network, this network should be physically separated from the
|
||||
other rings network to actually increase availability.
|
||||
[[pvecm_redundancy]]
|
||||
Corosync Redundancy
|
||||
-------------------
|
||||
|
||||
RRP On Cluster Creation
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Corosync supports redundant networking via its integrated kronosnet layer by
|
||||
default (it is not supported on the legacy udp/udpu transports). It can be
|
||||
enabled by specifying more than one link address, either via the '--linkX'
|
||||
parameters of `pvecm` (while creating a cluster or adding a new node) or by
|
||||
specifying more than one 'ringX_addr' in `corosync.conf`.
|
||||
|
||||
The 'pvecm create' command provides the additional parameters 'bindnetX_addr',
|
||||
'ringX_addr' and 'rrp_mode', can be used for RRP configuration.
|
||||
NOTE: To provide useful failover, every link should be on its own
|
||||
physical network connection.
|
||||
|
||||
NOTE: See the xref:pvecm_corosync_conf_glossary[glossary] if you do not know what each parameter means.
|
||||
|
||||
So if you have two networks, one on the 10.10.10.1/24 and the other on the
|
||||
10.10.20.1/24 subnet you would execute:
|
||||
|
||||
[source,bash]
|
||||
----
|
||||
pvecm create CLUSTERNAME -bindnet0_addr 10.10.10.1 -ring0_addr 10.10.10.1 \
|
||||
-bindnet1_addr 10.10.20.1 -ring1_addr 10.10.20.1
|
||||
----
|
||||
|
||||
RRP On Existing Clusters
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
You will take similar steps as described in
|
||||
xref:pvecm_separate_cluster_net_after_creation[separating the cluster network] to
|
||||
enable RRP on an already running cluster. The single difference is, that you
|
||||
will add `ring1` and use it instead of `ring0`.
|
||||
|
||||
First add a new `interface` subsection in the `totem` section, set its
|
||||
`ringnumber` property to `1`. Set the interfaces `bindnetaddr` property to an
|
||||
address of the subnet you have configured for your new ring.
|
||||
Further set the `rrp_mode` to `passive`, this is the only stable mode.
|
||||
|
||||
Then add to each node entry in the `nodelist` section its new `ring1_addr`
|
||||
property with the nodes additional ring address.
|
||||
|
||||
So if you have two networks, one on the 10.10.10.1/24 and the other on the
|
||||
10.10.20.1/24 subnet, the final configuration file should look like:
|
||||
Links are used according to a priority setting. You can configure this priority
|
||||
by setting 'knet_link_priority' in the corresponding interface section in
|
||||
`corosync.conf`, or, preferrably, using the 'priority' parameter when creating
|
||||
your cluster with `pvecm`:
|
||||
|
||||
----
|
||||
totem {
|
||||
cluster_name: tweak
|
||||
config_version: 9
|
||||
ip_version: ipv4
|
||||
rrp_mode: passive
|
||||
secauth: on
|
||||
version: 2
|
||||
interface {
|
||||
bindnetaddr: 10.10.10.1
|
||||
ringnumber: 0
|
||||
}
|
||||
interface {
|
||||
bindnetaddr: 10.10.20.1
|
||||
ringnumber: 1
|
||||
}
|
||||
# pvecm create CLUSTERNAME --link0 10.10.10.1,priority=20 --link1 10.20.20.1,priority=15
|
||||
----
|
||||
|
||||
This would cause 'link1' to be used first, since it has the lower priority.
|
||||
|
||||
If no priorities are configured manually (or two links have the same priority),
|
||||
links will be used in order of their number, with the lower number having higher
|
||||
priority.
|
||||
|
||||
Even if all links are working, only the one with the highest priority will see
|
||||
corosync traffic. Link priorities cannot be mixed, i.e. links with different
|
||||
priorities will not be able to communicate with each other.
|
||||
|
||||
Since lower priority links will not see traffic unless all higher priorities
|
||||
have failed, it becomes a useful strategy to specify even networks used for
|
||||
other tasks (VMs, storage, etc...) as low-priority links. If worst comes to
|
||||
worst, a higher-latency or more congested connection might be better than no
|
||||
connection at all.
|
||||
|
||||
Adding Redundant Links To An Existing Cluster
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
To add a new link to a running configuration, first check how to
|
||||
xref:pvecm_edit_corosync_conf[edit the corosync.conf file].
|
||||
|
||||
Then, add a new 'ringX_addr' to every node in the `nodelist` section. Make
|
||||
sure that your 'X' is the same for every node you add it to, and that it is
|
||||
unique for each node.
|
||||
|
||||
Lastly, add a new 'interface', as shown below, to your `totem`
|
||||
section, replacing 'X' with your link number chosen above.
|
||||
|
||||
Assuming you added a link with number 1, the new configuration file could look
|
||||
like this:
|
||||
|
||||
----
|
||||
logging {
|
||||
debug: off
|
||||
to_syslog: yes
|
||||
}
|
||||
|
||||
nodelist {
|
||||
node {
|
||||
name: pvecm1
|
||||
nodeid: 1
|
||||
quorum_votes: 1
|
||||
ring0_addr: 10.10.10.1
|
||||
ring1_addr: 10.10.20.1
|
||||
}
|
||||
|
||||
node {
|
||||
name: pvecm2
|
||||
node {
|
||||
name: due
|
||||
nodeid: 2
|
||||
quorum_votes: 1
|
||||
ring0_addr: 10.10.10.2
|
||||
ring1_addr: 10.10.20.2
|
||||
ring1_addr: 10.20.20.2
|
||||
}
|
||||
|
||||
node {
|
||||
name: tre
|
||||
nodeid: 3
|
||||
quorum_votes: 1
|
||||
ring0_addr: 10.10.10.3
|
||||
ring1_addr: 10.20.20.3
|
||||
}
|
||||
|
||||
node {
|
||||
name: uno
|
||||
nodeid: 1
|
||||
quorum_votes: 1
|
||||
ring0_addr: 10.10.10.1
|
||||
ring1_addr: 10.20.20.1
|
||||
}
|
||||
|
||||
[...] # other cluster nodes here
|
||||
}
|
||||
|
||||
[...] # other remaining config sections here
|
||||
quorum {
|
||||
provider: corosync_votequorum
|
||||
}
|
||||
|
||||
totem {
|
||||
cluster_name: testcluster
|
||||
config_version: 4
|
||||
ip_version: ipv4-6
|
||||
secauth: on
|
||||
version: 2
|
||||
interface {
|
||||
linknumber: 0
|
||||
}
|
||||
interface {
|
||||
linknumber: 1
|
||||
}
|
||||
}
|
||||
----
|
||||
|
||||
Bring it in effect like described in the
|
||||
xref:pvecm_edit_corosync_conf[edit the corosync.conf file] section.
|
||||
The new link will be enabled as soon as you follow the last steps to
|
||||
xref:pvecm_edit_corosync_conf[edit the corosync.conf file]. A restart should not
|
||||
be necessary. You can check that corosync loaded the new link using:
|
||||
|
||||
This is a change which cannot take live in effect and needs at least a restart
|
||||
of corosync. Recommended is a restart of the whole cluster.
|
||||
----
|
||||
journalctl -b -u corosync
|
||||
----
|
||||
|
||||
It might be a good idea to test the new link by temporarily disconnecting the
|
||||
old link on one node and making sure that its status remains online while
|
||||
disconnected:
|
||||
|
||||
----
|
||||
pvecm status
|
||||
----
|
||||
|
||||
If you see a healthy cluster state, it means that your new link is being used.
|
||||
|
||||
If you cannot reboot the whole cluster ensure no High Availability services are
|
||||
configured and the stop the corosync service on all nodes. After corosync is
|
||||
stopped on all nodes start it one after the other again.
|
||||
|
||||
Corosync External Vote Support
|
||||
------------------------------
|
||||
@ -832,10 +840,8 @@ for Debian based hosts, other Linux distributions should also have a package
|
||||
available through their respective package manager.
|
||||
|
||||
NOTE: In contrast to corosync itself, a QDevice connects to the cluster over
|
||||
TCP/IP and thus does not need a multicast capable network between itself and
|
||||
the cluster. In fact the daemon may run outside of the LAN and can have
|
||||
longer latencies than 2 ms.
|
||||
|
||||
TCP/IP. The daemon may even run outside of the clusters LAN and can have longer
|
||||
latencies than 2 ms.
|
||||
|
||||
Supported Setups
|
||||
~~~~~~~~~~~~~~~~
|
||||
@ -871,7 +877,6 @@ There are two drawbacks with this:
|
||||
If you understand the drawbacks and implications you can decide yourself if
|
||||
you should use this technology in an odd numbered cluster setup.
|
||||
|
||||
|
||||
QDevice-Net Setup
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
@ -923,7 +928,6 @@ Membership information
|
||||
|
||||
which means the QDevice is set up.
|
||||
|
||||
|
||||
Frequently Asked Questions
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
@ -961,15 +965,15 @@ pve# pvecm qdevice remove
|
||||
|
||||
//Still TODO
|
||||
//^^^^^^^^^^
|
||||
//There ist still stuff to add here
|
||||
//There is still stuff to add here
|
||||
|
||||
|
||||
Corosync Configuration
|
||||
----------------------
|
||||
|
||||
The `/etc/pve/corosync.conf` file plays a central role in {pve} cluster. It
|
||||
controls the cluster member ship and its network.
|
||||
For reading more about it check the corosync.conf man page:
|
||||
The `/etc/pve/corosync.conf` file plays a central role in a {pve} cluster. It
|
||||
controls the cluster membership and its network.
|
||||
For further information about it, check the corosync.conf man page:
|
||||
[source,bash]
|
||||
----
|
||||
man corosync.conf
|
||||
@ -983,23 +987,23 @@ Here are a few best practice tips for doing this.
|
||||
Edit corosync.conf
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Editing the corosync.conf file can be not always straight forward. There are
|
||||
two on each cluster, one in `/etc/pve/corosync.conf` and the other in
|
||||
Editing the corosync.conf file is not always very straightforward. There are
|
||||
two on each cluster node, one in `/etc/pve/corosync.conf` and the other in
|
||||
`/etc/corosync/corosync.conf`. Editing the one in our cluster file system will
|
||||
propagate the changes to the local one, but not vice versa.
|
||||
|
||||
The configuration will get updated automatically as soon as the file changes.
|
||||
This means changes which can be integrated in a running corosync will take
|
||||
instantly effect. So you should always make a copy and edit that instead, to
|
||||
avoid triggering some unwanted changes by an in between safe.
|
||||
effect immediately. So you should always make a copy and edit that instead, to
|
||||
avoid triggering some unwanted changes by an in-between safe.
|
||||
|
||||
[source,bash]
|
||||
----
|
||||
cp /etc/pve/corosync.conf /etc/pve/corosync.conf.new
|
||||
----
|
||||
|
||||
Then open the Config file with your favorite editor, `nano` and `vim.tiny` are
|
||||
preinstalled on {pve} for example.
|
||||
Then open the config file with your favorite editor, `nano` and `vim.tiny` are
|
||||
preinstalled on any {pve} node for example.
|
||||
|
||||
NOTE: Always increment the 'config_version' number on configuration changes,
|
||||
omitting this can lead to problems.
|
||||
@ -1026,7 +1030,7 @@ systemctl status corosync
|
||||
journalctl -b -u corosync
|
||||
----
|
||||
|
||||
If the change could applied automatically. If not you may have to restart the
|
||||
If the change could be applied automatically. If not you may have to restart the
|
||||
corosync service via:
|
||||
[source,bash]
|
||||
----
|
||||
@ -1054,7 +1058,6 @@ corosync[1647]: [SERV ] Service engine 'corosync_quorum' failed to load for re
|
||||
It means that the hostname you set for corosync 'ringX_addr' in the
|
||||
configuration could not be resolved.
|
||||
|
||||
|
||||
Write Configuration When Not Quorate
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
@ -1080,19 +1083,8 @@ Corosync Configuration Glossary
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
ringX_addr::
|
||||
This names the different ring addresses for the corosync totem rings used for
|
||||
the cluster communication.
|
||||
|
||||
bindnetaddr::
|
||||
Defines to which interface the ring should bind to. It may be any address of
|
||||
the subnet configured on the interface we want to use. In general its the
|
||||
recommended to just use an address a node uses on this interface.
|
||||
|
||||
rrp_mode::
|
||||
Specifies the mode of the redundant ring protocol and may be passive, active or
|
||||
none. Note that use of active is highly experimental and not official
|
||||
supported. Passive is the preferred mode, it may double the cluster
|
||||
communication throughput and increases availability.
|
||||
This names the different link addresses for the kronosnet connections between
|
||||
nodes.
|
||||
|
||||
|
||||
Cluster Cold Start
|
||||
@ -1127,10 +1119,10 @@ It makes a difference if a Guest is online or offline, or if it has
|
||||
local resources (like a local disk).
|
||||
|
||||
For Details about Virtual Machine Migration see the
|
||||
xref:qm_migration[QEMU/KVM Migration Chapter]
|
||||
xref:qm_migration[QEMU/KVM Migration Chapter].
|
||||
|
||||
For Details about Container Migration see the
|
||||
xref:pct_migration[Container Migration Chapter]
|
||||
xref:pct_migration[Container Migration Chapter].
|
||||
|
||||
Migration Type
|
||||
~~~~~~~~~~~~~~
|
||||
@ -1155,7 +1147,6 @@ modern systems is lower because they implement AES encryption in
|
||||
hardware. The performance impact is particularly evident in fast
|
||||
networks where you can transfer 10 Gbps or more.
|
||||
|
||||
|
||||
Migration Network
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
@ -1175,7 +1166,6 @@ destination node from the network specified in the CIDR form. To
|
||||
enable this, the network must be specified so that each node has one,
|
||||
but only one IP in the respective network.
|
||||
|
||||
|
||||
Example
|
||||
^^^^^^^
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user