mirror_corosync-qdevice/man/corosync-qdevice.8
Jan Friesse 9a1955a7d6 Initial import from corosync codebase
Used the code from corosync master
(31ddba64a2726bcedf81eb84df2e2da4846832f7)

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2018-01-23 14:24:36 +01:00

462 lines
14 KiB
Groff

.\"/*
.\" * Copyright (C) 2016-2017 Red Hat, Inc.
.\" *
.\" * All rights reserved.
.\" *
.\" * Author: Jan Friesse <jfriesse@redhat.com>
.\" *
.\" * This software licensed under BSD license, the text of which follows:
.\" *
.\" * Redistribution and use in source and binary forms, with or without
.\" * modification, are permitted provided that the following conditions are met:
.\" *
.\" * - Redistributions of source code must retain the above copyright notice,
.\" * this list of conditions and the following disclaimer.
.\" * - Redistributions in binary form must reproduce the above copyright notice,
.\" * this list of conditions and the following disclaimer in the documentation
.\" * and/or other materials provided with the distribution.
.\" * - Neither the name of Red Hat, Inc. nor the names of its
.\" * contributors may be used to endorse or promote products derived from this
.\" * software without specific prior written permission.
.\" *
.\" * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
.\" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
.\" * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
.\" * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
.\" * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
.\" * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
.\" * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
.\" * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
.\" * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
.\" * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
.\" * THE POSSIBILITY OF SUCH DAMAGE.
.\" */
.TH COROSYNC-QDEVICE 8 2017-10-17
.SH NAME
corosync-qdevice \- QDevice daemon
.SH SYNOPSIS
.B "corosync-qdevice [-dfh] [-S option=value[,option2=value2,...]]"
.SH DESCRIPTION
.B corosync-qdevice
is a daemon running on each node of a cluster. It provides a configured
number of votes to the
quorum subsystem based on a third-party arbitrator's decision. Its primary use
is to allow a cluster to sustain more node failures than standard quorum rules allow.
It is recommended for clusters with an even number of nodes and highly recommended
for 2 node clusters.
.SH OPTIONS
.TP
.B -d
Forcefully turn on debug information without the need to change corosync.conf.
.TP
.B -f
Do not daemonize, run in the foreground.
.TP
.B -h
Show short help text
.TP
.B -S
Set advanced settings described in its own section below. This option
shouldn't be generally used because most of the options are
not safe to change.
.SH CONFIGURATION
.B corosync-qdevice
reads its configuration from corosync.conf file.
The main configuration is within
.B quorum.device
sub-key. Each model also has its own configuration within a
similarly named sub-key.
.TP
.B model
Specifies the model to be used. This parameter is required.
.B corosync-qdevice
is modular and is able to support multiple different models. The model basically
defines what type of arbitrator is used. Currently only
.I net
is supported.
.TP
.B timeout
Specifies how often
.B corosync-qdevice
should call the votequorum_poll function. It is also used by the
.I net
model to adjust
its hearbeat timeout. It is recommended that you don't change this value.
Default is
.IR 10000 .
.TP
.B sync_timeout
Specifies how often
.B corosync-qdevice
should call the votequorum_poll function during a sync phase. It is recommended that you don't change this value.
Default is
.IR 30000 .
.TP
.B votes
The number of votes provided to the cluster by qdevice. Default is (number_of_nodes - 1) or generally
sum(votes_per_node) - 1.
.PP
.B quorum.device.heuristics
subkey holds the configuration of the heuristics. Heuristics are set of commands executed locally on
startup, cluster membership change, successful connect to
.B corosync-qnetd
and optionally also at regular times. Commands are executed in parallel.
When all commands finish successfully
(their return error code is zero) on time,
heuristics have passed, otherwise they have failed. The heuristics result is sent to
.B corosync-qnetd
and there it's used in calculations to determine which partition should be quorate.
.TP
.B timeout
Specifies maximum time in milliseconds how long
.B corosync-qdevice
waits till the heuristics commands finish. If some command doesn't finish before the timeout, it's
killed and heuristics fail. This timeout is used for heuristics executed at regular times.
Default value is half of the
.BR quorum.device.timeout ", so"
.IR 5000 .
.TP
.B sync_timeout
Similar to quorum.device.heuristics.timeout but used during membership changes. Default
value is half of the
.BR quorum.device.sync_timeout ", so"
.IR 15000 .
.TP
.B interval
Specifies interval between two regular heuristics execution. Default value is
3 *
.BR quorum.device.timeout ", so"
.IR 30000 .
.TP
.B mode
Can be one of
.IR on ", " sync " or " off
and specifies mode of operation of heuristics. Default is
.IR off ,
which means heuristics are disabled. When
.I sync
is set, heuristics are executed only during startup, membership change and when connection
to
.B corosync-qnetd
is established. When heuristics should be running also on regular basis, this option
should be set to
.I on
value.
.TP
.B exec_NAME
defines executables.
.I NAME
can be arbitrary valid cmap key name string and it has no special meaning.
The value of this variable must contain a command to execute. The value is parsed (split)
into arguments similarly as Bourne shell would do. Quoting is possible by
using backslash and double quotes.
.PP
.B quorum.device.net
subkey holds the configuration for
.B model
.IR net .
.TP
.B tls
Can be one of
.IR on ", " off " or " required
and specifies if tls should be used.
.I on
means a connection with TLS is attempted first, but if the server doesn't advertise TLS support
then non-TLS will be used.
.I off
is used then TLS is not required and it's then not even tried. This mode is the
only one which doesn't need a properly initialized NSS database.
.I required
means TLS is required and if the server doesn't support TLS, qdevice will
exit with error message. Default is
.IR on .
.TP
.B host
Specifies the IP address or host name of the qnetd server to be used. This parameter
is required.
.TP
.B port
Specifies TCP port of qnetd server. Default is
.IR 5403 .
.TP
.B algorithm
Decision algorithm. Can be one of the
.I ffsplit
or
.IR lms .
(actually there are also
.I test
and
.IR 2nodelms ,
both of which are mainly for developers and shouldn't be used for production clusters).
For a description of what each algorithm means and how the algorithms differ see their
individual sections.
Default value is
.IR ffsplit .
.TP
.B tie_breaker
can be one of
.IR lowest ", " highest
or valid_node_id (number) values. It's used as a fallback if qdevice has to decide between two or more
equal partitions.
.I lowest
means the partition with the lowest node id is chosen.
.I highest
means the partition with highest node id is chosen. And valid_node_id means that the partition
containing the node with the given node id is chosen.
Default is
.IR lowest .
.TP
.B connect_timeout
Timeout when
.B corosync-qdevice
is trying to connect to
.B corosync-qnetd
host. Default is 0.8 *
.BR quorum.sync_timeout .
.TP
.B force_ip_version
can be one of
.I 0|4|6
and forces the software to use the given IP version.
.I 0
(default value) means IPv6 is preferred and IPv4 should be used as a fallback.
.PP
Logging configuration is within the
.B logging
directive.
.B corosync-qdevice
parses and supports most of the options with exception of
.BR to_logfile ", " logfile
and
.BR logfile_priority .
The
.B logger_subsys
sub-directive can be also used if
.B subsys
is set to
.IR QDEVICE .
.PP
For
.B corosync-qdevice
to work correctly, the
.B nodelist
directive has to be used and properly configured. Also the
.I net
model requires that
.B totem.cluster_name
option is set.
.SH MODEL NET TLS CONFIGURATION
For
.B model
.I net
to work using TLS, it's necessary to create the NSS database, import Qnetd
CA certificate, and get/distribute a valid client certificate.
If pcs is used (recommended) the following steps are not needed because pcs does them automatically.
.B corosync-qdevice-net-certutil
is the tool to perform required actions semi-automatically. Please consult the help output of
it and its man page. For a first time configuration it may make sense to start with the
.B -Q
option.
If TLS is not required just edit corosync.conf file and set
.B quorum.device.net.tls
to
.IR off .
.SH MODEL NET ALGORITHMS
Algorithms are used to change behavior of how
.B corosync-qnetd
provides votes to a given node/partition. Currently there are two algorithms supported.
.TP
.B ffsplit
This one makes sense only for clusters with an even number of nodes. It provides exactly one
vote to the partition with the highest number of active nodes. If there are two exactly
similar partitions,
it provides its vote to the partition with higher score. The score is computed
as (number_of_connected_nodes +
number_of_connected_nodes_with_passed_heuristics - number_of_connected_nodes_with_failed_heuristics)
If the scores are equal, the vote is provided to partition with the most clients connected to the qnetd
server. If this number is also equal, then the tie_breaker is used. It is able to transition
its vote if the currently active partition becomes partitioned and a non-active partition
still has at least 50% of the active nodes. Because of this, a vote is not provided
if the qnetd connection is not active.
To use this algorithm it's required to set the number of votes per node to 1 (default)
and the qdevice number of votes has to be also 1. This is achieved by setting
.B quorum.device.votes
key in corosync.conf file to 1.
.TP
.B lms
Last-man-standing. If the node is the only one left in the cluster that can see the
qnetd server then we return a vote.
If more than one node can see the qnetd server but some nodes can't
see each other then the cluster is divided up into 'partitions' based on
their ring_id and this algorithm returns a vote to the partition with highest
heuristics score (computed the same way as for the
.B ffsplit
algorithm), or if there is more than 1 partition with equal scores,
the largest active partition or,
if there is more than 1 equal partition, the partition that contains the tie_breaker
node (lowest, highest, etc). For LMS to work, the number
of qdevice votes has to be set to default (so just delete
.B quorum.device.votes
key from corosync.conf).
.SH ADVANCED SETTINGS
Set by using
.B -S
option. The default value is shown in parentheses) Options
beginning with
.B net_
prefix are specific to
.B model
.IR net .
.TP
.B lock_file
Lock file location. (/var/run/corosync-qdevice/corosync-qdevice.pid)
.TP
.B local_socket_file
Internal IPC socket file location. (/var/run/corosync-qdevice/corosync-qdevice.sock)
.TP
.B local_socket_backlog
Parameter passed to listen syscall. (10)
.TP
.B max_cs_try_again
How many times to retry the call to a corosync function which has returned CS_ERR_TRY_AGAIN. (10)
.TP
.B votequorum_device_name
Name used for qdevice registration. (Qdevice)
.TP
.B ipc_max_clients
Maximum allowed simultaneous IPC clients. (10)
.TP
.B ipc_max_receive_size
Maximum size of a message received by IPC client. (4096)
.TP
.B ipc_max_send_size
Maximum size of a message allowed to be sent to an IPC client. (65536)
.TP
.B master_wins
Force enable/disable master wins. (default is model)
.TP
.B heuristics_ipc_max_send_buffers
Maximum number of heuristics worker send buffers. (128)
.TP
.B heuristics_ipc_max_send_receive_size
Maximum size of a message allowed to be send to, or received from heuristics worker. (4096)
.TP
.B heuristics_min_timeout
Minimum heuristics timeout accepted by client in ms. (1000)
.TP
.B heuristics_max_timeout
Maximum heuristics timeout accepted by client in ms. (120000)
.TP
.B heuristics_min_interval
Minimum heuristics interval accepted by client in ms. (1000)
.TP
.B heuristics_max_interval
Maximum heuristics interval accepted by client in ms. (3600000)
.TP
.B heuristics_max_execs
Maximum number of exec_ commands. (32)
.TP
.B heuristics_use_execvp
Use execvp instead of execv for executing commands. (off)
.TP
.B heuristics_max_processes
Maximum number of processes running at one time. (160)
.TP
.B heuristics_kill_list_interval
Interval between status is gathered and eventually signal is sent
to processes which didn't finished on time in ms. (5000)
.TP
.B net_nss_db_dir
NSS database directory. (/etc/corosync/qdevice/net/nssdb)
.TP
.B net_initial_msg_receive_size
Initial (used during connection parameters negotiation)
maximum size of the receive buffer for message (maximum
allowed message size received from qnetd). (32768)
.TP
.B net_initial_msg_send_size
Initial (used during connection parameter negotiation)
maximum size of one send buffer (message) to be sent to server. (32768)
.TP
.B net_min_msg_send_size
Minimum required size of one send buffer (message) to be sent to server. (32768)
.TP
.B net_max_msg_receive_size
Maximum allowed size of receive buffer for a message sent by server. (16777216)
.TP
.B net_max_send_buffers
Maximum number of send buffers. (10)
.TP
.B net_nss_qnetd_cn
Canonical name of qnetd server certificate. (Qnetd Server)
.TP
.B net_nss_client_cert_nickname
NSS nickname of qdevice client certificate. (Cluster Cert)
.TP
.B net_heartbeat_interval_min
Minimum heartbeat timeout accepted by client in ms. (1000)
.TP
.B net_heartbeat_interval_max
Maximum heartbeat timeout accepted by client in ms. (120000)
.TP
.B net_min_connect_timeout
Minimum connection timeout accepted by client in ms. (1000)
.TP
.B net_max_connect_timeout
Maximum connection timeout accepted by client in ms. (120000)
.TP
.B net_test_algorithm_enabled
Enable test algorithm. (if built with --enable-debug on, otherwise off)
.SH EXAMPLE
Define qdevice with
.I net
model connecting to qnetd running on qnetd.example.org host, using
.I ffsplit
algorithm.
Heuristics is set to
.I sync
mode and executes two commands.
.nf
quorum {
provider: corosync_votequorum
device {
votes: 1
model: net
net {
tls: on
host: qnetd.example.org
algorithm: ffsplit
}
heuristics {
mode: sync
exec_ping: /bin/ping -q -c 1 "www.example.org"
exec_test_txt_exists: /usr/bin/test -f /tmp/test.txt
}
}
.fi
.SH SEE ALSO
.BR corosync-qdevice-tool (8)
.BR corosync-qdevice-net-certutil (8)
.BR corosync-qnetd (8)
.BR corosync.conf (5)
.SH AUTHOR
Jan Friesse
.PP