Commit Graph

4062 Commits

Author SHA1 Message Date
Jan Friesse
d4d48d9268 totemip: Use res in totemip_sa_equal
Setting res to -1 was not entirely following semantics of "equal"
operation. Set it to 0 and return it when families differs makes
compiler happy.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-06-12 15:40:50 +02:00
Jan Friesse
299c9c5b70 totemconfig: ipaddr_equal use switch
Compiler may have problem understanding relation between addr1p and
addrlen. Small change makes code a little more readable and compiler
happy.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-06-12 15:40:50 +02:00
Jan Friesse
45d19a2d90 configure: Fix GDB_CFLAGS typo
GDB_FLAGS (without C) is the correct name of variable
to print in the summary.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-06-10 10:57:16 +02:00
Jan Friesse
8ced7e545d man: Add vqsim man page into distributed tarball
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2019-06-10 08:20:52 +02:00
Jan Friesse
5cced85cd5 spec: Add support for user-flags configure option
Passing -ggdb3 (or -g3) during compiler may result in corrupted
debuginfo files (bug in debugedit - for Fedora filed as a
https://bugzilla.redhat.com/show_bug.cgi?id=1708786). Until the bug is
fixed it's possible to ether change configure to add -ggdb2/-g2 or use
already existing --enable-user-flags option and rely on environment set
by rpmbuild.

Patch implements second option so RPM distros without broken debugedit
are not affected.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-06-07 11:12:51 +02:00
Jan Friesse
d775f1425d man: Enahnce block_unlisted_ips description
Thanks Christine Caulfield <ccaulfie@redhat.com> for
Englishify and refining the description.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-06-07 11:11:00 +02:00
Jan Friesse
183d5da5eb man: Enhance corosync.conf mp a bit
Fix issues found by Ulrich Windl <Ulrich.Windl@rz.uni-regensburg.de>

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-05-31 12:51:26 +02:00
Fabian Grünbichler
ef2569d323 cfgtool: Fix link status display
instead of the nodeid, this displayed arbitrary values (usually '1')
from other cmap keys under nodelist.node.XX.

sscanf returns the number of conversions even on mismatch, e.g. it also
returns 1 for

nodelist.node.2.quorum_votes
nodelist.node.2.ring0_addr
nodelist.node.2.name
...

instead of just

nodelist.node.2.nodeid

which leads to the value of (at least) quorum_votes being stored in
nodeid_list in addition to the actual nodeid.

storing the returned int in a cs_error_t enum also potentially masks
errors, so just compare the result with the expectation directly.

Fixes: c0d14485c3

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-05-30 15:23:37 +02:00
Jan Friesse
9bba026bcd knet: Use block_unlisted_ips
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-05-29 16:30:18 +02:00
Jan Friesse
72737d3929 udpu: Drop packets from unlisted IPs
This feature allows corosync to block packets received from unknown
nodes (nodes with IP address which is not in the nodelist). This is
mainly for situations when "forgotten" node is booted and tries to join
cluster which already removed such node from configuration. Another use
case is to allow atomic reconfiguration and rejoin of two separate
clusters.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-05-29 16:30:10 +02:00
Christine Caulfield
482df5d67b knet: Fix initialising of knet access lists.
It needs to be done at both reload and initialize time.
Also disable access lists if the config key is removed.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-05-29 16:29:56 +02:00
Fabio M. Di Nitto
5c9a2b1c06 knet: allow corosync to use knet access lists
currently knet acl are only available on master
but they might be backported
to stable1 as they don´t break onwire protocol.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-05-29 16:29:35 +02:00
yuan ren
3e8b525e32 man: Enhance token_retransmit description
Signed-off-by: yuan ren <yren@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-05-28 09:55:49 +02:00
yuan ren
2a4cd3c4af totemconfig: Fix minimum limit for hold timeout
Make sure the retransmit timeout have the lowest limit
`MINIMUM_TIMEOUT`. So, the lowest limit of hold should be
recalculated.

Also token timeout and retransmits count should
keep a relational expression.

Signed-off-by: yuan ren <yren@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-05-15 16:28:43 +02:00
Christine Caulfield
c3d69712c6 vqsim: Enhance vqsim
1. Enable scripting of vqsim and add man page

I've added a 'sleep' command to help with scripting as well as
documentation on how to do it.

2. Make 'sync' operation much more robust and useful

Refactored a lot of code to make sure that in sync mode the
prompt appears at the 'right' time. What we do is wait for all
of the nodes in all partitions to have the same ring_id. If this
doesn't happen then the timeout will fire as before.

3. Rename binary to corosync-vqsim and add a sub-package for it

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-05-13 11:04:24 +02:00
Christine Caulfield
01ce5a96ef knet: Fix a couple of errors when adding a new link
When adding a new link for the first time you will often see:
1) knet_link_set_ping_timers for nodeid 1, link 1 failed: Invalid
argument (22)
2) New config has different knet transport for link 1. Internal value
was NOT changed. To reconfigure an interface it must be deleted and
recreated. A working interface needs to be available to corosync at all
times

1) is caused by setting the ping timers twice, once in
totemknet_member_add() and once in totemknet_refresh_config().
The first time we don't know the value
so it's zero and thus display an error. For this we simply check
for the zero and skip the knet API call. It's not ideal, but
totemconfig needs a lot of reconfiguring itself before we can
make this more sane.

2) was caused by simply comparing an unconfigured link with
a configured one, so OF COURSE, they are going to be different!

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-05-02 16:42:03 +02:00
yuan ren
70cda5d55f totemconfig: fix autogen mcastaddr for ipv6-4
When UDP is used as a transport, the error would occur
"Multicast address family does not match bind address family"
because there is no ipv6 in /etc/hosts specified but using the
totem.ip_version: ipv6-4. because
the mcastaddr generated (if not specified) only according to
the totem.ip_version.

Solution is to use bindnetaddr (configured or generated from
nodelist) addr family.

Signed-off-by: yuan ren <yren@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-05-02 10:53:54 +02:00
Jan Friesse
3172a76d12 totemconfig: Ensure nodeid is specified for IPv6
Thanks Yuan Ren <yren@suse.com> for finding this problem.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-04-25 17:11:12 +02:00
Christine Caulfield
e65d7b5d98 vqsim: Fix vqsim for corosync 3.0
A couple of small internal changes in corosync 3.0 broke vqsim.
1) The way the custom config file is specified (no long an env variable)
2) votequorum now needs to know ouZ_node_pos

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-04-25 16:50:51 +02:00
Jan Friesse
e287a7c1ef vqsim: Make vqsim compile
Also add vqsim binary to .gitignore.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-04-24 14:34:17 +02:00
Jan Friesse
d05c1593a1 totemconfig: ipaddr_equal check just addr part
Checking whole structure is fine for IPv4, but IPv6 contains also scope
id, what may be problem for local address. It's possible to use a zone
index, but because it's not required when host name is used, it
shouldn't be needed when IPv6 address is used.

Example configuration snip which fails without patch:

...
nodelist {
  node {
    nodeid: 1
      ring0_addr: fe80:🔢5678:9abc:def1
    }
}
...

(example succeed when %eth0 is used).

With patch, zone index is not needed.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-04-23 16:17:42 +02:00
Jan Friesse
41f9e966bb cpg: Add CPG_REASON_UNDEFINED
Previously the reason field for the member_list items
in cpg_totem_confchg_fn was unset what may be little confusing.

Solution is to add a special value CPG_REASON_UNDEFINED and use it for
the member_list items.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-04-16 14:49:10 +02:00
Fabian Grünbichler
b97ca8e9f0 crypto: re-introduce secauth parameter
with the following semantics:
- default off
- implies crypto_hash SHA256 and crypto_cipher AES256
- crypto_* have higher precedence
- only applicable for knet, like crypto_*

this should make upgrading from Corosync 2.x less painful for users that
have an explicit secauth=on in their configuration.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-04-15 13:29:41 +02:00
Jan Friesse
d05636b738 totemconfig: Remove support for 3des
Triple DES is considered as a "weak cipher" since 2016 so there is
really no need to support it in the corosync. Thanks to bug in
Corosync/Knet/NSS which caused 3des to not work at all,
no matter what library was used, we can just remove support for 3des
without braking the compatibility.

Also fix coroparse so:
- totem.crypto_type is removed (this is 1.x construct which was not used
even in 2.x)
- Add checking of totem.crypto_model.
- Enumarate possible values for crypto_model, crypto_cipher and
crypto_hash error messages

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-04-11 15:15:38 +02:00
Jan Friesse
c260bce45b keygen: Reflect change in knet
Knet commit 1cb36f0cffd4559971826ca4774a88c5b05882fb reduced minimal
key length to 1024-bit. Keygen should keep compatibility with already
released 3.0.[0-1] so default key length should be 2048 bits. It's
possible to use -s argument to generate shorter key - keygen respects
minimum/maximum as defined by knet.

Also fix man page to reflect this change.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-04-11 15:14:06 +02:00
Fabian Grünbichler
03fba21503 set totem.keyfile and totem.key to RO
so that we get the nice log message when attempting to modify them at
runtime, just like for totem.crypto_* and co.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-04-05 17:07:03 +02:00
Jan Friesse
527e30a8d0 Revert "init: Enable StopWhenUnneeded"
This reverts commit 03d9321bc8.

Reverted because when corosync service is not enabled and corosync
is executed by "systemctl start corosync" it is then immediately
shutdown because of "Unit not needed anymore. Stopping.".

This is really not expected behavior.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-04-04 11:51:23 +02:00
yuan ren
24a72e9780 totemsrp: Word spelling mistake
Signed-off-by: yuan ren <reyren179@gmail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-04-01 08:20:46 +02:00
Jan Friesse
7c825173de coroparse: Fix compiler warning
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-02-26 13:28:40 +01:00
Jan Friesse
83dc407f55 configure: Do not autodetect nozzle
Nozzle is part of kronosnet but it is independent library. Enabling it
when detected without ability to turn it off is not in line with
other libraries.

Solution is to use same method as for other libraries - add
--enable-nozzle to configure script and add support for this option into
spec file.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-02-26 13:12:00 +01:00
Christine Caulfield
eab55e7384 nozzle: Add support for libnozzle devices
A nozzle device is a pseudo ethernet device that routes network
traffic through a channel on the corosync knet network (NOT cpg or any
corosync internal service) to other nodes in the cluster. It allows
applications to take advantage of knet features such as multipathing,
automatic failover, link switching etc.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-02-26 13:11:35 +01:00
Jan Friesse
db38e3958c quorumtool: Fix exit status codes
1. Use EXIT_SUCCESS and EXIT_FAILURE when possible
2. For -s option return EXIT_SUCCESS when no problem appeared and node
   is quorate, EXIT_FAILURE if problem appeared and exit code 2
   (EXIT_NOT_QUORATE) when no problem appeared but node is not quorate.
3. Document exit codes in the man page

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-02-15 17:12:59 +01:00
Jan Friesse
4f9e46e7a8 corosync-cfgtool: Fix -i matching
Previously it was required to use link id together with IP address (ex.
"0 127.0.0.1") as a -i parameter.

This was reported as not very user friendly. Solution is to split
returned interface name and try match link id and ip address
separately.

Also fix typo in description of parameter -s.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-02-14 13:46:57 +01:00
Ferenc Wágner
b09b96fe6f build: Use the AWK variable provided by configure
Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-02-06 16:05:11 +01:00
Ferenc Wágner
6a476017b9 build: Use the SED variable provided by configure
Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-02-06 16:04:55 +01:00
Ferenc Wágner
b9cc5be3a2 configure.ac: AC_PROG_SED is already present
Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-02-06 16:04:41 +01:00
Ferenc Wágner
4d0e764310 corosync.conf.5: typography fixes
Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-02-06 16:04:07 +01:00
Ferenc Wágner
059c22a154 corosync.conf.5: fix grammar
Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-02-06 16:03:48 +01:00
Christine Caulfield
c0d14485c3 cfgtool: Improve link status display
Now show the nodeids properly, rather than node indexes which were
annoying and unhelpful.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-02-04 08:06:01 +01:00
Jan Friesse
ce29717491 doc: Update INSTALL file
- Add LibQB and Knet links
- Remove old (pre udpu) config file example
- Change corosync.conf man page to contain useful information about
token timeout

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-01-18 15:06:18 +01:00
Jan Pokorný
03d9321bc8 init: Enable StopWhenUnneeded
It shall be a rule of thumb not to combine "application stack"
components run under particular init/supervision mechanism and
run by whatever other means (without transitive relationships
like when corosync's client runs from other pacemaker that is
itself started through systemd) when there's a directed graph
of reliance between them (sans constrained corner cases like
when of such components is a kernel module).

And corosync on its own is just a service provider that only
appears useful when utilized as a basic building block for
application specific distributed environments.

Therefore, we may assume whenever corosync gets started by the
means of systemd, it's because of a mechanized attempt to satisfy
declared dependency of some such corosync's client that is about
to be started under the service manager realms (directly or, by
induction, through the same triggering mechanism indirectly).
Hence, when there's no such client around anymore (unless
this dependant is being restarted at the moment, see below)
corosync shall rather shutdown as well.

In the past, there was an issue with systemd regarding said
inflicted restart of the dependant/client, but that's resolved
as of v236:
https://github.com/systemd/systemd/commit/
deb4e7080db9dcd2a1d51ccf7c357f88ea863e54

Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-01-11 09:49:35 +01:00
Jan Friesse
2ab4d41886 totemip: Use AF_UNSPEC for ipv4-6 and ipv6-4
AF_UNSPEC returns different results than AF_INET/AF_INET6, because of
nsswitch.conf search is in order and it stops asking other
modules once current module success.

Example of difference between previous and new code when ipv6-4 is used:
- /etc/hosts contains test_name with an ipv4
- previous code called AF_INET6 where /etc/hosts failed so other methods
were used which may return IPv6 addr -> result was ether fail or IPv6
address.
- new code calls AF_UNSPEC returning IPv4 defined in /etc/hosts ->
result is IPv4 address

New code behavior should solve problems caused by nss-myhostname.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2019-01-11 09:37:30 +01:00
Fabio M. Di Nitto
ff7ace7655 [totemknet] update for libknet.so.2.0.0 init API
more changes are to be expected on this front as the API evolves in
knet master.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-01-03 10:10:38 +01:00
Ferenc Wágner
fcf5733ccb Config version must be specified
Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-01-03 09:50:57 +01:00
Ferenc Wágner
13b070f0c8 Don't declare success early
Here we're very far from entering the main loop, even farther from
sending the READY notification to systemd.  This sounded awkward:

systemd[1]: Starting Corosync Cluster Engine...
corosync[827]:   [MAIN  ] Corosync Cluster Engine ('2.99.5'):
  started and ready to provide service.
corosync[827]:   [MAIN  ] Corosync built-in features: dbus monitoring
  watchdog augeas systemd xmlconf snmp pie relro bindnow
corosync[827]:   [MAIN  ] parse error in config: No interfaces defined
corosync[827]:   [MAIN  ] Corosync Cluster Engine exiting with status 8
  at main.c:1378.
systemd[1]: corosync.service: Main process exited, code=exited,
  status=8/n/a
systemd[1]: corosync.service: Failed with result 'exit-code'.
systemd[1]: Failed to start Corosync Cluster Engine.

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2019-01-03 09:50:36 +01:00
Ferenc Wágner
ba24bef8bd More natural error messages
Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2019-01-03 09:48:51 +01:00
Jan Friesse
0ee7fd0c2f main: Rename run_dir to state_dir
system.run_dir was a little bit unfortunate and confusing name. Rename
to state_dir makes more evident what is content of this directory. To
keep setting consistent with code, get_run_dir is changed to
get_state_dir.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-12-14 13:48:33 +01:00
Jan Friesse
a84ade701c totemconfig: Enhance totem.ip_version
Originally totem.ip_version was used to force ip version used by totem.
With Knet this variable didn't make too much sense so it was not used.

Sadly rely only on DNS resolver order doesn't always work (RFC is quite
complicated, but if IPv6 is not configured then IPv4 is preferred), what
we tried to solve by forcing IPv6 and only if that fails, use IPv4.

Sadly this collides with nss_myhostname which is able to return every
local address and today system usually have at least one autogenerated
link-local IPv6 address so it is able to "overwrite" /etc/hosts.

Solution is to enhance totem.ip_version and use it also for Knet.
totem.ip_version is now just a flag for resolver and can have four
states: ipv4 (only IPv4 is used), ipv6 (only IPv6 is used), ipv4-6 (ask
IPv4 first and if it fails ask for IPv6) and ipv6-4 (ask IPv6 first and
if it fails ask for IPv4). Default for Knet and UDPU transports is
ipv6-4, for UDP it's ipv4, because autogenerated mcast addr doesn't play
too well with ipv6-4.

So everywhere where nss_myhostname becomes problem, it's just possible
to set totem.ip_version to ipv4-6.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-12-14 10:56:06 +01:00
Jan Friesse
aa7daf8c77 totemip: Add debug information to totemip_parse
It's required to create TOTEM logsys subsys before totemip_parse is used
(so before totem_config_read). Logsys is not yet fully initialized, but
it's good enough.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-12-13 15:25:04 +01:00
Jan Friesse
e17e3f4b81 totemconfig: Add IPs to family mismatch error
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-12-13 15:24:37 +01:00