Commit Graph

156 Commits

Author SHA1 Message Date
Christine Caulfield
268cde6ee4 totem: Add Kronosnet transport.
This is a big update that removes RRP & MRP from the codebase
and makes knet the default transport for corosync. UDP & UDPU
are still (currently) supported but are deprecated. Also crypto
and mutiple interfaces are only supported over knet.

To compile this codebase you will need to install libknet from
https://github.com/fabbione/kronosnet

The corosync.conf(5) man page has been updated with info on the new
options. Older config files should still work but many options
have changed because of the knet implementation so configs should
be checked carefully. In particular any cluster using using RRP
over UDP or UDPU will not start as RRP is no longer present. If you
need multiple interface support then you should be using the knet transport.

Knet brings many benefits to the corosync codebase, it provides support
for more interfaces than RRP (up to 8), will be more reliable in the event
of network outages and allows dynamic reconfiguration of interfaces.
It also fixes the ifup/ifdown and 127.0.0.1 binding problems that have
plagued corosync/openais from day 1

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
2016-10-11 10:09:42 +01:00
Jan Friesse
1925074909 Fix few bugs found by coverity
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2016-06-28 13:58:43 +02:00
Jan Friesse
44df76a7ee config: get_cluster_mcast_addr error is not fatal
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2016-06-28 13:57:14 +02:00
Jan Friesse
60565b7da7 totemconfig: Explicitly pass IP version
If resolver was set to prefer IPv6 (almost always) and interface section
was not defined (almost all config files created by pcs), IP version was
set to mcast_addr.family. Because mcast_addr.family was unset (reset to
zero), IPv6 address was returned causing failure in totemsrp.
Solution is to pass correct IP version stored in
totem_config->ip_version.

Patch also simplifies get_cluster_mcast_addr. It was using mix of
explicitly passed IP version and bindnet IP version.

Also return value of get_cluster_mcast_addr is now properly checked.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2016-04-07 14:45:05 +02:00
Christine Caulfield
997074cc3e totemconfig: Check for duplicate nodeids
Having duplicate nodeids in corosync.conf can play havoc with a cluster,
so (as suggested by someone on this list) here is some code to check
that all nodeids are unique. Even if a nodeid is not specified it will
check to be sure that the ID generated from the IP address (ipv4 only)
does not clash with one that is provided.

It logs all non-unique nodeids to syslog, but only the last is reported
on the command-line to the user which should be enough to get them to
check further. At startup this will cause corosync to fail to start.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
2015-04-10 14:22:07 +01:00
Jan Friesse
d77cec24d0 Handle adding and removing UDPU members atomically
When config file is reloaded with removed UDPU member, internal icmap
index of nodelist.node can change. This can result in removal and then
adding back node. This, with UDPU alive filtering (where member is by
default considered as not a member) makes corosync not sending messages
to such members resulting in new membership creation.

Solution is to properly test which members were really deleted and added
(instead of relying on internal and dynamic naming of icmap hash table
key name).

Also trully dynamic add and remove node (via cmap) is now handled by
same function so totem_config->interfaces is now updated properly.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2015-01-21 16:37:26 +01:00
Jan Friesse
6449bea835 config: Ensure mcast address/port differs for rrp
When using multiple interfaces, it's necessary to use different
multicast address/port pair for each interface to make
rrp work correctly. This is now checked in parser.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-11-24 11:55:37 +01:00
Jan Friesse
70bd35fc06 config: Process broadcast option consistently
Broadcast option is global but in config set in interface section. When
more interfaces are defined, only broadcast from last section was used.

Solution is to use broadcast whenever at least one interface use
broadcast.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-11-24 11:55:37 +01:00
Jan Friesse
6c028d4d9c config: Make sure user doesn't mix IPv6 and IPv4
Checking code was there, sadly not correct, so it was possible to enter
one bindnet addr as IPv4 and second as IPv6. Fix is trivial.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-11-24 11:55:37 +01:00
Jan Friesse
bb52fc2774 Store configuration values used by totem to cmap
Some totem configuration values (like token, consensus, ...) are ether
computed or default value is used. It's hard to find out, what
value is really used.

Solution is to store values in cmap.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-10-13 11:59:06 +02:00
Christine Caulfield
88dbb9f722 totemconfig: Make sure join timeout is less than consensus
The thesis contains this paragraph:

" The Join timeout is shorter than the Consensus timeout and is used to
  increase the probability that Join messages from all currently
  working processors are received during a single round of consensus."

Empirically I can confirm that making join less than consensus can cause
havoc with a cluster so I think we should enforce this.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2014-07-25 08:24:02 +01:00
Christine Caulfield
3b8365e806 config: Fix typos
Fix several places where 'then' is used instead of 'than' in error
messages and a comment.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2014-07-24 10:27:45 +01:00
Jan Friesse
63bf09776f totemconfig: refactor nodelist_to_interface func
Move finding of bindaddr in nodelist to generally usable function
totem_config_find_local_addr_in_nodelist and refactor
config_convert_nodelist_to_interface function to use it.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2014-07-22 14:59:31 +02:00
Jan Friesse
10c80f454e totemconfig: totem_config_get_ip_version
Add totem_config_get_ip_version to get user configured ip version.
Make totem_config_read use this newly introduced function.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2014-07-22 14:59:27 +02:00
Jan Friesse
dc35bfae62 totemconfig: Free ifaddrs list
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2014-07-22 14:59:20 +02:00
Jan Friesse
72cf15af27 votequorum: Do not process events during reload
During reload, local_node_pos is deleted and reinstation is handled in
totemconfig after reload is finished. votequorum handles this events and
tries to reload it's configuration. This led to logging a little scary
messages (even nothing bad is happening, because after local_node_pos
reinstation everything back to normal).

Solution is to stop processing events during reload. Sadly, simple
tracking of config.reload_in_progress doesn't work because LibQB events
triggering order is undefined so votequorum reload handler can be called
before totemconfig (and before local_node_pos is reinstatied).

So new config.totemconfig_reload_in_progress key is defined with very
similar semanthic as config.reload_in_progress but set inside
totem_reload_notify function. Votequorum then use this new key.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-06-27 11:40:21 +02:00
Jan Friesse
7557fdec48 config: Allow dynamic change of token_coefficient
token_coefficient change in cmap didn't triggered change. So only way
how to change token_coefficient was editing config file and reload.

Patch let's key totem.token_coefficient to be processed so
token_coefficient can be dynamically changed.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-05-07 15:55:26 +02:00
Jan Friesse
58176d6779 Add token_coefficient option
Token coefficient is used only when nodelist is specified and contains
at least 3 nodes. If so, real token timeout is then computed as
token + (number_of_nodes - 2) * token_coefficient. This allows cluster
to scale without manually changing token timeout every time new
node is added. This value can be set to 0 resulting in effective
removal of this feature.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-03-25 15:29:17 +01:00
Jan Friesse
9a8de87c34 totemconfig: Log errors on key change and reload
When volatile key was changed (cmap set or reload) and checks fails,
nothing was logged.

Values are now checked and error string is logged on problems.

Also totem_config is dumped to log (DEBUG level) after every
volatile key change and every reload.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-03-25 15:29:14 +01:00
Jan Friesse
b95ebd640e totemconfig: Key change process dependencies
When key with dependency was changed, dependant keys were not recomputed.
Nice example is consensus timeout. If token timout was changed,
consensus timeout was not recomputed correctly (nether via cmap change
of key nor via cfg reload).

Solution is almost complete refactor of handling volatile defaults.

totem_volatile_config_read now handles not only storing cmap key to
totem_config structure, but also checking of existence, comparing with
zero value and properly storing defaults.

totem_set_volatile_defaults is gone. It's function was splitted into
totem_volatile_config_read and totem_volatile_config_validate functions.

Reload callback and change of key callback are now mostly same functions
and both calls totem_volatile_config_read.

Patch also fixes small memory leak. totem.vsftype key is not used for
long time and original totem_volatile_config_read wasn't freeing
allocated memory returned by icmap_get_string. Whole reading of
totem.vsftype is removed.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-03-25 15:29:12 +01:00
Jan Friesse
eeb2384157 Really clear totemconfig nodes on reload
When reload was called nodes were constantly added to totemconfig
nodelist.

So simple corosync-cfgtool -R resulted very quickly in filling whole
array and segfault.

Solution is to clear member_count.

Clearing is also moved directly to put_nodelist_members_to_config to
make sure it's always processed.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-03-25 15:29:09 +01:00
Jan Friesse
2f0cad20a9 config: Handle totem_set_volatile_defaults errors
When totem_set_volatile_defaults is called from totem_config_validate
return code is unchecked.

It's then perfectly possible to set (for example) join timeout to very
small value (1) and consensus value is then set to 0 making corosync
unable to create membership.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-03-17 10:04:00 +01:00
Jan Friesse
5c54f941ac Fix cppchecks warning
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-01-14 11:24:29 +01:00
Christine Caulfield
c0bfd48928 Reload: Add atomic reload to totemconfig
When a reload is in progress, wait until the whole thing has
finished before setting parameters

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-09-12 16:09:55 +01:00
Jan Friesse
d6dd2e455d totemconfig: Prevent leak of cluster_name str
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-06-21 11:21:33 +02:00
Jan Friesse
421de34972 totemconfig: Check length of rrp_mode string
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-18 14:35:15 +02:00
Jan Friesse
92b900da67 Initialize node_found in nodelist_to_interface fun
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-13 10:53:57 +02:00
Jan Friesse
90f8a68a2b Use proper totem_ip_address size in memset
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-13 10:53:56 +02:00
Jan Friesse
96a89a0085 Check icmap str get for clustername
Even this check is really not needed, it's nice to have it and on fault
ensure that cluster_name is really NULL.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-13 10:53:55 +02:00
Fabio M. Di Nitto
55dc09ea23 totemconfig: enforce hmac config when crypto is enabled
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-01-14 12:31:47 +01:00
Fabio M. Di Nitto
ed6bca3293 crypto: drop < 2.3 protocols and onwire compat
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-01-14 11:49:32 +01:00
Jan Friesse
dd588d004e Add option to specify ip version
Default is ipv4.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-12-03 14:02:32 +01:00
Fabio M. Di Nitto
220d659b38 totemcrypto: implement crypto packet format 2.2 and crypto_compat: config opt
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-11-22 11:13:30 +01:00
Fabio M. Di Nitto
20c5871525 totemcrypto: add support for different encryption methods
(backport from nsscrypto kronosnet code)

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-10-15 10:00:16 +02:00
Jan Friesse
373ded0652 Don't access invalid mem in totemconfig interfaces
When ringnumber in config file was set to value bigger or equal to
INTERFACE_MAX, we are using this big value as index to totemconfig
interfaces array, resulting to access to invalid memory and segfault.

Instead of that, ringnumber is now checked and proper error message is
printed if value is too big.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-09-27 13:54:39 +02:00
Fabio M. Di Nitto
fa92e4068a totemconfig: drop unnecessary includes
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-09-07 09:04:06 +02:00
Tim Beale
6129ce5b59 Remove redundant default-config code
We were checking 'hold_timeout == 0' in 3 different places when setting up
the default totem config.

Signed-off-by: Tim Beale <tlbeale@gmail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-21 14:26:50 +02:00
Jan Friesse
e57b5b9e6d crypto: Remove sha224 and add md5 hash
SHA224 is not supported on RHEL6 and also it's kind of weird. Instead of
that, md5 can now be configured.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-15 17:36:56 +01:00
Fabio M. Di Nitto
4a2d503643 crypto: add new hashing methods and fix config defaults
add support for sha224/256/384/512

change config defaults to match coroparse and totemconfig

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-15 10:55:32 +01:00
Fabio M. Di Nitto
0a6a6bbcfa crypto: drop secauth and make crypto none work again
keep totem.secauth config key for compatibility

if the key is NOT set, crypto will default to aes256/sha1
if the key is set to "off", crypto is disabled.
this reflects pretty much old behavior

keywords totem.crypto_cipher and totem.crypto_hash can
override secauth individually.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-14 11:28:36 +01:00
Jan Friesse
ab1675f0fe Parse and use hash and crypto from config file
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-13 17:38:59 +01:00
Fabio M. Di Nitto
55e8476697 crypto: mask the crypto operations from totem packet size management
totem doesn't need to understand what crypto does.

totem needs to be able to tell crypto: "those are data, play with them"
and crypto needs to return: "here are your scrambled data and the new size"

similar to decrypt/verify.

this way we add enough dynamic within crypto to change header size and all
at any given time (for different hash algorithm for example) without
affecting on wire compat.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-13 15:50:58 +01:00
Jan Friesse
8cdd2fc493 Remove libtomcrypt
Tomcrypt in corosync is for long time not updated. Because we have
support for libnss, libtomcrypt can be removed.

Also few leftovers (AES is 256 bits, not 128, ...) are removed.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-13 09:19:47 +01:00
Fabio M. Di Nitto
142ce8c3a1 totem: drop crypt_accept: concept/option
this was another old onwire compat mode that is not useful anylonger.

we can safely move the new model by default.

According to Honza (real hardware 1 node testing) there are no
performance impact.

My tests (8 nodes VM cluster), there is up to 10/12% performance
improvements up to 1M packet size where old and new models are equal.

As a side note, nss still shows to be a performance loss on both
real and virtual hw (without any kind of nss hw acceleration).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-10 07:08:30 +01:00
Jan Friesse
c30c088597 Tweak nodeid warning
Nodeid warning now appears only when both totem.nodeid and nodelist
nodeid exists. When nodelist nodeid is not defined, totem.nodeid is
used.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-21 16:33:56 +01:00
Jan Friesse
88ae75d6c2 Allow autoconfiguration of interface section
Thanks to totemip_getifaddrs infrastructure it's now possible to use
nodelist informations to autoconfigure interface bindnetaddr. Together
with cluster_name, interface section can be completely omitted.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:57 +01:00
Jan Friesse
ba13537471 totemconfig: ensure suffix for ringX_addr
Patch makes sure, that ringX_addr key has really _addr suffix.
Previously, it was possible to enter ringXanything and it was
interpreted as ringX_addr.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:57 +01:00
Steven Dake
2ad0cdc832 Update copyright header dates in exec directory
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-02-13 17:05:04 -07:00
Jan Friesse
0c2e3c8408 Make local_node ring0 address read-only
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-20 11:09:37 +01:00
Jan Friesse
16007acbef Use nodeid provided in nodelist
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-20 11:08:35 +01:00