Commit Graph

4160 Commits

Author SHA1 Message Date
liangxin1300
f0e1eaff2d totemconfig: validate totem.transport value
Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-09-03 16:00:31 +02:00
liangxin1300
7f64a1dc0f cmapctl: return error on no result of print prefix
return  EXIT_FAILURE if no result print for ACTION_PRINT_PREFIX.

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-08-21 11:22:09 +02:00
liangxin1300
ec889e89c6 cmapctl: check NULL for key type and value for -p
To avoid segmentation fault.

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-08-21 11:20:21 +02:00
liangxin1300
56f9f19154 quorumtool: strict check for -o option
Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-08-20 14:28:19 +02:00
liangxin1300
303c869259 quorumtool: Help shouldn't require running service
Do not require corosync running when usage is requested.

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-08-19 17:29:34 +02:00
liangxin1300
c02a69a988 cfgtool: Return error when -i doesn't match
Give error message and EXIT_FAILURE return code when -i
option doesn't match.

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-08-18 18:15:48 +02:00
liangxin1300
d3224df77c man: update output of -s and -b for cfgtool
Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-08-17 15:06:07 +02:00
liangxin1300
9105d94a80 cmapctl: return EXIT_FAILURE on failure
For -g and -d option return EXIT_FAILURE when error occurs (most often
because key does not exist).

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-08-17 14:59:13 +02:00
liangxin1300
fb5e0fae92 tools: use util_strtonum for options checking
Function atoi is not safe since miss validation;
Function strtol is better but need to consider empty string and overflows
Function util_strtonum is a safer wrapper of strtoll

Use util_strtonum to check nodeid option and strict checking condition.

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-08-12 16:56:34 +02:00
liangxin1300
e741f6a612 cfgtool: enhancement -a option
* Add return code
  * Give error message when nodeid not exist

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-08-11 09:45:10 +02:00
liangxin1300
06d530dbf5 cfgtool: output error messages to stderr
... and standardize the return code

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-08-07 11:39:52 +02:00
Jan Friesse
464945a3e1 configure: Use default systemd path with prefix
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2020-07-16 17:15:57 +02:00
Jan Friesse
a1a86ebfdf build: Use git-version-gen during specfile build
Instead of copying parts of git-version-gen for spec target use
git-version-gen directly and parse final version into components
(rpmver, alphatag, numcomm) and use them.

Main reason is to simplify code a bit (sed scripts are a bit repetitive
tho), reuse the code and also allow building of RPM from dist tarball
generated from non-tagged commit or dirty git (not very useful).

The code relies on fact, that hyphen is never used in tagged release
name.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2020-07-16 17:15:51 +02:00
Jan Friesse
ce4746d68c build: Update git-version-gen
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2020-07-16 17:15:42 +02:00
Jan Friesse
1602a8c055 spec: Require at least knet 1.18 for crypto reload
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2020-07-16 17:15:28 +02:00
Christine Caulfield
5f71445be0 config: Allow reconfiguration of crypto options
Needs new knet crypto API.

If it's not available, then fall back to the old
API and forbid changing crypto while running.

To avoid us being dependant on the leader node, each
node sends its own crypto_reconfig_phase messages so
we can guarantee that the reconfiguration always completes
on each node.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-07-09 16:54:16 +02:00
Christine Caulfield
05023ee2e9 test: Fix cpgtest
... to cope with the max number of group members.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-05-18 14:54:13 +02:00
Christine Caulfield
f8b63083e1 config: Fix crash when a reload fails twice
Have string values stored in char arrays in totem_config
so we don't get into a mess with the pointers.

Also remove vsftype (which hasn't been used since corosync 1)

Use strncpy even though we know the string is fine. Keep covscan happy

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-04-24 16:27:18 +02:00
Christine Caulfield
4ddc96cd4e config: Don't free pointers used by transports
reload failed for UDP[U] because they had saved pointers
to the interfaces[] array. so memcpy into that rather then
re-allocate it.

Also, move the check for different IP address families so
it also gets run at reload time.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-04-24 16:27:09 +02:00
Christine Caulfield
7cb539e2e3 config: don't reload vquorum if reload fails
Fix an 'error: success' stype message by propogating error_string
back down the stack.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-04-24 16:27:01 +02:00
Christine Caulfield
600072ef38 cfg: Improve error return to cfgtool -R
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-04-24 16:26:54 +02:00
Christine Caulfield
f078fff6eb config: Reorganise the config system
To be more reliable & maintainable

The basic plan here is to fix reloads to be more stable
using read/parse/verify/build/commit stages, so that any errors
will not leave corosync in an unstable state. This should
also make the code more maintainable as currently the verify/commit
stages are horribly intertwined.

Also:
- Fix local_node_pos not being updated in the new map during validation
 (broke adding and removing new nodes in the middle of the list).
- Fix reconfiguration so that nodes are indexed by nodeid and not their
  position in the list. This is an old bug that's just been carried
  over

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-04-24 16:26:44 +02:00
Jan Friesse
1777d9992c Revert "totemip: compare sin6_scope_id and interface_num"
This reverts commit efd34df531 to make
master compile after revert of 934c47ed43.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2020-04-22 13:30:36 +02:00
Jan Friesse
cd6cc90a6f Revert "totemip: Add support for sin6_scope_id"
This reverts commit 934c47ed43 which is
causing protocol incompatibility in needle. Master seems to be not
affected, but it needs more checking.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2020-04-22 13:30:19 +02:00
Hideo Yamauchi
0d0febbc94 cfgtool: Fix error code as described in MP
If all links are connected 0 is returned to the shell, otherwise it's
error code 1.

Signed-off-by: Hideo Yamauchi <renayama19661014@ybb.ne.jp>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-03-30 18:05:56 +02:00
Christine Caulfield
c631951ef5 icmap: icmap_init_r() leaks if trie_create() fails
Thanks to Coverity for finding this

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-03-26 14:42:41 +01:00
Jan Friesse
ca320beac2 votequorum: set wfa status only on startup
Previously reload of configuration with enabled wait_for_all result in
set of wait_for_all_status which set cluster_is_quorate to 0 but didn't
inform the quorum service so votequorum and quorum information may get
out of sync.

Example is 1 node cluster, which is extended to 3 nodes. Quorum service
reports cluster as a quorate (incorrect) and votequorum as not-quorate
(correct). Similar behavior happens when extending cluster in general,
but some configurations are less incorrect (3->4).

Discussed solution was to inform quorum service but that would mean
every reload would cause loss of quorum until all nodes would be seen
again.

Such behaviour is consistent but seems to be a bit too strict.

Proposed solution sets wait_for_all_status only on startup and
doesn't touch it during reload.

This solution fulfills requirement of "cluster will be quorate for
the first time only after all nodes have been visible at least
once at the same time." because node clears wait_for_all_status only
after it sees all other nodes or joins cluster which is quorate. It also
solves problem with extending cluster, because when cluster becomes
unquorate (1->3) wait_for_all_status is set.

Added assert is only for ensure that I haven't missed any case when
quorate cluster may become unquorate.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2020-03-24 14:13:32 +01:00
Jan Friesse
5f543465bb quorumtool: exit on invalid expected votes
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2020-03-13 09:14:56 +01:00
Jan Friesse
0c16442f2d votequorum: Change check of expected_votes
Previously value of new expected_votes was checked so newly computed
quorum value was in the interval <total_votes / 2, total_votes>. The
upper range prevented the cluster to become unquorate, but bottom check
was almost useless because it allowed to change expected_votes so it is
smaller than total_votes.

Solution is to check if expected_votes is bigger or equal to total_votes
and for quorate cluster only check if cluster doesn't become unquorate
(for unquorate cluster one can set upper range freely - as it is
perfectly possible when using config file)

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2020-03-13 09:06:55 +01:00
Jan Friesse
15c25a286d cfgtool: Simplify output a bit for link status
Display words connected/disconnected instead of 1/0 and show enabled
status only when link is not enabled (shouldn't happen).

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2020-03-04 15:19:13 +01:00
Jan Friesse
d9eaab7823 man: Enhance link_mode priority description
Some users found description of priority for passive link_mode
confusing (probably because "priority" word is too
overloaded) so add some redundancy to make description
unambiguous.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2020-02-27 08:38:58 +01:00
Jan Friesse
35662dd0ec main: Add schedmiss timestamp into message
This is useful for matching schedmiss event in stats map with logged
event.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2020-02-27 08:37:35 +01:00
liangxin1300
efd34df531 totemip: compare sin6_scope_id and interface_num
When user configure a specific interface like vlan
with the same IPv6 link-local address, Corosync should
compare sin6_scope_id with interface_num, to make sure got
the right interface to bind

Signed-off-by: liangxin1300 <XLiang@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-02-21 15:46:22 +01:00
Jan Friesse
98448d4ebc totemip: Really remove totemip_copy_endian_convert
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2020-02-17 17:54:23 +01:00
Jan Friesse
38d1d10d39 totemip: Remove unused totemip_copy_endian_convert
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2020-02-17 17:31:55 +01:00
Jan Friesse
934c47ed43 totemip: Add support for sin6_scope_id
sin6_scope_id was not present in totemip structure making impossible to
use link-local ipv6 address.

Patch adds sin6_scope_id and changes convert/copy functions to use it
(formally also comparator functions should be changed, but it seems to
cause more harm and it is not really needed).

This makes corosync work with link-local addresses fine for both UDPU
and Knet transport as long as interface specification is used (so
fe80::xxxx:xxxx:xxxx:xxxx%eth0).

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2020-02-17 17:31:42 +01:00
Jan Friesse
720a892751 cfgtool: Improve link status display
Totemknet is enhanced to use 'n' character for localhost and not adding
status, because it is safe to expect that localhost link is always
connectd. corosync-cfgtool is enhanced to properly decode 'n', '?' and
'd' characters and display its meaning for extended status. Special
characters are also documented in man page.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2020-02-12 13:08:25 +01:00
Hideo Yamauchi
0143ee9a2f totemknet: Change the initial value of the status
Signed-off-by: Hideo Yamauchi <renayama19661014@ybb.ne.jp>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-02-10 16:41:22 +01:00
Jan Friesse
ebd05fa008 stats: Use nanoseconds from epoch for schedmiss
Using monotonic time is not working because it doesn't have to match
time from epoch.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2020-01-23 17:58:41 +01:00
Christine Caulfield
48b6894ef4 stats: Add stats for scheduler misses
This patch add a stats.schedmiss.* set of entries that
are a record of the last 10 times corosync was not scheduled
in time.

These entries are keypt in reverse order (so stats.schedmiss.0.* is
always the latest one kept) and the values, including the timestamp,
are in milliseconds.

It's also possible to use a cmap tracker to follow these events, which
might be useful.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-01-22 17:06:10 +01:00
Jan Friesse
8ce65bf951 votequorum: Reflect runtime change of 2Node to WFA
When 2Node mode is set, WFA is also set unless WFA is configured
explicitly. This behavior was not reflected on runtime change, so
restarted corosync behavior was different (WFA not set). Also when
cluster is reduced from 3 nodes to 2 nodes during runtime, WFA was not
set, what may result in two quorate partitions.

Solution is to set WFA depending on 2Node when WFA
is not explicitly configured.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2020-01-21 16:19:49 +01:00
Hideo Yamauchi
9fda4dc6ac cpg: Change downlist log level
Signed-off-by: Hideo Yamauchi <renayama19661014@ybb.ne.jp>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-01-09 12:40:32 +01:00
Ferenc Wágner
f1d36307e5 man: move cmap_keys man page from section 8 to 7
Section 8 is for "System administration commands", 7 is "Miscellaneous".

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2020-01-07 08:56:58 +01:00
Jan Friesse
89b0d62f8b stats: Check return code of stats_map_get
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2019-11-28 09:44:45 +01:00
Jan Friesse
56ee850301 quorumtool: Assert copied string length
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2019-11-28 09:44:45 +01:00
Jan Friesse
1fb095b0af notifyd: Check cmap_track_add result
And assert length of key_name to strcpy.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2019-11-28 09:44:45 +01:00
Jan Friesse
8ff7760ce5 cmapctl: Free bin_value on error
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2019-11-28 09:44:45 +01:00
Jan Friesse
21e1c71169 cfgtool: Remove unused callbacks
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2019-11-28 09:44:45 +01:00
Jan Friesse
ee38d93cc7 cpghum: Remove unused time variables and functions
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2019-11-28 09:44:44 +01:00
Jan Friesse
35c312f810 votequorum: Assert copied strings length
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2019-11-28 09:44:44 +01:00