When 2Node mode is set, WFA is also set unless WFA is configured
explicitly. This behavior was not reflected on runtime change, so
restarted corosync behavior was different (WFA not set). Also when
cluster is reduced from 3 nodes to 2 nodes during runtime, WFA was not
set, what may result in two quorate partitions.
Solution is to set WFA depending on 2Node when WFA
is not explicitly configured.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Section 8 is for "System administration commands", 7 is "Miscellaneous".
Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Express intention to ignore icmap_get_* return
value and rely on default behavior of not changing the output
parameter on error.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Make the icmap*_r functions read from the specified map rather
than the global map.
Also include icmap_get_string_r() which seems to have been missed out.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
To make sure libqb dependency is visible across all libraries.
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Some functions allocated memory on stack without clearing memory and
then send them on wire. This is not an issue, but valgrind reports this
as a problem so it is easy to miss real problem then.
Solution is to clear stack memory.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
commit 029b8ebad6 changed the default
of the KNET_PONG_COUNT from the kronosnet default of 5 to 2, as
corosync bring up was deemed to slow.
The documentation, and the comment stating that the totem config
default values match the knet ones were not updated, and thus now out
of date.
Fixhis by noting the correct default of 2 for KNET_PONG_COUNT and
note that all but that one are in sync with the korosync defaults in
the comment.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Messages sent during recovery phase are encapsulated so such message has
extra size of mcast structure. This is not so big problem for UDPU,
because most of the switches are able to fragment and defragment packet
but it is problem for knet, because totempg is using maximum packet size
(65536 bytes) and when another header is added during retransmition,
then packet is too large.
Solution is to reduce mtu by 2 * sizeof (struct mcast).
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
--with-sanitizers= option is stricly meant for runtime debugging
purposes. Do NOT use in production.
Please check gcc/clang man pages on how to use ASAN/UBSAN/TSAN.
Also allow users to specificy SANITIZERS_CFLAGS and SANITIZERS_LDFLAGS
for advanced use.
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Knet callbacks may be called from different thread than main thread. If
this happens, log messages may be lost. Most prominent example is when
link goes up (logged by main thread) and host_change_callback_fn is
called.
Implemented solution is adding mutex for every log call in totemknet.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
This patch handles the situation where the leader
node (the node with lowest node_id) crashes and is started again
before token timeout of the rest of the cluster.
The newly restarted node restores the ringid of the old ring from
stable storage, so it has the same ringid as rest of the nodes,
but ARU is zero. If the node is able to create a singleton membership
before receiving the joinlist from rest of the cluster,
everything works as expected, because the ring id gets increased
correctly.
But if the node receives a joinlist from another cluster node before
its own joinlist, then it continues as it would had it never left
the cluster. This is not correct, because the new node should always
create a singleton configuration first.
During the recovery phase, ARUs are compared and because they differ
(the ARU of the old leader node is 0), the other nodes
try to sent all of their previous messages. This is impossible
(even if it was correct), because other nodes have already freed most
of those messages. The implementation uses an assert to limit maximum
number of messages sent during recovery (we could fix this,
but it's not really the point).
The solution here is to increase the ring_id sequence number by 1 after
loading it from storage. During creation of the commit token it is
always increased by 4, so it will not collide with an existing
sequence.
Thanks Christine Caulfield <ccaulfie@redhat.com> for clarify commit
message.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Init script used to use corosync-cfgtool -s to wait till
corosync accepts ipc connection. Problem with this approach
is that error code is returned not only if ipc cannot be initialized,
but also when one of the ring is marked as failed, making corosync
service not to start. Corosync with one failed ring can work just
fine and there is no need to fail startup.
Patch is changing call of corosync-cfgtool to corosync-cpgtool. Also to
make spotting of broken ring easier, corosync-cfgtool -s is called after
successful return of the cpgtool, and warning is issued if cfgtool
fails.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
time_t is platform dependent real type which is usually long int on
64-bit platform, but only int on 32-bit platform and printing it with
%ld generated warning.
Solution seems to be ether retype time_t to long int or use functions
which works with time_t. Later option is used in this patch, which uses
localtime and strftime to print time_t value.
Also code is refactored to remove duplicate calls and add _cs_snmp
prefix to prevent snmp_ prefix collision.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
corosync_cfg_ring_status_get returns string status, which is always OK
for UDP(U) and detailed status for Knet transport. Previously also
FAULTY status was returned for UDP(U) and cfgtool used to return error
code back to shell when one of the interfaces was faulty.
Because FAULTY is now not returned, it's not needed to have code for
handling it.
Also man page was misleading, so it is fixed too.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Previously node id was logged ether as a %d (most often), %u, %x or
PRI.32 and ring id ether as %lld, %llx with various separators (., :, /)
between rep nodeid and seq. This seems to cause confusion.
This patch adds macros CS_PRI_NODE_ID, CS_PRI_RING_ID and
CS_PRI_RING_ID_SEQ (CS prefix = corosync, PRI modeled in spirit of
inttypes.h PRIx32) and makes code use them.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Disabling forwarding will make knet flush the messages (especially
LEAVE one).
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Compiler is unable to understand relation between members and
num_configured and warns about uninitialized members. Instead of
initializing members to 0 and (potentially after some code
refactor) let code fall to display error message, more explicit method
of assert is used.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
configured links may not come in order in the interfaces array, which
holds an entry for _all_ possible links, not just configured ones.
So iterate through all interfaces, but skip those which are not
configured. This allows to start corosync with a configuration where
link 0 is currently not mentioned, as else it was checked but had
member_count = 0 from it's default initialization, which then made
this code report a false positive for the "Not all nodes have the
same number of links" check even on a correct config.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
As totem_config->interfaces entries are _all_ possible links and not
only the configured ones we cannot trust that interface[0] is
configured at the time of checking, and thus has a valid
member_count. So set the members variable to the member_count entry
from an actually configured interface and loop over that one.
This fixes a case where the check for the name property on all nodes
for multi links was skipped if link 0 was not configured, as then its
member_count was 0.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
New gcc warn about passing posibly unaligned pointer from packed
structure. This shouldn't be problem for x86.
Implemented solution is to let compiler do its job (compiler knows if
pointer is aligned so accessing structure field is safe) and
use it together with support for asigning and returning of structure
(not a pointer to the structure).
- srp_addr_copy is removed and replaced by simple assignment
- srp_addr_copy_endian_convert is removed and replaced by
srp_addr_endian_convert function which takes srp_addr structure and
returns endian converted srp_addr structure
- functions which accepts srp_addr array are not changed because
(luckily) non-aligned pointer is always just one item array and
such item is always used as a source pointer so it's possible to use
temporary variable
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
And make handling of left_list more generic. Also free skiplist
allocated by joinlist_inform_clients function. Last (but not least)
remove czechlish founded (should have been pp of "find").
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
using a similar approach to
43bead3645
"Send one confchg event per CPG group to CPG client"
which did the same for leave events on a network partition.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
... and remove unused nodes_in_partition function.
Also replace TAILQ_FOREACH with goto to while cycle.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>