mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2026-01-25 17:24:29 +00:00

Author	SHA1	Message	Date
Jan Friesse	8ce65bf951	votequorum: Reflect runtime change of 2Node to WFA When 2Node mode is set, WFA is also set unless WFA is configured explicitly. This behavior was not reflected on runtime change, so restarted corosync behavior was different (WFA not set). Also when cluster is reduced from 3 nodes to 2 nodes during runtime, WFA was not set, what may result in two quorate partitions. Solution is to set WFA depending on 2Node when WFA is not explicitly configured. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2020-01-21 16:19:49 +01:00
Hideo Yamauchi	9fda4dc6ac	cpg: Change downlist log level Signed-off-by: Hideo Yamauchi <renayama19661014@ybb.ne.jp> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-01-09 12:40:32 +01:00
Ferenc Wágner	f1d36307e5	man: move cmap_keys man page from section 8 to 7 Section 8 is for "System administration commands", 7 is "Miscellaneous". Signed-off-by: Ferenc Wágner <wferi@debian.org> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-01-07 08:56:58 +01:00
Jan Friesse	89b0d62f8b	stats: Check return code of stats_map_get Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:45 +01:00
Jan Friesse	56ee850301	quorumtool: Assert copied string length Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:45 +01:00
Jan Friesse	1fb095b0af	notifyd: Check cmap_track_add result And assert length of key_name to strcpy. Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:45 +01:00
Jan Friesse	8ff7760ce5	cmapctl: Free bin_value on error Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:45 +01:00
Jan Friesse	21e1c71169	cfgtool: Remove unused callbacks Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:45 +01:00
Jan Friesse	ee38d93cc7	cpghum: Remove unused time variables and functions Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:44 +01:00
Jan Friesse	35c312f810	votequorum: Assert copied strings length Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:44 +01:00
Jan Friesse	29109683cf	totemknet: Assert strcpy length Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:44 +01:00
Jan Friesse	0c118d8ff4	totemknet: Check result of fcntl O_NONBLOCK call Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:44 +01:00
Jan Friesse	a24cbad590	totemconfig: Initialize warnings variable Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:44 +01:00
Jan Friesse	74eed54a7f	sync: Assert sync_callbacks.name length Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:44 +01:00
Jan Friesse	380b744ec8	totemknet: Don't mix corosync and knet error codes And use correct return code in stats.c. Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:44 +01:00
Jan Friesse	624b6a4707	stats: Assert value_len when value is needed Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:44 +01:00
Jan Friesse	f31a31f91a	cmap: Assert copied string length Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:44 +01:00
Jan Friesse	e925e389a2	totemconfig: Reuse already fetched pointer Make code a bit readable and easier to process for coverity. Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:44 +01:00
Jan Friesse	09f6d34aaa	logconfig: Remove double free of value Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:44 +01:00
Jan Friesse	cddd62f972	votequorum: Ignore the icmap_get_* return value Express intention to ignore icmap_get_* return value and rely on default behavior of not changing the output parameter on error. Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:44 +01:00
Jan Friesse	efe48120e2	totemconfig: Free leaks found by coverity Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-11-28 09:44:43 +01:00
Christine Caulfield	1ba03a3816	icmap: fix the icmap_get__r functions Make the icmap_r functions read from the specified map rather than the global map. Also include icmap_get_string_r() which seems to have been missed out. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2019-11-18 16:29:57 +01:00
Fabio M. Di Nitto	1eb12d30a0	pkgconfig: Add libqb dependency To make sure libqb dependency is visible across all libraries. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2019-11-18 09:57:40 +01:00
Jan Friesse	6ba9870f69	Initialize stack allocated memory Some functions allocated memory on stack without clearing memory and then send them on wire. This is not an issue, but valgrind reports this as a problem so it is easy to miss real problem then. Solution is to clear stack memory. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2019-11-08 11:20:18 +01:00
Thomas Lamprecht	721c5d4b5b	man: Fix corosync.conf knet pong count default commit `029b8ebad6` changed the default of the KNET_PONG_COUNT from the kronosnet default of 5 to 2, as corosync bring up was deemed to slow. The documentation, and the comment stating that the totem config default values match the knet ones were not updated, and thus now out of date. Fixhis by noting the correct default of 2 for KNET_PONG_COUNT and note that all but that one are in sync with the korosync defaults in the comment. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2019-10-17 08:27:07 +02:00
Jan Friesse	ee8b8993d9	totemsrp: Reduce MTU to left room second mcast Messages sent during recovery phase are encapsulated so such message has extra size of mcast structure. This is not so big problem for UDPU, because most of the switches are able to fragment and defragment packet but it is problem for knet, because totempg is using maximum packet size (65536 bytes) and when another header is added during retransmition, then packet is too large. Solution is to reduce mtu by 2 * sizeof (struct mcast). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2019-10-09 11:48:43 +02:00
Jan Friesse	bd11a3380c	totempg: Check sanity (length) of received message Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2019-10-09 11:48:17 +02:00
Fabio M. Di Nitto	671720490a	build: add option for enabling sanitizer builds --with-sanitizers= option is stricly meant for runtime debugging purposes. Do NOT use in production. Please check gcc/clang man pages on how to use ASAN/UBSAN/TSAN. Also allow users to specificy SANITIZERS_CFLAGS and SANITIZERS_LDFLAGS for advanced use. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2019-10-09 11:39:43 +02:00
Jan Friesse	1cf1558fe7	totemknet: Add locking for log call Knet callbacks may be called from different thread than main thread. If this happens, log messages may be lost. Most prominent example is when link goes up (logged by main thread) and host_change_callback_fn is called. Implemented solution is adding mutex for every log call in totemknet. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2019-09-10 11:29:54 +02:00
Jan Friesse	0a323ff2ed	man: Fix link_mode priority description ... to match knet source code. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2019-08-27 07:48:15 +02:00
Jan Friesse	41a7b18ded	notifyd: Don't dereference NULL key_name This problem shouldn't really happen, but better safe than sorry. Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-07-30 14:24:32 +02:00
Jan Friesse	3675daceee	totem: Increase ring_id seq after load This patch handles the situation where the leader node (the node with lowest node_id) crashes and is started again before token timeout of the rest of the cluster. The newly restarted node restores the ringid of the old ring from stable storage, so it has the same ringid as rest of the nodes, but ARU is zero. If the node is able to create a singleton membership before receiving the joinlist from rest of the cluster, everything works as expected, because the ring id gets increased correctly. But if the node receives a joinlist from another cluster node before its own joinlist, then it continues as it would had it never left the cluster. This is not correct, because the new node should always create a singleton configuration first. During the recovery phase, ARUs are compared and because they differ (the ARU of the old leader node is 0), the other nodes try to sent all of their previous messages. This is impossible (even if it was correct), because other nodes have already freed most of those messages. The implementation uses an assert to limit maximum number of messages sent during recovery (we could fix this, but it's not really the point). The solution here is to increase the ring_id sequence number by 1 after loading it from storage. During creation of the commit token it is always increased by 4, so it will not collide with an existing sequence. Thanks Christine Caulfield <ccaulfie@redhat.com> for clarify commit message. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2019-07-15 16:39:32 +02:00
Jan Friesse	0d1a1b1329	init: Use cpgtool instead of cfgtool Init script used to use corosync-cfgtool -s to wait till corosync accepts ipc connection. Problem with this approach is that error code is returned not only if ipc cannot be initialized, but also when one of the ring is marked as failed, making corosync service not to start. Corosync with one failed ring can work just fine and there is no need to fail startup. Patch is changing call of corosync-cfgtool to corosync-cpgtool. Also to make spotting of broken ring easier, corosync-cfgtool -s is called after successful return of the cpgtool, and warning is issued if cfgtool fails. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2019-07-08 16:00:39 +02:00
Jan Friesse	257a4fd377	notifyd: Fix warning produced by 32-bit compiler time_t is platform dependent real type which is usually long int on 64-bit platform, but only int on 32-bit platform and printing it with %ld generated warning. Solution seems to be ether retype time_t to long int or use functions which works with time_t. Later option is used in this patch, which uses localtime and strftime to print time_t value. Also code is refactored to remove duplicate calls and add _cs_snmp prefix to prevent snmp_ prefix collision. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2019-07-08 15:13:45 +02:00
Jan Friesse	d7f5478b32	cfgtool: Remove unused code corosync_cfg_ring_status_get returns string status, which is always OK for UDP(U) and detailed status for Knet transport. Previously also FAULTY status was returned for UDP(U) and cfgtool used to return error code back to shell when one of the interfaces was faulty. Because FAULTY is now not returned, it's not needed to have code for handling it. Also man page was misleading, so it is fixed too. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2019-07-08 13:00:58 +02:00
Jan Friesse	5731af2782	logging: Add CS_PRI_NODE_ID and CS_PRI_RING_ID Previously node id was logged ether as a %d (most often), %u, %x or PRI.32 and ring id ether as %lld, %llx with various separators (., :, /) between rep nodeid and seq. This seems to cause confusion. This patch adds macros CS_PRI_NODE_ID, CS_PRI_RING_ID and CS_PRI_RING_ID_SEQ (CS prefix = corosync, PRI modeled in spirit of inttypes.h PRIx32) and makes code use them. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2019-07-03 10:53:52 +02:00
Jan Friesse	49641b9a8f	vqsim: Fix gitignore Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2019-07-02 10:34:08 +02:00
Jan Friesse	d59a18d4a1	totemknet: Disable forwarding on shutdown Disabling forwarding will make knet flush the messages (especially LEAVE one). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2019-06-28 08:27:18 +02:00
Jan Friesse	51fbd7bafe	totemconfig: Fix compiler warning Compiler is unable to understand relation between members and num_configured and warns about uninitialized members. Instead of initializing members to 0 and (potentially after some code refactor) let code fall to display error message, more explicit method of assert is used. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2019-06-17 17:44:10 +02:00
Thomas Lamprecht	816324c94c	totem: fix check if all nodes have same number of links configured links may not come in order in the interfaces array, which holds an entry for _all_ possible links, not just configured ones. So iterate through all interfaces, but skip those which are not configured. This allows to start corosync with a configuration where link 0 is currently not mentioned, as else it was checked but had member_count = 0 from it's default initialization, which then made this code report a false positive for the "Not all nodes have the same number of links" check even on a correct config. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2019-06-17 12:29:30 +02:00
Thomas Lamprecht	7ada508a82	totem: fix check if all nodes have name attrs in multi-link setups As totem_config->interfaces entries are _all_ possible links and not only the configured ones we cannot trust that interface[0] is configured at the time of checking, and thus has a valid member_count. So set the members variable to the member_count entry from an actually configured interface and loop over that one. This fixes a case where the check for the name property on all nodes for multi links was skipped if link 0 was not configured, as then its member_count was 0. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2019-06-17 12:29:09 +02:00
dkutergin	2183b9aa4a	corosync-notifyd: Add option to disable DNS lookup New configuration option -n is added. Signed-off-by: dkutergin <dmytro.kutergin@harmonicinc.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2019-06-14 10:56:13 +02:00
Jan Friesse	b0c24ec665	totemsrp: Fix warnings produced by gcc 9.1 New gcc warn about passing posibly unaligned pointer from packed structure. This shouldn't be problem for x86. Implemented solution is to let compiler do its job (compiler knows if pointer is aligned so accessing structure field is safe) and use it together with support for asigning and returning of structure (not a pointer to the structure). - srp_addr_copy is removed and replaced by simple assignment - srp_addr_copy_endian_convert is removed and replaced by srp_addr_endian_convert function which takes srp_addr structure and returns endian converted srp_addr structure - functions which accepts srp_addr array are not changed because (luckily) non-aligned pointer is always just one item array and such item is always used as a source pointer so it's possible to use temporary variable Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2019-06-14 10:03:31 +02:00
Jan Friesse	3c7f19a02f	cpg: Move filling of member_list to subfunction Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2019-06-13 15:16:31 +02:00
Jan Friesse	1e2df0ba0c	cpg: Add more comments to notify_lib_joinlist And make handling of left_list more generic. Also free skiplist allocated by joinlist_inform_clients function. Last (but not least) remove czechlish founded (should have been pp of "find"). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2019-06-13 15:16:13 +02:00
Fabian Grünbichler	7fb2470966	cpg: send single confchg event per group on joinlist using a similar approach to `43bead3645` "Send one confchg event per CPG group to CPG client" which did the same for leave events on a network partition. Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2019-06-13 15:15:32 +02:00
Fabian Grünbichler	c16abe515f	cpg: notify_lib_joinlist: drop conn parameter since it is always set to NULL. Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2019-06-13 15:14:53 +02:00
Jan Friesse	0390200dd4	vqsim: Check length of copied optarg Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2019-06-12 15:40:53 +02:00
Jan Friesse	1d8c1a4c97	vqsim: Check result of icmap_set_uint32 Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2019-06-12 15:40:52 +02:00
Jan Friesse	ef9b931b7e	vqsim: Remove unused total_nodes ... and remove unused nodes_in_partition function. Also replace TAILQ_FOREACH with goto to while cycle. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2019-06-12 15:40:52 +02:00

1 2 3 4 5 ...

4120 Commits