mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2025-10-24 12:01:54 +00:00

Author	SHA1	Message	Date
Jan Friesse	06504c0f6f	build: Remove NSS dependencies Complete removal of NSS from corosync tree. Most of the changes are in build system and cpgverify had to be rewritten to use crc32 instead of sha1. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-09-17 10:26:05 +02:00
Jan Friesse	e3989c2b56	coroparse: Fix newly introduced warning Small fix for a problem introduced by "coroparse: Use key_name for error message" patch. Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2018-09-07 16:53:08 +02:00
Chris Walker	51989b4a0a	Add option to force cluster into GATHER state Signed-off-by: Chris Walker <cwalker@cray.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-09-07 13:27:36 +02:00
Jan Friesse	0ac659608d	coroparse: Use key_name for error message Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-09-06 13:02:03 +02:00
Jan Friesse	f6262e5755	coroparse: Add file name and line to error message It's just much easier to find out what is happening when message like parser error: /etc/corosync/corosync.conf:39: Unexpected closing brace is logged instead of parser error: Unexpected closing brace Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-09-06 13:01:56 +02:00
Jan Friesse	80701845ab	coroparse: Be more strict in what is parsed Corosync parser is not very clever, but it is able to detect more errors without too much code. 1. Check if section name is not empty (just '{' character) 2. Check if there is no extra characters after opening bracket '{' 3. Check if there is no extra characters after or before closing bracket '}' 4. Check if line is opening section, closing section or key/value So following examples are reported as error: totem { version: 2 }}}}}}}}}} Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-09-06 13:01:35 +02:00
Jan Friesse	7a4725f9da	coroparse: Fix remove_whitespace end condition When remove_whitespace function parameter is single character string with whitespaces (like a:) then colon is not removed. Reason is end condition end != start, which is valid for empty string, but invalid in case described above. Solution is to check if *end is '\0'. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-09-06 13:01:20 +02:00
Jan Friesse	ffb759cd7d	coroparse: Check icmap_set results Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-09-06 13:01:12 +02:00
Jan Friesse	20bd68b3fb	coroparse: Return error if config line is too long Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-09-06 13:01:01 +02:00
Jan Friesse	c9e5d6db13	Remove libcgroup Libcgroup is deprecated and not shipping with new distributions (OpenSuSE is one example). Solution is to have a partial implementation of required functionality of libcgroup in the corosync code. Patch uses hardcoded cgroup mount point, because most of the systems are now systemd and systemd is also using hardcoded mountpoint (see https://github.com/systemd/systemd/blob/master/src/core/mount-setup.c) Configuration option --enable-cgroup is gone, because it's not needed any longer. Big thanks to Christine Caulfield <ccaulfie@redhat.com> for example of simplified implementation of cgroup management code primitives. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-08-14 14:54:28 +02:00
Chris Walker	3f7d2cf6aa	Add token_warning configuration option Token_warning is used to present information about when the token was last received. Signed-off-by: Chris Walker <cwalker@cray.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-08-14 10:34:49 +02:00
Jan Friesse	f60541513e	totemsrp: Add assert into memb_lowest_in_config Add assert when there are no members in token_memb structure so non-existing member is not accessed (token should always have at least one member). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-08-13 09:00:47 +02:00
Jan Friesse	1d2c6e4696	totemconfig: Enlarge error_string_response ... so error_reason can be fully included into parse error message. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-08-13 09:00:44 +02:00
Jan Friesse	0095b9a3cb	ipc_glue: Fix strncpy in pid_to_name function Trailing zero is always added so there is no need to have a warning about unterminated destination string. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-08-13 09:00:43 +02:00
Jan Friesse	f576ad6388	util: Fix strncpy in setcs_name_t function Trailing zero is always added so there is no need to have a warning about unterminated destination string. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-08-13 09:00:39 +02:00
Jan Friesse	844a76e775	totemknet: Free instance on failure exit Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-08-13 09:00:35 +02:00
Jan Friesse	31268cc744	totemudpu: Pass correct paramto totemip_nosigpipe Fixes compilation on (at least) FreeBSD. Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2018-07-12 16:29:15 +02:00
Bin Liu	96b4bd1660	totemudpu: Add local loop support This patch intends to solve long time ifdown corosync problem. Idea is to use local socket for sending both unicast and multicast messages if interface is down. Together with testing what is current bind state it's possible to keep pretending existence of old IP address instead of rebinding to localhost what breaks a lot things badly. Heavilly based on Yu, Zou <zouyu@shiqichuban.com> work and it's basically port of UDP patch created by Jan Friesse <jfriesse@redhat.com>. (ported from needle `96354fba72`) Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-07-12 15:43:03 +02:00
Christine Caulfield	a471bab798	config: Fail config validation if not all nodes have all links KNET requires that all links be full-mesh (this may change in the future but almost certainly not before knet 2.0), so enforce this in the config. Also avoid a potential div-by-0 error if the local node is not fully configured either. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-07-03 12:38:02 +02:00
Christine Caulfield	d1db8c2851	config: Enforce use of 'name' node attribute in multi-link clusters If the local host does not have a 'name' attribute and the cluster has more than one link then fail the validation test. I'm open to the idea of checking all of the nodes in the nodelist if necessary. It seems overkill as each node will check its own entry though. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-07-03 12:37:45 +02:00
Christine Caulfield	429209f4aa	totemconfig: Check for things that cannot be changed on the fly There are a few things in the interface that cannot be changed on the fly. Warn about them and tell the user that these things need to be done in two steps and why. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-07-02 09:54:31 +02:00
Jan Friesse	cc81696ff5	Fix snprintf warnings Compiler shows warnings about possible not large enough buffer, so check snprintf return value properly. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-07-02 08:08:33 +02:00
Christine Caulfield	137b31397c	knet: Don't try to create loopback interface twice It wasn't hardmful, but it generated an annoying message Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-07-02 08:00:36 +02:00
Christine Caulfield	5dda71ae29	knet: Fix knet log buffer size knet sends log messages as struct knet_log_msg, not a string of KNET_MAX_LOG_MSG_SIZE (which is only part of that structure). So we were both losing and corrupting messages. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-07-02 08:00:15 +02:00
Jan Friesse	23e17953fe	cpg: Inform clients about left nodes during pause Patch tries to fix incorrect behaviour during following test-case: - 3 nodes - Node 1 is paused - Node 2 and 3 detects node 1 as failed and informs CPG clients - Node 1 is unpaused - Node 1 clients are informed about new membership, but not about Node 1 being paused, so from Node 1 point-of-view, Node 2 and 3 failure Solution is to: - Remove downlist master choose and always choose local node downlist. For Node 1 in example above, downlist contains Node 2 and 3. - Keep code which informs clients about left nodes - Use joinlist as a authoritative source of nodes/clients which exists in membership This patch doesn't break backwards compatibility. I've walked thru all the patches which changed behavior of cpg to ensure patch does not break CPG behavior. Most important were: - `058f50314c` - Base. Code was significantly changed to handle double free by split group_info into two structures cpg_pd (local node clients) and process_info (all clients). Joinlist was - `97c28ea756` - This patch removed confchg_fn and made CPG sync correct - `feff0e8542` - I've tested described behavior without any issues - `6bbbfcb6b4` - Added idea of using heuristics to choose same downlist on all nodes. Sadly this idea was beginning of the problems described in `040fda8872`, `ac1d79ea7c`, `559d4083ed`, `02c5dffa5b`, `64d0e5ace0` and `b55f32fe2e` - `02c5dffa5b` - Made joinlist as authoritative source of nodes/clients but left downlist_master_choose as a source of information about left nodes Long story made short. This patch basically reverts idea of using heuristics to choose same downlist on all nodes. (ported from needle `9c2a97f4f9`) Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-04-30 14:37:20 +02:00
Jan Friesse	e45bbcc92a	totemsrp: Fix leave message regression Leave message in totem is just join message where leaving member is excluded from member list and included in fail list. It also contains special nodeid in header.nodeid and system_from.nodeid fields. Before "totem: Use nodeid ONLY in srp_addr" fix, most of the functions were using system_from addresses and not nodeid, which was used only in one specific case for memb_consensus_set function. After the patch, addresses are gone and only nodeid is used. Result is, that leaving node nodeid is not added into local fail list (my_faillist) so node is unable to reach consensus till token timeout, which starts new gather process. Solution is to send valid leaving node nodeid in system_from.nodeid and handle specific case for memb_consensus_set in memb_join_process. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-04-23 17:46:05 +02:00
Jan Friesse	dc590159f5	totemsrp: Log proc/fail lists in memb_join_process These information are useful and with trace log level they should not be too much irritating. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-04-23 17:45:51 +02:00
Jan Friesse	9b3782e48e	totemsrp: Fix srp_addr_compare There is regression caused by "totem: Use nodeid ONLY in srp_addr" patch in srp_addr_compare function. This function should be usable with qsort, so it should return values less than, equal to or greater than zero. It was however returning only zero or negation of a zero. Final results were unable to reach consensus in following test case: - 3 node cluster - start nodes 1, 2, 3 - shutdown node 3 - start node 3 - shutdown node 2 - start node 2 - shutdown node 1 After this steps, node 2 and 3 were unable to reach consensus. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-04-23 17:45:29 +02:00
Ferenc Wágner	baece74c39	Fix typo: sucesfully -> successfully Signed-off-by: Ferenc Wágner <wferi@debian.org> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-04-20 12:04:49 +02:00
Jan Friesse	ccb2290f84	totemsrp: Check join and leave msg length If number of proc_list, failed_list or active members is too high it may be impossible to put them into message, which is allocated on the stack what results in stack corruption. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-04-12 15:25:38 +02:00
Jan Friesse	c139255669	totemsrp: Implement sanity checks of received msgs Sanity checkers are used to prevent crashing because of accessing unallocated memory. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-04-12 15:25:33 +02:00
Jan Friesse	69857efb5b	totem: Display IP of sender To make finding victim of incompatible messages easier, IP of sender is logged. Propagating IP in layers makes patch slightly larger. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-03-16 13:58:15 +01:00
Jan Friesse	0c509a25a7	totemsrp: Add magic and version into header Magic number (0xC070) together with version in every packet is used for detecting that other node is really Corosync 3.x. Endian_detector field is removed and magic number is now used instead. If received packet magic number differs, guessing is used to show more about the source (Corosync 2.3+, 2.2 are quite reliable, Knet and unencrypted Corosync 2.1/2.0/1.x/OpenAIS are semi-reliable and encrypted Corosync 2.1/2.0/1.x/OpenAIS are quite unreliable). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-03-16 13:57:55 +01:00
Christine Caulfield	066525efd3	knet: Fix display of links with unconfigured link0 because totemknet always configures link0 as loopback even if it's not known to corosync, we need to filter it out when returning the link status, as things get misaligned in cfg. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-16 13:11:13 +01:00
Jan Friesse	b3f3a1df26	main: Set errno before calling of strtol Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-03-02 17:29:22 +01:00
Christine Caulfield	2c20590d16	knet: Always use link0 for loopback Even if it's not used for anything else. Also, make cfgtool show the correct link ID when links are not contiguous Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:23:20 +01:00
Christine Caulfield	111bfbc11d	totem: Fix debug warnings printed by knet Fix crash introduced a couple of commits ago in iface_get Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:22:22 +01:00
Christine Caulfield	f5871c6b4c	config: Allow use of ring0_addr Allow ring0_addr to be used in place of 'name' for backwards compatibility Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:21:37 +01:00
Christine Caulfield	7a639d1b62	config: Update message when local host isn't found Make the message more representative of what's going on. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:20:00 +01:00
Christine Caulfield	386d710ed1	cfg: Fix cfg_get_node_addrs so that DLM works Also update copyright dates Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:19:45 +01:00
Christine Caulfield	f5b690bd96	totem: Return interface count correctly Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:19:12 +01:00
Christine Caulfield	fc8580bdbf	totem: Use nodeid ONLY in srp_addr This shrinks the srp_addr (and consequently every packet sent by corosync) so that instead of containing loads of IP addresses to identify a node, it just sends the nodeid. This then allows us to make ring0 optional and replaceable when running knet. It also means that we need some other way of identifying the local node in corosync.conf, so the nodelist.node.name entry is now mandatory and is mapped to the local host using the same algorithm as used in cman. This code needs LOTS of testing as it touches a huge amount of totemsrp and totemconfig. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:18:51 +01:00
Rytis Karpuška	105f3ae98c	totempg: Fix corrupted messages Commit `899cb29983` changed copy_len to iovec[i].iov_len, assuming, copy_len is always the same as iovec[i].iov_len under those circumstances, but it missed the possability of small message being partly put at the end of packet, which cuts this message in two parts and therefore making copy_len not equal to iovec[i].iov_len. This is revert of `899cb29983` Signed-off-by: Rytis Karpuška <rytisk@neurotechnology.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-02-09 17:38:05 +01:00
Rytis Karpuška	899cb29983	totempg: use iovec[i].iov_len instead of copy_len To be more explicit that we are copying whole message. Related to `0ebae6b47d`. Signed-off-by: Rytis Karpuška <rytisk@neurotechnology.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-02-08 09:30:07 +01:00
Rytis Karpuška	0ebae6b47d	totempg: Fix fragmentation segfault The problem was that two or more messages were concatenated together during fragmentation in mcast_msg() function. In specific case, message of just short of 1MB was provided for mcast_msg() and it happened so, that the remainder (212 bytes to be exact) left some free space in packet, therefore branch if ((copy_len + fragment_size) < (max_packet_size - sizeof (unsigned short))) { ... was selected and this was the last mesage in provided iovec. Then, on the second call, came another big message (about 300KB ) and during fragmentation mcast.fragmented was set to 1. On the other end, while receiving messages, due to missing mcast.fragmentation==0 those two messages were concatenated and therefore assembly->data array overflowed overwriting linked list pointers and offset (which happened to be set to 0 and that 300KB message was being copied from the beginning again). After whole 300KB message has been sent, mcast.fragmentation==0 arrived and totempg_deliver_fn() tried to move assembly structure to assembly_list_free list, but as linked list pointers has been overriden, segfault occured. Signed-off-by: Rytis Karpuška <rytisk@neurotechnology.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-02-08 09:29:22 +01:00
Fabio M. Di Nitto	1411608a81	[build] fix build with non-standard knet location Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-02-05 15:57:12 +01:00
Jan Friesse	11fa527ed4	logging: Close before and open blackbox after fork Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-01-30 13:21:52 +01:00
Jan Friesse	79dba9c51f	logging: Make blackbox configurable Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-01-30 13:21:48 +01:00
Jan Friesse	1fba1b83aa	build: Replace -lknet with autoconf generated vars Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-01-25 16:08:09 +01:00
Jan Friesse	589ed92505	build: Remove rdma/ibverbs Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-01-25 16:08:07 +01:00
Christine Caulfield	31ddba64a2	config: Don't fudge port numbers When I was adding knet I wanted the port numbers to default to the base port number + the linknumber. However I seem to have messed this up such that any port number specified in the config file has the link number added to it. Which is almost certainly not what people would expect. This patch sets it right. If a port number is not specified then 5405+linknumber is used. If a port number IS specified then that actual number is used. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-01-18 16:31:24 +01:00
Christine Caulfield	22ae4cacda	knet: Allow ping_timers to be auto-configured knet ping_timers are auto-configured according to token value. This patch also fixes some knet config bugs that resulted in defaults not being applied when values were removed from corosync.conf. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-01-15 15:08:19 +01:00
yuskiida	e7734fab70	build: Add the headers necessary for RPM build Signed-off-by: yuskiida <yusk.iida@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-01-11 14:47:46 +01:00
Christine Caulfield	236032f7b5	config: if local node addr is wrong, fail with a sensible message If no valid local address is found in corosync.conf then corosync exits with: "parse error in config: No multicast port specified" This is because of the config change for knet that always populates the interfaces. The old error of "no interfaces found" was only slightly better anyway IMHO. This patch adds an explicit check that local_node_pos has been set in icmap and uses that to determine if a valid local address has been found. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-01-09 17:50:12 +01:00
Jan Friesse	96cb977880	totemknet: Drop truncated packets on receive This is backport of part of "totemudpu: Scale receive buffer" patch. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-01-09 17:46:31 +01:00
Jan Friesse	0f1813adff	totemudp: Make use of UDP_RECEIVE_FRAME_SIZE_MAX Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-01-09 17:46:28 +01:00
Jan Friesse	32535b842c	totemudpu: Export and rename UDPU_FRAME_SIZE_MAX Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-01-09 17:46:25 +01:00
Jan Friesse	3982f795d5	totemconfig: Fix UDP autogeneration of mcast addr Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-01-09 17:46:21 +01:00
Jan Friesse	155c0d4052	totemudpu: Scale receive buffer Receive buffer should be based on PROCESSOR_COUNT_MAX and not static buffer. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-01-09 17:46:04 +01:00
Christine Caulfield	98bb0c78c8	config: Allow selection of crypto_model KNET has options for nss or openssl crpyto libraries, make this available to corosync. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-01-05 15:25:17 +01:00
Christine Caulfield	2a6a571c06	config: Allow links to have different ip_versions knet allows links to have different IP versions - proivided they all match per link. So don't force them all to be the same. I've added a check here to make sure that all nodes on the same link are using the same IP version. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-12-22 17:15:19 +01:00
Bin Liu	b1d3eca448	wd: fix snprintf warnings When running ./configure --enable-watchdog, gcc 7.2.1 will report warnings for snprintf. This patch fixes the warnings. Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-12-01 17:23:54 +01:00
Christine Caulfield	1ca72a1154	totemsrp: Revert totemsrp_get_ifaces() changes In my enthusiasm for removing code while integrating knet I also deleted the correct code for returning IP address for a node, so that only the IP addres of the local node was ever returned. This commit restores the the previous code. Also, because we always return INTERFACE_MAX interfaces now (they don't have to be contiguous) set ss_family to zero if that interface is not in use so that downstream apps know and don't display a lot of 0.0.0.0 Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-11-30 16:59:05 +01:00
Bin Liu	af21baf0ff	totemconfig: remove duplicate aes256 test Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-11-29 18:18:52 +01:00
Jan Friesse	154895dfbe	sync: Call sync_init of all services at once This patch solves situation which can happen very rearly: - Node B is running - Node A is started and tries to create singleton membership. It also initialize service S which tries to send message during initialization - Just before node A finished move to operational state, it gets Node B multicast message so moves to gather state - Node A and B creates membership and moves to operational state and sync is started - Node A and B receives message sent by node A during initialization of service S - Node A exits before sync of service is finished In this situation, node B may never execute sync_init for service S. So node B service S is not aware of existence of node A but it received message from it. Similar situation can theoretically also happen during merge. Solution is to change flow of sync, so now it looks like: - Build service_list - Call sync_init for all local services - Send service_list - Receive service_list from all members and send barier - For all services: - Receive barier - Call sync_activate if this is not first service - Call sync_process for next service or finish sync if previous this service is the last one - Send barier Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2017-11-16 15:22:19 +01:00
Jan Friesse	499eaac80f	sync: Remove unneeded determine sync code Code was used for compatibility with old sync v1 (in needle this was deleted and previous version 2 became v1), and it's no longer needed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2017-11-16 15:22:14 +01:00
Christine Caulfield	1df7eca5ad	stats: Add some missing knet stats Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-11-16 08:35:50 +01:00
Ferenc Wágner	09b0123d58	Send corosync startup notification to systemd This enables starting the daemon directly in the service file, because dependent units won't be started until initialization is complete. Signed-off-by: Ferenc Wágner <wferi@debian.org> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-11-09 09:49:18 +01:00
Jan Friesse	f05d1c9293	coroparse: Do not convert empty uid, gid to 0 When uid (or gid) value was empty string it was incorrectly converted to 0. Solution is to check input string emptines. Thanks Bin Liu <bliu@suse.com> for reporting the bug. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Bin Liu <bliu@suse.com>	2017-11-06 09:37:54 +01:00
Christine Caulfield	45fe19ed86	stats: Don't display errors when reading knet stat Only add the knet handle stat keys if we are actually running knet. This prevents errors occurring when iterating through all of the stats keys Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-11-03 13:40:41 +01:00
Christine Caulfield	d9dfd41e4e	stats: Add cmap key to clear the various stats. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-10-31 17:39:14 +01:00
Bin Liu	cf339c20c3	totemconfig: generate mcast icmap items for UDP Generating mcastaddr and mcastport in icmap make sense only for UDP transport. Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-10-30 14:14:48 +01:00
Bin Liu	99567f0e65	totemconfig: add nodeid check for knet Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-10-30 13:02:03 +01:00
Christine Caulfield	396bca4739	config: Fix memory leak totem_volatile_config_set_string_value was not properly freeing memory. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-10-23 17:31:14 +02:00
Christine Caulfield	16f616b65d	knet: Add support for knet compression Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-10-23 17:30:25 +02:00
Jan Friesse	165d748c04	cmap: Remove noop highest config version check Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2017-10-11 17:11:33 +02:00
Jonathan Davies	2d0e8114ba	cmap: don't shutdown highest config_version node Scenario: 1. node A starts corosync with config_version = 2, nodelist = {A, B} 2. node B starts corosync with config_version = 1, nodelist = {A, B} corosync.conf(5) says the config_version option is "used to prevent joining old nodes with not up-to-date configuration." So expected outcome is: * corosync on node A remains alive * corosync on node B exits Actual outcome is: * corosync on node A exits * corosync on node B exits Explanation of actual behaviour: * Host A will have cmap_my_config_version = 2 but cmap_highest_config_version_received = 1, so will shutdown in cmap_sync_activate because these are not equal. * Host B will have cmap_my_config_version = 1 but cmap_highest_config_version_received = 2, so will shutdown in cmap_sync_activate because these are not equal. Instead, node A should consider its own config_version in the calculation of the highest config_version, i.e. cmap_highest_config_version_received = 2, and so not shutdown in cmap_sync_activate. Signed-off-by: Jonathan Davies <jonathan.davies@citrix.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-10-11 17:07:35 +02:00
Kazunori INOUE	576a493d1e	totemudp: Remove memb_join discarding This is already implemented in totemsrp in much cleaner way (added by commit `ab8942f626`). Signed-off-by: Kazunori INOUE <inouekazu@intellilink.co.jp> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-10-02 11:33:58 +02:00
Edwin Torok	15383b3eb3	votequorum: make atb consistent on nodelist reload When the cluster changes from even sized to odd sized corosync disables auto-tie-breaker if wait_for_all is not enabled. However when changing from odd sized to even sized it doesn't reenable it, causing auto_tie_breaker to be inconsistent across the cluster: the newly added node and any nodes that restart corosync will have it, but all the previously running nodes won't. Signed-off-by: Edwin Torok <edvin.torok@citrix.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-09-26 18:05:17 +02:00
Fabio M. Di Nitto	76591baa4a	totem: Remove unnecessary NSS headers Also fix corosync.spec.in to depend on libknet. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-09-22 10:27:01 +02:00
Christine Caulfield	294a629fb5	config: Allow dynamic link configuration Now we are using knet, it's possible to dynamically add, remove and reconfigure links on the fly. Also print 'n' for non-existant knet links. This will show up only on loopback links >0. But it looks better than 'status =' Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-09-21 17:16:21 +02:00
Masse Nicolas	5b38aa721a	totemudp: Retry if bind fails If bind call fails it's retried for BIND_MAX_RETRIES. If it's still unsuccessful, corosync exists instead of working incorrectly. Slightly modified by reviewer. Signed-off-by: Masse Nicolas <nicolas.masse@stormshield.eu> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-09-19 12:44:26 +02:00
Ferenc Wágner	b7b318b86f	wd: default to not using a watchdog Signed-off-by: Ferenc Wágner <wferi@debian.org> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-09-14 17:40:48 +02:00
Ferenc Wágner	151ed9dfe5	wd: remove extra capitalization typo Signed-off-by: Ferenc Wágner <wferi@debian.org> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-09-12 14:23:04 +02:00
Jonathan Davies	3296a0d41a	totemknet: fix debug message typo Signed-off-by: Jonathan Davies <jonathan.davies@citrix.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-09-11 11:51:16 +02:00
Ferenc Wágner	0f33464531	wd: fix typo Signed-off-by: Ferenc Wágner <wferi@debian.org> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-09-11 11:40:12 +02:00
Christine Caulfield	ed235edfe3	stats: add knet 'handle' stats knet handle stats show compression and crypto statistics. With these you can see the effectiveness of compression and the overheads of both crypto and compression. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-08-23 14:18:59 +02:00
Christine Caulfield	01495f650c	main: use syslog & printf directly for early log messages libqb seems funny about logging things before its fully configured. This corosync commit didn't help either: `8b6bd86a55` So to make sure that messages about the config file not being opened get delivered to the user/syslog we send them directly. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-08-22 09:51:09 +01:00
Christine Caulfield	9898fc8760	totempg: Allow space for incoming overflow totempg needs to store the current message + any overflow for the next message which can be up to (nearly) the MTU size. in knet that's large, but for UDP it's just 1500. The reason we've never seen it before is because the actual max message size is 1024 less than 1MB and after all the headers are stripped out the overflow is usually 1024 bytes or less. The 1024*1024 size of the assembly buffer is large enough to hold a max message (1047552) + 1024 bytes of a new UDP message. So we never saw any problems. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-08-14 14:04:31 +01:00
Chrissie Caulfield	f4a7e54d45	totemknet: Use knet's LOOPBACK transport (#236 ) knet now has a built-in LOOPBACK transport so use that rather than special-casing it for ourself. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-08-04 12:59:16 +01:00
Christine Caulfield	9da89f32c2	CFG: Remove ring-reenable code RRP doesn't exist any more so all the ring re-enable code is redundant. I've removed it from the library and all the code that does anything, but I've left the hole in the IPC just in case old libraries are hanging around. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-08-03 14:32:02 +02:00
Jan Friesse	9a50628fd1	main: Add support for libcgroup When corosync is started in environment where it ends in cgroup without properly set rt_runtime_us it's impossible to get RT priority. Already implemented workaround is to use higher non-RT priority. This patch implements another solution. It moves corosync into root cpu cgroup. Root cpu cgroup hopefully has enough RT budget. Another solution was mentioned on ML https://lists.freedesktop.org/archives/systemd-devel/2017-July/039353.html but this means to generate some "random" values. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> (cherry picked from commit `c56086c701`)	2017-08-01 14:32:53 +02:00
Christine Caulfield	55c3dcb76d	stats: Add map with on-demand statistics Icmap is factored out so it's possible to add other maps for cmap. API call to switch maps from application end is added. Corosync-cmapctl is enhanced with -m option. Stats contains all statistics previously found in runtime.connections, runtime.services and runtime.totem prefixes together with new knet related. All stats are read only. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-07-27 15:53:04 +02:00
Christine Caulfield	876910d8ff	ipc: Check for the libraries sending invalid message IDs If the library sent an invalid (ie too high) message ID to corosync, then it could cause the daemon to crash. Now we check the message ID before indexing the function array Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-07-14 14:06:49 +01:00
Jan Friesse	9627d7350b	main: Add option to set priority Option -P takes numeric value with same meaning as nice or values min / max, meaning maximal / minimal priority (so minimal / maximal nice value). Scheduler / priority setting is moved in code so it is now executed after logsys is configured so errors are logged. Setting maximal priority is also used as fallback when realtime scheduling is requested and sched_setscheduler fails. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> (cherry picked from commit `a008448efb`)	2017-07-10 16:40:39 +02:00
Jan Friesse	2c17832fa6	totemknet: Prevent dead-loop in log_flush_messages Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2017-07-03 15:41:08 +02:00
Jan Friesse	abc1fa5626	totemknet: Flush knet log messages When initialization fails knet logs messages into pipe. Previously they were never processed. Solution is to add log_flush_messages which takes care to call log_deliver_fn. Call of log_flush_messages is also added to totemknet_finalize because this removes log pipe fd from qb_loop so similar problem can happen. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2017-07-03 13:19:11 +02:00
Jan Friesse	cf18736d52	totemconfig: Make crypto work again Knet needs longer key and supports various key lengths. Split TOTEM_PRIVATE_KEY_LEN into TOTEM_PRIVATE_KEY_LEN_MIN and TOTEM_PRIVATE_KEY_LEN_MAX (both using KNET_*_KEY_LEN). Fix incorrect "Could only read..." message. Make sure key is properly initialized/zeroed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2017-07-03 13:19:02 +02:00
Christine Caulfield	b7df8fa46f	knet: Compile with latest knet API extra parameter added to knet_link_get_status() Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-06-29 10:02:21 +01:00
Jan Friesse	564b4bf7d4	totem: Propagate totem initialization failure Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2017-06-15 11:07:33 +02:00
Christine Caulfield	fa37b6073c	totemknet: Use new knet_link_set_config() API TC_PRIO_INTERACTIVE is now a link option in knet, so we have to provide it at link config time. This needs the latest knet git to compile as this is an updated API. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-06-09 13:28:46 +01:00
Michael Jones	afd97d7884	coroapi: Use size_t for private_data_size Unsigned int and size_t represent two different concepts. Same problem was present in ipc_glue. Signed-off-by: Michael Jones <jonesmz@jonesmz.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-05-29 17:23:37 +02:00
Christine Caulfield	5b1df51aa6	votequorum: Report errors from votequorum_exec_send_reconfigure If votequorum_exec_send_reconfigure() returns an error (ie the packet could not be sent) then we should either return it to the sender (for a library call) or, for an internal call, log it. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-05-26 16:18:33 +02:00
Jan Friesse	95b91e4ae7	main: Display reason why cluster cannot be formed Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2017-05-18 17:15:55 +02:00
Christine Caulfield	571f499e0a	knet: Allow space for encapsulated messages Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-05-09 09:05:12 +01:00
Andrew Price	86012ebb45	Main: Call mlockall after fork Man page of mlockall is clear: Memory locks are not inherited by a child created via fork(2) and are automatically removed (unlocked) during an execve(2) or when the process terminates. So calling mlockall before corosync_tty_detach is noop when corosync is executed as a daemon (corosync -f was not affected). This regression is caused by `ed7d054e55` (setprio for logsys/qblog was correct, mlockall was not). Solution is to move corosync_mlockall call on correct place. Signed-off-by: Andrew Price <anprice@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-04-25 14:50:04 +02:00
Bin Liu	c83e6c7ed9	coroparse: Use readdir instead of readdir_r readdir_r is deprecated in glibc 2.24 in favor of readdir (which became thread safe). Also because corosync never calls read_uidgid_files_into_icmap in muliple threads, no problem should appears even with libc where readdir is thread-safe. Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-04-20 08:53:54 +02:00
Bin Liu	725f9039e9	totemknet: Handle logpipe creation failure Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-04-20 08:53:49 +02:00
Bin Liu	be3e166249	wd: Report error when close of wd fails Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-04-20 08:53:45 +02:00
Christine Caulfield	44afff227d	totemknet: Got back to recvmsg() from recvmmsg() The kernel team have recommended us not to use recvmmsg and as it confers no particular speed advantage (especially given the extra memory consumption) I'm going back to single message recvmsg() again. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-04-11 13:44:08 +01:00
Bin Liu	0462b5e609	totemconfig: Prefer nodelist over bindnetaddr In a two-node cluster, I 've one node configured with open-vswtich: 5: br-fixed: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default inet 192.168.124.88/24 scope global br-fixed inet 192.168.124.87/24 scope global secondary br-fixed inet 192.168.124.83/24 brd 192.168.124.255 scope global secondary tentative br-fixed inet 192.168.124.89/24 scope global secondary br-fixed while I use 192.168.124.83 in node list of corosync.conf with udpu, and the bind_addr is 192.168.124.0. After upgrading corosync on this node, the it uses 192.168.124.88 instead of 192.168.124.83. As we can see: corosync-cfgtool -s Printing ring status. Local node ID 1084783704 corosync-quorumtool -s Membership information: Nodeid Votes Name 1084783697 1 d52-54-77-77-01-02 1084783699 1 d52-54-77-77-01-01 (local) while the other node can only see itself: corosync-cfgtool -s Printing ring status. Local node ID 1084783697 RING ID 0 id = 192.168.124.81 status = ring 0 active with no faults corosync-quorumtool -s Membership information: Nodeid Votes Name 1084783697 1 d52-54-77-77-01-02.virtual.cloud.suse.de (local) this patch will check if there are both nodelist and bindnetaddr and if so, display warning and use nodelist information. Signed-off-by: Bin Liu <bliu@suse.com> Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-04-11 11:19:31 +02:00
Christine Caulfield	6076e840f5	knet: Close libknet down cleanly at shutdown By tidily shutting down knet in totekmknet_finalize we make sure all the links are cleanly taken down and, more importantly for us, the corosync LEAVE message gets sent so we don't get fenced on a clean exit. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-04-11 09:03:26 +01:00
Bin Liu	d2a5e1442e	logconfig: Do not overwrite logger_subsys priority logfile_priority and syslog_priority could be modified by logging.logger_subsys.{logfile_priority\|syslog_priority}. which could lead to the following output(which are at notice level): corosync[21419]: [QUORUM] Using quorum provider corosync_votequorum corosync[21419]: [QUORUM] Members[1]: 1084777643 corosync[21419]: [QUORUM] This node is within the primary component and will provide service. corosync[21419]: [QUORUM] Members[3]: 1084777563 1084777584 1084777643 even the syslog_priority is warning. This patch could avoid the overwrite. Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-03-10 09:09:42 +01:00
Christine Caulfield	16770a4153	totem: Fix buffer sizes knet needs buffers to be KNET_MAX_PACKET_SIZE or messages will get lost or corrupted. UDPU packets shouldn't be that big so I introduced UDP_FRAME_SIZE_MAX for that transport. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-03-02 14:57:39 +00:00
Christine Caulfield	30771a39a8	main: Don't ask libqb to handle segv, it doesn't work segv should be handled by corosync, libqb is not the place to be handling emergency signals. This currently requires the head of libqb git tree to generate a blackbox & coredump in the event of a segfault, but it's better than the write() spin that currently happens. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-02-27 15:14:41 +00:00
Jan Friesse	8b6bd86a55	Logsys: Change logsys syslog_priority priority LibQB adds default "*" syslog filter so we have to set syslog_priority as low as possible so filters applied later in _logsys_config_apply_per_file takes effect. Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2017-02-24 16:23:50 +01:00
Fabio M. Di Nitto	36ef2af5a7	knet: improve logging messages by adding knet subsystem Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2017-02-24 09:41:35 +01:00
Fabio M. Di Nitto	19232f6052	knet: Change nodeids to knet_node_id_t for new knet compatibility after some feedback on github, people prefers to have the option to support up to 64K node_id's. libknet added knet_node_id_t to mask the size and type, currently set to uint16_t. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2017-02-14 06:08:45 +01:00
Christine Caulfield	c0f1d576d6	knet: Fix MTU sizes & allow transport config in corosync.conf Corosync layers don't need to know the knet MTU size - this way corosync fragments buffers only when they get larger than the KNET buffer size (64K) and knet fragments below that based on the actual MTU and transport considerations. It is also now possible to configure knet to use UDP or SCTP transports in corosync.conf. This is currently done per-link so if you have more than 1 link you need several interface{} stanzas inside totem{} to make it use other than the default of UDP. if it's useful I might add the option of a global default. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-02-13 16:54:30 +00:00
Fabio M. Di Nitto	970549ddfc	knet: PMTUd data_mtu already accounts for IP and knet header overheads provide some more space for data and small (+1% perf boost) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2017-02-11 06:41:38 +01:00
Fabio M. Di Nitto	18fef0ae7f	knet: switch from write to sendto() this provides another 9.6% performance boost on 2 node clusters Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2017-02-11 06:24:12 +01:00
Christine Caulfield	d9df98ceba	knet: Change nodeids to 8 bit for new knet compatibility I've also put an assert in totemknet_member_add() to check for invalid nodeids. Later on we need to fix the rest of the corosync code to only use 8bit nodeids (or force people to use UDPU if they want large nodeids). Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-02-03 09:38:32 +00:00
Christine Caulfield	2d478505e5	knet: Fix member_remove to shut down existing links first Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-01-16 13:16:15 +00:00
Christine Caulfield	029b8ebad6	knet: Reduce default pong count to 2 for faster startup The default PONG_COUNT of 5 made corosync slow to connect to other nodes. This helps.	2017-01-03 13:30:26 +00:00
Christine Caulfield	950cca886e	totemknet: Make it compile with kronosnet git master Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-12-22 10:25:11 +00:00
Takeshi MIZUTA	4939c75629	Remove redundant header file inclusion Signed-off-by: Takeshi MIZUTA <miz.take4@gmail.com> Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-12-05 09:59:08 +01:00
Bin Liu	819d66ca1c	Totempg: remove duplicate memcpy in mcast_msg func In function mcast_msg of totempg.c, line 923, there is a memcpy call in "else" branch, and also another memcpy out of the "else" branch, while the two calls have the same parameters. It is possibleto remove the memcpy in "else" branch. Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-12-05 09:40:55 +01:00
Takeshi MIZUTA	034553c080	man: Modify man-page according to command usage Signed-off-by: Takeshi MIZUTA <miz.take4@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-12-01 16:32:42 +01:00
Takeshi MIZUTA	9c5b39d438	totempg: totempg_groups_join return valid error totempg_groups_join() is called by sync_init(). sync_init() judge that totempg_groups_join() failed if return code of totempg_groups_join() is -1. Therefore, the return code should return in -1 when totempg_groups_join() fails. Signed-off-by: Takeshi MIZUTA <miz.take4@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-11-23 09:22:21 +01:00
Christine Caulfield	401f483cce	knet: Support reload of link parameters Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-11-17 11:41:54 +00:00
Takeshi MIZUTA	f5dcc4a5f2	list: Unify the list processing with qb_list func Signed-off-by: Takeshi MIZUTA <miz.take4@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-11-15 12:19:13 +01:00
Christine Caulfield	7cec6a131d	knet: Allow configuration of more params knet_pmtud_interval & knet_pong_count Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-11-15 09:32:09 +00:00
Chrissie Caulfield	65219a6300	knet: Don't lose log messages when knet gets busy (#165 ) Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-11-14 15:01:34 +00:00
Jan Friesse	1f90c31ba7	list: Replace for_each by safe version where need Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2016-10-27 14:56:52 +02:00
Michael Jones	b4c06e52f3	list: Replace uses of list.h with qblist.h Signed-off-by: Michael Jones <jonesmz@jonesmz.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-10-27 14:56:52 +02:00
Christine Caulfield	86de6ce1e6	totem: add totemknet.[ch] it seems git is better at deleting files than adding them Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-10-13 08:46:34 +01:00
Michael Jones	a24d26c46a	cfg: Prevents use of uninitialized buffer Signed-off-by: Michael Jones <jonesmz@jonesmz.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-10-12 16:19:05 +02:00
Christine Caulfield	268cde6ee4	totem: Add Kronosnet transport. This is a big update that removes RRP & MRP from the codebase and makes knet the default transport for corosync. UDP & UDPU are still (currently) supported but are deprecated. Also crypto and mutiple interfaces are only supported over knet. To compile this codebase you will need to install libknet from https://github.com/fabbione/kronosnet The corosync.conf(5) man page has been updated with info on the new options. Older config files should still work but many options have changed because of the knet implementation so configs should be checked carefully. In particular any cluster using using RRP over UDP or UDPU will not start as RRP is no longer present. If you need multiple interface support then you should be using the knet transport. Knet brings many benefits to the corosync codebase, it provides support for more interfaces than RRP (up to 8), will be more reliable in the event of network outages and allows dynamic reconfiguration of interfaces. It also fixes the ifup/ifdown and 127.0.0.1 binding problems that have plagued corosync/openais from day 1 Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-10-11 10:09:42 +01:00
HideoYamauchi	f1ffe31ce5	coropase: Set a poll_period value for wd monitor Signed-off-by: HideoYamauchi <renayama19661014@ybb.ne.jp> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-10-06 15:48:38 +02:00
Christine Caulfield	c4683be9b0	votequorum: simplify reconfigure message handling As we now have update_node_expected_votes(), we can use that when receiving a new EXPECTED_VOTES value from another node rather than having our own loop. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-09-13 15:55:58 +01:00
Christine Caulfield	bd2e6b5d9d	votequorum: Don't update expected_votes display if value is too high If expected_votes was set via the library but the calculation decides it's too high, then an error is correctly returned but the value is still set in the nodes' expected_votes field and turns up in the corosync-quorumtool display. This patch separates out the quorum calculation from the updating of expected_votes per node to prevent this from happening. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-09-13 14:28:56 +01:00
Ferenc Wágner	cf10a754e9	Fix various typos occured -> occurred parantheses -> parentheses configuraton -> configuration aquire -> acquire retrive -> retrieve prefered -> preferred Signed-off-by: Ferenc Wágner <wferi@niif.hu> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-09-12 09:50:11 +02:00
Jan Friesse	f837f95dfe	Config: Flag config uidgid entries Uidgid entries parsed from configuration files now has prefix (uidgid.config.) so they are distinguishable from dynamically added entries. Entries added from config file are pruned on reload if no longer exists in config file (dynamic one stays unaffected). Also whole uidgid.config. prefix is made read only. This make PCMK work again after configuration reload is called. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2016-08-04 16:13:48 +02:00
HideoYamauchi	71c9035c27	Low: totemsrp: Addition of the log. Signed-off-by: HideoYamauchi <renayama19661014@ybb.ne.jp> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-08-01 10:11:45 +02:00
Jan Friesse	1925074909	Fix few bugs found by coverity Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2016-06-28 13:58:43 +02:00
Christine Caulfield	0665aca9e1	quorum: revert patch that adds qdevice (node 0) to quorum callback Revert patch 9f54f0a1fad7dad42c55562a50dfb9d773e6a660 as it causes more troubles than it solves. Code that uses the quorum nodelist to get a list of actual nodes in the cluster for communication break using this as well as the display from corosync-quorumtool Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:43 +02:00
Christine Caulfield	c9c6d9e30f	quorum: Return qdevice nodeid in the quorum callbacks (if active). Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:41 +02:00
Christine Caulfield	e41b256c67	votequorum: Allow wait_for_all with qdevice Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:39 +02:00
Christine Caulfield	98548e1880	qnetd: lms: Fix search for node/ring_id check We were looking for us in other node lists, rather than others in our nodelist. Also, remove debug print in votequorum.c Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:39 +02:00
Christine Caulfield	3a5d51fca7	votequorum: Fix up quorum/nodelist callbacks This patch tidies the two state change callbacks and explains them in the man page: The difference between votequorum_nodelist_notification_t and votequorum_quorum_notification_t is subtle but important. The 'nodelist' callback is sent at the start of a cluster state transition and contains the new ring_id and only the list of nodes that are included in the sync state - ie only active nodes. No quorum information is included this callback because it is not available at that time. The 'quorum' callback is sent after the cluster state transition has completed and does contain quorum information. In addition, the nodelist contains a list of all nodes known to votequorum (whether up or down) and their state as well as information about the quorum device attached (if any). quorum callbacks will not be sent for qdevice up and down events unless they affect quorum. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:39 +02:00

1 2 3 4 5 ...

2100 Commits