mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2025-08-15 15:23:48 +00:00

Author	SHA1	Message	Date
Christine Caulfield	14a5e6f361	totemsrp: Fix orf_token stats Previously, orf_token_tx was only incremented on initial send, this is obviously wrong and resulted in the TX count being significantly lower than any RX count. Now we increment it every time the ORF token is sent or resent. As a quick test, on a single node system the RX and TX stats will now match. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2024-10-31 10:58:02 +01:00
Jan Friesse	749f1cb9a5	totem: Use uint64_t type and QB_TIME_NS_IN_MSEC Function message_handler_orf_token contains extra debug info enabled by defining GIVEINFO. Insted of using long long unsigned int use better suited uint64_t and make use of QB_TIME_NS_IN_MSEC constant instead of hardcoded number. Also compile tv_old conditionally so it is not used by accident. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2024-10-23 16:03:14 +02:00
Jan Friesse	55a6f657f4	totem: Use proper timestamp type for token warning Timestamp diff is very unlikely to be larger than 32-bit integer but it is still worth to use 64-bit. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2024-10-23 16:03:01 +02:00
Jan Friesse	3785829935	stats: Store token rx and tx timestamps as 64-bit Token rx and tx timestamps were computed and stored as 32-bit unsigned integer but substracted in other parts of code from 64-bit integer. Result was, that node with uptime larger than 49.71 days (2^32/(10006060*24)) reported wrong numbers for stats.srp.time_since_token_last_received and in log message during long pause (function timer_function_orf_token_warning). Solution is to store rx and tx data as 64-bit integer. Fixes #761 Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2024-10-23 16:02:50 +02:00
Jan Friesse	c01fd757a0	totem: Fix reference links Link Corosync project archived copy of Yair Amir's PhD thesis and paper about totem protocol. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2024-03-12 17:22:42 +01:00
Christine Caulfield	33fa5dcb85	config: Fail to start if ping timers are invalid This required adding a lot of return values to two previously 'void' functions. I did two rather than just the one that was needed because it seemed to make sense to do them both together. Although these functions now return errors, they are probably still ignored higher up. this really needs a comprehensive audit. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2023-10-05 15:53:55 +02:00
Jan Friesse	e7a82370a7	totemsrp: Switch totempg buffers at the right time Commit `92e0f9c7bb` added switching of totempg buffers in sync phase. But because buffers got switch too early there was a problem when delivering recovered messages (messages got corrupted and/or lost). Solution is to switch buffers after recovered messages got delivered. I think it is worth to describe complete history with reproducers so it doesn't get lost. It all started with `402638929e` (more info about original problem is described in https://bugzilla.redhat.com/show_bug.cgi?id=820821). This patch solves problem which is way to be reproduced with following reproducer: - 2 nodes - Both nodes running corosync and testcpg - Pause node 1 (SIGSTOP of corosync) - On node 1, send some messages by testcpg (it's not answering but this doesn't matter). Simply hit ENTER key few times is enough) - Wait till node 2 detects that node 1 left - Unpause node 1 (SIGCONT of corosync) and on node 1 newly mcasted cpg messages got sent before sync barrier, so node 2 logs "Unknown node -> we will not deliver message". Solution was to add switch of totemsrp new messages buffer. This patch was not enough so new one (`92e0f9c7bb`) was created. Reproducer of problem was similar, just cpgverify was used instead of testcpg. Occasionally when node 1 was unpaused it hang in sync phase because there was a partial message in totempg buffers. New sync message had different frag cont so it was thrown away and never delivered. After many years problem was found which is solved by this patch (original issue describe in https://github.com/corosync/corosync/issues/660). Reproducer is more complex: - 2 nodes - Node 1 is rate-limited (used script on the hypervisor side): ``` iface=tapXXXX # ~0.1MB/s in bit/s rate=838856 # 1mb/s burst=1048576 tc qdisc add dev $iface root handle 1: htb default 1 tc class add dev $iface parent 1: classid 1:1 htb rate ${rate}bps \ burst ${burst}b tc qdisc add dev $iface handle ffff: ingress tc filter add dev $iface parent ffff: prio 50 basic police rate \ ${rate}bps burst ${burst}b mtu 64kb "drop" ``` - Node 2 is running corosync and cpgverify - Node 1 keeps restarting of corosync and running cpgverify in cycle - Console 1: while true; do corosync; sleep 20; \ kill $(pidof corosync); sleep 20; done - Console 2: while true; do ./cpgverify;done And from time to time (reproduced usually in less than 5 minutes) cpgverify reports corrupted message. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2021-11-03 10:19:44 +01:00
Jan Friesse	cdf72925db	totem: Add cancel_hold_on_retransmit config option Previously, existence of retransmit messages canceled holding of token (and never allowed representative to enter token hold state). This makes token rotating maximum speed and keeps processor resending messages over and over again - overloading network and reducing chance to successfully deliver the messages. Also there were reports of various Antivirus / IPS / IDS which slows down delivery of packets with certain sizes (packets bigger than token) what make Corosync retransmit messages over and over again. Proposed solution is to allow representative to enter token hold state when there are only retransmit messages. This allows network to handle overload and/or gives Antivirus/IPS/IDS enough time scan and deliver packets without corosync entering "FAILED TO RECEIVE" state and adding more load to network. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2021-08-20 16:55:48 +02:00
Christine Caulfield	9e7f62d27d	cfg: New API to get extended node/link infomation Current we horribly over-use totempg_ifaces_get() to retrieve information about knet interfaces. This is an attempt to improve on that. All transports are supported (so not only Knet but also UDP(U)). This patch builds best against the "onwire-upgrade" branch of knet as that's what sparked my interest in getting more information out. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-11-26 16:15:50 +01:00
Aleksei Burlakov	98bfd9988b	totemsrp: More informative messages ... when token and consensus timeouts pop. Signed-off-by: Aleksei Burlakov <aburlakov@suse.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-10-15 16:46:51 +02:00
Jan Friesse	40d636e9ef	totemsrp: Move token received callback Trigger token received callback only for valid token. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2020-09-29 15:51:49 +02:00
Christine Caulfield	5f71445be0	config: Allow reconfiguration of crypto options Needs new knet crypto API. If it's not available, then fall back to the old API and forbid changing crypto while running. To avoid us being dependant on the leader node, each node sends its own crypto_reconfig_phase messages so we can guarantee that the reconfiguration always completes on each node. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-07-09 16:54:16 +02:00
Christine Caulfield	f078fff6eb	config: Reorganise the config system To be more reliable & maintainable The basic plan here is to fix reloads to be more stable using read/parse/verify/build/commit stages, so that any errors will not leave corosync in an unstable state. This should also make the code more maintainable as currently the verify/commit stages are horribly intertwined. Also: - Fix local_node_pos not being updated in the new map during validation (broke adding and removing new nodes in the middle of the list). - Fix reconfiguration so that nodes are indexed by nodeid and not their position in the list. This is an old bug that's just been carried over Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-04-24 16:26:44 +02:00
Jan Friesse	6ba9870f69	Initialize stack allocated memory Some functions allocated memory on stack without clearing memory and then send them on wire. This is not an issue, but valgrind reports this as a problem so it is easy to miss real problem then. Solution is to clear stack memory. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2019-11-08 11:20:18 +01:00
Jan Friesse	ee8b8993d9	totemsrp: Reduce MTU to left room second mcast Messages sent during recovery phase are encapsulated so such message has extra size of mcast structure. This is not so big problem for UDPU, because most of the switches are able to fragment and defragment packet but it is problem for knet, because totempg is using maximum packet size (65536 bytes) and when another header is added during retransmition, then packet is too large. Solution is to reduce mtu by 2 * sizeof (struct mcast). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2019-10-09 11:48:43 +02:00
Jan Friesse	3675daceee	totem: Increase ring_id seq after load This patch handles the situation where the leader node (the node with lowest node_id) crashes and is started again before token timeout of the rest of the cluster. The newly restarted node restores the ringid of the old ring from stable storage, so it has the same ringid as rest of the nodes, but ARU is zero. If the node is able to create a singleton membership before receiving the joinlist from rest of the cluster, everything works as expected, because the ring id gets increased correctly. But if the node receives a joinlist from another cluster node before its own joinlist, then it continues as it would had it never left the cluster. This is not correct, because the new node should always create a singleton configuration first. During the recovery phase, ARUs are compared and because they differ (the ARU of the old leader node is 0), the other nodes try to sent all of their previous messages. This is impossible (even if it was correct), because other nodes have already freed most of those messages. The implementation uses an assert to limit maximum number of messages sent during recovery (we could fix this, but it's not really the point). The solution here is to increase the ring_id sequence number by 1 after loading it from storage. During creation of the commit token it is always increased by 4, so it will not collide with an existing sequence. Thanks Christine Caulfield <ccaulfie@redhat.com> for clarify commit message. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2019-07-15 16:39:32 +02:00
Jan Friesse	5731af2782	logging: Add CS_PRI_NODE_ID and CS_PRI_RING_ID Previously node id was logged ether as a %d (most often), %u, %x or PRI.32 and ring id ether as %lld, %llx with various separators (., :, /) between rep nodeid and seq. This seems to cause confusion. This patch adds macros CS_PRI_NODE_ID, CS_PRI_RING_ID and CS_PRI_RING_ID_SEQ (CS prefix = corosync, PRI modeled in spirit of inttypes.h PRIx32) and makes code use them. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2019-07-03 10:53:52 +02:00
Jan Friesse	b0c24ec665	totemsrp: Fix warnings produced by gcc 9.1 New gcc warn about passing posibly unaligned pointer from packed structure. This shouldn't be problem for x86. Implemented solution is to let compiler do its job (compiler knows if pointer is aligned so accessing structure field is safe) and use it together with support for asigning and returning of structure (not a pointer to the structure). - srp_addr_copy is removed and replaced by simple assignment - srp_addr_copy_endian_convert is removed and replaced by srp_addr_endian_convert function which takes srp_addr structure and returns endian converted srp_addr structure - functions which accepts srp_addr array are not changed because (luckily) non-aligned pointer is always just one item array and such item is always used as a source pointer so it's possible to use temporary variable Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2019-06-14 10:03:31 +02:00
yuan ren	24a72e9780	totemsrp: Word spelling mistake Signed-off-by: yuan ren <reyren179@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2019-04-01 08:20:46 +02:00
Jan Friesse	06504c0f6f	build: Remove NSS dependencies Complete removal of NSS from corosync tree. Most of the changes are in build system and cpgverify had to be rewritten to use crc32 instead of sha1. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-09-17 10:26:05 +02:00
Chris Walker	51989b4a0a	Add option to force cluster into GATHER state Signed-off-by: Chris Walker <cwalker@cray.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-09-07 13:27:36 +02:00
Chris Walker	3f7d2cf6aa	Add token_warning configuration option Token_warning is used to present information about when the token was last received. Signed-off-by: Chris Walker <cwalker@cray.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-08-14 10:34:49 +02:00
Jan Friesse	f60541513e	totemsrp: Add assert into memb_lowest_in_config Add assert when there are no members in token_memb structure so non-existing member is not accessed (token should always have at least one member). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-08-13 09:00:47 +02:00
Jan Friesse	e45bbcc92a	totemsrp: Fix leave message regression Leave message in totem is just join message where leaving member is excluded from member list and included in fail list. It also contains special nodeid in header.nodeid and system_from.nodeid fields. Before "totem: Use nodeid ONLY in srp_addr" fix, most of the functions were using system_from addresses and not nodeid, which was used only in one specific case for memb_consensus_set function. After the patch, addresses are gone and only nodeid is used. Result is, that leaving node nodeid is not added into local fail list (my_faillist) so node is unable to reach consensus till token timeout, which starts new gather process. Solution is to send valid leaving node nodeid in system_from.nodeid and handle specific case for memb_consensus_set in memb_join_process. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-04-23 17:46:05 +02:00
Jan Friesse	dc590159f5	totemsrp: Log proc/fail lists in memb_join_process These information are useful and with trace log level they should not be too much irritating. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-04-23 17:45:51 +02:00
Jan Friesse	9b3782e48e	totemsrp: Fix srp_addr_compare There is regression caused by "totem: Use nodeid ONLY in srp_addr" patch in srp_addr_compare function. This function should be usable with qsort, so it should return values less than, equal to or greater than zero. It was however returning only zero or negation of a zero. Final results were unable to reach consensus in following test case: - 3 node cluster - start nodes 1, 2, 3 - shutdown node 3 - start node 3 - shutdown node 2 - start node 2 - shutdown node 1 After this steps, node 2 and 3 were unable to reach consensus. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-04-23 17:45:29 +02:00
Jan Friesse	ccb2290f84	totemsrp: Check join and leave msg length If number of proc_list, failed_list or active members is too high it may be impossible to put them into message, which is allocated on the stack what results in stack corruption. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-04-12 15:25:38 +02:00
Jan Friesse	c139255669	totemsrp: Implement sanity checks of received msgs Sanity checkers are used to prevent crashing because of accessing unallocated memory. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-04-12 15:25:33 +02:00
Jan Friesse	69857efb5b	totem: Display IP of sender To make finding victim of incompatible messages easier, IP of sender is logged. Propagating IP in layers makes patch slightly larger. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-03-16 13:58:15 +01:00
Jan Friesse	0c509a25a7	totemsrp: Add magic and version into header Magic number (0xC070) together with version in every packet is used for detecting that other node is really Corosync 3.x. Endian_detector field is removed and magic number is now used instead. If received packet magic number differs, guessing is used to show more about the source (Corosync 2.3+, 2.2 are quite reliable, Knet and unencrypted Corosync 2.1/2.0/1.x/OpenAIS are semi-reliable and encrypted Corosync 2.1/2.0/1.x/OpenAIS are quite unreliable). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-03-16 13:57:55 +01:00
Christine Caulfield	2c20590d16	knet: Always use link0 for loopback Even if it's not used for anything else. Also, make cfgtool show the correct link ID when links are not contiguous Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:23:20 +01:00
Christine Caulfield	111bfbc11d	totem: Fix debug warnings printed by knet Fix crash introduced a couple of commits ago in iface_get Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:22:22 +01:00
Christine Caulfield	386d710ed1	cfg: Fix cfg_get_node_addrs so that DLM works Also update copyright dates Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:19:45 +01:00
Christine Caulfield	f5b690bd96	totem: Return interface count correctly Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:19:12 +01:00
Christine Caulfield	fc8580bdbf	totem: Use nodeid ONLY in srp_addr This shrinks the srp_addr (and consequently every packet sent by corosync) so that instead of containing loads of IP addresses to identify a node, it just sends the nodeid. This then allows us to make ring0 optional and replaceable when running knet. It also means that we need some other way of identifying the local node in corosync.conf, so the nodelist.node.name entry is now mandatory and is mapped to the local host using the same algorithm as used in cman. This code needs LOTS of testing as it touches a huge amount of totemsrp and totemconfig. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:18:51 +01:00
Christine Caulfield	1ca72a1154	totemsrp: Revert totemsrp_get_ifaces() changes In my enthusiasm for removing code while integrating knet I also deleted the correct code for returning IP address for a node, so that only the IP addres of the local node was ever returned. This commit restores the the previous code. Also, because we always return INTERFACE_MAX interfaces now (they don't have to be contiguous) set ss_family to zero if that interface is not in use so that downstream apps know and don't display a lot of 0.0.0.0 Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-11-30 16:59:05 +01:00
Christine Caulfield	d9dfd41e4e	stats: Add cmap key to clear the various stats. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-10-31 17:39:14 +01:00
Christine Caulfield	16f616b65d	knet: Add support for knet compression Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-10-23 17:30:25 +02:00
Christine Caulfield	294a629fb5	config: Allow dynamic link configuration Now we are using knet, it's possible to dynamically add, remove and reconfigure links on the fly. Also print 'n' for non-existant knet links. This will show up only on loopback links >0. But it looks better than 'status =' Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-09-21 17:16:21 +02:00
Christine Caulfield	9da89f32c2	CFG: Remove ring-reenable code RRP doesn't exist any more so all the ring re-enable code is redundant. I've removed it from the library and all the code that does anything, but I've left the hole in the IPC just in case old libraries are hanging around. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-08-03 14:32:02 +02:00
Jan Friesse	564b4bf7d4	totem: Propagate totem initialization failure Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2017-06-15 11:07:33 +02:00
Jan Friesse	1f90c31ba7	list: Replace for_each by safe version where need Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2016-10-27 14:56:52 +02:00
Michael Jones	b4c06e52f3	list: Replace uses of list.h with qblist.h Signed-off-by: Michael Jones <jonesmz@jonesmz.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-10-27 14:56:52 +02:00
Christine Caulfield	268cde6ee4	totem: Add Kronosnet transport. This is a big update that removes RRP & MRP from the codebase and makes knet the default transport for corosync. UDP & UDPU are still (currently) supported but are deprecated. Also crypto and mutiple interfaces are only supported over knet. To compile this codebase you will need to install libknet from https://github.com/fabbione/kronosnet The corosync.conf(5) man page has been updated with info on the new options. Older config files should still work but many options have changed because of the knet implementation so configs should be checked carefully. In particular any cluster using using RRP over UDP or UDPU will not start as RRP is no longer present. If you need multiple interface support then you should be using the knet transport. Knet brings many benefits to the corosync codebase, it provides support for more interfaces than RRP (up to 8), will be more reliable in the event of network outages and allows dynamic reconfiguration of interfaces. It also fixes the ifup/ifdown and 127.0.0.1 binding problems that have plagued corosync/openais from day 1 Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-10-11 10:09:42 +01:00
HideoYamauchi	71c9035c27	Low: totemsrp: Addition of the log. Signed-off-by: HideoYamauchi <renayama19661014@ybb.ne.jp> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-08-01 10:11:45 +02:00
Ferenc Wágner	c76ee39f61	Fix typo: Diabled -> disabled Signed-off-by: Ferenc Wágner <wferi@niif.hu> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-06-22 14:26:48 +02:00
Ruben Kerkhof	37f092bbed	totemsrp: Fix clang warning (tautological compare) gsfrom is always >= 0 Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-01-04 17:28:14 +01:00
Ferenc Wágner	73910bd66e	totmesrp: Fix typo in log message Signed-off-by: Ferenc Wágner <wferi@niif.hu> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-08-26 09:26:26 +02:00
Christine Caulfield	ab8942f626	totemsrp: Improve logging of left/down nodes This patch from Hideo Yamauchi improves the logging of whether nodes leave the cluster cleanly or uncleanly, making it easier to determine if a node ws shut down by the operator. There is also the possibility that a LEAVE message could get missed (due to the node being in flush state) so this can also make that clearer. The modifications are as follows. Change 1) I added the list which maintained LEAVE node to totemsrp. Change 2) I added registration, a search, the handling of to clear LEAVE node. Change 3) I added the output to log. Change 4) I changed an output level of the log. Signed-off-by: Hideo Yamauchi <renayama19661014@ybb.ne.jp> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-06-12 16:16:45 +01:00
Andrey N. Groshev	5d9acc5604	totemsrp: Format member list log as unsigned int Signed-off-by: Andrey N. Groshev <greenx@yandex.ru> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-03-05 16:34:07 +01:00

1 2 3 4 5 ...

281 Commits