mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2025-10-24 19:12:30 +00:00

Author	SHA1	Message	Date
Jan Friesse	e45bbcc92a	totemsrp: Fix leave message regression Leave message in totem is just join message where leaving member is excluded from member list and included in fail list. It also contains special nodeid in header.nodeid and system_from.nodeid fields. Before "totem: Use nodeid ONLY in srp_addr" fix, most of the functions were using system_from addresses and not nodeid, which was used only in one specific case for memb_consensus_set function. After the patch, addresses are gone and only nodeid is used. Result is, that leaving node nodeid is not added into local fail list (my_faillist) so node is unable to reach consensus till token timeout, which starts new gather process. Solution is to send valid leaving node nodeid in system_from.nodeid and handle specific case for memb_consensus_set in memb_join_process. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-04-23 17:46:05 +02:00
Jan Friesse	dc590159f5	totemsrp: Log proc/fail lists in memb_join_process These information are useful and with trace log level they should not be too much irritating. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-04-23 17:45:51 +02:00
Jan Friesse	9b3782e48e	totemsrp: Fix srp_addr_compare There is regression caused by "totem: Use nodeid ONLY in srp_addr" patch in srp_addr_compare function. This function should be usable with qsort, so it should return values less than, equal to or greater than zero. It was however returning only zero or negation of a zero. Final results were unable to reach consensus in following test case: - 3 node cluster - start nodes 1, 2, 3 - shutdown node 3 - start node 3 - shutdown node 2 - start node 2 - shutdown node 1 After this steps, node 2 and 3 were unable to reach consensus. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-04-23 17:45:29 +02:00
Jan Friesse	ccb2290f84	totemsrp: Check join and leave msg length If number of proc_list, failed_list or active members is too high it may be impossible to put them into message, which is allocated on the stack what results in stack corruption. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-04-12 15:25:38 +02:00
Jan Friesse	c139255669	totemsrp: Implement sanity checks of received msgs Sanity checkers are used to prevent crashing because of accessing unallocated memory. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-04-12 15:25:33 +02:00
Jan Friesse	69857efb5b	totem: Display IP of sender To make finding victim of incompatible messages easier, IP of sender is logged. Propagating IP in layers makes patch slightly larger. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-03-16 13:58:15 +01:00
Jan Friesse	0c509a25a7	totemsrp: Add magic and version into header Magic number (0xC070) together with version in every packet is used for detecting that other node is really Corosync 3.x. Endian_detector field is removed and magic number is now used instead. If received packet magic number differs, guessing is used to show more about the source (Corosync 2.3+, 2.2 are quite reliable, Knet and unencrypted Corosync 2.1/2.0/1.x/OpenAIS are semi-reliable and encrypted Corosync 2.1/2.0/1.x/OpenAIS are quite unreliable). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2018-03-16 13:57:55 +01:00
Christine Caulfield	2c20590d16	knet: Always use link0 for loopback Even if it's not used for anything else. Also, make cfgtool show the correct link ID when links are not contiguous Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:23:20 +01:00
Christine Caulfield	111bfbc11d	totem: Fix debug warnings printed by knet Fix crash introduced a couple of commits ago in iface_get Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:22:22 +01:00
Christine Caulfield	386d710ed1	cfg: Fix cfg_get_node_addrs so that DLM works Also update copyright dates Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:19:45 +01:00
Christine Caulfield	f5b690bd96	totem: Return interface count correctly Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:19:12 +01:00
Christine Caulfield	fc8580bdbf	totem: Use nodeid ONLY in srp_addr This shrinks the srp_addr (and consequently every packet sent by corosync) so that instead of containing loads of IP addresses to identify a node, it just sends the nodeid. This then allows us to make ring0 optional and replaceable when running knet. It also means that we need some other way of identifying the local node in corosync.conf, so the nodelist.node.name entry is now mandatory and is mapped to the local host using the same algorithm as used in cman. This code needs LOTS of testing as it touches a huge amount of totemsrp and totemconfig. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2018-03-01 14:18:51 +01:00
Christine Caulfield	1ca72a1154	totemsrp: Revert totemsrp_get_ifaces() changes In my enthusiasm for removing code while integrating knet I also deleted the correct code for returning IP address for a node, so that only the IP addres of the local node was ever returned. This commit restores the the previous code. Also, because we always return INTERFACE_MAX interfaces now (they don't have to be contiguous) set ss_family to zero if that interface is not in use so that downstream apps know and don't display a lot of 0.0.0.0 Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-11-30 16:59:05 +01:00
Christine Caulfield	d9dfd41e4e	stats: Add cmap key to clear the various stats. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-10-31 17:39:14 +01:00
Christine Caulfield	16f616b65d	knet: Add support for knet compression Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-10-23 17:30:25 +02:00
Christine Caulfield	294a629fb5	config: Allow dynamic link configuration Now we are using knet, it's possible to dynamically add, remove and reconfigure links on the fly. Also print 'n' for non-existant knet links. This will show up only on loopback links >0. But it looks better than 'status =' Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-09-21 17:16:21 +02:00
Christine Caulfield	9da89f32c2	CFG: Remove ring-reenable code RRP doesn't exist any more so all the ring re-enable code is redundant. I've removed it from the library and all the code that does anything, but I've left the hole in the IPC just in case old libraries are hanging around. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-08-03 14:32:02 +02:00
Jan Friesse	564b4bf7d4	totem: Propagate totem initialization failure Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2017-06-15 11:07:33 +02:00
Jan Friesse	1f90c31ba7	list: Replace for_each by safe version where need Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2016-10-27 14:56:52 +02:00
Michael Jones	b4c06e52f3	list: Replace uses of list.h with qblist.h Signed-off-by: Michael Jones <jonesmz@jonesmz.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-10-27 14:56:52 +02:00
Christine Caulfield	268cde6ee4	totem: Add Kronosnet transport. This is a big update that removes RRP & MRP from the codebase and makes knet the default transport for corosync. UDP & UDPU are still (currently) supported but are deprecated. Also crypto and mutiple interfaces are only supported over knet. To compile this codebase you will need to install libknet from https://github.com/fabbione/kronosnet The corosync.conf(5) man page has been updated with info on the new options. Older config files should still work but many options have changed because of the knet implementation so configs should be checked carefully. In particular any cluster using using RRP over UDP or UDPU will not start as RRP is no longer present. If you need multiple interface support then you should be using the knet transport. Knet brings many benefits to the corosync codebase, it provides support for more interfaces than RRP (up to 8), will be more reliable in the event of network outages and allows dynamic reconfiguration of interfaces. It also fixes the ifup/ifdown and 127.0.0.1 binding problems that have plagued corosync/openais from day 1 Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-10-11 10:09:42 +01:00
HideoYamauchi	71c9035c27	Low: totemsrp: Addition of the log. Signed-off-by: HideoYamauchi <renayama19661014@ybb.ne.jp> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-08-01 10:11:45 +02:00
Ferenc Wágner	c76ee39f61	Fix typo: Diabled -> disabled Signed-off-by: Ferenc Wágner <wferi@niif.hu> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-06-22 14:26:48 +02:00
Ruben Kerkhof	37f092bbed	totemsrp: Fix clang warning (tautological compare) gsfrom is always >= 0 Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-01-04 17:28:14 +01:00
Ferenc Wágner	73910bd66e	totmesrp: Fix typo in log message Signed-off-by: Ferenc Wágner <wferi@niif.hu> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-08-26 09:26:26 +02:00
Christine Caulfield	ab8942f626	totemsrp: Improve logging of left/down nodes This patch from Hideo Yamauchi improves the logging of whether nodes leave the cluster cleanly or uncleanly, making it easier to determine if a node ws shut down by the operator. There is also the possibility that a LEAVE message could get missed (due to the node being in flush state) so this can also make that clearer. The modifications are as follows. Change 1) I added the list which maintained LEAVE node to totemsrp. Change 2) I added registration, a search, the handling of to clear LEAVE node. Change 3) I added the output to log. Change 4) I changed an output level of the log. Signed-off-by: Hideo Yamauchi <renayama19661014@ybb.ne.jp> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-06-12 16:16:45 +01:00
Andrey N. Groshev	5d9acc5604	totemsrp: Format member list log as unsigned int Signed-off-by: Andrey N. Groshev <greenx@yandex.ru> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-03-05 16:34:07 +01:00
Jason	4ee84c51fa	totem: Ignore duplicated commit tokens in recovery In active rrp mode, commit tokens are treated as mcast data messages, thus, rrp directly delivers them to srp layer by active_mcast_recv(). This will result in duplicated commit tokens being received by srp from different heartbeat links. If node is in recovery state and has already sent out the initial orf token, those duplicated commit tokens will cause message_handler_memb_commit_token() to send initial orf token again! This is wrong because it resets the orf token content in instance->orf_token_retransmit, which breaks the token retransmission state. Furthermore, by sending those initial orf tokens again and again, it may lead active_token_recv() to drop some subsequent orf tokens. It is OK for rrp because srp will do token retransmission, but as said above, srp retransmission state has already been broken, so finally we meet a "token lost in recovery state" condition caused by software. If token timeout value is large, then it will takes long time to create a new ring. This can be reproduced by having two noded set to active rrp mode, with two heartbeat links. Then with one node always on, let the other one do stop/start again and again. It has a low probability to reproduce. In theory, I think, the more heartbeat links used, the more easily it can be reproduced. This problem can be resolved by letting message_handler_memb_commit_token() to ignore duplicated commit tokens in recovery state if node (the ring representation) has already sent out the initial orf token. Different from prev take, this version do not depends on stored token data but uses originated_orf_token in totemsrp_instance to remember if initial orf token has been already originated for current membership. Signed-off-by: Jason <huzhijiang@gmail.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2015-01-15 17:33:04 +01:00
Jan Friesse	acb55cdb03	totem: Inform RRP about membership changes Services are informed about membership changes, but if same information is needed inside totemrrp or totemnet, it's impossible to gather this information. Patch makes this possible for now only for RRP with empty callbacks. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-08-26 15:35:56 +02:00
Jason HU	f135b68096	Cancel token holding while in retransmition When there is no other activty on ring but only retransmition, and token is in hold mode, the retransmition will become slow. More over, if the retransmition is always fail but token rotation works well, then it takes quite a lone time (fail_to_recv_const * token_hold = 2500 * 180ms = 450sec) for the retransmit requester to meet the "FAILED TO RECEIVE" condition to re-construct a new ring. This problem can be solved by checking if retransmits are present before going into hold. If a node is the retransmit requester or the resender, it set my_token_held to 0 to speed up retransmition and omit further unnecessary sending of token_hold_cancel signal. Signed-off-by: Jason HU <huzhijiang@gmail.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-08-12 09:28:04 +02:00
Jan Friesse	da46ecfc30	Move ringid store and load from totem library Functions for storing and loading ring id was in the totem library. This causes problem, what to do when it's impossible to load or store ring id. Easy solution seemed to be assert, but sadly this makes hard for user to find out what happened (because corosync was just aborted and logsys didn't flush) Solution is to move these functions to main.c, where is much easier to handle error. This also makes libtotem free of any file system operations. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-06-02 14:54:57 +02:00
Jan Friesse	d310b251c3	Introduce get_run_dir function Run dir (LOCALSTATEDIR/lib/corosync) was hardcoded thru whole codebase. Totemsrp was trying to create and chdir into it, but also takes into account environment variable COROSYNC_RUN_DIR creating inconsistency. get_run_dir correctly returns COROSYNC_RUN_DIR (when set) or LOCALSTATEDIR/lib/corosync. This is now used by all functions instead of hardcoded string. All occurrences of mkdir/chdir are removed from totemsrp and chdir is now called in main function. Mkdir call is completely removed, because it was not used anyway (check in main.c was called before totemsrp init, so mkdir was never called) and also make install and/or package system should take care of creating this directory with correct permissions/context. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-06-02 14:53:18 +02:00
Jan Friesse	38c04d9a66	totemsrp: Fix typo with cont gather Patch `f3ffd3da5c` introduced named states of state-machine, but sadly contains logical problem causing stats.continuous_gather increasing even when it shouldn't. Problem is not critical, because continuous_gather is set to 0 on successful membership creation. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-18 16:12:57 +01:00
Jason	cfbb021e13	totem: Drop invalid join msg in operational state According to the totem paper, if a processor receives a join message in the operational state and if the receivers identifier is in the join messages fail list, then join message should be ignored. By applying this validation of join messages, we can avoid unnecessary switching from operational state to gather state(or even lead to rings can not be merged) like the following to happen. 1. Initially, there is only one ring contains three nodes, say ring(A,B,C). 2. A and B network partition, "in the same time", C is down. 3. Node A sends join message with proclist:A,B,C. faillist:NULL. Node B sends join message with proclist:A,B,C. faillist:NULL. 4. Both A and B consensus timeout due to network partition. 5. A and B network remerged. 6. Node A sends join message with proclist:A,B,C. faillist:B,C. and create ring(A). Node B sends join message with proclist:A,B,C. faillist:A,C. and create ring(B). 7. Say join message with proclist:A,B,C. faillist:A,C which sent by node B is received by node A because network remerged. 8. Node A shifts to gather state and send out a modified join message with proclist:A,B,C. faillist:B. Such join message will prevent both A and B from merging. 9. Node A consensus timeout (caused by waiting node C) and sends join message with proclist:A,B,C. faillist:B,C again. Same thing happens on node B, so A and B will dead loop forever in step 7, 8 and 9. As the paper also said: "If a processor receives a join message in the operational state and if the sender's identifier is in the receiver's my_proclist and the join message's ring_seq is less than the receiver's ring sequence number, then it ignores the join message too." So these patch applying these validations of join messages altogether. Signed-off-by: Jason <huzhijiang@gmail.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-01-13 14:46:13 +01:00
Masatake YAMATO	f3ffd3da5c	totemsrp: Show English message when memb_state_gather_enter is called The reason why memb_state_gather_enter is invoked was printed in integer code. This patch introduces human readable English messages for the code. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-10-24 16:46:17 +02:00
Christine Caulfield	074e57910e	The corosync message "A processor joined or left the membership" is vague and unhelpful. People have to look for the following quorum message and try to deduce which nodes have joined or left from that and past membership messages, even though the routine printing the message already has this information to hand. This patch fixes that message so that it prints the nodeids of the nodes that have joined/left the cluster. Signed-Off-By: Christine Caulfield <ccaulfie@redhat.com> Reviewed-By: Jan Friesse <jfriesse@redhat.com>	2013-06-27 14:44:46 +01:00
Jan Friesse	92e0f9c7bb	Add waiting_trans_ack also to fragmentation layer Patch for support waiting_trans_ack may fail if there is synchronization happening between delivery of fragmented message. In such situation, fragmentation layer is waiting for message with correct number, but it will never arrive. Solution is to handle (callback) change of waiting_trans_ack and use different queue. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-22 11:48:12 +01:00
Jan Friesse	2d4e7bebb5	Handle segfault in backlog_get If instance->memb_state is not OPERATION or RECOVERY, we was passing NULL to cs_queue_used call. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-22 11:48:07 +01:00
Steven Dake	402638929e	Fix problem with sync operations under very rare circumstances This patch creates a special message queue for synchronization messages. This prevents a situation in which messages are queued in the new_message_queue but have not yet been originated from corrupting the synchronization process. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-22 11:47:57 +01:00
Jan Friesse	d4db2ea535	If failed_to_recv is set, consensus can be empty If failed_to_recv is set (node detect itself not able to receive message), we can end up with assert, because my_failed_list and my_member_list are same list. This is happening because we are not following specification and we allow to mark node itself as failed. Because if failed_to_recv is set and we reached consensus across nodes, single node membership is created (ignoring both fail list and member_list), we can skip assert. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-05 15:16:25 +01:00
Jan Friesse	d042671369	Move "Totem is unable to form..." message to main Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-08 16:53:33 +02:00
Jan Friesse	5ce59f49ba	Move some totem and cpg messages to trace level Messages which are flow messages, rather then lifecycle are now logged in trace level. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-19 11:03:16 +02:00
Jan Friesse	932829bfca	Add header files when needed Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-09-03 09:34:31 +02:00
Tim Beale	77ea036c72	Remove unused structure Nowhere in the corosync codebase references this structure. Signed-off-by: Tim Beale <tlbeale@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-21 14:11:48 +02:00
Jan Friesse	0791f44c41	Include ringid in processor joined log message This should help correlate syslog entires with their blackbox counterparts. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Andrew Beekhof <andrew@beekhof.net>	2012-05-17 14:58:04 +02:00
Jan Friesse	e925f42165	Make ifaces_get work with dynamic no_rings Commit which added number of addresses to srp_address structure didn't count with totemsrp_ifaces_get where whole structure was copied instead of addresses only. This is now fixed. Also to make API totempg forward compatible, size of interfaces array must be passed to ifaces_get like functions to prevent memory overwrite. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-26 11:54:26 +02:00
Jan Friesse	124ff4339c	Add no_addrs field in srp_addr structure This should allow us future change to dynamic number of rings without breaking wire compatibility. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-22 14:03:38 +01:00
Jan Friesse	3b7c2f0588	Update crypto_set API Also few leftovers from cfg is removed and version of totempg is increased to 5 to reflect all changes we made Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-15 17:33:53 +01:00
Jan Friesse	8cdd2fc493	Remove libtomcrypt Tomcrypt in corosync is for long time not updated. Because we have support for libnss, libtomcrypt can be removed. Also few leftovers (AES is 256 bits, not 128, ...) are removed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-13 09:19:47 +01:00
Angus Salkeld	3131601ce2	Remove all unneccessary "\n" from log messages These look ugly, are inconsistently done and just have to be removed later in libqb before calling syslog. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-01-23 13:08:23 +11:00

1 2 3 4 5 ...

258 Commits