mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2025-10-28 00:47:42 +00:00

Author	SHA1	Message	Date
Jan Friesse	d042671369	Move "Totem is unable to form..." message to main Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-08 16:53:33 +02:00
Jan Friesse	5ce59f49ba	Move some totem and cpg messages to trace level Messages which are flow messages, rather then lifecycle are now logged in trace level. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-19 11:03:16 +02:00
Jan Friesse	932829bfca	Add header files when needed Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-09-03 09:34:31 +02:00
Tim Beale	77ea036c72	Remove unused structure Nowhere in the corosync codebase references this structure. Signed-off-by: Tim Beale <tlbeale@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-21 14:11:48 +02:00
Jan Friesse	0791f44c41	Include ringid in processor joined log message This should help correlate syslog entires with their blackbox counterparts. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Andrew Beekhof <andrew@beekhof.net>	2012-05-17 14:58:04 +02:00
Jan Friesse	e925f42165	Make ifaces_get work with dynamic no_rings Commit which added number of addresses to srp_address structure didn't count with totemsrp_ifaces_get where whole structure was copied instead of addresses only. This is now fixed. Also to make API totempg forward compatible, size of interfaces array must be passed to ifaces_get like functions to prevent memory overwrite. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-26 11:54:26 +02:00
Jan Friesse	124ff4339c	Add no_addrs field in srp_addr structure This should allow us future change to dynamic number of rings without breaking wire compatibility. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-22 14:03:38 +01:00
Jan Friesse	3b7c2f0588	Update crypto_set API Also few leftovers from cfg is removed and version of totempg is increased to 5 to reflect all changes we made Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-15 17:33:53 +01:00
Jan Friesse	8cdd2fc493	Remove libtomcrypt Tomcrypt in corosync is for long time not updated. Because we have support for libnss, libtomcrypt can be removed. Also few leftovers (AES is 256 bits, not 128, ...) are removed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-13 09:19:47 +01:00
Angus Salkeld	3131601ce2	Remove all unneccessary "\n" from log messages These look ugly, are inconsistently done and just have to be removed later in libqb before calling syslog. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-01-23 13:08:23 +11:00
Jan Friesse	bb6bbd01e6	Store rrp faulty status of ring in cmap New key with faulty status of ring is created in cmap as name runtime.totem.pg.mrp.rrp.$ring_number.faulty Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-01-11 14:12:06 +01:00
Steven Dake	8ad583a54c	Move logsys.c into corosync binary instead of a shared object Our preferred shared logging system is exported via the libqb library. As a result, the corosync project no longer needs to export logsys.so and the code can be directly included in the binary. The header file can also be removed. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-01-06 18:19:59 -07:00
Yunkai Zhang	232ac5a7fe	Correct nodeid in memb_state_commit_token_send function Signed-off-by: Yunkai Zhang <qiushu.zyk@taobao.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-11-30 11:21:22 -07:00
Steven Dake	e48ddf99a6	From: Yunkai Zhang: Today, I have observed one of the reason that corosync running into FAILED TO RECEIVE state. There was five nodes(A,B,C,D,E) in my testing, and I limited the UDP transmission rate of C nodes by iptables command: iptables -A INPUT -i eth0 -p udp -m limit --limit 10000/s --limit-burst 1 -j ACCEPT iptables -A INPUT -i eth0 -p udp -j DROP After one hour later, C node had been missing some MCAST messages, it's state described as following: ==state of C node== my_aru:0x805 my_high_seq_received:0xC2C my_aru_count:7 =>receved MCAST message with seq:806 from B nodes =>enter message_handler_mcast =>add this message to regular_sort_queue ... =>enter update_aru function => range = (my_high_seq_received - my_aru) = (0xC2C - 0x805) = 1063 => if range>1024, do nothing and and return directly. ==END== According this logic, after (my_high_req_received-my_aru)>1024, my_aru will not be updated though corosync can receive MCAST messages retransmitted by other nodes. But at that timte, my_aru_count was only 7. So the corosync at C node would keep in this status until my_aru_count increased to fail_to_recv_const(the default value is 2500). This was a long time for corosync, but we wasted it. To solve this issue, maybe we can enlarge the range condition in update_aru function? Or we just ingnore the checking of range value, it seems no harmfull, because we have been using fail_to_recv_const to control the things. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2011-11-29 10:59:11 -07:00
Yunkai Zhang	19652c3d7c	Correct nodeid of token when we retransmit it Although incorrect nodeid will not affect program's logic, but it will make us confused when we add some logs to record the transmission path of token in debug mode. Signed-off-by: Yunkai Zhang <qiushu.zyk@taobao.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-11-28 05:56:28 -07:00
Yunkai Zhang	d991400372	Fixed bug when corosync receive JoinMSG in OPERATIONAL state Accordig the totem protocal, nodes should enter GATHER state when it receive JoinMSG in OPERATIONAL state. If we discard it in OPERATIONAL state, the nodes sending this JoinMSG could not receive the response untill other nodes reach token lost timeout. This bug will cause nodes having entered GATHER state spend more time to rejoin the ring, and then it will make nodes reach token expired timeout more easily. Signed-off-by: Yunkai Zhang <qiushu.zyk@taobao.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-11-26 08:52:26 -07:00
Angus Salkeld	92ca91fa66	TOTEM: better clean up on exit Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-11-11 09:08:04 +11:00
Steven Dake	2ec4ddb039	Deliver all messages from my_high_seq_recieved to the last gap This patch passes two test cases: ------- Test #1 ------- Two node cluster - run cpgbench on each node modify totemsrp with following defines: Two test cases: ------- Test #2 ------- 5 node cluster start 5 nodes randomly at about same time, start 5 nodes randomly at about same time, wait 10 seconds and attempt to send a message. If message blocks on "TRY_AGAIN" likely a message loss has occured. Wait a few minutes without cyclng the nodes and see if the TRY_AGAIN state becomes unblocked. If it doesn't the test case has failed Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2011-09-22 10:21:37 +02:00
Jan Friesse	752239eaa1	rrp: Higher threshold in passive mode for mcast There were too much false positives with passive mode rrp when high number of messages were received. Patch adds new configurable variable rrp_problem_count_mcast_threshold which is by default 10 times rrp_problem_count_threshold and this is used as threshold for multicast packets in passive mode. Variable is unused in active mode. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed by: Steven Dake <sdake@redhat.com>	2011-09-01 11:21:09 +02:00
Steven Dake	32f11337b1	Remove hdb.h header includes from unnecessary files The files in this patch do not use the hdb.h header. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-08-23 22:28:40 -07:00
Steven Dake	71f044bfe7	Add totempg_threaded_mode_enable() api This API allows totem to operate as a multithreaded library. Performance is better without threads but some library users may only have multithreaded systems. In the corosync case where we have removed threads, this reduces cpu utilization by ~10% by removing about 50% of the mutex lock and unlock calls that occur during typical operation. Since the latest corosync is nearly thread free, there is no need for mutex operations. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-08-22 19:31:52 -07:00
Steven Dake	9f36a892a8	Move cs_queue.h from include directory to exec directory This file is only used by totemsrp.c. Move out of general include directory. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-08-22 19:31:33 -07:00
Tim Beale	370d9bcecf	Display ring-ID consistently in debug Ring ID was being displayed both as hex and decimal in places. Update so it's displayed consistently (I chose hex) to make debugging easier. Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-08-17 12:15:16 +10:00
Tim Beale	5a724a9c39	Add code comment mapping for message handler defines As a corosync-newbie it can be hard to bridge the gap between where a particular message is sent and where the receive handler processes it, and vice versa. Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-08-17 11:52:25 +10:00
Angus Salkeld	37e17e7a94	libqb: logging & trace Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-08-09 10:37:16 +10:00
Angus Salkeld	b5afc9283d	libqb: change pause_timestamp to uint64_t Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-08-09 10:37:15 +10:00
Angus Salkeld	78e06739b7	libqb: remove worker thread - keep to one thread. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-08-09 10:37:15 +10:00
Angus Salkeld	f717bc60e1	libqb: make timer api a wrapper around qb_loop timers. - change timeout value to nano seconds - fix timer handles (don't alloc on stack) Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-08-09 10:37:14 +10:00
Angus Salkeld	fce8a3c3b6	libqb: convert coropoll calls to qb_loop calls. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-08-09 10:37:14 +10:00
Jan Friesse	ddb5214c2c	Revert "totemsrp: Remove recv_flush code" This reverts commit `1a7b7a39f4`. Reversion is needed to remove overflow of receive buffers and dropping messages. Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2011-07-26 10:05:55 +02:00
MORITA Kazutaka	1d9f444fec	totemsrp: fix buffer overflows for large clusters (> 100 nodes) Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-07-24 13:33:26 -07:00
Steven Dake	a3d98f1652	Fix problem where corosync will segfault if there are gaps in recovery queue Fixes a problem where there are gaps in the recovery queue. Example my_aru = 5, but there are messages at 7,8. 8 = my_high_seq_received which results in data slots taken up in new message queue. What should really happen is these last messages should be delivered after a transitional configuration to maintain SAFE agreement. We don't have support for SAFE atm, so it is probably safe just to throw these messages away. Without this change, the new message queue on a new configuraton change is out of sync. Signed-off-by: Steven Dake <sdake@redhat.com> Tested-by: Tim Beale <tlbeale@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2011-07-15 10:39:57 -07:00
Jiaju Zhang	5dc33c2824	RRP: redundant ring automatic recovery This patch automatically recovers redundant ring failures. Please note that this patch introduced rrp_autorecovery_check_timeout in totem config hence breaks internal ABI. The internal ABI users of totem.h need to rebuild their binaries. Signed-off-by: Jiaju Zhang <jjzhang@suse.de> Signed-off-by: Steven Dake <sdake@redhat.com> Tested-by: Jan Friesse <jfriesse@redhat.com> Tested-by: Florian Haas <florian.haas@linbit.com> Tested-by: Jiaju Zhang <jjzhang@suse.de>	2011-07-05 09:13:48 -07:00
Jerome Flesch	00434a4f10	Fix usage of strerror_r()/perror() Signed-off-by: Jerome Flesch <jerome.flesch@netasq.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-06-28 09:56:58 +02:00
Jiaju Zhang	c6bfc6b5d6	RRP: Fix ring initialization issue for UDPU mode Redundant ring has some problem in the UDP unicast mode. The problem is the second ring has not been successfully initialized, that is, the second time iface_changes happens, the member list for that interface has not been added, which results in that ring cannot transmit normal message. So the second ring cannot take over the work if the first ring is down. This patch fixes this issue. comments from review: More work is needed probably in totemnet where totemnet maintains the the of node list and an iterator for them, and totemudpu_member_add adds state information to a context for the iteration. In any regard, that is somewhat difficult to test, so I'll merge this patch for now - keep in mind interface changes on the bindnetaddr will cause problems with udpu after this patch has been commmitted. Signed-off-by: Jiaju Zhang <jjzhang@suse.de> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-06-16 17:23:36 -07:00
Jan Friesse	61d83cd719	totemsrp: Enhance mcast failure detection memb_state_gather_enter increase stats.continuous_gather only if previous state was gather also. This should happen only if multicast is not working properly (local firewall in most cases) and not if many nodes joins at one time. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-05-05 11:00:26 +02:00
Zane Bitter	6365150ae2	Provide better checking of the message type A negative value for the message type (on systems where char is signed) would cause a crash. This is highly probable if the cluster is, for example, misconfigured to have encryption enabled on some nodes but not others. Signed-off-by: Zane Bitter <zane.bitter@gmail.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-04-12 13:09:39 -07:00
Zane Bitter	6e990d202f	Fix uninitialised memory errors found by valgrind Signed-off-by: Zane Bitter <zane.bitter@gmail.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-04-08 09:13:12 -07:00
Steven Dake	7d5e588931	totemsrp: free messages originated in recovery rather then rely on messages_free Relying on messages_free may seem like it should work, but it leads to a situation where every node has released the messages, yet some nodes think messages are missing. The output then looks like "Retransmit: #" in repitition. This patch frees those messages immediately during the transition to the OPERATIONAL state and sets the internal variables totemsrp depends upon to the proper values. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2011-03-24 09:25:15 -07:00
Steven Dake	ef05817ce5	totemsrp: Only restore old ring id information one time The current code stores the current ring information every time a commit token is generated. This causes the old ring id used for comparison purposes to increase if a token is lost in commit or recovery, resulting in failure of totem. This patch changes the behavior to only store the old ring id one time when the commit token is received, and then further commit token ring id saves are not done until OPERATIONAL is reached. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2011-03-24 09:22:34 -07:00
Steven Dake	1a7b7a39f4	totemsrp: Remove recv_flush code The recv_flush code is no longer necessary because of the miss_count_count addition. It can in some cases lead to register corruption because of interactions with -fstack-protector, the recursive nature of how this code works, and interactions with the optimizer in some versions of gcc. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2011-03-24 09:21:27 -07:00
Steven Dake	d99fba72e6	Resolve abort during simulatenous stopping of atleast 4 nodes consider 5 nodes. node 3,4 stopped (by random stopping) node 1,2,5 form new configuration and during recovery node 1 and node 2 are stopped (via service service corosync stop). This causes 5 never to finish recovery within the timeout period, triggering a token loss in recovery. Bug #623176 resolved an assert which happens because the full ring id was being restored. The resolution to Bug #623176 was to not restore the full ring id, and instead operate (according to specifications) the new ring id. Unfortunately this exposes a problem whereby the restarting of nodes 1-4 generate the same ring id. This ring id gets to the recovery failed node 5 which is now in gather, and triggers a condition not accounted for in the original totem specification. It appears later work from Dr. Agarwal's PHD dissertation considers this scenario. That solution entails rejecting the regular token in the above condition. Since the ring id is also used to make decisions for commit token acceptance, we must also take care to reject the regular token in all cases after transitioning from OPERATIONAL. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-21 09:26:35 -07:00
Angus Salkeld	0ad2494ae7	Fix some "set but not used" warnings [-Wunused-but-set-variable] Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-16 07:13:42 +11:00
Zane Bitter	dddaeef21c	Allocate packet buffers in the transport drivers This change paves the way for eliminating a copy within the Infiniband driver in the future by transferring responsibility for allocating and freeing message buffers to the transport driver layer. Tested under valgrind on a single-node cluster. Signed-off-by: Zane Bitter <zane.bitter@gmail.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-11 20:38:28 -07:00
Steven Dake	6aa47fde95	Fix abort when token is lost in RECOVERY state A commit token should be rejected when a token is lost in the recovery state. This occurs naturally because the ring id increases by 4 for every new ring. Prior to this patch, if the token was lost, the old ring id information was restored, causing a commit token to be accepted when it should be rejected. This erronously accepted commit token would lead to an assertion which is fixed by this patch. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-03-07 17:15:05 -07:00
Steven Dake	7471c88346	Don't assert when ring id file is less then 8 bytes If the ring id file for the processor is less then 8 bytes, totemsrp would assert. Our speculation is that this condition happens during a fencing operation or local filesystem corruption. With this patch, Corosync will create fresh ring id file data when the incorrect number of bytes are read from the ring id. Amend to use sizeof the strerror string length and PATH_MAX for the path length. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-02-24 15:34:39 -07:00
Steven Dake	6646a864b4	Handle delayed multicast packets that occur with switches Some switches delay multicast packets vs the unicast token. This patch works around that problem by providing a new tuneable called miss_count_const. This tuneable works by counting the number of times a message is found missing and once reaching the const value, marks it as missing in the retransmit list. This improves performance and doesn't display warning messages about missed multicast messages when operating in these switching environments. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-01-11 10:34:46 -07:00
Jan Friesse	b9df4424b1	Display warning when not possible to form cluster This may typically happen if local firewall is enabled. Patch adds new item to statistics called continuous_gather where is number of continuous entered gather state. If this number is bigger then MAX_NO_CONT_GATHER, warning message is displayed. This is also used on exiting, so stop of corosync is now possible even with enabled firewall. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2010-12-03 10:11:11 +01:00
Steven Dake	bb05aed93f	Add the UDPU transport The UDPU transport is useful for those deployments which can't use multicast. UDPU works by using UDP unicast, which is fully supported by every switch manufacturer by default and doesn't rely on a functional IGMP implementation. An example of the UDPU transport is contained in the corosync.conf.example.udpu file which shows a 16 node cluster. This file should be copied to each node in the cluster and IP addresses changed as appropriate. Amended to remove dead udpu REUSEADDR socket option. Signed-off-by: Steven Dake <sdake@redhat.com>	2010-11-18 14:21:30 -07:00
Steven Dake	fef259970a	Remove cancel token retransmit timeout. git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3012 fd59a12c-fef9-0310-b244-a6a79926bd2f	2010-08-03 17:31:33 +00:00

1 2 3 4 5

218 Commits