mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2025-07-27 07:24:01 +00:00

Author	SHA1	Message	Date
Jan Friesse	9dfc7d0040	Add cmapctl tool corosync-cmapctl is direct replacement for corosync-objctl Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-12-15 09:19:18 +01:00
Jan Friesse	a8fb7c07e2	Move cfg service to use icmap Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-12-15 09:19:18 +01:00
Jan Friesse	8dc460bdfb	Move votequorum to use icmap Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-12-15 09:19:18 +01:00
Jan Friesse	a9e1fc3877	Move testquorum to icmap Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-12-15 09:19:18 +01:00
Jan Friesse	8a45e2b152	Move corosync core to use icmap Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-12-15 09:19:17 +01:00
Jan Friesse	b3c99977de	Add user library to use cmap service Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-12-15 09:19:17 +01:00
Jan Friesse	a2824073c7	Add cmap service Cmap service is application developer interface to icmap and it is direct replacement for confdb. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-12-15 09:19:17 +01:00
Jan Friesse	525e6a6ebe	Add icmap Icmap is replacement for objdb, based on libqb map (trie). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-12-15 09:19:17 +01:00
Angus Salkeld	c4498197b5	TODO: remove "message/queue size" todo's Signed-off-by: Angus Salkeld <asalkeld@redhat.com>	2011-12-15 10:53:46 +11:00
Angus Salkeld	7b02f176df	Check for the correct message size in totempg_groups_joined_reserve() Currently: - send_reserve() adds to the reserve - msg_count_send_ok() tests ((avail - totempg_reserved) > msg_count) So essentially we are checking to see if 2 * msg_count can fit in the q. So instead I am using byte_count_send_ok (size) to see if the message will fit then calling send_reserve() Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-12-15 10:43:11 +11:00
Angus Salkeld	2ba4ebe09e	Fix cpgbench (large message sizes) To allow async cpg messages of 1M we need to: 1) increase the totem queue size 4 times 2) align the critical level to one large message free There are a number of reasons for doing this: We can't let cpg_mcast_joined() fail because the user will not see it and will assume is has succeded. The reason we are getting good performance is by providing a negative feedback loop from the totem q to the IPC/poll system. This relies on 4 q states low/med/high/crit. With messages of size 1M you now have a q of size one and now go from level low to crit instantly then back to low as messages are put on and taken off. I don't think this is the best behaviour. By having a q size of 4 allows the system to utilize the q better and give us time to respond to changes in the q level. To effectively achieve flow control with a q of size 1 would require all the clients to request the space on the q like is done in totempg_groups_joined_reserve() but probably in shared memory This would take quite a bit of re-work. Signed-off-by: Angus Salkeld <asalkeld@redhat.com>	2011-12-15 10:43:11 +11:00
Angus Salkeld	94b11502cb	LOG: get the logging to work from loaded quorum modules Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-12-15 10:10:54 +11:00
Angus Salkeld	5aa44cd20b	Tweek the increment in cpgbench so the message size gets to 1M Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-12-15 10:04:45 +11:00
Angus Salkeld	a748700cde	Be more flexible (correct) with flowcontrol. Many functions do not require flowcontrol and are two-way so they can get failures from corosync. Only cpg_mcast_joined() _really_ needs the current level of flowcontrol. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-12-14 12:03:42 +11:00
Fabio M. Di Nitto	f872241738	quorum-tools: add quorum monitoring option Reviewed-by: Steven Dake <sdake@redhat.com> Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2011-12-13 10:43:43 +01:00
Fabio M. Di Nitto	7d1570d052	quorum-tool: reduce amount of init/finalize Reviewed-by: Steven Dake <sdake@redhat.com> Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2011-12-13 10:42:34 +01:00
Fabio M. Di Nitto	57aa099b0b	quorum-tools: change internal get_quorum_type don't leak memory, better error reporting and improve status output when there is no quorum configured also fix some coding style based on review input Reviewed-by: Steven Dake <sdake@redhat.com> Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2011-12-13 10:20:31 +01:00
Fabio M. Di Nitto	c7f57614c2	quorum-tool: add return codes to show status -1 indicates an error communicating with corosync/quorum/votequorum service 0 node is not quorate 1 node is quorate also add more error reporting and a couple of missing calls to finalize Reviewed-by: Steven Dake <sdake@redhat.com> Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2011-12-13 10:13:27 +01:00
Fabio M. Di Nitto	1504ab84d1	quorum-tool: update copyright date Reviewed-by: Steven Dake <sdake@redhat.com> Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2011-12-13 10:12:53 +01:00
Fabio M. Di Nitto	2866387956	quorum-tools: fix options/help text Reviewed-by: Steven Dake <sdake@redhat.com> Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2011-12-13 10:12:23 +01:00
Angus Salkeld	c317ee433f	LOG: Fix a crash in the shutdown. Reviewed-by: Steven Dake <sdake@redhat.com> Signed-off-by: Angus Salkeld <asalkeld@redhat.com>	2011-12-13 15:00:42 +11:00
Steven Dake	620b86d1ad	Change mailing list in configure.ac to discuss@lists.corosync.org Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2011-12-06 09:01:27 -07:00
Steven Dake	c6701adb14	Add silent rules to corosync make to more easily find warnings Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2011-12-06 09:00:32 -07:00
Jan Friesse	e5952176d6	hdb* functions already returns -error value So it's wrong to define hdb_error_to_cs and pass -error value, because this creates --error = error = CS_OK. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-12-01 08:52:32 +01:00
Yunkai Zhang	232ac5a7fe	Correct nodeid in memb_state_commit_token_send function Signed-off-by: Yunkai Zhang <qiushu.zyk@taobao.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-11-30 11:21:22 -07:00
Steven Dake	e48ddf99a6	From: Yunkai Zhang: Today, I have observed one of the reason that corosync running into FAILED TO RECEIVE state. There was five nodes(A,B,C,D,E) in my testing, and I limited the UDP transmission rate of C nodes by iptables command: iptables -A INPUT -i eth0 -p udp -m limit --limit 10000/s --limit-burst 1 -j ACCEPT iptables -A INPUT -i eth0 -p udp -j DROP After one hour later, C node had been missing some MCAST messages, it's state described as following: ==state of C node== my_aru:0x805 my_high_seq_received:0xC2C my_aru_count:7 =>receved MCAST message with seq:806 from B nodes =>enter message_handler_mcast =>add this message to regular_sort_queue ... =>enter update_aru function => range = (my_high_seq_received - my_aru) = (0xC2C - 0x805) = 1063 => if range>1024, do nothing and and return directly. ==END== According this logic, after (my_high_req_received-my_aru)>1024, my_aru will not be updated though corosync can receive MCAST messages retransmitted by other nodes. But at that timte, my_aru_count was only 7. So the corosync at C node would keep in this status until my_aru_count increased to fail_to_recv_const(the default value is 2500). This was a long time for corosync, but we wasted it. To solve this issue, maybe we can enlarge the range condition in update_aru function? Or we just ingnore the checking of range value, it seems no harmfull, because we have been using fail_to_recv_const to control the things. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2011-11-29 10:59:11 -07:00
Yunkai Zhang	19652c3d7c	Correct nodeid of token when we retransmit it Although incorrect nodeid will not affect program's logic, but it will make us confused when we add some logs to record the transmission path of token in debug mode. Signed-off-by: Yunkai Zhang <qiushu.zyk@taobao.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-11-28 05:56:28 -07:00
Yunkai Zhang	d991400372	Fixed bug when corosync receive JoinMSG in OPERATIONAL state Accordig the totem protocal, nodes should enter GATHER state when it receive JoinMSG in OPERATIONAL state. If we discard it in OPERATIONAL state, the nodes sending this JoinMSG could not receive the response untill other nodes reach token lost timeout. This bug will cause nodes having entered GATHER state spend more time to rejoin the ring, and then it will make nodes reach token expired timeout more easily. Signed-off-by: Yunkai Zhang <qiushu.zyk@taobao.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-11-26 08:52:26 -07:00
Steven Dake	25a6701e9d	Remove unchecked return error in test code Signed-off-by: Steven Dake <sdake@redhat.com>	2011-11-26 08:50:25 -07:00
Steven Dake	b7207138d6	Remove unused variable from latest cpg work that merged all config changes Signed-off-by: Steven Dake <sdake@redhat.com>	2011-11-26 08:50:25 -07:00
Steven Dake	bdd03a4bb7	Remove unchecked return problem in test code Signed-off-by: Steven Dake <sdake@redhat.com>	2011-11-26 08:50:25 -07:00
Steven Dake	aa76b79f24	Remove unchecked return warning Signed-off-by: Steven Dake <sdake@redhat.com>	2011-11-26 08:50:25 -07:00
Steven Dake	bcbb7e028c	Remove use of NULL in test agent Signed-off-by: Steven Dake <sdake@redhat.com>	2011-11-26 08:50:25 -07:00
Steven Dake	f601c73436	Remove unchecked return error Signed-off-by: Steven Dake <sdake@redhat.com>	2011-11-26 08:50:25 -07:00
Steven Dake	73a0adf10e	Correct typing in memory_map function in lib/cpg.c Signed-off-by: Steven Dake <sdake@redhat.com>	2011-11-26 08:50:25 -07:00
Angus Salkeld	0290297b42	Fix last warnings so we can build with --enable-fatal-warnings Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-11-15 09:42:26 +11:00
Angus Salkeld	92ca91fa66	TOTEM: better clean up on exit Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-11-11 09:08:04 +11:00
Angus Salkeld	a6729003a6	OBJDB: free up resources on exit Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-11-11 09:06:50 +11:00
Angus Salkeld	0fc51c40fd	LOG: cleanup logging resources at exit Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-11-11 09:05:08 +11:00
Angus Salkeld	21f1008be8	Clean up the poll loop resourses on exit Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-11-11 08:13:08 +11:00
Angus Salkeld	f5a31e55a2	Add calls to missing object_find_destroy() to fix mem leaks Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-11-11 08:12:13 +11:00
Angus Salkeld	390391acba	Free mem allocated by getaddrinfo Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-11-11 08:11:17 +11:00
Yunkai Zhang	43bead3645	Send one confchg event per CPG group to CPG client We found that sheepdog will receive more than one confchg msg when network partition occur. For example, suppose the cluster has 4 nodes: N1, N2, N3, N4, and they form a single-ring initially. After a while, network partition occur, the single-ring divide into two sub-ring: ring(N1, N2, N3) and ring(N4). The sheepdog in the ring(N4) will receive the following confchg messages in turn: Memb: N2,N3,N4 Left:N1 Joined:null memb: N3,N4 Left:N2 Joined:null memb: N4 Left:N3 Joined:null This patch will fixed this bug, and the client will only receive one confchg event in this case: memb: N4 Left:N1,N2,N3 Joined:null Signed-off-by: Yunkai Zhang <qiushu.zyk@taobao.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-10-31 17:05:38 +11:00
Anton Jouline	a358791d5b	Adding support for dynamic membership with UDPU transport Add a new object called totem.interface.dynamic to allow creation/deletion of new child objects using the corosync-objctl utility: to add new member: linux# corosync-objctl -c totem.interface.dynamic.10-211-55-12 to delete an existing member: linux# corosync-objctl -d totem.interface.dynamic.10-211-55-12 Corosync will dynamically add these members to the configuration and start communicating with those nodes. Signed-off-by: Anton Jouline <anton.jouline@cbsinteractive.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-10-27 23:52:16 -07:00
Jan Friesse	783dd4e553	Remove unused buf and len variables in log_printf Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-10-25 16:29:10 +02:00
Jan Friesse	26db8b21b2	api: Change some of totempg definitons Recent changes in patch "Get rid of hdb usage in totempg.h interface" caused incompatibility between corosync API and totempg. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-10-24 17:43:36 +02:00
Jan Friesse	87821f52a6	totemmrp: Allow compilation without warnings Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-10-24 17:43:32 +02:00
Jan Friesse	1711aea72f	Allow compilation of totempg without warnings Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-10-24 17:43:28 +02:00
Jan Friesse	99bbf4cc78	logsys.h: Properly define LEAVE macro Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-10-24 14:24:52 +02:00
Angus Salkeld	2cf37d4063	Set the size of the blackbox to the size on flatiron Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-10-22 17:42:53 +11:00

1 2 3 4 5 ...

2844 Commits