mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2025-10-31 12:37:19 +00:00

Author	SHA1	Message	Date
Jan Friesse	90da72cd7f	cfg: Check interface status and name length Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:12 +02:00
Jan Friesse	335da1ecfd	cfg: Check number of interfaces Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:12 +02:00
Jan Friesse	5dc3fc4bda	totemrrp: Make status string shorter Status string should be same lenght as needed for cfg ringstatusget function. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:11 +02:00
Jan Friesse	845a625908	totem: Don't leak instance variable on crypto fail Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-18 14:35:25 +02:00
Jan Friesse	93286a344e	totemudpu: Handle fd leak in totemudpu Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-18 14:35:21 +02:00
Jan Friesse	421de34972	totemconfig: Check length of rrp_mode string Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-18 14:35:15 +02:00
Jan Friesse	675da75759	coroparse: Ensure that config items fits into cmap Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-18 14:35:05 +02:00
Jan Friesse	e094ab2e2c	votequorum: Prevent leak in qdevice_is_configured Also LEAVE from function is now properly logged. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-17 15:47:27 +02:00
Jan Friesse	4310d84e4d	Initialize error variable in ykd_init Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:57 +02:00
Jan Friesse	92b900da67	Initialize node_found in nodelist_to_interface fun Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:57 +02:00
Jan Friesse	903e02875d	Initialize item in cmap_mcast_send Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	f198955644	votequrorum: Assert sender nodeid is known Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	56ee492471	Check result of logsys_subsys_create Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	d5d4cdb972	Check logsys_format_set result in logsys setup Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	90f8a68a2b	Use proper totem_ip_address size in memset Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	df6b87f293	Free icmap strings in logconfig Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	ce9c69da03	Properly break MAIN_CP_CB_DATA_STATE_QDEVICE state Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:55 +02:00
Jan Friesse	d5d3fb4d45	Do not dereference format_buffer when it's NULL Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:55 +02:00
Jan Friesse	96a89a0085	Check icmap str get for clustername Even this check is really not needed, it's nice to have it and on fault ensure that cluster_name is really NULL. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:55 +02:00
Jan Friesse	966f461b69	Properly check result of stat func in coroparse Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:55 +02:00
Jan Friesse	e684e4ca6f	Remove unnecessary mmap in cpg Code for zero-copy in cpg does following mmaps: - Mmap anonymous, private memory to some address (-> malloc) - Mmap shared memory of fd to address returned by first mmap (effectively shadows first mapping) This is not necessary and only one mapping is needed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-05-21 14:46:15 +02:00
Jan Friesse	8429d01389	Detect big scheduling pauses Add poll timer scheduler to be called 3 times per token timeout. If poll timer was not called for more then 0.8 * token timeout, it means corosync process was not scheduled and ether token_timeout should be increased or load should be reduced (useful for VM, where host is overcommitted so VM is not scheduled as expected). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-04-08 09:58:42 +02:00
Jan Friesse	86b074dc1a	Support for numerical uid/gid Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-04-02 09:32:10 +02:00
Andrei Belov	005e7fd3b9	Improved POSIX-compliant handling of getpwnam_r() and getgrnam_r(). Signed-off-by: Andrei Belov <defanator@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-03-28 16:32:53 +01:00
Jan Friesse	0e3d1a9c51	totempg: Make iov_delv local variable Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-03-21 14:24:23 +01:00
Xia Li	ca6051e80c	Convert the nodeid byte order to be aligned with network order When using corosync with clear_node_high_bit setting to yes, the highest bit is cleared. When all the cluster nodes are in one subnet, we probably configure the IP addresses as follows: node1: 147.2.207.64 node2: 147.2.207.192 If the byte order of the nodeid is little endian, wiping off the highest bit will make the two nodes have the same nodeid! This patch fixes this by converting the nodeid to network order. Signed-off-by: Xia Li <xli@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-03-19 16:39:59 +01:00
Jeremy Fitzhardinge	52f88d04ea	Handle ERANGE from getpwnam_r / getgrnam_r These functions return ERANGE if the supplied buffer is too small to fit a line. Try doubling the buffer a few times until it works.	2013-03-07 16:59:51 -08:00
Jan Friesse	66172a501a	Handle unexpected closing brace in config file If configuration file contains closing brace before opening brace at top level, configuration parsing is stopped and file is not completely parsed. Solution is to detect extra closing brace and display error. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-01-31 16:11:22 +01:00
Jan Friesse	663489d277	Handle colon in configuration file If colon was entered as part of value on end of value, it is deleted. This makes impossible to enter (legal) IPv6 address ending with :: (like fed0::). Also when line contains both brace and colon, it is parsed twice (first as key = value and second as start of section). This is handled by continue in if section. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-01-31 16:11:18 +01:00
Fabio M. Di Nitto	98d0245c7e	votequorum: port to sync API (take 2) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-01-31 15:32:07 +01:00
Fabio M. Di Nitto	55dc09ea23	totemconfig: enforce hmac config when crypto is enabled Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-01-14 12:31:47 +01:00
Kazunori INOUE	1ad21e384e	log: move Corosync started log messages "Corosync Cluster Engine ... started" message is shown after logsys is full configured. Signed-off-by: Kazunori INOUE <inouekazu@intellilink.co.jp> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-01-14 11:52:26 +01:00
Fabio M. Di Nitto	ed6bca3293	crypto: drop < 2.3 protocols and onwire compat Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-01-14 11:49:32 +01:00
Fabio M. Di Nitto	b3f456a8ce	totemcrypto: fix hmac key initialization Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-01-14 11:23:32 +01:00
Jan Friesse	6127be1806	Move qb_loop creation after daemonization Creating qb_loop before daemonization is not problem for poll or epoll type loops, but it's problem for kqueue, because kqueue is not shared in child with parent after fork. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-12-12 11:47:42 +01:00
Jan Friesse	dd588d004e	Add option to specify ip version Default is ipv4. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-12-03 14:02:32 +01:00
Jan Friesse	92e0f9c7bb	Add waiting_trans_ack also to fragmentation layer Patch for support waiting_trans_ack may fail if there is synchronization happening between delivery of fragmented message. In such situation, fragmentation layer is waiting for message with correct number, but it will never arrive. Solution is to handle (callback) change of waiting_trans_ack and use different queue. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-22 11:48:12 +01:00
Jan Friesse	2d4e7bebb5	Handle segfault in backlog_get If instance->memb_state is not OPERATION or RECOVERY, we was passing NULL to cs_queue_used call. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-22 11:48:07 +01:00
Steven Dake	402638929e	Fix problem with sync operations under very rare circumstances This patch creates a special message queue for synchronization messages. This prevents a situation in which messages are queued in the new_message_queue but have not yet been originated from corrupting the synchronization process. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-22 11:47:57 +01:00
Fabio M. Di Nitto	220d659b38	totemcrypto: implement crypto packet format 2.2 and crypto_compat: config opt Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-11-22 11:13:30 +01:00
Evgeny Barskiy	e3f615b4a0	corosync to start in infiniband + redundant ring active/passive mode Corosync now works with infiniband transport in any redundant ring mode Signed-off-by: Evgeny Barskiy <barskiy@rts.ru> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-11-21 10:28:57 +01:00
Fabio M. Di Nitto	ed63c812af	votequorum: fix handling of expected_votes/votes changes from cmapctl and allow natural selection to take place.... Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-11-20 15:45:57 +01:00
Jan Friesse	3cd4f9a1f5	Add support for selecting IPC type Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-08 12:16:11 +01:00
Jan Friesse	89809ec80e	Check successful initialization of IPC Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-08 12:16:06 +01:00
Angus Salkeld	abc3b6abed	Try reduce the number of sprintf's Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-11-07 21:28:31 +11:00
Jan Friesse	d4db2ea535	If failed_to_recv is set, consensus can be empty If failed_to_recv is set (node detect itself not able to receive message), we can end up with assert, because my_failed_list and my_member_list are same list. This is happening because we are not following specification and we allow to mark node itself as failed. Because if failed_to_recv is set and we reached consensus across nodes, single node membership is created (ignoring both fail list and member_list), we can skip assert. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-05 15:16:25 +01:00
Jacek Konieczny	07832748f2	link libtotem_pg to libqb The libtotem_pg library uses symbols from libqb, so it should be explicitely linked with it. This doesn't cause problems for corosync binary itself, as it is linked to both libraries, but can cause problems if anything else links to libtotem_pg.so and automated checkers can show this as a library problem. Signed-off-by: Jacek Konieczny <jajcus@jajcus.net> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-10-29 16:49:19 +01:00
Jan Friesse	8a9869eeec	Correctly check if service was unloaded my_processing_idx is pointer to received service list, instead of global service number. If we check state of service we should use service_id instead of my_processing_idx. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-17 15:06:36 +02:00
Jan Friesse	c165bf4f51	Define AES_*_KEY_LENGTH if not defined Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-17 15:06:32 +02:00
Fabio M. Di Nitto	20c5871525	totemcrypto: add support for different encryption methods (backport from nsscrypto kronosnet code) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-10-15 10:00:16 +02:00
Jan Friesse	fc50443f5f	Make totemiba compile again Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2012-10-08 17:44:09 +02:00
Jan Friesse	b7635ab9f7	Return back "Totem is unable to form..." message This patch returns back SUBJ functionality. It rely on fact, that sendmsg will return error, and if such error is returned for long time, it's probably because of firewall. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-08 16:53:35 +02:00
Jan Friesse	d042671369	Move "Totem is unable to form..." message to main Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-08 16:53:33 +02:00
Jan Friesse	6c3b337b37	Use unix socket for local multicast loop Instead of rely on multicast loop functionality of kernel, we now use unix socket created by socketpair to deliver multicast messages to local node. This handles problems with improperly configured local firewall. So if output/input to/from ethernet interface is blocked, node is still able to create single node membership. Dark side of the patch is fact, that membership is always created, so "Totem is unable to form a cluster..." will never appear (same applies to continuous_gather key). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-08 16:53:30 +02:00
Jan Friesse	4354ed6ecb	Store config_version of other nodes Config version of other nodes is stored in runtime.totem.pg.mrp.srp.members.NODEID.config_version key. Also when local config_version is changed, all nodes are informed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-03 11:26:35 +02:00
Jan Friesse	d2a85593c4	Support for check of config version on start Config version is requested from other nodes. If our config version is not 0 and differes from highest config version of other nodes, corosync quits. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:32 +02:00
Jan Friesse	73b0fe688d	Make cmap_mcast_send return correct error code Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:28 +02:00
Jan Friesse	a273be58ae	Make service_build contain correct number of msgs Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:24 +02:00
Jan Friesse	3c019f2130	Align items in cmap_mcast_send Aligning function (kernel style magic) MAR_ALIGN_UP is used for aligning of items in req_exec_cmap_mcast message. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:20 +02:00
Jan Friesse	2214a60639	Support for flt and dbl in mcast_endian_convert Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:17 +02:00
Jan Friesse	cbaa2977ae	Add support for sending cmap values to wire Function is little more complex, but it is designed to be used in future without big changes. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:07 +02:00
Jan Friesse	6825c1d39b	Parse config_version as 64-bit uint Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:02 +02:00
Jan Friesse	373ded0652	Don't access invalid mem in totemconfig interfaces When ringnumber in config file was set to value bigger or equal to INTERFACE_MAX, we are using this big value as index to totemconfig interfaces array, resulting to access to invalid memory and segfault. Instead of that, ringnumber is now checked and proper error message is printed if value is too big. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-27 13:54:39 +02:00
Jan Friesse	5ce59f49ba	Move some totem and cpg messages to trace level Messages which are flow messages, rather then lifecycle are now logged in trace level. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-19 11:03:16 +02:00
Jan Friesse	5717655019	Add support for debug level trace in config file Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-19 11:03:10 +02:00
Fabio M. Di Nitto	8a2e936381	icmap: fix mapping return codes Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2012-09-12 08:18:50 +02:00
Fabio M. Di Nitto	bb5946babb	build: clean AM_CFLAGS and AM_CPPFLAGS usage around also set commont include dirs. fPIC and DPIC are automatically detected and added as required by libtool. We don't need to carry it around. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-09-07 09:04:07 +02:00
Fabio M. Di Nitto	fa92e4068a	totemconfig: drop unnecessary includes Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-09-07 09:04:06 +02:00
Jan Friesse	7fe307383f	Remove newline in logsys_config_file_set_unlocked Also remove commented leftover. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-06 09:39:18 +02:00
Jan Friesse	bd30fe3dcd	Make threaded log work Previous two log releated patches tried to solve few problems with threaded libqb, but introduced regressions when running in daemon mode. This patch takes bigger hammer and hopefully solves all problems. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-06 09:39:15 +02:00
Jan Friesse	bd138085ca	Ensure qb_log thread is started Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-05 09:10:57 +02:00
Jan Friesse	7026fffdf9	Ensure no garbage left in msghdr for sendmsg call Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-09-03 09:34:37 +02:00
Jan Friesse	120b7fac7b	Use uint8_t in setsockopt when needed Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-09-03 09:34:35 +02:00
Jan Friesse	ee59122ad7	OpenBSD getifaddrs returns netmask without sa_family So we relax netmask check and set to same family as ipaddr if needed Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-09-03 09:34:33 +02:00
Jan Friesse	932829bfca	Add header files when needed Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-09-03 09:34:31 +02:00
Angus Salkeld	0e86aa4ac6	Fix cpg_membership_get() The wrong size was getting set in exec/cpg.c Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-31 14:48:35 +10:00
Fabio M. Di Nitto	6d28d51284	build: bring SOLARIS up to the same standard as other OSes drop all SOLARIS specific ifdefs and replace them with feature checks Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto	a0a14c68e3	totemip: clean up headers a lot more getifaddrs is always available if there is freeifaddr. all BSD and openindiana have it defined in ifaddr.h. drop a bunch of obsoleted headers. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto	18929089d1	build: drop MAP_ANONYMOUS check from configure define it only in case it's not there Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto	5c5db34e56	build: make libstatgrab the facto default for monitoring service drop duplicate code and remove the last COROSYNC_LINUX ifdefs around Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto	a1c154e6fa	build: use MADV_NOSYNC only when it's defined so far only FreeBSD defines it. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto	6098ef2c14	build: make exec/totemip os detection free Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Jan Friesse	dbe0e9e382	Log: Use threaded mode for syslog and file log Syslog and file log can block, so it's good idea to use libqb threaded mode to prevent it. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-30 09:46:48 +02:00
Jan Friesse	9f6e6a990b	Use native IPC mechanism Instead of hardcoded SHM, we should use NATIVE, so libqb is able to find out what is best/availiable mechanism. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-30 09:45:46 +02:00
Fabio M. Di Nitto	427fdd4558	build: fix build on openindiana 151a openindiana toolchain is rather messy. This is the first cut only Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-28 15:14:49 +02:00
Fabio M. Di Nitto	9f7181b533	build: drop more dlopen leftovers from dinosaur era... Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-28 15:14:49 +02:00
Fabio M. Di Nitto	dd4d7f86e6	build: make monitoring optional in corosync exec Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-28 15:14:49 +02:00
Fabio M. Di Nitto	8f96347100	build: respect watchdog conditional when building corosync exec Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-28 15:14:49 +02:00
Fabio M. Di Nitto	76d18f964d	build: use libtool for linking Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-28 15:14:48 +02:00
Tim Beale	6129ce5b59	Remove redundant default-config code We were checking 'hold_timeout == 0' in 3 different places when setting up the default totem config. Signed-off-by: Tim Beale <tlbeale@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-21 14:26:50 +02:00
Tim Beale	77ea036c72	Remove unused structure Nowhere in the corosync codebase references this structure. Signed-off-by: Tim Beale <tlbeale@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-21 14:11:48 +02:00
Jan Friesse	397cc89f01	Make logging of WD and MON service correct MON and WD services are using fsm.h, which calls log function. Such messages were incorrectly logged as SERV (or random service) which made debugging hard. Solution is to add callback parameter to fsm functions and do actual logging there. Handling of failure states is also done in calback now. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-16 14:45:15 +02:00
Jan Friesse	e3cef955bf	IPC: Call lib function only when it's possible send_ok was incorrectly tested as boolean, even it's errno type variable. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-09 15:10:52 +02:00
Jan Friesse	8014b2facf	Close sockets after deleting from poll This will remove (non critical) debug message from QB about polling on closed FD. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-09 15:10:44 +02:00
Jan Friesse	2d10e2bbea	cpg: Check input param name_t length IPC is using buffer of CS_MAX_NAME_LENGTH for name. If user calls function with longer string, such string can be passed to service incomplete. Solution is to not allow string larger then CS_MAX_NAME_LENGTH and return error. Same applies to cpg service. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-09 15:10:35 +02:00
Jan Friesse	6f6988afff	Handle sync and service unload correctly When sync started and service is unloaded in meantime, it can happen that sync will call sync_* functions on unloaded service. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-09 15:10:26 +02:00
Jan Friesse	dfe34d330c	service: remove leftovers from mt corosync Multithreaded corosync used to use many ugly workarounds. One of them is shutdown process, where we had to solve problem with two locks. This was solved by scheduling jobs between service exit_fn call and actual service unload. Sadly this can cause to receive message from other node in that meantime causing corosync to segfault on exit. Because corosync is now single threaded, we don't need such hacks any longer. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-09 15:10:16 +02:00
Fabio M. Di Nitto	423e37b4ca	votequorum: change init/clean up to deal with exit races Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-08 09:03:57 +02:00
Fabio M. Di Nitto	50308cb08d	quorumtool: make output more meaningful there is really no point to have a per node view of (vote)quorum since all the info are always there. drop the -n option for status/display nodes and improve the output to provide a full cluster view at any given time. Old format: [root@fedora-master-node2 ~]# corosync-quorumtool -s Quorum information ------------------ Date: Mon Aug 6 10:22:27 2012 Quorum provider: corosync_votequorum Nodes: 2 Ring ID: 8 Quorate: Yes Votequorum information ---------------------- Node ID: 3254954176 Node state: Member Node votes: 1 Qdevice votes: 1 Expected votes: 3 Highest expected: 3 Total votes: 3 Quorum: 2 Flags: Quorate Qdevice Membership information ---------------------- Nodeid Votes Name 3238176960 1 fedora-master-node1.int.fabbione.net 3254954176 1 fedora-master-node2.int.fabbione.net 0 1 QDEVICE (Alive/Voting/NoMasterWins) New format: [root@fedora-master-node1 tools]# ./corosync-quorumtool -s Quorum information ------------------ Date: Mon Aug 6 15:50:03 2012 Quorum provider: corosync_votequorum Nodes: 2 Ring ID: 48 Quorate: Yes Votequorum information ---------------------- Expected votes: 3 Highest expected: 3 Total votes: 3 Quorum: 2 Flags: Quorate Qdevice Membership information ---------------------- Nodeid Votes Qdevice Name 3238176960 1 A,V,MW fedora-master-node1.int.fabbione.net 3254954176 1 NR fedora-master-node2.int.fabbione.net 0 1 QDEVICE Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	6b270c6cd1	votequorum: make the last QDEVICE define name consistent with everything else Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	302545e112	votequorum: add missing return call Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	379b203677	votequorum: make master_wins check stricter Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	9c50f33509	votequorum: add ENTER/LEAVE for consistency Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	2f369e7039	votequorum: delegate qdevice_master_wins setting to qdevice votequorum has no business to device if master_wins setting is correct or not. only the qdevice can decide and should set the value for votequorum. Logic is: - user requests master_wins from config - corosync starts - qdevice starts - qdevice reads cmap values / register with votequorum - qdevice decides if the node can support master_wins or not and tells votequorum - at this point votequorum can check if an unquorate node is part of the master_wins partition it is the qdevice responsibility to keep that value up to date in votequorum and the value can be changed at runtime. this commit also exchange per node master_wins information to lay down the infrastructure to verify discrepancies in node config for master_wins (coming next on this channel). Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	cc7bfeb462	votequorum: drop votequorum_qdevice_getinfo and collapse data into getinfo it's really pointless to have basically a duplicated API call to transfer one value and one name. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	65a6c29a31	votequorum: external defines should all be prefixed with VOTEQUORUM_ Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	2a37b56c49	votequorum: drop _FLAG_ from defines those are all info flags.. it's redudant and inconsistent Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	3416eacbec	votequorum: fix define name to match reality Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	86dd11b28e	qdevice: implement master_wins partition in previous incarnation of qdisk + cman, master_wins was restricted to 2 node only. In this new version it is possible to use master_wins for any cluster size. Let's assume a 4 node cluster. Each node votes 1, qdevice votes 3. node 1 becomes qdevice master node 2/3/4 no In case of a split (let's assume 2/2): partition 1: {4, 1} partition 2: {1, 1} node 2 in partition 1 would normally be unquorate, leaving effectively only node 1 active. master_wins allows node 2 to recognize to be part of a quorate partition (since node1 is broadcasting that qdevice is voting) and retain quorum. node1 has never lost quorate status since qdevice is voting there. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	aa295be834	votequorum: fix flag check for qdevice votes propagation and cleanup similar code to make it more readable Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	2dae49e54a	votequorum: remove last instance of state and rename it to cast_vote also align naming of vote to cast_vote for info calls Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	3fed1af077	votequorum: several major bug fixes and code cleanup - add a protection check to avoid spurious messages on membership change - greately simplify processing of nodeinfo, since the only data that we send for qdevice over nodeinfo is the number of votes - fix a flag check to trigger quorum calculation that would leave a cluster unquorate under certain conditions Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	62659dbb21	votequorum: move to the new flag structure simplify different code path as checks are simpler, separate ALIVE and CAST_VOTE Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	c9e207ec92	votequorum: simplify getinfo data and protect against call against quorum node Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	f2b25936e5	votequorum: use REGISTERED flag consistently Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	0bcb4cddcc	votequorum: simply internal qdevice_getinfo function as data are moving around we can drop lots of special cases Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	43d1439600	votequorum: add qdevice CAST_VOTE status/flag this is a preparation commit for the next changes. right now it is no more than an alias to ALIVE. CAST_VOTE is required to support master/slave feature from qdevice. Effectively a quorum device can be: Not registered / registered (connected to API but nothing else is happening) if registered: Not alive / alive (quorum device is petting the API via poll and timer is running) if alive: Not voting (slave) / voting (master) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:16 +02:00
Fabio M. Di Nitto	987e26f8d1	votequorum: rename NODE_FLAGS_QDEVICE_STATE to NODE_FLAGS_QDEVICE_ALIVE STATE is confusing and overloaded term in votequorum as it's used for nodes and other bits. make the name unique and ALIVE means that the qdevice is heartbeating to votequorum. improve display of the status in tools and tests. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:16 +02:00
Fabio M. Di Nitto	4621a6cd02	votequorum: rename NODE_FLAGS_QDEVICE to NODE_FLAGS_QDEVICE_REGISTERED make the flag name explicit Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:16 +02:00
Jan Friesse	fed7fc23e1	Don't call sync_* funcs for unloaded services When service is unloaded, sync shouldn't call sync_init\|process\|activate and abort functions. It happens very rare, but in process of unloading all services, totem can recreate membership and bad things can happen (service is unloaded, so there may be access to already freed memory, ...) Solution is to fetch services sync handlers in every time when we are building service list instead of using precreated one. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-08-02 09:34:58 +02:00
Jan Friesse	9fb7979370	Introduce SERVICES_COUNT_MAX macro Sync/service was using maximal number of services in ehter numberic form (magic constant) or inconsistently, this means using SERVICE_HANDLER_MAXIMUM_COUNT which means maximal number of handlers. New macro solves this. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-08-02 09:32:05 +02:00
Jan Friesse	537bf56fcc	cpg: Be more verbose for procjoin message Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-07-30 10:22:16 +02:00
Jan Friesse	04dac3ff5d	Correctly free state string in wd Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-07-12 15:53:04 +02:00
Jan Friesse	e4d75d1ab3	Revert "Free state variable allocated in wd_resource_state_is_ok" This reverts commit `01c63ca17c`.	2012-07-11 17:04:41 +02:00
Jan Friesse	a966506c1e	cpg: Enhance downlist selection algorithm Let's say we have 2 nodes: - node 2 is paused - node 1 create membership (one node) - node 2 is unpaused Result is that node 1 downlist is selected, so it means that from node 2 point of view, node 1 was never down. Patch solves situation by adding additional check for largest previous membership. So current tests are: 1) largest (previous #nodes - #nodes know to have left) 2) (then) largest previous membership 3) (and last as a tie-breaker) node with smallest nodeid Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-06-14 15:15:42 +02:00
Jan Friesse	f3457c5d49	cpg: Print cpg name to debug informations In downlist and joinlist debug output group was printed in nonsense format of integer to pointer to array. Now it's printed by full name. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-06-14 15:15:39 +02:00
Jan Friesse	35446d6bcc	cpg: Process join list after downlists let's say following situation will happen: - we have 3 nodes - on wire messages looks like D1,J1,D2,J2,D3,J3 (D is downlist, J is joinlist) - let's say, D1 and D3 contains node 2 - it means that J2 is applied, but right after that, D1 (or D3) is applied what means, node 2 is again considered down It's solved by collecting joinlists and apply them after downlist, so order is: - apply best matching downlist - apply all joinlists Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-06-14 15:15:35 +02:00
Jan Friesse	816d7687b0	cpg: Never choose downlist with localnode Test scenario is follows: - node 1, node 2 - node 1 is paused - node 2 sees node 1 dead - node 1 unpaused - node 1 and 2 both choose same dowlist message which includes node 2 -> node 2 is efectivelly disconnected Patch includes additional test if left_node is localnode. If so, such downlist is ignored. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-06-14 15:15:32 +02:00
Jerome FLESCH	99faa3b864	When flushing, discard only memb_join messages Patch solves problem when 1 ring out of 2 went up/down quite often. The simplest setup to reproduce bug is following: - 2 VMs, connected by 2 network interfaces - OS: Linux - On one of the VMs, a test program sending some CPG messages (see the script "test_corosync.sh" joined to this mail for example) Here are the Corosync logs we get when we do this setup: Jun 06 16:23:40 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. Jun 06 16:23:40 corosync [CPG ] chosen downlist: sender r(0) ip(192.168.56.104) r(1) ip(192.168.57.104) ; members(old:1 left:0) Jun 06 16:23:40 corosync [MAIN ] Completed service synchronization, ready to provide service. Jun 06 16:24:37 corosync [TOTEM ] Marking ringid 1 interface 192.168.57.105 FAULTY Jun 06 16:24:38 corosync [TOTEM ] Automatically recovered ring 1 Jun 06 16:25:33 corosync [TOTEM ] Marking ringid 1 interface 192.168.57.105 FAULTY Jun 06 16:25:34 corosync [TOTEM ] Automatically recovered ring 1 Jun 06 16:26:35 corosync [TOTEM ] Marking ringid 1 interface 192.168.57.105 FAULTY Jun 06 16:26:36 corosync [TOTEM ] Automatically recovered ring 1 (...) The second ring goes down about every 2 minutes and automatically back up right after. We spent some times looking for the commit that introduced this bug, and it appears it's due the following one: Corosync 1.3.3 -> 1.3.4: `e27a58d93d` Corosync 1.4.1 -> 1.4.2: `be608c0502` Commit message: Ignore memb_join messages during flush operations I had a look at this commit, and it seems to me it's dropping too many packets: Because of this commit, while totemrrp_recv_flush() is called, Corosync drops memb_join packets, but also ORF tokens. In the end, it seems that sometimes, we drop so many of them that Corosync marks the ring as faulty. To fix that, only memb_join messages are dropped now. Signed-off-by: Jerome FLESCH <jerome.flesch@netasq.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-06-11 10:59:30 +02:00
Jan Friesse	2766e57ce5	Store fdata with timestamp and pid in name This should allow easier handling of various blackbox dumps. Original fdata name is now symlink to latest created dump. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-06-05 12:19:42 +02:00
Jan Friesse	7ce332a713	totemudpu: Bind sending sockets to bindto address Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-05-31 09:28:52 +02:00
Fabio M. Di Nitto	f008cf442c	rename mainconfig to logconfig Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-05-29 09:36:00 +02:00
Fabio M. Di Nitto	b283ef8f12	mainconfig: allow mainconfig logic to be used both internally and externally corosync logging configuration logic is rather complex and in order to make it simpler to reuse (at least within corosync/ tree) we need to be able to use both icmap and cmap. the patch might seem controversial, but it reduces heaps of code around from qdevices (coming next). It might be useful to consider moving this to a common shared library but there aren't enough users yet and a shared lib would force corosync to link with cmap (that we do not want at all costs) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-05-29 09:04:03 +02:00
Angus Salkeld	5831136c87	LOG: make sure the log target is enabled. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-05-29 14:02:42 +10:00
Angus Salkeld	e6b35bdb7a	LOG: handle closing unused logfiles better This fixes a bug where having a second log file will close the previous one. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-05-29 14:02:42 +10:00
Angus Salkeld	e6afc761fe	LOG: be more explict about the qb file names else we can get messages been put in the wrong subsys. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-05-29 14:02:42 +10:00
Jan Friesse	2894f33c4f	totemip: Support bind to exact address Logic for binding now works in following way: - Try to find exact match - If not exact match is found, use first found network address This allows set concrete IP even if network settings contains two IPs on same network. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-05-24 14:01:12 +02:00
Jan Friesse	aaa575e091	totemip: insert items in correct order list_add_tail is used instead of list_add so ip addresses are inserted in same order as returned by getifaddrs. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-05-24 14:01:08 +02:00
Jan Friesse	0791f44c41	Include ringid in processor joined log message This should help correlate syslog entires with their blackbox counterparts. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Andrew Beekhof <andrew@beekhof.net>	2012-05-17 14:58:04 +02:00
Fabio M. Di Nitto	f2444effd0	icmap: don't leak memory when changing ro/rw status on a key Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-04-24 09:28:23 +02:00
Fabio M. Di Nitto	1dcb2d43d9	icmap: fix a valgrind errors (pass 1) clean up a lot of allocated blocks at exit. those changes has no runtime effects, but it makes valgrind output a bit more useful by dropping over 700 errors/warnings to skip over every single run. there are still a few icmap related valgrind errors but those need some more complex and timeconsuming investigation. pre patch: ==21844== HEAP SUMMARY: ==21844== in use at exit: 1,229,321 bytes in 1,516 blocks ==21844== total heap usage: 7,191 allocs, 5,675 frees, 3,819,853 bytes allocated ==21844== LEAK SUMMARY: ==21844== definitely lost: 3,617 bytes in 11 blocks ==21844== indirectly lost: 21,960 bytes in 11 blocks ==21844== possibly lost: 1,080,101 bytes in 131 blocks ==21844== still reachable: 123,643 bytes in 1,363 blocks ==21844== suppressed: 0 bytes in 0 blocks ==21844== ERROR SUMMARY: 136 errors from 136 contexts (suppressed: 0 from 0) post patch: ==25793== HEAP SUMMARY: ==25793== in use at exit: 1,185,870 bytes in 808 blocks ==25793== total heap usage: 9,427 allocs, 8,619 frees, 4,156,841 bytes allocated ==25793== LEAK SUMMARY: ==25793== definitely lost: 3,697 bytes in 12 blocks ==25793== indirectly lost: 22,248 bytes in 13 blocks ==25793== possibly lost: 1,079,655 bytes in 113 blocks ==25793== still reachable: 80,270 bytes in 670 blocks ==25793== suppressed: 0 bytes in 0 blocks ==25793== ERROR SUMMARY: 119 errors from 119 contexts (suppressed: 0 from 0) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-04-24 09:28:23 +02:00
Fabio M. Di Nitto	d2872aec70	crypto init: release *_slot resource after init Those are only used at init phase and we can free some memory for the system. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2012-04-20 10:57:16 +02:00
Fabio M. Di Nitto	b34c1e2870	ipcs: allow connections only after all services are ready this fixes a rather annoying race condition at startup where a client connects to corosync "too fast" before the service is ready to operate and client gets some random data during initialization phase. With this fix, we allow connections to ipc only after the main engine is operational and configured (and after the first totem transition). Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2012-04-16 13:39:03 +02:00
Jan Friesse	f89d7b715f	Always allocate totemrrp stats array This prevents segfault when rrp mode is set with only one ring. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-04-10 09:08:42 +02:00
Jan Friesse	92ead6106f	Properly parse uidgid files Full path to key is now tested rather then key name only. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-04-10 09:08:36 +02:00
Fabio M. Di Nitto	cde4468581	totemcrypt: fix build warning (unused variable) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-27 12:06:46 +02:00
Fabio M. Di Nitto	4378915a33	totemcrypto: major code cleanup (no functional or onwire changes) - cleanup include list - reorder code and functions (crypto then hash) - split crypt/decrypt/hash functions - some micro optimizations by dropping a few memcpy - make the code more readable (better var names and buffers mapping) - improve exit paths on error (return codes and free) - store crypto header size instead of recalculating it per packet Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-27 11:43:07 +02:00
Jan Friesse	e925f42165	Make ifaces_get work with dynamic no_rings Commit which added number of addresses to srp_address structure didn't count with totemsrp_ifaces_get where whole structure was copied instead of addresses only. This is now fixed. Also to make API totempg forward compatible, size of interfaces array must be passed to ifaces_get like functions to prevent memory overwrite. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-26 11:54:26 +02:00
Jan Friesse	124ff4339c	Add no_addrs field in srp_addr structure This should allow us future change to dynamic number of rings without breaking wire compatibility. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-22 14:03:38 +01:00
Jan Friesse	7a0a39b949	Mark few more icmap keys as read only Also most of the key settings are now centralized in one function, so it's easier to audit. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-16 09:37:25 +01:00
Jan Friesse	e57b5b9e6d	crypto: Remove sha224 and add md5 hash SHA224 is not supported on RHEL6 and also it's kind of weird. Instead of that, md5 can now be configured. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-15 17:36:56 +01:00
Jan Friesse	3b7c2f0588	Update crypto_set API Also few leftovers from cfg is removed and version of totempg is increased to 5 to reflect all changes we made Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-15 17:33:53 +01:00
Fabio M. Di Nitto	c75153feb4	crypto: allocate padding in crypto_header while it might seem a waste of space by using 2 extra bytes in the crypto_config_header, it actually gives us the option to grow "unknown at this time" features without hopefully breaking onwire compat Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-15 12:55:11 +01:00
Fabio M. Di Nitto	4a2d503643	crypto: add new hashing methods and fix config defaults add support for sha224/256/384/512 change config defaults to match coroparse and totemconfig Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-15 10:55:32 +01:00
Fabio M. Di Nitto	737de4dbd4	crypto: change network packets and add dynamic crypto header/data The new network packet will look: struct crypto_config_header * that provides info on crypto/hashing hash_block[size based on hashing function] (if hash is selected) salt[SALT_SIZE] (if crypto is selected) ...data... and we kill the concept of crypto_security_header completely since values are now dynamic for hash_block_size. the reason why hash_block needs to be there, is because we do hash salt in case both hashing and crypto are selected. the crypto_config_header is totally transparent to totem and to any underlaying crypto functions. as we go cleaning, also use HASH_BLOCK_SIZE to generate hash_block. the input buffer and output buffer size are dependent on the algo used to hash. we can now determine the real header size and adjust net_mtu properly at startup. This will allow in future to use any algorithm since size is dynamic. some part of the code still needs some polishing to make it more readable (specially the mapping of pointers into the packet is still a bit obscure). Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-14 15:57:01 +01:00
Fabio M. Di Nitto	c3f7d0ef3e	totem: don't send garbage onwire if we fail to crypt Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-14 15:30:40 +01:00
Fabio M. Di Nitto	452800c958	crypto: add crypto config to network data this add 2 bytes at the end of the each packet to propagate config info. in case there is a config mismatch packet must be rejected. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-14 12:32:10 +01:00
Fabio M. Di Nitto	0a6a6bbcfa	crypto: drop secauth and make crypto none work again keep totem.secauth config key for compatibility if the key is NOT set, crypto will default to aes256/sha1 if the key is set to "off", crypto is disabled. this reflects pretty much old behavior keywords totem.crypto_cipher and totem.crypto_hash can override secauth individually. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-14 11:28:36 +01:00
Jan Friesse	ab1675f0fe	Parse and use hash and crypto from config file Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-13 17:38:59 +01:00
Jan Friesse	cb97ed186a	Rename totemcrypto Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-13 17:38:46 +01:00
Fabio M. Di Nitto	55e8476697	crypto: mask the crypto operations from totem packet size management totem doesn't need to understand what crypto does. totem needs to be able to tell crypto: "those are data, play with them" and crypto needs to return: "here are your scrambled data and the new size" similar to decrypt/verify. this way we add enough dynamic within crypto to change header size and all at any given time (for different hash algorithm for example) without affecting on wire compat. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-13 15:50:58 +01:00
Jan Friesse	42a2f69e6f	onecrypt: move encryption code to crypto.c This will remove duplicity of code. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-13 12:23:13 +01:00
Jan Friesse	b5f7dcefeb	cfg: remove crypto_set Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-13 12:23:10 +01:00
Jan Friesse	8cdd2fc493	Remove libtomcrypt Tomcrypt in corosync is for long time not updated. Because we have support for libnss, libtomcrypt can be removed. Also few leftovers (AES is 256 bits, not 128, ...) are removed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-13 09:19:47 +01:00
Fabio M. Di Nitto	20a5289074	drop evs service there are several reasons for this: 1) evs is only partially implemented with no plans to complete it typedef enum { EVS_TYPE_UNORDERED, /* not implemented / EVS_TYPE_FIFO, / same as agreed / EVS_TYPE_AGREED, EVS_TYPE_SAFE / not implemented */ } evs_guarantee_t; 2) evs has no users in any upstream distribution and no search engine can find any other upstream using it. 3) the only reason (I was told) to carry around evs was that evs receives the full ring_id struct from totem. This is only partially correct because while the structures are prepared to carry around those data, they are never transmitted from corosync engine down the IPC line to the user. CPG ring_id contains the exact same information and it's actually less buggy (due to prototying of the info). worst case scenario where a user really absolutely need libevs, it can be easily reimplemented as libcpg wrapper and avoid lots of code duplication. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-12 15:51:50 +01:00
Fabio M. Di Nitto	c00502a70a	build: drop another leftover from the past Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-12 07:13:04 +01:00
Fabio M. Di Nitto	fd79118110	build: drop last LCRSO references Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-12 07:12:20 +01:00
Fabio M. Di Nitto	eb3d49ef7d	pload: make it a test service and not a public one pload is a performance benchmark that measures the onwire speed of corosync. problem is that once pload has been executed, the cluster is basically dead. turn pload into a test tool, by removing corosync-pload tool and user library. cleanup pload code to make it more readable and drop lots of unnecessary stuff. add test/ploadstart tool that can configure and start pload via cmap calls. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-12 07:11:51 +01:00
Fabio M. Di Nitto	142ce8c3a1	totem: drop crypt_accept: concept/option this was another old onwire compat mode that is not useful anylonger. we can safely move the new model by default. According to Honza (real hardware 1 node testing) there are no performance impact. My tests (8 nodes VM cluster), there is up to 10/12% performance improvements up to 1M packet size where old and new models are equal. As a side note, nss still shows to be a performance loss on both real and virtual hw (without any kind of nss hw acceleration). Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-10 07:08:30 +01:00
Angus Salkeld	03b32d7fad	Fix typo in stats key name. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-09 21:54:51 +11:00
Angus Salkeld	41b4416bd4	Remove unused function logsys_priority_name_get() Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-09 21:54:51 +11:00
Angus Salkeld	f628ccba8b	Add pid, hostname and process name to the logfile Note this is only for file targets not stderr or syslog. https://bugzilla.redhat.com/show_bug.cgi?id=789925 Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-09 21:54:51 +11:00
Fabio M. Di Nitto	e0e27e3d12	utils: cleanup main daemon exit codes some of them are not in use anymore and can be dropped. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-09 11:15:44 +01:00
Fabio M. Di Nitto	8f6e5ff530	sync: kill evil and syncv1 in one shot this change breaks onwire compatibility. cpg is the only user of sync_* interface and it's the only service that will require extra testing. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-09 11:15:08 +01:00
Fabio M. Di Nitto	64fd946086	votequorum: move last malloc/alloca buf to static this should guarantee that votequorum won't fail under high memory pressure. Price is 3500 bytes extra preallocated at startup. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-05 14:30:17 +01:00
Fabio M. Di Nitto	90c602902c	votequorum: fix node allocation memory leak stop using malloc for each new node, because we cannot free the memory easily. Move to a static allocated buffer that can contain PROCESSOR_MAX + qdevice cluster_node instead. We can never have more than PROCESSOR_MAX nodes anyway and the memory footprint is small enough compared to memory leaks (those can effectively happen only in very dynamic clusters with tons of different nodes joining/leaveing with different nodeids). Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-05 14:30:17 +01:00
Fabio M. Di Nitto	2d7a8ab29a	votequorum: rename leave_remove to allow_downscale pointed out that leave_remove can be easily confused with the old cman leave_remove behavior. The two are substantially different and we need to avoid confusion both for users and our support team. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto	75b3dc0f4e	votequorum: fix handling of config updates cmap changes are local to the node only and should not be broadcasted as configuration changes. if any change has happened to us, we will inform other nodes via send_nodeinfo. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto	861d2c90ef	votequorum: free our data and lists on exit this is mostly to avoid valgrind errors on exit and make the output more readable. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto	edf0728323	votequorum: disallow special features vs qdevice simply taking the safest path here since integration of qdevice is not fully complete Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto	33ea03f426	votequorum: fix node check based on reconfig parameter Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto	43e08bb143	votequorum: make a common function to calculate votes and cluster members Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto	692fd72468	votequorum: incorporate static config into dynamic no functional changes or extra features yet Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto	f960d0a342	votequorum: move all configuration in votequorum_readconfig Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto	3a717fc8e9	votequorum: start moving from static to fully dynamic config Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto	f12bfc5ad8	votequorum: disallow wait_for_all and qdevice operations The problem here is that user expectations, when using both modes at the same time, have not been set yet. There are 2/3 options that need investigation. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto	4a93ff267f	votequorum: improve debugging output Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:47 +01:00
Jan Friesse	25381738c2	Always set interface_up in totemip_iface_check Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-02 09:41:36 +01:00
Fabio M. Di Nitto	e34f095551	votequorum: fix node->flags type when receiving nodeinfo messages old_flags was set to uint16_t but it needs to be uint32_t. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-02-29 09:58:10 +01:00
Fabio M. Di Nitto	6a9e4760da	votequorum: fix segfault in wfa status update this is a regression introduced by `cb5fd775` when reading static config us->flags does not exists yet and therefor setting it will cause a segfault. Move the settings after cluster_node *us is created, with the long term plan to simply kill the whole _static readconfig bits in favour of dynamic (runtime changeable) bits. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-02-29 09:58:10 +01:00
Fabio M. Di Nitto	cb5fd77501	votequorum: major rework to fix qdevice API and integration with core qdevice is a very special node in the cluster and it adds a certain amount of complexity and special cases across the code. most of the qdevice data are shared across the cluster (name/votes) but effectively each node has a different view of the qdevice (registered/unregistered/voting/etc.) with this change, we align the qdevice view across the node, exchanging more data between nodes and we fix how qdevice behaves and it is configured. The only side effect is that the amount of data transmitted on wire is slightly higher. The qdevice API is still disabled by default. This means that the amount of real changes in current code are a lot smaller than it appears by this patch. TODO: documentation/man pages needs to be updated once this change is in (and behavior finalized). User visible changes: - configuration (coroparse, exec/votequorum): the quorum device section is now standalone within the quorum. quorum { provider: corosync_votequorum device { model: (name) timeout: (millisec) votes: } } the keyword "model:" is mandatory to enable qdevice in configuration and should express the name of the script/daemon that will provide the qdevice. Looking into the future, an init script or systemd service will look for that name in /path/to/be/decided/name and start/stop qdevice. timeout: defines the maximum interval the qdevice implementation has available between poll (see votequorum_qdevice_poll.3) before the device is considered dead and votes discarded votes: is now a configuration parameter and not an API call. quorum devices don't care what they need to vote. votes is autocalculated when a nodelist is available and all nodes in the list vote 1. Otherwise this parameter is mandatory. - configuration (exec/votequorum): startup and runtime configuration changes have been improved. errors at startup are considered fatal. errors at runtime have different exit paths. startup: * quorum.two_node and qdevice are incompatible. * quorum.expected_votes requires quorum.device.votes. * quorum.expected_votes - quorum.device.votes cannot be lower than 2. * qdevice and last_man_standing are mutually exclusive. * qdevice and auto_tie_breaker are mutually exclusive. runtime config changes: * quorum.two_node and qdevice are incompatible: if quorum device is alive, two_node is disabled. if quorum device is not alive and node count is 2, two_node is enabled, and quorum device cannot be registered * if either last_man_standing or auto_tie_breaker were enabled at startup, and at runtime quorum device is configured, quorum device registration will be blocked. * if quorum.expected_votes is configured but not quorum.device.votes, quorum device registration will be blocked. * if quorum.device.votes is not configured and we cannot automatically calculate it, quorum device registration will be blocked. * An error in configuring quorum.expected_votes and quorum.device.votes will block quorum device registration. blocking quorum device registation, also means dropping the votes. quorum.device.votes (either set or automatically calculated) is now used to determine current expected_votes in the cluster. - logging (exec/votequorum): all errors from configuration are treated as WARNING/CRITICAL. lots of extra DEBUG output is added (see internal changes too). - corosync-quorumtool (tools/corosync-quorumtool): * added option to forcefully kick out a quorum device from the local node. This is for emergency recovery only and it is only available when qdevice API is built-in. * Improved status output, specifically add node state and qdevice information [root@fedora-master-node2 coro]# corosync-quorumtool -s Version: 1.99.4.12-9c7d-dirty Quorum type: corosync_votequorum Nodes: 2 Ring ID: 132 Quorate: Yes Node votes: 1 Node state: Member Expected votes: 3 Highest expected: 3 Total votes: 3 Quorum: 2 Flags: Quorate Qdevice Nodeid Votes Name 1 1 fedora-master-node1.int.fabbione.net 2 1 fedora-master-node2.int.fabbione.net 0 1 QDEVICE (Voting) * allow to print status for any node in the cluster known to local node. [root@fedora-master-node1 coro]# corosync-quorumtool -s Version: 1.99.4.12-9c7d-dirty Quorum type: corosync_votequorum Nodes: 2 Ring ID: 144 Quorate: Yes Node votes: 1 Node state: Member Expected votes: 3 Highest expected: 3 Total votes: 2 Quorum: 2 Flags: Quorate Nodeid Votes Name 1 1 fedora-master-node1.int.fabbione.net 2 1 fedora-master-node2.int.fabbione.net [root@fedora-master-node1 coro]# corosync-quorumtool -s -n 2 Version: 1.99.4.12-9c7d-dirty Quorum type: corosync_votequorum Nodes: 2 Ring ID: 144 Quorate: Yes Node votes: 1 Node state: Member Expected votes: 3 Highest expected: 3 Total votes: 3 Quorum: 2 Flags: Quorate Qdevice Nodeid Votes Name 1 1 fedora-master-node1.int.fabbione.net 2 1 fedora-master-node2.int.fabbione.net 0 1 QDEVICE (Voting) Internal changes: - change qdevice timer to not run all time, but only when necessary. - change votequorum_nodeinfo on wire data to use flags instead of uint8_t and add QDEVICE status. - allocate nodeid 0 to qdevice since it's the only real nodeid that be reserved. - change send_nodeinfo to allow to send nodeinfo for any node so that we can share qdevice info across the cluster (and this might be useful in future if we need to sync internal cluster view). - add votequorum api call to update qdevice name - add runtime data if quorum device has been forcefully disabled by config error - add qdevice votes to expected_votes calculation (this is probably the biggest difference vs cman) - change votequorum_read_nodelist_configuration so that we can autocalculate votes for qdevice (we need the nodecount vs votes). - add all checks for startup/runtime config (see above). - do not make qdevice part of the membership_list received from totem. None of our users care about it and it is not a real node. - change onwire message handlers to deal with "data for this node from any node" case and undersand nodeid 0 for qdevice info - always allocate qdevice at startup. this simplifies code a lot. - dispatch qdevice nodeinfo on membership changes. - inform libvotequorum users when a qdevice is registered - improve substantially qdevice api and add a simple barrier based on qdevice name. - add qdevice API barrier at cluster level. This feature allow only one qdevice name to be active in the cluster at any time. - qdevice getinfo can now report status for qdevice on any node. - change slightly the way the qdevice API is built-in/out: only the libvotequorum calls are #ifdef'out now. Doing so in the core is too complex and would make the code unreadable with the risk of missing a bit or two effectively introducing an on-wire incompatibility if we will ever turn the API on. - probably added some bugs on the way... TODO: update qdevice_* API once the above is settled and test qdevice integration with other features. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com> (only second part)	2012-02-27 09:30:26 +01:00
Jan Friesse	c30c088597	Tweak nodeid warning Nodeid warning now appears only when both totem.nodeid and nodelist nodeid exists. When nodelist nodeid is not defined, totem.nodeid is used. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-21 16:33:56 +01:00
Jan Friesse	04720649ba	iba: Use configured node id Corosync was ignoring nodeid for iba transport and always used autogenerated one. Original patch by: Jason Dillaman <jdillama@redhat.com> Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-02-21 16:27:16 +01:00
Angus Salkeld	40727bd6a3	Convert the common lib into a shared lib. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-02-21 20:26:08 +11:00
Jan Friesse	88ae75d6c2	Allow autoconfiguration of interface section Thanks to totemip_getifaddrs infrastructure it's now possible to use nodelist informations to autoconfigure interface bindnetaddr. Together with cluster_name, interface section can be completely omitted. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-16 10:47:57 +01:00
Jan Friesse	ba13537471	totemconfig: ensure suffix for ringX_addr Patch makes sure, that ringX_addr key has really _addr suffix. Previously, it was possible to enter ringXanything and it was interpreted as ringX_addr. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-16 10:47:57 +01:00
Jan Friesse	8cde53aa99	cmap: Handle NULL in [i]cmap_set_string value Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-16 10:47:57 +01:00
Jan Friesse	88ddaecfe9	Create solaris specific getifaddrs This not only makes possible to use generic totemip_iface_check, but also fixes some problems with previous implementation (fixed mask, not very well supported ipv6, ...) Tested on OpenIndiana 151a Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-16 10:47:57 +01:00
Jan Friesse	fd47fddcaf	Add totemip_iface_check based on totemip_getifaddrs Also Linux and BSD/Darwin specific bits are no longer needed, so they are gone. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-16 10:47:57 +01:00
Jan Friesse	27e9988486	Add generic implementation of getifaddrs Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-16 10:47:56 +01:00
Angus Salkeld	023c4fa0cc	Move hdb_error_to_cs to corotypes.h Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-14 11:10:14 +11:00
Steven Dake	415ef892ad	Remove empty testquorum.c file Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2012-02-13 17:05:04 -07:00
Steven Dake	2ad0cdc832	Update copyright header dates in exec directory Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2012-02-13 17:05:04 -07:00
Steven Dake	4ee9550f80	Remove jhash.h since it is not used We would use libqb for hashing now if we needed hashing. cpg no longer uses jhash.h. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>	2012-02-13 17:05:04 -07:00
Steven Dake	815375411e	Remove unused or unimplemented CFG apis Remove: cfg_statetrack cfg_statetrackstop cfg_administrativestateste cfg_administrativestateget cfg_serviceload cfg_serviceunload Rev SO to 5.0.0 Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2012-02-13 17:04:49 -07:00
Fabio M. Di Nitto	8840113704	votequorum: fix variable init Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-09 16:49:25 +01:00
Fabio M. Di Nitto	e3ba920307	votequorum: fix possible memory corruption nodeid = 0 is a valide nodeid and node associated with it should not be freed Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-09 16:49:25 +01:00
Fabio M. Di Nitto	939a7b2d66	quorum: don't leak memory on error Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-09 16:49:25 +01:00
Angus Salkeld	6cd576b0f5	move hdb_error_to_cs to common_lib Note the previous inconsistent implementation. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-09 10:45:56 +11:00
Angus Salkeld	da483b8121	Add a common library that can be shared between libs and corosync We have always had this problem and worked around it by coping code or using inline functions. Both not good IMO. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-09 10:45:56 +11:00
Steven Dake	7592e3b61e	Remove include/engine/quorum and integrate it into exec/engine.h Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-02-08 08:31:10 -07:00
Steven Dake	01c63ca17c	Free state variable allocated in wd_resource_state_is_ok Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-02-07 08:42:58 -07:00
Steven Dake	c05cbb65bc	Remove leaked resource error from wd_resource_state_is_ok Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-02-07 08:42:58 -07:00
Steven Dake	190dba3933	Remove use after free and free of uninit value in mainconfig error path Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-02-07 08:42:58 -07:00
Steven Dake	46a2b1a297	Remove use after free in corosync_main_config_set in error path Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-02-07 08:42:58 -07:00
Fabio M. Di Nitto	cff57430d6	votequorum: fix quorum_ringid setting before any delivery occours Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-02-07 14:07:09 +01:00
Angus Salkeld	8992acb815	LOG: add libqb as a "subsys" So we can see libqb internal logs Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-07 10:53:56 +11:00
Jan Friesse	546aea23cf	cmap: Check RO flag in adjust int function Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-06 16:37:00 +01:00
Jiaju Zhang	dd9e177af7	CPG: Send CPG_REASON_PROCDOWN when really needed This patch fixes the issue that in some cases where cpg_finalize() was called just after cpg_leave() was called, CPG_REASON_PROCDOWN might also be sent while CPG_REASON_LEAVE had already been sent. This behavior is not aligned with what the man page has described: "CPG_REASON_PROCDOWN - the process left a group without calling cpg_leave()." And it will confuse CPG's clients in that one process left results in two different reasons being sent. The root cause of this issue is cpg_leave() will return after adding the LEAVE message to the sending queue, but the cpg's group name has not been cleared yet. Just at that time, cpg_finalize() is being called, then it determines if there is the calling of cpg_leave() happened only by the checking of cpg's group name, so this method is not sufficient. Signed-off-by: Jiaju Zhang <jjzhang@suse.de> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-06 08:07:54 -07:00
Fabio M. Di Nitto	3b77dd9d83	votequorum: fix expected votes manual override from quorumtools votequorum internal quorum/expected_vote check was slightly too conservative and was not done correctly when leave_remove feature is enabled. this fix allows admins to effectively override expected_votes and drive ev_barrier as expected. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-02-03 10:33:33 +01:00
Jan Friesse	0929dcb68c	Better checks of integer values in coroparse Instead of atoi, strtol is used. This allows detection of typical problems like empty value of key and incorrectly entered numbers. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-03 09:16:43 +01:00
Fabio M. Di Nitto	230231fedb	votequorum: add runtime internal data to icmap runtime.votequorum.* specifically ev_barrier, two_node, lowest_node_id and wait_for_all_status are values that change internally at runtime and keeping track of those can make debugging rather easy, specially when LOG_DEBUG is not set. Also track our node id. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-By: Christine Caulfield <ccaulfie@redhat.com>	2012-02-02 16:36:57 +01:00
Jan Friesse	33e5ce8d56	Show correct error when open of logfile failed Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-02 09:30:49 +01:00
Jan Friesse	a80febda7e	Store error str if can't open logfile Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-02 09:30:49 +01:00
Angus Salkeld	af9cfc7b55	IPC: reference count the connection whilst flushing the outq Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-02 11:34:26 +11:00
Angus Salkeld	45cb05f1ad	IPC: allow for failures in the connection_created callback Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-02-01 08:51:13 +11:00
Fabio M. Di Nitto	46b7b155a4	votequorum: add leave_remove option this also cleanup NODESTATE for good. JOINING was never used Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-01-31 16:58:08 +01:00
Fabio M. Di Nitto	c16086bead	votequorum: honor onwire node flags change internal flags were not propagated correctly in the node status Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-01-31 10:20:32 +01:00
Fabio M. Di Nitto	9fa83dabbe	quorum: fix load/unload priority for quorum services all main services are loaded at priority 1. vfs_quorum and votequorum did not specify a priority and automatically defaulting to 0, that has a special meaning of being loaded last and unloaded last. this is not correct behavior and limits what votequorum can do at shutdown, for example notify other nodes that it is leaving (something that cannot be gathered by totem membership change callback). fix vsf_quorum to load at priority 1 as the other default services and bump votequorum to 2 (needs to unload before everything else currently known). Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-01-31 10:16:52 +01:00
Fabio M. Di Nitto	a2b960d109	service: fix service unload regression introduced by lcrso dropping service exec_exit_fn was not honored because the loop was looking into the wrong icmap key Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-01-31 10:16:16 +01:00
Fabio M. Di Nitto	fc61b20a8a	votequorum: drop unnecessary flags code inspection shows that those internal flags are never used Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-01-31 10:14:19 +01:00
Steven Dake	007e5c9458	Honor exec_init_fn call exec_init_fn now either returns NULL (success) or a string which indicates the error that occured during service engine initialization. If an error occurs, corosync will exit. This patch adds ykd and makes other suggestions from Fabio Di Nitto. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>	2012-01-30 14:05:09 -07:00
Fabio M. Di Nitto	ccd36af00e	votequorum: rename qdisk to qdevice a quorum device is not necessarely a disk and this also aligns various names to be generic Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-By: Christine Caulfield <ccaulfie@redhat.com>	2012-01-27 11:17:02 +01:00
Fabio M. Di Nitto	769fc913f3	quorum: drop quorum.quorate config option it's unused / unnecessary Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-By: Christine Caulfield <ccaulfie@redhat.com>	2012-01-27 11:16:36 +01:00
Fabio M. Di Nitto	b05477859f	votequorum: fix expected_votes propagation it is not correct to randomly accept expected_votes from any node in the cluster. We can only allow expected_votes from quorate nodes. A quorate cluster is "always" right and have the correct expected_votes. One of the different bug triggers: quorum { expected_votes: 8 auto_tie_breaker: 1 last_man_standing: 1 } start all 8 nodes. clean shut down 2 nodes. wait for lms to kick in. kill 3 nodes with highest nodeid (we want to retain a quorate partition of 3 nodes) start one node again -> cluster will be unquorate This happens because the node rebooting/rejoining with non current cluster status will propagate an expected_votes of 8, while in reality the cluster is down to expected_votes: 3. 4 nodes are still < 5 (quorum for 8 nodes/votes). In order to avoid this condition, we need to exchange expected_votes information among nodes but we cannot randomly trust everybody. 1) Allow expected_votes to be changed cluster-wide only if the information is coming from a quorate node. 2) Fix node->expected_votes based on quorate status 3) allow a joining node to decrease quorum and expected_votes if the node is not yet quorate, but it's joining a quorate cluster Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-01-26 14:32:54 +01:00
Fabio M. Di Nitto	88e6830df1	votequorum: fix auto_tie_breaker design and simplify code a lot auto_tie_breaker requires to know the lowest node id in the currently quorate partition and not of the whole cluster. this allow us to determine the lowest node id as soon as we are quorate and remove the complexity to read it from WFA or nodelist. Add the same time it adds the flexibility for dynamic nodeids in a cluster. drop requirement on WFA if nodelist is not specified update man page Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-01-26 14:32:54 +01:00
Fabio M. Di Nitto	40aa40ed84	votequorum: drop NODESTATE_LEAVING this is another leftover from cman compatibility layer Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-01-26 14:32:54 +01:00
Fabio M. Di Nitto	269e0c4970	votequorum: change quorum.expected_votes override behavior as agreed on the mailing list, quorum.expected_votes should override automatically calculated expected_votes from nodelist. Also simplify the code to handle expected_votes. "silly defaults" is now unnecessary because votequorum does config sanity checks upfront. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-01-25 14:06:27 +01:00
Fabio M. Di Nitto	efbf5282f9	votequorum: two_node should enable wait_for_all by default This avoids fencing races at startup of a cluster. It is still possible to override WFA by explicitly setting wait_for_all: 0 Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2012-01-25 07:04:24 +01:00
Angus Salkeld	14fd1c927a	Add debug log messages to corosync for join/leave This is needed by cts. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-01-25 11:33:09 +11:00
Angus Salkeld	3698b78de9	LOG: make sure that debug works to syslog Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-01-25 11:33:09 +11:00
Jan Friesse	e89201b9c9	totemiba: Remove unused wthread.h include Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2012-01-24 16:28:55 +01:00
Fabio M. Di Nitto	78edc1f24b	votequorum: add support for nodelist config bits expected votes is now calculated automatically and quorum.expected_votes can be used to override nodelist calculation. The highest of the two value is used for runtime. quorum_votes can be specified either in the node list or in quorum.votes. The node list has priority over global. propagate votequorum initalization errors (due to config inconsistencies) back to vsf_quorum. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-01-23 11:46:34 +01:00
Angus Salkeld	3131601ce2	Remove all unneccessary "\n" from log messages These look ugly, are inconsistently done and just have to be removed later in libqb before calling syslog. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-01-23 13:08:23 +11:00
Angus Salkeld	61c0995e1c	Shorten some really long lines in main.c Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-01-23 13:08:23 +11:00
Jan Friesse	0c2e3c8408	Make local_node ring0 address read-only Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-01-20 11:09:37 +01:00
Jan Friesse	d6cbdd9b84	Support for dynamic nodelist udpu member change Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-01-20 11:08:35 +01:00
Jan Friesse	16007acbef	Use nodeid provided in nodelist Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-01-20 11:08:35 +01:00
Jan Friesse	de70c0007c	Support udpu members in nodelist Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-01-20 11:08:35 +01:00
Jan Friesse	c8a62d8b3c	Add local_node_pos icmap key Key contains local node position in nodelist Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-01-20 11:08:35 +01:00

... 3 4 5 6 7 ...

1917 Commits