mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2025-10-31 08:44:50 +00:00

Author	SHA1	Message	Date
Jan Friesse	58176d6779	Add token_coefficient option Token coefficient is used only when nodelist is specified and contains at least 3 nodes. If so, real token timeout is then computed as token + (number_of_nodes - 2) * token_coefficient. This allows cluster to scale without manually changing token timeout every time new node is added. This value can be set to 0 resulting in effective removal of this feature. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-25 15:29:17 +01:00
Jan Friesse	9a8de87c34	totemconfig: Log errors on key change and reload When volatile key was changed (cmap set or reload) and checks fails, nothing was logged. Values are now checked and error string is logged on problems. Also totem_config is dumped to log (DEBUG level) after every volatile key change and every reload. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-25 15:29:14 +01:00
Jan Friesse	b95ebd640e	totemconfig: Key change process dependencies When key with dependency was changed, dependant keys were not recomputed. Nice example is consensus timeout. If token timout was changed, consensus timeout was not recomputed correctly (nether via cmap change of key nor via cfg reload). Solution is almost complete refactor of handling volatile defaults. totem_volatile_config_read now handles not only storing cmap key to totem_config structure, but also checking of existence, comparing with zero value and properly storing defaults. totem_set_volatile_defaults is gone. It's function was splitted into totem_volatile_config_read and totem_volatile_config_validate functions. Reload callback and change of key callback are now mostly same functions and both calls totem_volatile_config_read. Patch also fixes small memory leak. totem.vsftype key is not used for long time and original totem_volatile_config_read wasn't freeing allocated memory returned by icmap_get_string. Whole reading of totem.vsftype is removed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-25 15:29:12 +01:00
Jan Friesse	eeb2384157	Really clear totemconfig nodes on reload When reload was called nodes were constantly added to totemconfig nodelist. So simple corosync-cfgtool -R resulted very quickly in filling whole array and segfault. Solution is to clear member_count. Clearing is also moved directly to put_nodelist_members_to_config to make sure it's always processed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-25 15:29:09 +01:00
Jan Friesse	1b6abcc7d5	Log: Make reload of logging work When reload was called multiple times (~20), logging to file stopped working. Main problem was hidden in the fact, that log file was opened multiple times, because even target_id was shared via subsystem loggers, file name was not. Solution is to ALWAYS set proper log file name into subsystem logger (copy is stored). This will not only fix problem but also removes small leak. Also if filename didn't changed, function can return sooner. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-25 15:13:33 +01:00
Jan Friesse	2f0cad20a9	config: Handle totem_set_volatile_defaults errors When totem_set_volatile_defaults is called from totem_config_validate return code is unchecked. It's then perfectly possible to set (for example) join timeout to very small value (1) and consensus value is then set to 0 making corosync unable to create membership. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-17 10:04:00 +01:00
Jan Friesse	e1801ba497	votequorum: Properly initialize atb and atb_string icmap_get_* behavior is to NOT modify passed variable when it doesn't success. So we must initialize variable before icmap_get_* call. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2014-02-26 16:59:02 +01:00
Jan Friesse	ff67daa55f	mon: Make monitoring work Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-25 14:57:20 +01:00
Jan Friesse	099f704cdd	mon: Pass correct pointer to inst Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-25 14:57:16 +01:00
Jan Friesse	57ff693b70	mon: Fix comparsion typo Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-25 14:57:13 +01:00
Jan Friesse	e1e2390b61	mon: Make mon compilable with libstatgrab ver 0.9 Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-25 14:57:10 +01:00
Jan Friesse	fbe8768f1b	cpg: Make sure left nodes are really removed When node is paused and other nodes has in meantime exited cpg process, paused node after resume doesn't update it's membership correctly so on previously paused node exited cpg process is still visible. Solution is to compare join list with cpd and remove all pids which are not included in join list. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-19 10:59:14 +01:00
Jan Friesse	83c63b247f	cpg: Make sure nodid is always logged as hex num Also number is prefixed by 0x so it's easier to spot that number is hexadecimal. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-19 10:59:10 +01:00
Jan Friesse	fcf26e0303	cpg: Refactor mh_req_exec_cpg_procleave Most of functionality is moved to do_proc_leave function to make it reusable. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-19 10:59:05 +01:00
Jan Friesse	38c04d9a66	totemsrp: Fix typo with cont gather Patch `f3ffd3da5c` introduced named states of state-machine, but sadly contains logical problem causing stats.continuous_gather increasing even when it shouldn't. Problem is not critical, because continuous_gather is set to 0 on successful membership creation. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-18 16:12:57 +01:00
Christine Caulfield	90d448af3b	votequorum: Add extended options to auto_tie_breaker This patch adds more flexibility to the auto_tie_breaker feature of votequorum. With this, not only can the lowest nodeid be used as a tie breaker, but also the highest, or a node from a nominated list. If there is a list of nodes, the first node in the list that was not part of the previous partition is used. This allows the user to specify a preferred set of nodes but prevents a split-brain if the cluster divides evenly with a node in each half. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-02-17 16:29:45 +00:00
Masatake YAMATO	fa71067a93	Free object allocated at quorum_register_callback Memory object allocated with malloc at quorum_register_callback is not freed. The object is linked to internal_trackers_list. The object is unlinked at quorum_unregister_callback. However, it is not freed at the function. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-01-23 17:18:44 +01:00
Jan Friesse	45dd9861ff	Properly check result of symlink Error message is displayed when it's impossible to create symlink to fdata file. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-01-14 11:24:31 +01:00
Jan Friesse	5c54f941ac	Fix cppchecks warning Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-01-14 11:24:29 +01:00
Jan Friesse	178c0d82d9	Close devnull file handler Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-01-14 11:24:26 +01:00
Jason	cfbb021e13	totem: Drop invalid join msg in operational state According to the totem paper, if a processor receives a join message in the operational state and if the receivers identifier is in the join messages fail list, then join message should be ignored. By applying this validation of join messages, we can avoid unnecessary switching from operational state to gather state(or even lead to rings can not be merged) like the following to happen. 1. Initially, there is only one ring contains three nodes, say ring(A,B,C). 2. A and B network partition, "in the same time", C is down. 3. Node A sends join message with proclist:A,B,C. faillist:NULL. Node B sends join message with proclist:A,B,C. faillist:NULL. 4. Both A and B consensus timeout due to network partition. 5. A and B network remerged. 6. Node A sends join message with proclist:A,B,C. faillist:B,C. and create ring(A). Node B sends join message with proclist:A,B,C. faillist:A,C. and create ring(B). 7. Say join message with proclist:A,B,C. faillist:A,C which sent by node B is received by node A because network remerged. 8. Node A shifts to gather state and send out a modified join message with proclist:A,B,C. faillist:B. Such join message will prevent both A and B from merging. 9. Node A consensus timeout (caused by waiting node C) and sends join message with proclist:A,B,C. faillist:B,C again. Same thing happens on node B, so A and B will dead loop forever in step 7, 8 and 9. As the paper also said: "If a processor receives a join message in the operational state and if the sender's identifier is in the receiver's my_proclist and the join message's ring_seq is less than the receiver's ring sequence number, then it ignores the join message too." So these patch applying these validations of join messages altogether. Signed-off-by: Jason <huzhijiang@gmail.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-01-13 14:46:13 +01:00
Christine Caulfield	ff6a43edb3	votequorum: Add persistent expected_votes tracking. This patch adds the option to store expected_votes to persistent storage. This is needed to allow_downscale to operate properly. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2014-01-07 15:30:11 +00:00
Jan Friesse	b88c0766fe	logsys: Make logging of totem work again Because of change in libqb (9abb686) logging of TOTEM subsystem stopped working. Instead of rely on previous behavior (implicit substring match), all totem files are now explicitly given. Also QB subsystem now uses comma separated filelist instead of previous function calling. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-11-04 12:32:35 +01:00
Masatake YAMATO	f3ffd3da5c	totemsrp: Show English message when memb_state_gather_enter is called The reason why memb_state_gather_enter is invoked was printed in integer code. This patch introduces human readable English messages for the code. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-10-24 16:46:17 +02:00
Yevheniy Demchenko	805b3423ee	totemiba: Check if configured MTU is allowed by HW Solution use aproximation of totem structures. This needs to be rewritten in proper way. Also MTU checking should be implemented for IP transports. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-20 11:27:08 +02:00
Yevheniy Demchenko	8f14a5788f	totemiba: Fix parameters position for poll_add Parameters in functions like mcast_cq_send_event_fn, ... were defined in incorrect order. Also their names were weird. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-20 11:26:50 +02:00
Yevheniy Demchenko	c5d4a0762f	totemiba: Del channel fd from poll before destroy Corosync freezes after several peer node connects/disconnects. The freeze happens in recv_token_cq_recv_event_fn in ibv_get_cq_event call. The problems is in fact, that after each peer node connect, recv_token_accept_destroy is called, which tries to call poll_dispatch_delete _after_ freeing of completion_channel. As completion_channel contains fd, handlers are not disconnected from poller properly. This leads to complete inconsistency in subsequent calls to handlers. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-20 11:26:04 +02:00
Yevheniy Demchenko	5046de387b	totemiba: Properly allocate RDMA buffers 1. In UD mode receivnig side of RDMA application should have enough space in buffer to hold data and GRH. Also, sge.length on the receiving size should be set to max_msg_size + sizeof (struct ibv_grh). Current corosync doesn't take grh in the account and does not work if mtu is set to the real mtu of IB port (it works if netmtu is set to < 2048-40). 2. ibv_wc.byte_len is the actual lentgh of the received packet, i.e. msg_len + GRH. GRH length should be substracted in further proceeding. If not, it might cause problems when messages get retransmitted, as their apparent size will constantly grow. 3. Current corosync will not work with rdma and mtus > 2048. Most modern IB HW supports 4096 mtu. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-20 11:26:00 +02:00
Christine Caulfield	1a046793cb	Reload: Add atomic reload to log config When a reload is in progress, wait until it has all finished before re-reading all of the logging parameters Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-12 16:10:07 +01:00
Christine Caulfield	c0bfd48928	Reload: Add atomic reload to totemconfig When a reload is in progress, wait until the whole thing has finished before setting parameters Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-12 16:09:55 +01:00
Christine Caulfield	82fbffc34b	Reload: Add reload code to cfg Add the code to do the actual corosync.conf reload to cfg, along with a corosync-cfgtool -R command to trigger it Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-12 16:09:41 +01:00
Christine Caulfield	bc47c583bd	Reload: Make coroparse use a designated icmap hash table Pass an icmap hashtable into coroparse so we can load it into a temporary one during reload Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-12 16:09:06 +01:00
Jan Friesse	95133a5d77	icmap: Add func to test equality of two key values Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-09-10 17:02:12 +02:00
Christine Caulfield	8567887abb	[PATCH] Replace freopen with open/dup2 when daemonizing This patch replaces the existing freopen method of forcing stdin/out/err to /dev/null with the more usual system of open/dup2. While I don't like posting patches I don't fully understand, this patch seems to fix a problem where stdout/err get assigned to a socket causing double logging output on systemd. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-10 15:33:31 +01:00
Christine Caulfield	3663622576	Add log message to exit signal handler I've seen a few instances where corosync has shut down for apparently 'no reason'. In fact most of the time the shutdown has been caused by an external source (often an init script) but it's not been obvious what has happened and people implicate the deamon This patch simply adds a log message to the signal handler when it is called so that the cause of the shutdown is obvious. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-03 14:04:50 +01:00
Jan Friesse	26ef8e15db	icmap: Add map copy function Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-29 17:08:46 +02:00
Jan Friesse	e363f8b06d	icmap: Add function to return item data pointer icmap_get_r is now implemented using this function. Function is not very safe tho defined as static. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-29 17:08:41 +02:00
Jan Friesse	624cd439aa	icmap: Fix value len checking for strings Implementation should allow pass only parts of string (shorten string) and must prohibit reading of uninitialized memory. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-29 17:08:37 +02:00
Jan Friesse	04ddddd6d2	icmap: Add function to return global icmap Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-29 17:08:32 +02:00
Jan Friesse	e5a528c5cb	icmap: Allow multiple icmap instances Patch adds reentrant version of most of functions (with exception of RO flags support and tracking) to allow multiple icmap instances existence inside corosync. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-27 15:23:52 +02:00
Michael Chapman	2740cfd1ea	Fix scheduler pause-detection timeout qb_loop_timer_add expects the timeout to be in nanoseconds, but we were passing the value in milliseconds. Scale the timeout appropriately. Signed-off-by: Michael Chapman <mike@very.puzzling.org> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-08-19 09:03:24 +02:00
David Vossel	b424acc3a0	ipc_glue: proper ref counting during service connection iteration Signed-off-by: David Vossel <dvossel@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-07-04 13:05:52 +02:00
David Vossel	aa8e56a0fe	ipc_glue: Remove connection unref with no matching reference. We don't reference the connection object on creation, so there is on reason to dereference it on disconnect. Signed-off-by: David Vossel <dvossel@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-07-04 13:05:36 +02:00
David Vossel	771b239603	ipc_glue: Fixes connection ref count leak Signed-off-by: David Vossel <dvossel@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-07-04 13:05:02 +02:00
Christine Caulfield	074e57910e	The corosync message "A processor joined or left the membership" is vague and unhelpful. People have to look for the following quorum message and try to deduce which nodes have joined or left from that and past membership messages, even though the routine printing the message already has this information to hand. This patch fixes that message so that it prints the nodeids of the nodes that have joined/left the cluster. Signed-Off-By: Christine Caulfield <ccaulfie@redhat.com> Reviewed-By: Jan Friesse <jfriesse@redhat.com>	2013-06-27 14:44:46 +01:00
Jan Friesse	615d7592fb	Log: Output parse errors to syslog When corosync was started in daemon mode and there was parse error, no way existed how to find out what happened (this is usual situation with systemd enabled systems). Solution seems to be output to syslog by default. Also redundant line with setting logsys is removed because it's no longer needed, because FORK and THREADED mode options has no longer effect. FORK is handled by libqb by default and THREADED mode is forced by calling logsys_thread_start. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-21 11:21:42 +02:00
Jan Friesse	d6dd2e455d	totemconfig: Prevent leak of cluster_name str Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-21 11:21:33 +02:00
Jan Friesse	7cba14fb61	service: Fix memleak in service_unlink_and_exit Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-21 11:21:29 +02:00
Jan Friesse	514eb0f37d	ipc_glue: Check service name len Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:12 +02:00
Jan Friesse	f7beba46c5	ipc_glue: Introduce constant for service name len Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:12 +02:00
Jan Friesse	90da72cd7f	cfg: Check interface status and name length Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:12 +02:00
Jan Friesse	335da1ecfd	cfg: Check number of interfaces Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:12 +02:00
Jan Friesse	5dc3fc4bda	totemrrp: Make status string shorter Status string should be same lenght as needed for cfg ringstatusget function. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:11 +02:00
Jan Friesse	845a625908	totem: Don't leak instance variable on crypto fail Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-18 14:35:25 +02:00
Jan Friesse	93286a344e	totemudpu: Handle fd leak in totemudpu Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-18 14:35:21 +02:00
Jan Friesse	421de34972	totemconfig: Check length of rrp_mode string Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-18 14:35:15 +02:00
Jan Friesse	675da75759	coroparse: Ensure that config items fits into cmap Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-18 14:35:05 +02:00
Jan Friesse	e094ab2e2c	votequorum: Prevent leak in qdevice_is_configured Also LEAVE from function is now properly logged. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-17 15:47:27 +02:00
Jan Friesse	4310d84e4d	Initialize error variable in ykd_init Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:57 +02:00
Jan Friesse	92b900da67	Initialize node_found in nodelist_to_interface fun Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:57 +02:00
Jan Friesse	903e02875d	Initialize item in cmap_mcast_send Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	f198955644	votequrorum: Assert sender nodeid is known Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	56ee492471	Check result of logsys_subsys_create Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	d5d4cdb972	Check logsys_format_set result in logsys setup Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	90f8a68a2b	Use proper totem_ip_address size in memset Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	df6b87f293	Free icmap strings in logconfig Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	ce9c69da03	Properly break MAIN_CP_CB_DATA_STATE_QDEVICE state Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:55 +02:00
Jan Friesse	d5d3fb4d45	Do not dereference format_buffer when it's NULL Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:55 +02:00
Jan Friesse	96a89a0085	Check icmap str get for clustername Even this check is really not needed, it's nice to have it and on fault ensure that cluster_name is really NULL. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:55 +02:00
Jan Friesse	966f461b69	Properly check result of stat func in coroparse Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:55 +02:00
Jan Friesse	e684e4ca6f	Remove unnecessary mmap in cpg Code for zero-copy in cpg does following mmaps: - Mmap anonymous, private memory to some address (-> malloc) - Mmap shared memory of fd to address returned by first mmap (effectively shadows first mapping) This is not necessary and only one mapping is needed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-05-21 14:46:15 +02:00
Jan Friesse	8429d01389	Detect big scheduling pauses Add poll timer scheduler to be called 3 times per token timeout. If poll timer was not called for more then 0.8 * token timeout, it means corosync process was not scheduled and ether token_timeout should be increased or load should be reduced (useful for VM, where host is overcommitted so VM is not scheduled as expected). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-04-08 09:58:42 +02:00
Jan Friesse	86b074dc1a	Support for numerical uid/gid Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-04-02 09:32:10 +02:00
Andrei Belov	005e7fd3b9	Improved POSIX-compliant handling of getpwnam_r() and getgrnam_r(). Signed-off-by: Andrei Belov <defanator@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-03-28 16:32:53 +01:00
Jan Friesse	0e3d1a9c51	totempg: Make iov_delv local variable Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-03-21 14:24:23 +01:00
Xia Li	ca6051e80c	Convert the nodeid byte order to be aligned with network order When using corosync with clear_node_high_bit setting to yes, the highest bit is cleared. When all the cluster nodes are in one subnet, we probably configure the IP addresses as follows: node1: 147.2.207.64 node2: 147.2.207.192 If the byte order of the nodeid is little endian, wiping off the highest bit will make the two nodes have the same nodeid! This patch fixes this by converting the nodeid to network order. Signed-off-by: Xia Li <xli@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-03-19 16:39:59 +01:00
Jeremy Fitzhardinge	52f88d04ea	Handle ERANGE from getpwnam_r / getgrnam_r These functions return ERANGE if the supplied buffer is too small to fit a line. Try doubling the buffer a few times until it works.	2013-03-07 16:59:51 -08:00
Jan Friesse	66172a501a	Handle unexpected closing brace in config file If configuration file contains closing brace before opening brace at top level, configuration parsing is stopped and file is not completely parsed. Solution is to detect extra closing brace and display error. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-01-31 16:11:22 +01:00
Jan Friesse	663489d277	Handle colon in configuration file If colon was entered as part of value on end of value, it is deleted. This makes impossible to enter (legal) IPv6 address ending with :: (like fed0::). Also when line contains both brace and colon, it is parsed twice (first as key = value and second as start of section). This is handled by continue in if section. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-01-31 16:11:18 +01:00
Fabio M. Di Nitto	98d0245c7e	votequorum: port to sync API (take 2) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-01-31 15:32:07 +01:00
Fabio M. Di Nitto	55dc09ea23	totemconfig: enforce hmac config when crypto is enabled Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-01-14 12:31:47 +01:00
Kazunori INOUE	1ad21e384e	log: move Corosync started log messages "Corosync Cluster Engine ... started" message is shown after logsys is full configured. Signed-off-by: Kazunori INOUE <inouekazu@intellilink.co.jp> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-01-14 11:52:26 +01:00
Fabio M. Di Nitto	ed6bca3293	crypto: drop < 2.3 protocols and onwire compat Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-01-14 11:49:32 +01:00
Fabio M. Di Nitto	b3f456a8ce	totemcrypto: fix hmac key initialization Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-01-14 11:23:32 +01:00
Jan Friesse	6127be1806	Move qb_loop creation after daemonization Creating qb_loop before daemonization is not problem for poll or epoll type loops, but it's problem for kqueue, because kqueue is not shared in child with parent after fork. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-12-12 11:47:42 +01:00
Jan Friesse	dd588d004e	Add option to specify ip version Default is ipv4. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-12-03 14:02:32 +01:00
Jan Friesse	92e0f9c7bb	Add waiting_trans_ack also to fragmentation layer Patch for support waiting_trans_ack may fail if there is synchronization happening between delivery of fragmented message. In such situation, fragmentation layer is waiting for message with correct number, but it will never arrive. Solution is to handle (callback) change of waiting_trans_ack and use different queue. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-22 11:48:12 +01:00
Jan Friesse	2d4e7bebb5	Handle segfault in backlog_get If instance->memb_state is not OPERATION or RECOVERY, we was passing NULL to cs_queue_used call. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-22 11:48:07 +01:00
Steven Dake	402638929e	Fix problem with sync operations under very rare circumstances This patch creates a special message queue for synchronization messages. This prevents a situation in which messages are queued in the new_message_queue but have not yet been originated from corrupting the synchronization process. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-22 11:47:57 +01:00
Fabio M. Di Nitto	220d659b38	totemcrypto: implement crypto packet format 2.2 and crypto_compat: config opt Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-11-22 11:13:30 +01:00
Evgeny Barskiy	e3f615b4a0	corosync to start in infiniband + redundant ring active/passive mode Corosync now works with infiniband transport in any redundant ring mode Signed-off-by: Evgeny Barskiy <barskiy@rts.ru> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-11-21 10:28:57 +01:00
Fabio M. Di Nitto	ed63c812af	votequorum: fix handling of expected_votes/votes changes from cmapctl and allow natural selection to take place.... Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-11-20 15:45:57 +01:00
Jan Friesse	3cd4f9a1f5	Add support for selecting IPC type Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-08 12:16:11 +01:00
Jan Friesse	89809ec80e	Check successful initialization of IPC Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-08 12:16:06 +01:00
Angus Salkeld	abc3b6abed	Try reduce the number of sprintf's Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-11-07 21:28:31 +11:00
Jan Friesse	d4db2ea535	If failed_to_recv is set, consensus can be empty If failed_to_recv is set (node detect itself not able to receive message), we can end up with assert, because my_failed_list and my_member_list are same list. This is happening because we are not following specification and we allow to mark node itself as failed. Because if failed_to_recv is set and we reached consensus across nodes, single node membership is created (ignoring both fail list and member_list), we can skip assert. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-05 15:16:25 +01:00
Jacek Konieczny	07832748f2	link libtotem_pg to libqb The libtotem_pg library uses symbols from libqb, so it should be explicitely linked with it. This doesn't cause problems for corosync binary itself, as it is linked to both libraries, but can cause problems if anything else links to libtotem_pg.so and automated checkers can show this as a library problem. Signed-off-by: Jacek Konieczny <jajcus@jajcus.net> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-10-29 16:49:19 +01:00
Jan Friesse	8a9869eeec	Correctly check if service was unloaded my_processing_idx is pointer to received service list, instead of global service number. If we check state of service we should use service_id instead of my_processing_idx. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-17 15:06:36 +02:00
Jan Friesse	c165bf4f51	Define AES_*_KEY_LENGTH if not defined Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-17 15:06:32 +02:00
Fabio M. Di Nitto	20c5871525	totemcrypto: add support for different encryption methods (backport from nsscrypto kronosnet code) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-10-15 10:00:16 +02:00
Jan Friesse	fc50443f5f	Make totemiba compile again Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2012-10-08 17:44:09 +02:00
Jan Friesse	b7635ab9f7	Return back "Totem is unable to form..." message This patch returns back SUBJ functionality. It rely on fact, that sendmsg will return error, and if such error is returned for long time, it's probably because of firewall. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-08 16:53:35 +02:00
Jan Friesse	d042671369	Move "Totem is unable to form..." message to main Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-08 16:53:33 +02:00
Jan Friesse	6c3b337b37	Use unix socket for local multicast loop Instead of rely on multicast loop functionality of kernel, we now use unix socket created by socketpair to deliver multicast messages to local node. This handles problems with improperly configured local firewall. So if output/input to/from ethernet interface is blocked, node is still able to create single node membership. Dark side of the patch is fact, that membership is always created, so "Totem is unable to form a cluster..." will never appear (same applies to continuous_gather key). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-08 16:53:30 +02:00
Jan Friesse	4354ed6ecb	Store config_version of other nodes Config version of other nodes is stored in runtime.totem.pg.mrp.srp.members.NODEID.config_version key. Also when local config_version is changed, all nodes are informed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-03 11:26:35 +02:00
Jan Friesse	d2a85593c4	Support for check of config version on start Config version is requested from other nodes. If our config version is not 0 and differes from highest config version of other nodes, corosync quits. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:32 +02:00
Jan Friesse	73b0fe688d	Make cmap_mcast_send return correct error code Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:28 +02:00
Jan Friesse	a273be58ae	Make service_build contain correct number of msgs Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:24 +02:00
Jan Friesse	3c019f2130	Align items in cmap_mcast_send Aligning function (kernel style magic) MAR_ALIGN_UP is used for aligning of items in req_exec_cmap_mcast message. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:20 +02:00
Jan Friesse	2214a60639	Support for flt and dbl in mcast_endian_convert Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:17 +02:00
Jan Friesse	cbaa2977ae	Add support for sending cmap values to wire Function is little more complex, but it is designed to be used in future without big changes. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:07 +02:00
Jan Friesse	6825c1d39b	Parse config_version as 64-bit uint Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:02 +02:00
Jan Friesse	373ded0652	Don't access invalid mem in totemconfig interfaces When ringnumber in config file was set to value bigger or equal to INTERFACE_MAX, we are using this big value as index to totemconfig interfaces array, resulting to access to invalid memory and segfault. Instead of that, ringnumber is now checked and proper error message is printed if value is too big. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-27 13:54:39 +02:00
Jan Friesse	5ce59f49ba	Move some totem and cpg messages to trace level Messages which are flow messages, rather then lifecycle are now logged in trace level. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-19 11:03:16 +02:00
Jan Friesse	5717655019	Add support for debug level trace in config file Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-19 11:03:10 +02:00
Fabio M. Di Nitto	8a2e936381	icmap: fix mapping return codes Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2012-09-12 08:18:50 +02:00
Fabio M. Di Nitto	bb5946babb	build: clean AM_CFLAGS and AM_CPPFLAGS usage around also set commont include dirs. fPIC and DPIC are automatically detected and added as required by libtool. We don't need to carry it around. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-09-07 09:04:07 +02:00
Fabio M. Di Nitto	fa92e4068a	totemconfig: drop unnecessary includes Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-09-07 09:04:06 +02:00
Jan Friesse	7fe307383f	Remove newline in logsys_config_file_set_unlocked Also remove commented leftover. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-06 09:39:18 +02:00
Jan Friesse	bd30fe3dcd	Make threaded log work Previous two log releated patches tried to solve few problems with threaded libqb, but introduced regressions when running in daemon mode. This patch takes bigger hammer and hopefully solves all problems. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-06 09:39:15 +02:00
Jan Friesse	bd138085ca	Ensure qb_log thread is started Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-05 09:10:57 +02:00
Jan Friesse	7026fffdf9	Ensure no garbage left in msghdr for sendmsg call Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-09-03 09:34:37 +02:00
Jan Friesse	120b7fac7b	Use uint8_t in setsockopt when needed Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-09-03 09:34:35 +02:00
Jan Friesse	ee59122ad7	OpenBSD getifaddrs returns netmask without sa_family So we relax netmask check and set to same family as ipaddr if needed Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-09-03 09:34:33 +02:00
Jan Friesse	932829bfca	Add header files when needed Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-09-03 09:34:31 +02:00
Angus Salkeld	0e86aa4ac6	Fix cpg_membership_get() The wrong size was getting set in exec/cpg.c Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-31 14:48:35 +10:00
Fabio M. Di Nitto	6d28d51284	build: bring SOLARIS up to the same standard as other OSes drop all SOLARIS specific ifdefs and replace them with feature checks Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto	a0a14c68e3	totemip: clean up headers a lot more getifaddrs is always available if there is freeifaddr. all BSD and openindiana have it defined in ifaddr.h. drop a bunch of obsoleted headers. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto	18929089d1	build: drop MAP_ANONYMOUS check from configure define it only in case it's not there Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto	5c5db34e56	build: make libstatgrab the facto default for monitoring service drop duplicate code and remove the last COROSYNC_LINUX ifdefs around Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto	a1c154e6fa	build: use MADV_NOSYNC only when it's defined so far only FreeBSD defines it. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto	6098ef2c14	build: make exec/totemip os detection free Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Jan Friesse	dbe0e9e382	Log: Use threaded mode for syslog and file log Syslog and file log can block, so it's good idea to use libqb threaded mode to prevent it. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-30 09:46:48 +02:00
Jan Friesse	9f6e6a990b	Use native IPC mechanism Instead of hardcoded SHM, we should use NATIVE, so libqb is able to find out what is best/availiable mechanism. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-30 09:45:46 +02:00
Fabio M. Di Nitto	427fdd4558	build: fix build on openindiana 151a openindiana toolchain is rather messy. This is the first cut only Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-28 15:14:49 +02:00
Fabio M. Di Nitto	9f7181b533	build: drop more dlopen leftovers from dinosaur era... Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-28 15:14:49 +02:00
Fabio M. Di Nitto	dd4d7f86e6	build: make monitoring optional in corosync exec Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-28 15:14:49 +02:00
Fabio M. Di Nitto	8f96347100	build: respect watchdog conditional when building corosync exec Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-28 15:14:49 +02:00
Fabio M. Di Nitto	76d18f964d	build: use libtool for linking Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-28 15:14:48 +02:00
Tim Beale	6129ce5b59	Remove redundant default-config code We were checking 'hold_timeout == 0' in 3 different places when setting up the default totem config. Signed-off-by: Tim Beale <tlbeale@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-21 14:26:50 +02:00
Tim Beale	77ea036c72	Remove unused structure Nowhere in the corosync codebase references this structure. Signed-off-by: Tim Beale <tlbeale@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-21 14:11:48 +02:00
Jan Friesse	397cc89f01	Make logging of WD and MON service correct MON and WD services are using fsm.h, which calls log function. Such messages were incorrectly logged as SERV (or random service) which made debugging hard. Solution is to add callback parameter to fsm functions and do actual logging there. Handling of failure states is also done in calback now. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-16 14:45:15 +02:00
Jan Friesse	e3cef955bf	IPC: Call lib function only when it's possible send_ok was incorrectly tested as boolean, even it's errno type variable. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-09 15:10:52 +02:00
Jan Friesse	8014b2facf	Close sockets after deleting from poll This will remove (non critical) debug message from QB about polling on closed FD. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-09 15:10:44 +02:00
Jan Friesse	2d10e2bbea	cpg: Check input param name_t length IPC is using buffer of CS_MAX_NAME_LENGTH for name. If user calls function with longer string, such string can be passed to service incomplete. Solution is to not allow string larger then CS_MAX_NAME_LENGTH and return error. Same applies to cpg service. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-09 15:10:35 +02:00
Jan Friesse	6f6988afff	Handle sync and service unload correctly When sync started and service is unloaded in meantime, it can happen that sync will call sync_* functions on unloaded service. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-09 15:10:26 +02:00
Jan Friesse	dfe34d330c	service: remove leftovers from mt corosync Multithreaded corosync used to use many ugly workarounds. One of them is shutdown process, where we had to solve problem with two locks. This was solved by scheduling jobs between service exit_fn call and actual service unload. Sadly this can cause to receive message from other node in that meantime causing corosync to segfault on exit. Because corosync is now single threaded, we don't need such hacks any longer. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-09 15:10:16 +02:00
Fabio M. Di Nitto	423e37b4ca	votequorum: change init/clean up to deal with exit races Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-08 09:03:57 +02:00
Fabio M. Di Nitto	50308cb08d	quorumtool: make output more meaningful there is really no point to have a per node view of (vote)quorum since all the info are always there. drop the -n option for status/display nodes and improve the output to provide a full cluster view at any given time. Old format: [root@fedora-master-node2 ~]# corosync-quorumtool -s Quorum information ------------------ Date: Mon Aug 6 10:22:27 2012 Quorum provider: corosync_votequorum Nodes: 2 Ring ID: 8 Quorate: Yes Votequorum information ---------------------- Node ID: 3254954176 Node state: Member Node votes: 1 Qdevice votes: 1 Expected votes: 3 Highest expected: 3 Total votes: 3 Quorum: 2 Flags: Quorate Qdevice Membership information ---------------------- Nodeid Votes Name 3238176960 1 fedora-master-node1.int.fabbione.net 3254954176 1 fedora-master-node2.int.fabbione.net 0 1 QDEVICE (Alive/Voting/NoMasterWins) New format: [root@fedora-master-node1 tools]# ./corosync-quorumtool -s Quorum information ------------------ Date: Mon Aug 6 15:50:03 2012 Quorum provider: corosync_votequorum Nodes: 2 Ring ID: 48 Quorate: Yes Votequorum information ---------------------- Expected votes: 3 Highest expected: 3 Total votes: 3 Quorum: 2 Flags: Quorate Qdevice Membership information ---------------------- Nodeid Votes Qdevice Name 3238176960 1 A,V,MW fedora-master-node1.int.fabbione.net 3254954176 1 NR fedora-master-node2.int.fabbione.net 0 1 QDEVICE Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	6b270c6cd1	votequorum: make the last QDEVICE define name consistent with everything else Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	302545e112	votequorum: add missing return call Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	379b203677	votequorum: make master_wins check stricter Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	9c50f33509	votequorum: add ENTER/LEAVE for consistency Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	2f369e7039	votequorum: delegate qdevice_master_wins setting to qdevice votequorum has no business to device if master_wins setting is correct or not. only the qdevice can decide and should set the value for votequorum. Logic is: - user requests master_wins from config - corosync starts - qdevice starts - qdevice reads cmap values / register with votequorum - qdevice decides if the node can support master_wins or not and tells votequorum - at this point votequorum can check if an unquorate node is part of the master_wins partition it is the qdevice responsibility to keep that value up to date in votequorum and the value can be changed at runtime. this commit also exchange per node master_wins information to lay down the infrastructure to verify discrepancies in node config for master_wins (coming next on this channel). Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	cc7bfeb462	votequorum: drop votequorum_qdevice_getinfo and collapse data into getinfo it's really pointless to have basically a duplicated API call to transfer one value and one name. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	65a6c29a31	votequorum: external defines should all be prefixed with VOTEQUORUM_ Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	2a37b56c49	votequorum: drop _FLAG_ from defines those are all info flags.. it's redudant and inconsistent Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	3416eacbec	votequorum: fix define name to match reality Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	86dd11b28e	qdevice: implement master_wins partition in previous incarnation of qdisk + cman, master_wins was restricted to 2 node only. In this new version it is possible to use master_wins for any cluster size. Let's assume a 4 node cluster. Each node votes 1, qdevice votes 3. node 1 becomes qdevice master node 2/3/4 no In case of a split (let's assume 2/2): partition 1: {4, 1} partition 2: {1, 1} node 2 in partition 1 would normally be unquorate, leaving effectively only node 1 active. master_wins allows node 2 to recognize to be part of a quorate partition (since node1 is broadcasting that qdevice is voting) and retain quorum. node1 has never lost quorate status since qdevice is voting there. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	aa295be834	votequorum: fix flag check for qdevice votes propagation and cleanup similar code to make it more readable Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	2dae49e54a	votequorum: remove last instance of state and rename it to cast_vote also align naming of vote to cast_vote for info calls Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	3fed1af077	votequorum: several major bug fixes and code cleanup - add a protection check to avoid spurious messages on membership change - greately simplify processing of nodeinfo, since the only data that we send for qdevice over nodeinfo is the number of votes - fix a flag check to trigger quorum calculation that would leave a cluster unquorate under certain conditions Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	62659dbb21	votequorum: move to the new flag structure simplify different code path as checks are simpler, separate ALIVE and CAST_VOTE Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	c9e207ec92	votequorum: simplify getinfo data and protect against call against quorum node Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	f2b25936e5	votequorum: use REGISTERED flag consistently Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	0bcb4cddcc	votequorum: simply internal qdevice_getinfo function as data are moving around we can drop lots of special cases Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	43d1439600	votequorum: add qdevice CAST_VOTE status/flag this is a preparation commit for the next changes. right now it is no more than an alias to ALIVE. CAST_VOTE is required to support master/slave feature from qdevice. Effectively a quorum device can be: Not registered / registered (connected to API but nothing else is happening) if registered: Not alive / alive (quorum device is petting the API via poll and timer is running) if alive: Not voting (slave) / voting (master) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:16 +02:00
Fabio M. Di Nitto	987e26f8d1	votequorum: rename NODE_FLAGS_QDEVICE_STATE to NODE_FLAGS_QDEVICE_ALIVE STATE is confusing and overloaded term in votequorum as it's used for nodes and other bits. make the name unique and ALIVE means that the qdevice is heartbeating to votequorum. improve display of the status in tools and tests. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:16 +02:00
Fabio M. Di Nitto	4621a6cd02	votequorum: rename NODE_FLAGS_QDEVICE to NODE_FLAGS_QDEVICE_REGISTERED make the flag name explicit Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:16 +02:00
Jan Friesse	fed7fc23e1	Don't call sync_* funcs for unloaded services When service is unloaded, sync shouldn't call sync_init\|process\|activate and abort functions. It happens very rare, but in process of unloading all services, totem can recreate membership and bad things can happen (service is unloaded, so there may be access to already freed memory, ...) Solution is to fetch services sync handlers in every time when we are building service list instead of using precreated one. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-08-02 09:34:58 +02:00
Jan Friesse	9fb7979370	Introduce SERVICES_COUNT_MAX macro Sync/service was using maximal number of services in ehter numberic form (magic constant) or inconsistently, this means using SERVICE_HANDLER_MAXIMUM_COUNT which means maximal number of handlers. New macro solves this. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-08-02 09:32:05 +02:00
Jan Friesse	537bf56fcc	cpg: Be more verbose for procjoin message Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-07-30 10:22:16 +02:00
Jan Friesse	04dac3ff5d	Correctly free state string in wd Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-07-12 15:53:04 +02:00
Jan Friesse	e4d75d1ab3	Revert "Free state variable allocated in wd_resource_state_is_ok" This reverts commit `01c63ca17c`.	2012-07-11 17:04:41 +02:00
Jan Friesse	a966506c1e	cpg: Enhance downlist selection algorithm Let's say we have 2 nodes: - node 2 is paused - node 1 create membership (one node) - node 2 is unpaused Result is that node 1 downlist is selected, so it means that from node 2 point of view, node 1 was never down. Patch solves situation by adding additional check for largest previous membership. So current tests are: 1) largest (previous #nodes - #nodes know to have left) 2) (then) largest previous membership 3) (and last as a tie-breaker) node with smallest nodeid Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-06-14 15:15:42 +02:00
Jan Friesse	f3457c5d49	cpg: Print cpg name to debug informations In downlist and joinlist debug output group was printed in nonsense format of integer to pointer to array. Now it's printed by full name. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-06-14 15:15:39 +02:00
Jan Friesse	35446d6bcc	cpg: Process join list after downlists let's say following situation will happen: - we have 3 nodes - on wire messages looks like D1,J1,D2,J2,D3,J3 (D is downlist, J is joinlist) - let's say, D1 and D3 contains node 2 - it means that J2 is applied, but right after that, D1 (or D3) is applied what means, node 2 is again considered down It's solved by collecting joinlists and apply them after downlist, so order is: - apply best matching downlist - apply all joinlists Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-06-14 15:15:35 +02:00
Jan Friesse	816d7687b0	cpg: Never choose downlist with localnode Test scenario is follows: - node 1, node 2 - node 1 is paused - node 2 sees node 1 dead - node 1 unpaused - node 1 and 2 both choose same dowlist message which includes node 2 -> node 2 is efectivelly disconnected Patch includes additional test if left_node is localnode. If so, such downlist is ignored. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-06-14 15:15:32 +02:00
Jerome FLESCH	99faa3b864	When flushing, discard only memb_join messages Patch solves problem when 1 ring out of 2 went up/down quite often. The simplest setup to reproduce bug is following: - 2 VMs, connected by 2 network interfaces - OS: Linux - On one of the VMs, a test program sending some CPG messages (see the script "test_corosync.sh" joined to this mail for example) Here are the Corosync logs we get when we do this setup: Jun 06 16:23:40 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. Jun 06 16:23:40 corosync [CPG ] chosen downlist: sender r(0) ip(192.168.56.104) r(1) ip(192.168.57.104) ; members(old:1 left:0) Jun 06 16:23:40 corosync [MAIN ] Completed service synchronization, ready to provide service. Jun 06 16:24:37 corosync [TOTEM ] Marking ringid 1 interface 192.168.57.105 FAULTY Jun 06 16:24:38 corosync [TOTEM ] Automatically recovered ring 1 Jun 06 16:25:33 corosync [TOTEM ] Marking ringid 1 interface 192.168.57.105 FAULTY Jun 06 16:25:34 corosync [TOTEM ] Automatically recovered ring 1 Jun 06 16:26:35 corosync [TOTEM ] Marking ringid 1 interface 192.168.57.105 FAULTY Jun 06 16:26:36 corosync [TOTEM ] Automatically recovered ring 1 (...) The second ring goes down about every 2 minutes and automatically back up right after. We spent some times looking for the commit that introduced this bug, and it appears it's due the following one: Corosync 1.3.3 -> 1.3.4: `e27a58d93d` Corosync 1.4.1 -> 1.4.2: `be608c0502` Commit message: Ignore memb_join messages during flush operations I had a look at this commit, and it seems to me it's dropping too many packets: Because of this commit, while totemrrp_recv_flush() is called, Corosync drops memb_join packets, but also ORF tokens. In the end, it seems that sometimes, we drop so many of them that Corosync marks the ring as faulty. To fix that, only memb_join messages are dropped now. Signed-off-by: Jerome FLESCH <jerome.flesch@netasq.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-06-11 10:59:30 +02:00
Jan Friesse	2766e57ce5	Store fdata with timestamp and pid in name This should allow easier handling of various blackbox dumps. Original fdata name is now symlink to latest created dump. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-06-05 12:19:42 +02:00
Jan Friesse	7ce332a713	totemudpu: Bind sending sockets to bindto address Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-05-31 09:28:52 +02:00
Fabio M. Di Nitto	f008cf442c	rename mainconfig to logconfig Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-05-29 09:36:00 +02:00
Fabio M. Di Nitto	b283ef8f12	mainconfig: allow mainconfig logic to be used both internally and externally corosync logging configuration logic is rather complex and in order to make it simpler to reuse (at least within corosync/ tree) we need to be able to use both icmap and cmap. the patch might seem controversial, but it reduces heaps of code around from qdevices (coming next). It might be useful to consider moving this to a common shared library but there aren't enough users yet and a shared lib would force corosync to link with cmap (that we do not want at all costs) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-05-29 09:04:03 +02:00
Angus Salkeld	5831136c87	LOG: make sure the log target is enabled. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-05-29 14:02:42 +10:00
Angus Salkeld	e6b35bdb7a	LOG: handle closing unused logfiles better This fixes a bug where having a second log file will close the previous one. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-05-29 14:02:42 +10:00
Angus Salkeld	e6afc761fe	LOG: be more explict about the qb file names else we can get messages been put in the wrong subsys. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-05-29 14:02:42 +10:00
Jan Friesse	2894f33c4f	totemip: Support bind to exact address Logic for binding now works in following way: - Try to find exact match - If not exact match is found, use first found network address This allows set concrete IP even if network settings contains two IPs on same network. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-05-24 14:01:12 +02:00
Jan Friesse	aaa575e091	totemip: insert items in correct order list_add_tail is used instead of list_add so ip addresses are inserted in same order as returned by getifaddrs. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-05-24 14:01:08 +02:00
Jan Friesse	0791f44c41	Include ringid in processor joined log message This should help correlate syslog entires with their blackbox counterparts. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Andrew Beekhof <andrew@beekhof.net>	2012-05-17 14:58:04 +02:00
Fabio M. Di Nitto	f2444effd0	icmap: don't leak memory when changing ro/rw status on a key Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-04-24 09:28:23 +02:00
Fabio M. Di Nitto	1dcb2d43d9	icmap: fix a valgrind errors (pass 1) clean up a lot of allocated blocks at exit. those changes has no runtime effects, but it makes valgrind output a bit more useful by dropping over 700 errors/warnings to skip over every single run. there are still a few icmap related valgrind errors but those need some more complex and timeconsuming investigation. pre patch: ==21844== HEAP SUMMARY: ==21844== in use at exit: 1,229,321 bytes in 1,516 blocks ==21844== total heap usage: 7,191 allocs, 5,675 frees, 3,819,853 bytes allocated ==21844== LEAK SUMMARY: ==21844== definitely lost: 3,617 bytes in 11 blocks ==21844== indirectly lost: 21,960 bytes in 11 blocks ==21844== possibly lost: 1,080,101 bytes in 131 blocks ==21844== still reachable: 123,643 bytes in 1,363 blocks ==21844== suppressed: 0 bytes in 0 blocks ==21844== ERROR SUMMARY: 136 errors from 136 contexts (suppressed: 0 from 0) post patch: ==25793== HEAP SUMMARY: ==25793== in use at exit: 1,185,870 bytes in 808 blocks ==25793== total heap usage: 9,427 allocs, 8,619 frees, 4,156,841 bytes allocated ==25793== LEAK SUMMARY: ==25793== definitely lost: 3,697 bytes in 12 blocks ==25793== indirectly lost: 22,248 bytes in 13 blocks ==25793== possibly lost: 1,079,655 bytes in 113 blocks ==25793== still reachable: 80,270 bytes in 670 blocks ==25793== suppressed: 0 bytes in 0 blocks ==25793== ERROR SUMMARY: 119 errors from 119 contexts (suppressed: 0 from 0) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-04-24 09:28:23 +02:00
Fabio M. Di Nitto	d2872aec70	crypto init: release *_slot resource after init Those are only used at init phase and we can free some memory for the system. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2012-04-20 10:57:16 +02:00
Fabio M. Di Nitto	b34c1e2870	ipcs: allow connections only after all services are ready this fixes a rather annoying race condition at startup where a client connects to corosync "too fast" before the service is ready to operate and client gets some random data during initialization phase. With this fix, we allow connections to ipc only after the main engine is operational and configured (and after the first totem transition). Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2012-04-16 13:39:03 +02:00
Jan Friesse	f89d7b715f	Always allocate totemrrp stats array This prevents segfault when rrp mode is set with only one ring. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-04-10 09:08:42 +02:00
Jan Friesse	92ead6106f	Properly parse uidgid files Full path to key is now tested rather then key name only. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-04-10 09:08:36 +02:00
Fabio M. Di Nitto	cde4468581	totemcrypt: fix build warning (unused variable) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-27 12:06:46 +02:00
Fabio M. Di Nitto	4378915a33	totemcrypto: major code cleanup (no functional or onwire changes) - cleanup include list - reorder code and functions (crypto then hash) - split crypt/decrypt/hash functions - some micro optimizations by dropping a few memcpy - make the code more readable (better var names and buffers mapping) - improve exit paths on error (return codes and free) - store crypto header size instead of recalculating it per packet Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-27 11:43:07 +02:00
Jan Friesse	e925f42165	Make ifaces_get work with dynamic no_rings Commit which added number of addresses to srp_address structure didn't count with totemsrp_ifaces_get where whole structure was copied instead of addresses only. This is now fixed. Also to make API totempg forward compatible, size of interfaces array must be passed to ifaces_get like functions to prevent memory overwrite. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-26 11:54:26 +02:00
Jan Friesse	124ff4339c	Add no_addrs field in srp_addr structure This should allow us future change to dynamic number of rings without breaking wire compatibility. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-22 14:03:38 +01:00
Jan Friesse	7a0a39b949	Mark few more icmap keys as read only Also most of the key settings are now centralized in one function, so it's easier to audit. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-16 09:37:25 +01:00

... 2 3 4 5 6 ...

1917 Commits