mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2025-10-18 01:42:05 +00:00

Author	SHA1	Message	Date
Jan Friesse	57ff693b70	mon: Fix comparsion typo Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-25 14:57:13 +01:00
Jan Friesse	e1e2390b61	mon: Make mon compilable with libstatgrab ver 0.9 Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-25 14:57:10 +01:00
Jan Friesse	fbe8768f1b	cpg: Make sure left nodes are really removed When node is paused and other nodes has in meantime exited cpg process, paused node after resume doesn't update it's membership correctly so on previously paused node exited cpg process is still visible. Solution is to compare join list with cpd and remove all pids which are not included in join list. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-19 10:59:14 +01:00
Jan Friesse	83c63b247f	cpg: Make sure nodid is always logged as hex num Also number is prefixed by 0x so it's easier to spot that number is hexadecimal. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-19 10:59:10 +01:00
Jan Friesse	fcf26e0303	cpg: Refactor mh_req_exec_cpg_procleave Most of functionality is moved to do_proc_leave function to make it reusable. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-19 10:59:05 +01:00
Jan Friesse	38c04d9a66	totemsrp: Fix typo with cont gather Patch `f3ffd3da5c` introduced named states of state-machine, but sadly contains logical problem causing stats.continuous_gather increasing even when it shouldn't. Problem is not critical, because continuous_gather is set to 0 on successful membership creation. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-18 16:12:57 +01:00
Christine Caulfield	90d448af3b	votequorum: Add extended options to auto_tie_breaker This patch adds more flexibility to the auto_tie_breaker feature of votequorum. With this, not only can the lowest nodeid be used as a tie breaker, but also the highest, or a node from a nominated list. If there is a list of nodes, the first node in the list that was not part of the previous partition is used. This allows the user to specify a preferred set of nodes but prevents a split-brain if the cluster divides evenly with a node in each half. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-02-17 16:29:45 +00:00
Masatake YAMATO	fa71067a93	Free object allocated at quorum_register_callback Memory object allocated with malloc at quorum_register_callback is not freed. The object is linked to internal_trackers_list. The object is unlinked at quorum_unregister_callback. However, it is not freed at the function. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-01-23 17:18:44 +01:00
Jan Friesse	45dd9861ff	Properly check result of symlink Error message is displayed when it's impossible to create symlink to fdata file. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-01-14 11:24:31 +01:00
Jan Friesse	5c54f941ac	Fix cppchecks warning Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-01-14 11:24:29 +01:00
Jan Friesse	178c0d82d9	Close devnull file handler Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-01-14 11:24:26 +01:00
Jason	cfbb021e13	totem: Drop invalid join msg in operational state According to the totem paper, if a processor receives a join message in the operational state and if the receivers identifier is in the join messages fail list, then join message should be ignored. By applying this validation of join messages, we can avoid unnecessary switching from operational state to gather state(or even lead to rings can not be merged) like the following to happen. 1. Initially, there is only one ring contains three nodes, say ring(A,B,C). 2. A and B network partition, "in the same time", C is down. 3. Node A sends join message with proclist:A,B,C. faillist:NULL. Node B sends join message with proclist:A,B,C. faillist:NULL. 4. Both A and B consensus timeout due to network partition. 5. A and B network remerged. 6. Node A sends join message with proclist:A,B,C. faillist:B,C. and create ring(A). Node B sends join message with proclist:A,B,C. faillist:A,C. and create ring(B). 7. Say join message with proclist:A,B,C. faillist:A,C which sent by node B is received by node A because network remerged. 8. Node A shifts to gather state and send out a modified join message with proclist:A,B,C. faillist:B. Such join message will prevent both A and B from merging. 9. Node A consensus timeout (caused by waiting node C) and sends join message with proclist:A,B,C. faillist:B,C again. Same thing happens on node B, so A and B will dead loop forever in step 7, 8 and 9. As the paper also said: "If a processor receives a join message in the operational state and if the sender's identifier is in the receiver's my_proclist and the join message's ring_seq is less than the receiver's ring sequence number, then it ignores the join message too." So these patch applying these validations of join messages altogether. Signed-off-by: Jason <huzhijiang@gmail.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-01-13 14:46:13 +01:00
Christine Caulfield	ff6a43edb3	votequorum: Add persistent expected_votes tracking. This patch adds the option to store expected_votes to persistent storage. This is needed to allow_downscale to operate properly. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2014-01-07 15:30:11 +00:00
Jan Friesse	b88c0766fe	logsys: Make logging of totem work again Because of change in libqb (9abb686) logging of TOTEM subsystem stopped working. Instead of rely on previous behavior (implicit substring match), all totem files are now explicitly given. Also QB subsystem now uses comma separated filelist instead of previous function calling. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-11-04 12:32:35 +01:00
Masatake YAMATO	f3ffd3da5c	totemsrp: Show English message when memb_state_gather_enter is called The reason why memb_state_gather_enter is invoked was printed in integer code. This patch introduces human readable English messages for the code. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-10-24 16:46:17 +02:00
Yevheniy Demchenko	805b3423ee	totemiba: Check if configured MTU is allowed by HW Solution use aproximation of totem structures. This needs to be rewritten in proper way. Also MTU checking should be implemented for IP transports. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-20 11:27:08 +02:00
Yevheniy Demchenko	8f14a5788f	totemiba: Fix parameters position for poll_add Parameters in functions like mcast_cq_send_event_fn, ... were defined in incorrect order. Also their names were weird. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-20 11:26:50 +02:00
Yevheniy Demchenko	c5d4a0762f	totemiba: Del channel fd from poll before destroy Corosync freezes after several peer node connects/disconnects. The freeze happens in recv_token_cq_recv_event_fn in ibv_get_cq_event call. The problems is in fact, that after each peer node connect, recv_token_accept_destroy is called, which tries to call poll_dispatch_delete _after_ freeing of completion_channel. As completion_channel contains fd, handlers are not disconnected from poller properly. This leads to complete inconsistency in subsequent calls to handlers. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-20 11:26:04 +02:00
Yevheniy Demchenko	5046de387b	totemiba: Properly allocate RDMA buffers 1. In UD mode receivnig side of RDMA application should have enough space in buffer to hold data and GRH. Also, sge.length on the receiving size should be set to max_msg_size + sizeof (struct ibv_grh). Current corosync doesn't take grh in the account and does not work if mtu is set to the real mtu of IB port (it works if netmtu is set to < 2048-40). 2. ibv_wc.byte_len is the actual lentgh of the received packet, i.e. msg_len + GRH. GRH length should be substracted in further proceeding. If not, it might cause problems when messages get retransmitted, as their apparent size will constantly grow. 3. Current corosync will not work with rdma and mtus > 2048. Most modern IB HW supports 4096 mtu. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-20 11:26:00 +02:00
Christine Caulfield	1a046793cb	Reload: Add atomic reload to log config When a reload is in progress, wait until it has all finished before re-reading all of the logging parameters Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-12 16:10:07 +01:00
Christine Caulfield	c0bfd48928	Reload: Add atomic reload to totemconfig When a reload is in progress, wait until the whole thing has finished before setting parameters Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-12 16:09:55 +01:00
Christine Caulfield	82fbffc34b	Reload: Add reload code to cfg Add the code to do the actual corosync.conf reload to cfg, along with a corosync-cfgtool -R command to trigger it Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-12 16:09:41 +01:00
Christine Caulfield	bc47c583bd	Reload: Make coroparse use a designated icmap hash table Pass an icmap hashtable into coroparse so we can load it into a temporary one during reload Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-12 16:09:06 +01:00
Jan Friesse	95133a5d77	icmap: Add func to test equality of two key values Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-09-10 17:02:12 +02:00
Christine Caulfield	8567887abb	[PATCH] Replace freopen with open/dup2 when daemonizing This patch replaces the existing freopen method of forcing stdin/out/err to /dev/null with the more usual system of open/dup2. While I don't like posting patches I don't fully understand, this patch seems to fix a problem where stdout/err get assigned to a socket causing double logging output on systemd. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-10 15:33:31 +01:00
Christine Caulfield	3663622576	Add log message to exit signal handler I've seen a few instances where corosync has shut down for apparently 'no reason'. In fact most of the time the shutdown has been caused by an external source (often an init script) but it's not been obvious what has happened and people implicate the deamon This patch simply adds a log message to the signal handler when it is called so that the cause of the shutdown is obvious. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-03 14:04:50 +01:00
Jan Friesse	26ef8e15db	icmap: Add map copy function Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-29 17:08:46 +02:00
Jan Friesse	e363f8b06d	icmap: Add function to return item data pointer icmap_get_r is now implemented using this function. Function is not very safe tho defined as static. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-29 17:08:41 +02:00
Jan Friesse	624cd439aa	icmap: Fix value len checking for strings Implementation should allow pass only parts of string (shorten string) and must prohibit reading of uninitialized memory. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-29 17:08:37 +02:00
Jan Friesse	04ddddd6d2	icmap: Add function to return global icmap Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-29 17:08:32 +02:00
Jan Friesse	e5a528c5cb	icmap: Allow multiple icmap instances Patch adds reentrant version of most of functions (with exception of RO flags support and tracking) to allow multiple icmap instances existence inside corosync. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-27 15:23:52 +02:00
Michael Chapman	2740cfd1ea	Fix scheduler pause-detection timeout qb_loop_timer_add expects the timeout to be in nanoseconds, but we were passing the value in milliseconds. Scale the timeout appropriately. Signed-off-by: Michael Chapman <mike@very.puzzling.org> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-08-19 09:03:24 +02:00
David Vossel	b424acc3a0	ipc_glue: proper ref counting during service connection iteration Signed-off-by: David Vossel <dvossel@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-07-04 13:05:52 +02:00
David Vossel	aa8e56a0fe	ipc_glue: Remove connection unref with no matching reference. We don't reference the connection object on creation, so there is on reason to dereference it on disconnect. Signed-off-by: David Vossel <dvossel@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-07-04 13:05:36 +02:00
David Vossel	771b239603	ipc_glue: Fixes connection ref count leak Signed-off-by: David Vossel <dvossel@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-07-04 13:05:02 +02:00
Christine Caulfield	074e57910e	The corosync message "A processor joined or left the membership" is vague and unhelpful. People have to look for the following quorum message and try to deduce which nodes have joined or left from that and past membership messages, even though the routine printing the message already has this information to hand. This patch fixes that message so that it prints the nodeids of the nodes that have joined/left the cluster. Signed-Off-By: Christine Caulfield <ccaulfie@redhat.com> Reviewed-By: Jan Friesse <jfriesse@redhat.com>	2013-06-27 14:44:46 +01:00
Jan Friesse	615d7592fb	Log: Output parse errors to syslog When corosync was started in daemon mode and there was parse error, no way existed how to find out what happened (this is usual situation with systemd enabled systems). Solution seems to be output to syslog by default. Also redundant line with setting logsys is removed because it's no longer needed, because FORK and THREADED mode options has no longer effect. FORK is handled by libqb by default and THREADED mode is forced by calling logsys_thread_start. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-21 11:21:42 +02:00
Jan Friesse	d6dd2e455d	totemconfig: Prevent leak of cluster_name str Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-21 11:21:33 +02:00
Jan Friesse	7cba14fb61	service: Fix memleak in service_unlink_and_exit Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-21 11:21:29 +02:00
Jan Friesse	514eb0f37d	ipc_glue: Check service name len Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:12 +02:00
Jan Friesse	f7beba46c5	ipc_glue: Introduce constant for service name len Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:12 +02:00
Jan Friesse	90da72cd7f	cfg: Check interface status and name length Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:12 +02:00
Jan Friesse	335da1ecfd	cfg: Check number of interfaces Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:12 +02:00
Jan Friesse	5dc3fc4bda	totemrrp: Make status string shorter Status string should be same lenght as needed for cfg ringstatusget function. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:11 +02:00
Jan Friesse	845a625908	totem: Don't leak instance variable on crypto fail Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-18 14:35:25 +02:00
Jan Friesse	93286a344e	totemudpu: Handle fd leak in totemudpu Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-18 14:35:21 +02:00
Jan Friesse	421de34972	totemconfig: Check length of rrp_mode string Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-18 14:35:15 +02:00
Jan Friesse	675da75759	coroparse: Ensure that config items fits into cmap Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-18 14:35:05 +02:00
Jan Friesse	e094ab2e2c	votequorum: Prevent leak in qdevice_is_configured Also LEAVE from function is now properly logged. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-17 15:47:27 +02:00
Jan Friesse	4310d84e4d	Initialize error variable in ykd_init Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:57 +02:00

1 2 3 4 5 ...

1758 Commits