mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2026-02-05 15:04:25 +00:00

Author	SHA1	Message	Date
Jan Friesse	72cf15af27	votequorum: Do not process events during reload During reload, local_node_pos is deleted and reinstation is handled in totemconfig after reload is finished. votequorum handles this events and tries to reload it's configuration. This led to logging a little scary messages (even nothing bad is happening, because after local_node_pos reinstation everything back to normal). Solution is to stop processing events during reload. Sadly, simple tracking of config.reload_in_progress doesn't work because LibQB events triggering order is undefined so votequorum reload handler can be called before totemconfig (and before local_node_pos is reinstatied). So new config.totemconfig_reload_in_progress key is defined with very similar semanthic as config.reload_in_progress but set inside totem_reload_notify function. Votequorum then use this new key. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-06-27 11:40:21 +02:00
Jan Friesse	c8e3f14fdb	Make config.reload_in_progress key read only It's not very good idea to allow user apps changing internal key reload_in_progress. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-06-27 11:40:18 +02:00
Jan Friesse	4e9716ed30	coroparse: More strict numbers parsing Previous safe_atoi didn't check range of input values so if for example user used -1 s token timeout, it was converted to UINT32_MAX without letting user know. Another safe_atoi problem was using strtol. This works pretty well on 64-bit systems, where long integer is usually 64-bits long, sadly on 32-bit systems, it is usually 32-bit long. And because strtol returns signed integer, it was not possible to enter 32-bit value with highest bit set. Solution is to use strtoll which is guaranteed to be at least 64-bits long and check value range. Also error message now contains also information about expected value range. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-06-12 14:49:00 +02:00
Jan Friesse	da46ecfc30	Move ringid store and load from totem library Functions for storing and loading ring id was in the totem library. This causes problem, what to do when it's impossible to load or store ring id. Easy solution seemed to be assert, but sadly this makes hard for user to find out what happened (because corosync was just aborted and logsys didn't flush) Solution is to move these functions to main.c, where is much easier to handle error. This also makes libtotem free of any file system operations. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-06-02 14:54:57 +02:00
Jan Friesse	d310b251c3	Introduce get_run_dir function Run dir (LOCALSTATEDIR/lib/corosync) was hardcoded thru whole codebase. Totemsrp was trying to create and chdir into it, but also takes into account environment variable COROSYNC_RUN_DIR creating inconsistency. get_run_dir correctly returns COROSYNC_RUN_DIR (when set) or LOCALSTATEDIR/lib/corosync. This is now used by all functions instead of hardcoded string. All occurrences of mkdir/chdir are removed from totemsrp and chdir is now called in main function. Mkdir call is completely removed, because it was not used anyway (check in main.c was called before totemsrp init, so mkdir was never called) and also make install and/or package system should take care of creating this directory with correct permissions/context. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-06-02 14:53:18 +02:00
Jan Friesse	8f13a98320	logsys: Log warning if flightrecorder init fails Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-06-02 14:36:10 +02:00
Jan Friesse	19c5b63ff5	logsys: Log error if blackbox cannot be created Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-06-02 14:36:08 +02:00
Jan Friesse	e905f92bf5	totemiba: Fix incorrect failed log message rdma_join_multicast failed ... message parameters was swapped. Also information about multicast join is now logged as notice. Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2014-05-15 15:28:51 +02:00
Yevheniy Demchenko	4d6a18d8a5	totemiba: Add multicast recovery Totemiba wasn't able to survive SubnetManager handover or restart. If SM was migrated to another node, corosync logged "multicast error" and losses connectivity. Commit should solve this situation. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-05-14 14:51:07 +02:00
hfu	d0dc9ae93c	Indent: Remove newline before else branch start Signed-off-by: hfu <askfuhu@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-05-09 11:38:02 +02:00
hfu	b6e2c8024d	Indent: Remove space in negation of expression Signed-off-by: hfu <askfuhu@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-05-09 11:37:47 +02:00
Jan Friesse	7557fdec48	config: Allow dynamic change of token_coefficient token_coefficient change in cmap didn't triggered change. So only way how to change token_coefficient was editing config file and reload. Patch let's key totem.token_coefficient to be processed so token_coefficient can be dynamically changed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-05-07 15:55:26 +02:00
Jan Friesse	58176d6779	Add token_coefficient option Token coefficient is used only when nodelist is specified and contains at least 3 nodes. If so, real token timeout is then computed as token + (number_of_nodes - 2) * token_coefficient. This allows cluster to scale without manually changing token timeout every time new node is added. This value can be set to 0 resulting in effective removal of this feature. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-25 15:29:17 +01:00
Jan Friesse	9a8de87c34	totemconfig: Log errors on key change and reload When volatile key was changed (cmap set or reload) and checks fails, nothing was logged. Values are now checked and error string is logged on problems. Also totem_config is dumped to log (DEBUG level) after every volatile key change and every reload. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-25 15:29:14 +01:00
Jan Friesse	b95ebd640e	totemconfig: Key change process dependencies When key with dependency was changed, dependant keys were not recomputed. Nice example is consensus timeout. If token timout was changed, consensus timeout was not recomputed correctly (nether via cmap change of key nor via cfg reload). Solution is almost complete refactor of handling volatile defaults. totem_volatile_config_read now handles not only storing cmap key to totem_config structure, but also checking of existence, comparing with zero value and properly storing defaults. totem_set_volatile_defaults is gone. It's function was splitted into totem_volatile_config_read and totem_volatile_config_validate functions. Reload callback and change of key callback are now mostly same functions and both calls totem_volatile_config_read. Patch also fixes small memory leak. totem.vsftype key is not used for long time and original totem_volatile_config_read wasn't freeing allocated memory returned by icmap_get_string. Whole reading of totem.vsftype is removed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-25 15:29:12 +01:00
Jan Friesse	eeb2384157	Really clear totemconfig nodes on reload When reload was called nodes were constantly added to totemconfig nodelist. So simple corosync-cfgtool -R resulted very quickly in filling whole array and segfault. Solution is to clear member_count. Clearing is also moved directly to put_nodelist_members_to_config to make sure it's always processed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-25 15:29:09 +01:00
Jan Friesse	1b6abcc7d5	Log: Make reload of logging work When reload was called multiple times (~20), logging to file stopped working. Main problem was hidden in the fact, that log file was opened multiple times, because even target_id was shared via subsystem loggers, file name was not. Solution is to ALWAYS set proper log file name into subsystem logger (copy is stored). This will not only fix problem but also removes small leak. Also if filename didn't changed, function can return sooner. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-25 15:13:33 +01:00
Jan Friesse	2f0cad20a9	config: Handle totem_set_volatile_defaults errors When totem_set_volatile_defaults is called from totem_config_validate return code is unchecked. It's then perfectly possible to set (for example) join timeout to very small value (1) and consensus value is then set to 0 making corosync unable to create membership. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-17 10:04:00 +01:00
Jan Friesse	e1801ba497	votequorum: Properly initialize atb and atb_string icmap_get_* behavior is to NOT modify passed variable when it doesn't success. So we must initialize variable before icmap_get_* call. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2014-02-26 16:59:02 +01:00
Jan Friesse	ff67daa55f	mon: Make monitoring work Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-25 14:57:20 +01:00
Jan Friesse	099f704cdd	mon: Pass correct pointer to inst Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-25 14:57:16 +01:00
Jan Friesse	57ff693b70	mon: Fix comparsion typo Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-25 14:57:13 +01:00
Jan Friesse	e1e2390b61	mon: Make mon compilable with libstatgrab ver 0.9 Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-25 14:57:10 +01:00
Jan Friesse	fbe8768f1b	cpg: Make sure left nodes are really removed When node is paused and other nodes has in meantime exited cpg process, paused node after resume doesn't update it's membership correctly so on previously paused node exited cpg process is still visible. Solution is to compare join list with cpd and remove all pids which are not included in join list. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-19 10:59:14 +01:00
Jan Friesse	83c63b247f	cpg: Make sure nodid is always logged as hex num Also number is prefixed by 0x so it's easier to spot that number is hexadecimal. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-19 10:59:10 +01:00
Jan Friesse	fcf26e0303	cpg: Refactor mh_req_exec_cpg_procleave Most of functionality is moved to do_proc_leave function to make it reusable. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-19 10:59:05 +01:00
Jan Friesse	38c04d9a66	totemsrp: Fix typo with cont gather Patch `f3ffd3da5c` introduced named states of state-machine, but sadly contains logical problem causing stats.continuous_gather increasing even when it shouldn't. Problem is not critical, because continuous_gather is set to 0 on successful membership creation. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-18 16:12:57 +01:00
Christine Caulfield	90d448af3b	votequorum: Add extended options to auto_tie_breaker This patch adds more flexibility to the auto_tie_breaker feature of votequorum. With this, not only can the lowest nodeid be used as a tie breaker, but also the highest, or a node from a nominated list. If there is a list of nodes, the first node in the list that was not part of the previous partition is used. This allows the user to specify a preferred set of nodes but prevents a split-brain if the cluster divides evenly with a node in each half. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-02-17 16:29:45 +00:00
Masatake YAMATO	fa71067a93	Free object allocated at quorum_register_callback Memory object allocated with malloc at quorum_register_callback is not freed. The object is linked to internal_trackers_list. The object is unlinked at quorum_unregister_callback. However, it is not freed at the function. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-01-23 17:18:44 +01:00
Jan Friesse	45dd9861ff	Properly check result of symlink Error message is displayed when it's impossible to create symlink to fdata file. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-01-14 11:24:31 +01:00
Jan Friesse	5c54f941ac	Fix cppchecks warning Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-01-14 11:24:29 +01:00
Jan Friesse	178c0d82d9	Close devnull file handler Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-01-14 11:24:26 +01:00
Jason	cfbb021e13	totem: Drop invalid join msg in operational state According to the totem paper, if a processor receives a join message in the operational state and if the receivers identifier is in the join messages fail list, then join message should be ignored. By applying this validation of join messages, we can avoid unnecessary switching from operational state to gather state(or even lead to rings can not be merged) like the following to happen. 1. Initially, there is only one ring contains three nodes, say ring(A,B,C). 2. A and B network partition, "in the same time", C is down. 3. Node A sends join message with proclist:A,B,C. faillist:NULL. Node B sends join message with proclist:A,B,C. faillist:NULL. 4. Both A and B consensus timeout due to network partition. 5. A and B network remerged. 6. Node A sends join message with proclist:A,B,C. faillist:B,C. and create ring(A). Node B sends join message with proclist:A,B,C. faillist:A,C. and create ring(B). 7. Say join message with proclist:A,B,C. faillist:A,C which sent by node B is received by node A because network remerged. 8. Node A shifts to gather state and send out a modified join message with proclist:A,B,C. faillist:B. Such join message will prevent both A and B from merging. 9. Node A consensus timeout (caused by waiting node C) and sends join message with proclist:A,B,C. faillist:B,C again. Same thing happens on node B, so A and B will dead loop forever in step 7, 8 and 9. As the paper also said: "If a processor receives a join message in the operational state and if the sender's identifier is in the receiver's my_proclist and the join message's ring_seq is less than the receiver's ring sequence number, then it ignores the join message too." So these patch applying these validations of join messages altogether. Signed-off-by: Jason <huzhijiang@gmail.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-01-13 14:46:13 +01:00
Christine Caulfield	ff6a43edb3	votequorum: Add persistent expected_votes tracking. This patch adds the option to store expected_votes to persistent storage. This is needed to allow_downscale to operate properly. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2014-01-07 15:30:11 +00:00
Jan Friesse	b88c0766fe	logsys: Make logging of totem work again Because of change in libqb (9abb686) logging of TOTEM subsystem stopped working. Instead of rely on previous behavior (implicit substring match), all totem files are now explicitly given. Also QB subsystem now uses comma separated filelist instead of previous function calling. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-11-04 12:32:35 +01:00
Masatake YAMATO	f3ffd3da5c	totemsrp: Show English message when memb_state_gather_enter is called The reason why memb_state_gather_enter is invoked was printed in integer code. This patch introduces human readable English messages for the code. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-10-24 16:46:17 +02:00
Yevheniy Demchenko	805b3423ee	totemiba: Check if configured MTU is allowed by HW Solution use aproximation of totem structures. This needs to be rewritten in proper way. Also MTU checking should be implemented for IP transports. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-20 11:27:08 +02:00
Yevheniy Demchenko	8f14a5788f	totemiba: Fix parameters position for poll_add Parameters in functions like mcast_cq_send_event_fn, ... were defined in incorrect order. Also their names were weird. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-20 11:26:50 +02:00
Yevheniy Demchenko	c5d4a0762f	totemiba: Del channel fd from poll before destroy Corosync freezes after several peer node connects/disconnects. The freeze happens in recv_token_cq_recv_event_fn in ibv_get_cq_event call. The problems is in fact, that after each peer node connect, recv_token_accept_destroy is called, which tries to call poll_dispatch_delete _after_ freeing of completion_channel. As completion_channel contains fd, handlers are not disconnected from poller properly. This leads to complete inconsistency in subsequent calls to handlers. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-20 11:26:04 +02:00
Yevheniy Demchenko	5046de387b	totemiba: Properly allocate RDMA buffers 1. In UD mode receivnig side of RDMA application should have enough space in buffer to hold data and GRH. Also, sge.length on the receiving size should be set to max_msg_size + sizeof (struct ibv_grh). Current corosync doesn't take grh in the account and does not work if mtu is set to the real mtu of IB port (it works if netmtu is set to < 2048-40). 2. ibv_wc.byte_len is the actual lentgh of the received packet, i.e. msg_len + GRH. GRH length should be substracted in further proceeding. If not, it might cause problems when messages get retransmitted, as their apparent size will constantly grow. 3. Current corosync will not work with rdma and mtus > 2048. Most modern IB HW supports 4096 mtu. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-20 11:26:00 +02:00
Christine Caulfield	1a046793cb	Reload: Add atomic reload to log config When a reload is in progress, wait until it has all finished before re-reading all of the logging parameters Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-12 16:10:07 +01:00
Christine Caulfield	c0bfd48928	Reload: Add atomic reload to totemconfig When a reload is in progress, wait until the whole thing has finished before setting parameters Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-12 16:09:55 +01:00
Christine Caulfield	82fbffc34b	Reload: Add reload code to cfg Add the code to do the actual corosync.conf reload to cfg, along with a corosync-cfgtool -R command to trigger it Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-12 16:09:41 +01:00
Christine Caulfield	bc47c583bd	Reload: Make coroparse use a designated icmap hash table Pass an icmap hashtable into coroparse so we can load it into a temporary one during reload Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-12 16:09:06 +01:00
Jan Friesse	95133a5d77	icmap: Add func to test equality of two key values Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-09-10 17:02:12 +02:00
Christine Caulfield	8567887abb	[PATCH] Replace freopen with open/dup2 when daemonizing This patch replaces the existing freopen method of forcing stdin/out/err to /dev/null with the more usual system of open/dup2. While I don't like posting patches I don't fully understand, this patch seems to fix a problem where stdout/err get assigned to a socket causing double logging output on systemd. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-10 15:33:31 +01:00
Christine Caulfield	3663622576	Add log message to exit signal handler I've seen a few instances where corosync has shut down for apparently 'no reason'. In fact most of the time the shutdown has been caused by an external source (often an init script) but it's not been obvious what has happened and people implicate the deamon This patch simply adds a log message to the signal handler when it is called so that the cause of the shutdown is obvious. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-03 14:04:50 +01:00
Jan Friesse	26ef8e15db	icmap: Add map copy function Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-29 17:08:46 +02:00
Jan Friesse	e363f8b06d	icmap: Add function to return item data pointer icmap_get_r is now implemented using this function. Function is not very safe tho defined as static. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-29 17:08:41 +02:00
Jan Friesse	624cd439aa	icmap: Fix value len checking for strings Implementation should allow pass only parts of string (shorten string) and must prohibit reading of uninitialized memory. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-29 17:08:37 +02:00

1 2 3 4 5 ...

1779 Commits