mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2026-01-13 20:40:50 +00:00

Author	SHA1	Message	Date
Christine Caulfield	bd2e6b5d9d	votequorum: Don't update expected_votes display if value is too high If expected_votes was set via the library but the calculation decides it's too high, then an error is correctly returned but the value is still set in the nodes' expected_votes field and turns up in the corosync-quorumtool display. This patch separates out the quorum calculation from the updating of expected_votes per node to prevent this from happening. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-09-13 14:28:56 +01:00
Ferenc Wágner	cf10a754e9	Fix various typos occured -> occurred parantheses -> parentheses configuraton -> configuration aquire -> acquire retrive -> retrieve prefered -> preferred Signed-off-by: Ferenc Wágner <wferi@niif.hu> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-09-12 09:50:11 +02:00
Jan Friesse	f837f95dfe	Config: Flag config uidgid entries Uidgid entries parsed from configuration files now has prefix (uidgid.config.) so they are distinguishable from dynamically added entries. Entries added from config file are pruned on reload if no longer exists in config file (dynamic one stays unaffected). Also whole uidgid.config. prefix is made read only. This make PCMK work again after configuration reload is called. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2016-08-04 16:13:48 +02:00
HideoYamauchi	71c9035c27	Low: totemsrp: Addition of the log. Signed-off-by: HideoYamauchi <renayama19661014@ybb.ne.jp> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-08-01 10:11:45 +02:00
Jan Friesse	1925074909	Fix few bugs found by coverity Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2016-06-28 13:58:43 +02:00
Christine Caulfield	0665aca9e1	quorum: revert patch that adds qdevice (node 0) to quorum callback Revert patch 9f54f0a1fad7dad42c55562a50dfb9d773e6a660 as it causes more troubles than it solves. Code that uses the quorum nodelist to get a list of actual nodes in the cluster for communication break using this as well as the display from corosync-quorumtool Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:43 +02:00
Christine Caulfield	c9c6d9e30f	quorum: Return qdevice nodeid in the quorum callbacks (if active). Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:41 +02:00
Christine Caulfield	e41b256c67	votequorum: Allow wait_for_all with qdevice Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:39 +02:00
Christine Caulfield	98548e1880	qnetd: lms: Fix search for node/ring_id check We were looking for us in other node lists, rather than others in our nodelist. Also, remove debug print in votequorum.c Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:39 +02:00
Christine Caulfield	3a5d51fca7	votequorum: Fix up quorum/nodelist callbacks This patch tidies the two state change callbacks and explains them in the man page: The difference between votequorum_nodelist_notification_t and votequorum_quorum_notification_t is subtle but important. The 'nodelist' callback is sent at the start of a cluster state transition and contains the new ring_id and only the list of nodes that are included in the sync state - ie only active nodes. No quorum information is included this callback because it is not available at that time. The 'quorum' callback is sent after the cluster state transition has completed and does contain quorum information. In addition, the nodelist contains a list of all nodes known to votequorum (whether up or down) and their state as well as information about the quorum device attached (if any). quorum callbacks will not be sent for qdevice up and down events unless they affect quorum. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:39 +02:00
Christine Caulfield	cf0028c86e	votequorum: split callbacks into nodelist and quorum This split is needed for qdevice, so that it gets the ring_id and nodelist as part of the sync process and not afterwards - when quorum has been calculated. As this is and unsupported API I'm not too worried about breaking existing code - all the clients I know of are using the quorum API anyway as they should be. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:38 +02:00
Jan Friesse	44df76a7ee	config: get_cluster_mcast_addr error is not fatal Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2016-06-28 13:57:14 +02:00
Ferenc Wágner	c76ee39f61	Fix typo: Diabled -> disabled Signed-off-by: Ferenc Wágner <wferi@niif.hu> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-06-22 14:26:48 +02:00
Ferenc Wágner	b1de8efd15	Fix typo: aquire -> acquire Signed-off-by: Ferenc Wágner <wferi@niif.hu> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-06-22 14:26:28 +02:00
Ferenc Wágner	841f48e253	Fix typo: Uknown -> Unknown Signed-off-by: Ferenc Wágner <wferi@niif.hu> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-06-22 14:26:22 +02:00
Christine Caulfield	f2a1fcc5bf	logconfig: Fix logging reload disabling logfiles In my previous logconfig patch, adding a subsys so the logging stanzas could disable logging to a file, because the subsys closed the file used by the main logging. This patch only applies defaults to higher-level logging and non-deprecated keys. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-05-27 17:36:30 +02:00
yuusuke	2ef086bd9b	wd: Warn if values are out of range Signed-off-by: yuusuke <yusk.iida@gmail.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2016-05-27 10:38:30 +02:00
yuusuke	39cd6b3d1d	parser: WD Read type correctly from corosync.conf Signed-off-by: yuusuke <yusk.iida@gmail.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2016-05-27 10:36:24 +02:00
Christine Caulfield	571b1621e9	Add some more RO keys Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-05-24 12:33:55 +02:00
Christine Caulfield	125848d80a	Reapply config defaults corosync.conf reload There were several places where defaults were not restored if the keys were removed from corosync.conf and the file reloaded. This patch adds those back so that reloading corosync.conf has the expected effect when keys are deleted. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-05-24 12:33:35 +02:00
Jan Friesse	b93d75abc4	schedwrk: Cleanup and make it work on PPC BE Schedwrk is passing hdb handle (64-bit) to totempg_callback_token_create as a context. Context is defined to be pointer, so there is conversion function which stores 64-bit hdb_handle into pointer. Potentially, pointer can be 32-bit. This means, check part of hdb is discarded (and have to get special no_check value in schedwrk_do) later. This works quite well on 32-bit Little-Endian system. Sadly on Big-Endian system, check partition of hdb is stored instead of value. Result is error of hdb_handle_get call. Proposed solution is to pass handle pointer to totempg_callback_token_create as context. This means full hdb (check + value) can be used in schedwrk_do (easier detection of memory corruption). Main reason for this patch is to remove usage of pointer as integer value. Small drawback of given solution is that handle pointer must be memory allocated on heap or static memory, making API more bug-prone. Current usage of schedwrk API across corosync always use memory in .text section (safe), so it's not a problem. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2016-05-17 16:29:25 +02:00
Valentin Vidic	8d8d4a936a	wd: make watchdog device configurable Add configuration option resources.watchdog_device allowing runtime selection of watchdog device. Useful for newer servers having more than one watchdog available (IPMI and iTCO). Special value "off" disables watchdog in configuration rather than just using build options. Useful when watchdog device is needed elsewhere (SBD cluster stonith service). Signed-off-by: Valentin Vidic <Valentin.Vidic@CARNet.hr> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-05-03 15:47:15 +02:00
Christine Caulfield	1e2de52ef1	logging: Use our own version of basename basename() function has some potentially odd issues on other platforms. So, to be safe, here's an internal version. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-05-03 15:31:29 +02:00
Christine Caulfield	d245831d65	logsys: fix TOTEM logging when corosync built out of tree If corosync is built out-of-tree (passing --srcdir to configure) then TOTEM logging doesn't print anything. This is caused by the source filenames (from __FILE__ at compilation time) having the configured path in them - in this example ../corosync/exec/totemudp.c etc. The list of totem source filenames passed to libqb logging facility only has the basenames so the filenames never match up as libqb does an exact string match. I looked into fixing this in libqb but it causes a regression. We can't simply basename() __FILE__ at the point of calling log_printf as it's i common also to use __FILE__ to generate the logging source, and using basename() on both removes the distinction between similarly named files from different directories which could be a requirement. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-04-26 09:49:53 +01:00
Christine Caulfield	aab55a004b	parser: Make config file parser more hierarchy pass 'state' down the stack so that the state of the hierarchy doesn't get lost when there are unexpected items in the config hierarchy. Don't bother setting 'state' on SECTION_END as there's no point now we're going back up the stack. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-04-22 13:01:04 +02:00
Jan Friesse	60565b7da7	totemconfig: Explicitly pass IP version If resolver was set to prefer IPv6 (almost always) and interface section was not defined (almost all config files created by pcs), IP version was set to mcast_addr.family. Because mcast_addr.family was unset (reset to zero), IPv6 address was returned causing failure in totemsrp. Solution is to pass correct IP version stored in totem_config->ip_version. Patch also simplifies get_cluster_mcast_addr. It was using mix of explicitly passed IP version and bindnet IP version. Also return value of get_cluster_mcast_addr is now properly checked. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2016-04-07 14:45:05 +02:00
Jan Friesse	600fb4084a	totempg: Fix memory leak Previously there were two free lists. One for operational and one for transitional state. Because every node starts in transitional state and always ends in the operational state, assembly was always put to normal state free list and never in transitional free list, so new assembly structure was always allocated after new node connected. Solution is to have only one free list. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Steven Dake <stdake@cisco.com>	2016-02-10 15:57:20 +01:00
Richard B Winters	028c473886	Fix spelling error in binary corosync - Changed paramater to parameter in exec/logcconfig.c Change-Id: I8a24b0ef5c6621dc6c19d7decbdfe7a255afd10d Signed-off-by: Richard B Winters <rik@mmogp.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-01-27 18:29:25 +01:00
Ruben Kerkhof	37f092bbed	totemsrp: Fix clang warning (tautological compare) gsfrom is always >= 0 Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-01-04 17:28:14 +01:00
Ruben Kerkhof	da3288217c	Remove a few unused variables and functions Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-01-04 17:11:06 +01:00
Ruben Kerkhof	479ec4dbf0	Check for fdatasync If we don't have it, fall back to fsync Fixes the build on FreeBSD Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-12-16 16:43:27 +01:00
Hideo Yamauchi	5ab922701a	quorum: Display node id as unsigned int. Signed-off-by: Hideo Yamauchi <renayama19661014@ybb.ne.jp> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-11-27 15:56:54 +01:00
Christine Caulfield	165561df9b	totemudp: Move udp bind() so that multicast works with IPv6 It seems that the IPv6 multicast parameters only take effect when bind() is called, so I've moved the mcast recv socket bind() to the bottom of totemudp_build_sockets_ip(). Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-11-16 16:00:36 +00:00
Christine Caulfield	a71ec5d95d	votequorum: Don't send multiple callbacks when nodes join This patch aligns the votequorum callbacks so that they are the same as the quorum ones. Previously it was quite common for votequorum to send one callback for every node in the cluster when a single new node joined (because it sent one for every nodeinfo message it received). This new system makes much more sense in itself and being consistent with the internal quorum is also an advantage! Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-10-22 11:45:26 +01:00
Ferenc Wágner	73910bd66e	totmesrp: Fix typo in log message Signed-off-by: Ferenc Wágner <wferi@niif.hu> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-08-26 09:26:26 +02:00
Christine Caulfield	d64ee7b531	wd: fix setting of watchdog timeouts Fix setting of initial watchdog timeout, and also changing of timeout. Remove redundant starting of timer in exec_init_fn Signed-off-by: Kazunori INOUE <kazunori.inoue3@gmail.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2015-07-14 10:04:06 +01:00
Jason HU	15b2e94cca	CFG: Prevent CFG orignating messages during SYNC During SYNC, corosync-cfgtool -R/-H commands can pass through IPC then send totem messages. This may corrupts assembly_list_inuse/assembly_list_free if those messages are recedived after SYNC is done. The solution is marking related CFG APIs as CS_LIB_FLOW_CONTROL_REQUIRED. Signed-off-by: Jason HU <huzhijiang@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-07-02 16:49:38 +02:00
Christine Caulfield	b9f5c290b7	votequorum: Fix auto_tie_breaker behaviour in odd-sized clusters auto_tie_breaker can behave incorrectly in the case of a cluster with an odd number of nodes. It's possible for a partition to have quorum while the other side has the ATB node, and both will continue working. (Of course in a properly configured cluster one side will be fenced but that becomes an indeterminate race .. just what ATB is supposed to avoid). This patch prevents ATB from running in a partition if the 'other' partition might have quorum, and also mandates the use of wait_for_all in clusters with an odd number of nodes so that a quorate partition cannot start services or fence an existing partition with the tie breaker node. Signed-Off-By: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-06-18 09:57:59 +01:00
Christine Caulfield	ab8942f626	totemsrp: Improve logging of left/down nodes This patch from Hideo Yamauchi improves the logging of whether nodes leave the cluster cleanly or uncleanly, making it easier to determine if a node ws shut down by the operator. There is also the possibility that a LEAVE message could get missed (due to the node being in flush state) so this can also make that clearer. The modifications are as follows. Change 1) I added the list which maintained LEAVE node to totemsrp. Change 2) I added registration, a search, the handling of to clear LEAVE node. Change 3) I added the output to log. Change 4) I changed an output level of the log. Signed-off-by: Hideo Yamauchi <renayama19661014@ybb.ne.jp> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-06-12 16:16:45 +01:00
Christine Caulfield	53f67a2a79	totem: Log a message if JOIN or LEAVE message is ignored As per recent email thread, this patch adds a log message if a JOIN or LEAVE message is discarded while corosync is flushing the receive queue. While ignoring a JOIN message is harmless (it will be resent), ignoring a LEAVE message can cause a longer state transition as it is treated as a node crashing rather than leaving gracefully, so the system admin might be confused as to the cause. Unfortunately, we can't (at the totemudp level) distinguish between JOIN or LEAVE messages without a lot more protocol-specific code creeping in the lower layer so the message is left ambiguous. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2015-04-17 15:49:53 +01:00
Christine Caulfield	997074cc3e	totemconfig: Check for duplicate nodeids Having duplicate nodeids in corosync.conf can play havoc with a cluster, so (as suggested by someone on this list) here is some code to check that all nodeids are unique. Even if a nodeid is not specified it will check to be sure that the ID generated from the IP address (ipv4 only) does not clash with one that is provided. It logs all non-unique nodeids to syslog, but only the last is reported on the command-line to the user which should be enough to get them to check further. At startup this will cause corosync to fail to start. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2015-04-10 14:22:07 +01:00
Christine Caulfield	82526d2fe9	quorum: don't allow quorum_trackstart to be called twice If quorum_trackstart() or votequorum_trackstart() are called twice with CS_TRACK_CHANGES then the client gets added twice to the notifications list effectively corrupting it. Users have reported segfaults in corosync when they did this (by mistake!). As there's already a tracking_enabled flag in the private-data, we check that before adding to the list again and return an error if the process is already registered. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-03-16 11:37:52 +00:00
Christine Caulfield	8cc8e51363	cpg: Add support for messages larger than 1Mb If a cpg client sends a message larger than 1Mb (actually slightly less to allow for internal buffers) cpg will now fragment that into several corosync messages before sending it around the ring. cpg_mcast_joined() can now return CS_ERR_INTERRUPT which means that the cpg membership was disrupted during the send operation and the message needs to be resent. The new API call cpg_max_atomic_msgsize_get() returns the maximum size of a message that will not be fragmented internally. New test program cpghum was written to stress test this functionality, it checks message integrity and order of receipt. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-03-05 16:45:15 +00:00
Andrey N. Groshev	5d9acc5604	totemsrp: Format member list log as unsigned int Signed-off-by: Andrey N. Groshev <greenx@yandex.ru> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-03-05 16:34:07 +01:00
Christine Caulfield	c832ade034	Don't allow both two_node and auto_tie_breaker in corosync.conf The two_node and auto_tie_breaker options are incompatible as they specify conflicting methods of determining the quorate half of a cluster partition. This patch detects this error in corosync.conf, issues a message and disables two_node if auto_tie_breaker is present. Signed-Off-By: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-03-02 15:50:21 +00:00
Christine Caulfield	314a01c98e	Votequorum: Fix auto_tie_breaker default The default for auto_tie_breaker should be 'lowest' - which is what it was before the extended ATB functionality of auto_tie_breaker_node was added, and what the documentation states. However this was broken so that if auto_tie_breaker_node was not specified then auto_tie_breaker itself was ignored. This patch fixes that. It also fixes a typo in a comment. Signed-Off-By: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-03-02 15:48:01 +00:00
Jan Friesse	d77cec24d0	Handle adding and removing UDPU members atomically When config file is reloaded with removed UDPU member, internal icmap index of nodelist.node can change. This can result in removal and then adding back node. This, with UDPU alive filtering (where member is by default considered as not a member) makes corosync not sending messages to such members resulting in new membership creation. Solution is to properly test which members were really deleted and added (instead of relying on internal and dynamic naming of icmap hash table key name). Also trully dynamic add and remove node (via cmap) is now handled by same function so totem_config->interfaces is now updated properly. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2015-01-21 16:37:26 +01:00
Jan Friesse	252b38ab8a	corosync_ring_id_store: Use safer permissions corosync_ring_id_store should use same (safer) permissions as corosync_ring_id_create_or_load for (eventually) newly created ringid file. Credit to Sjerek for finding this problem. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2015-01-20 11:21:05 +01:00
Jason	4ee84c51fa	totem: Ignore duplicated commit tokens in recovery In active rrp mode, commit tokens are treated as mcast data messages, thus, rrp directly delivers them to srp layer by active_mcast_recv(). This will result in duplicated commit tokens being received by srp from different heartbeat links. If node is in recovery state and has already sent out the initial orf token, those duplicated commit tokens will cause message_handler_memb_commit_token() to send initial orf token again! This is wrong because it resets the orf token content in instance->orf_token_retransmit, which breaks the token retransmission state. Furthermore, by sending those initial orf tokens again and again, it may lead active_token_recv() to drop some subsequent orf tokens. It is OK for rrp because srp will do token retransmission, but as said above, srp retransmission state has already been broken, so finally we meet a "token lost in recovery state" condition caused by software. If token timeout value is large, then it will takes long time to create a new ring. This can be reproduced by having two noded set to active rrp mode, with two heartbeat links. Then with one node always on, let the other one do stop/start again and again. It has a low probability to reproduce. In theory, I think, the more heartbeat links used, the more easily it can be reproduced. This problem can be resolved by letting message_handler_memb_commit_token() to ignore duplicated commit tokens in recovery state if node (the ring representation) has already sent out the initial orf token. Different from prev take, this version do not depends on stored token data but uses originated_orf_token in totemsrp_instance to remember if initial orf token has been already originated for current membership. Signed-off-by: Jason <huzhijiang@gmail.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2015-01-15 17:33:04 +01:00
Jan Friesse	e0ac861efd	Log auto-recovery of ring only once Make sure to log auto-recovery of ring only once. Every MESSAGE_TYPE_RING_TEST_ACTIVATE receive is logged, but with lower priority and more detailed information. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2015-01-14 18:13:29 +01:00

1 2 3 4 5 ...

1860 Commits