mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2026-01-12 23:55:10 +00:00

Author	SHA1	Message	Date
Jan Friesse	89ab80f694	totemconfig: Put autogenerated nodeid back to cmap Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2021-08-02 15:13:23 +02:00
Jan Friesse	6e57e5a96b	totemconfig: Do not process totem.nodeid totem.nodeid is relict from times when nodelist was not required and totemsrp was sending whole membership with ip addresses. With Corosync 3 ip addresses are no longer sent so it is not possible to find "next" node ip address where to send token (because only nodeid is sent) without having information about all of the nodes stored locally. When totem.nodeid was configured it was partly used and other parts (most notably totemudpu_token_target_set) were using autogenerated nodeid. Together it was not possible to create even single node membership. Solution is to ignore totem.nodeid completely (and display warning when it is set). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2021-08-02 15:13:04 +02:00
Christine Caulfield	1d217b9a34	knet: Fix node status display Currently if there is a gap in the links (eg link0 is missing) corosync-cfgtool -s will still display the links as 0,1,2,3... even if they are 1,2,5,6... Also display the KNET transport type with the link in corosync-cfgtool -s & -n Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2021-07-29 14:38:53 +02:00
Jan Friesse	c9996fdd0f	main: Add support for cgroup v2 and auto mode Support for cgroup v2 is very similar to cgroup v1 just checking (and writing) different file. Because of all the problems described later with cgroup v2 new "auto" mode (new default) is added. This mode first tries to set rr scheduling and moves Corosync to root cgroup only if it fails. Testing this feature is a bit harder than with cgroup v1 so it's probably worh noting in this commit message. 1. Copy some service file (I've used httpd service) and set CPUQuota=30% in the [service] section. 2. Check /sys/fs/cgroup/cgroup.subtree_control - there should be no "cpu" 3. Start modified service 4. Check /sys/fs/cgroup/cgroup.subtree_control - there should be "cpu" 5. Start corosync - It should be able to get rt priority When move_to_root_cgroup is disabled (applies only for kernels with CONFIG_RT_GROUP_SCHED enabled), behavior differs: - If corosync is started before modified service, so there is no "cpu" in /sys/fs/cgroup/cgroup.subtree_control corosync starts without problem and gets rt priority. Starting modified service later will never add "cpu" into /sys/fs/cgroup/cgroup.subtree_control (because corosync is holding rt priority and it is placed in the non-root cgroup by systemd). - When corosync is started after modified service, so "cpu" is in /sys/fs/cgroup/cgroup.subtree_control, corosync is not able to get RT priority. It's worth noting problems when cgroup v2 is used together with systemd logging described in corosync.conf(5) man page. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2021-07-23 15:31:52 +02:00
Christine Caulfield	24b787248b	stats: fix crash when iterating over deleted keys The libqb map API leaves 'ownership' of the data with the caller but does its own lifetime management, so it can easily happen that map_rm() is called and the data deleted by the caller. But if an iterator is running over that item then the map entry will not get removed (leaving dangling pointers) until later. libqb has a hack-y callback that tells the owner when it is safe to delete the allocated memory, so we hook into that. icmap is already using this. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2021-06-03 10:14:47 +02:00
Jan Friesse	fc7b420e94	Revert "main: Add support for cgroup v2" This reverts commit `57e6b86b53`. We are in process of finding better solution so reverting for now. Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2021-05-21 08:38:17 +02:00
Jan Friesse	791cc6c939	cfg: corosync_cfg_trackstop blocks forever corosync_cfg_trackstop expects reply but that was never sent. Make sure to send reply so corosync_cfg_trackstop works. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2021-05-19 18:28:45 +02:00
Jan Friesse	57e6b86b53	main: Add support for cgroup v2 Support for cgroup v2 is very similar to cgroup v1 just checking (and writing) different file. Testing this feature is a bit harder than with cgroup v1 so it's probably worh noting in this commit message. 1. Copy some service file (I've used httpd service) and set CPUQuota=30% in the [service] section. 2. Check /sys/fs/cgroup/cgroup.subtree_control - there should be no "cpu" 3. Start modified service 4. Check /sys/fs/cgroup/cgroup.subtree_control - there should be "cpu" 5. Start corosync - It should be able to get rt priority When move_to_root_cgroup is disabled, behavior differs: - If corosync is started before modified service, so there is no "cpu" in /sys/fs/cgroup/cgroup.subtree_control corosync starts without problem and gets rt priority. Starting modified service later will never add "cpu" into /sys/fs/cgroup/cgroup.subtree_control (because corosync is holding rt priority and it is placed in the non-root cgroup by systemd). - When corosync is started after modified service, so "cpu" is in /sys/fs/cgroup/cgroup.subtree_control, corosync is not able to get RT priority. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2021-05-10 15:47:32 +02:00
Jan Friesse	27369481e5	main: Mark crypto_model key read only ... to be in align with crypto_cypher and crypto_hash. Reload (corosync-cfgtool -R) works without any problem and changing of key is not supported anyway, Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2021-04-14 18:08:00 +02:00
Jan Friesse	a95b3df953	totemconfig: Ensure strncpy is always terminated Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2021-04-14 18:07:50 +02:00
Jan Friesse	52d457a455	config: Properly check crypto and compress models Use knet_get_crypto_list to find knet supported crypto models and use them instead of hardcoded list. Also fix compression handling. Previously knet_compression_model value was not checked at all and was directly passed to knet. Use knet_get_compress_list to find knet supported compress models and use them to check validity of config file and for more informative error message. Lastly enhance corosync version display with information about available crypto/compression models. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2021-04-14 18:07:20 +02:00
Fabio M. Di Nitto	650a3f15cf	knet: pass correct handle to knet_handle_compress totemknet_configure_compression was using knet_context just to gather the knet handle / instance. On first time config knet_contex is not initialized till much later in the code, passing some random garbage pointers to knet_handle_compress, that would crash later trying to acquire a mutex lock. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2021-04-06 11:08:28 +02:00
Johannes Krupp	8835de5dae	totemconfig: fix integer underflow and logic bug Fix integer underflow when computing `namelen` in `nodelist_byname`, always use computed `namelen`. Fixes #626. Signed-off-by: Johannes Krupp <johannes.krupp@cispa.saarland> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2021-03-29 14:05:04 +02:00
liangxin1300	cb5c77c557	totemconfig: change udp netmtu value as a constant Insted of using "magic number" use UDP_NETMTU constant. Signed-off-by: liangxin1300 <XLiang@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2021-03-25 10:48:47 +01:00
Dan Streetman	4f171ea584	totemknet: retry knet_handle_new if it fails Retry knet_handle_new without privileged operations if it fails knet_handle_new can fail with ENAMETOOLONG if its privileged operations fail, which can happen if we're running as a user process or in an unprivileged container. This adds a cmap key 'allow_knet_handle_fallback' that defaults to no, which is the current behavior of exiting with error if the knet_handle can't be created with privileged operations. If the new cmap key is set to 'yes' and the knet_handle creation fails, fallback to creating the handle using unprivileged operations is tried. Signed-off-by: Dan Streetman <ddstreet@canonical.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2021-03-18 17:21:06 +01:00
Dan Streetman	2d29f68e66	main: Check memlock rlimit Don't lock all current and future memory if can't increase memlock rlimit. If we fail to increase our RLIMIT_MEMLOCK, then locking all our current and future memory is extremely dangerous; once our memory use reaches our RLIMIT_MEMLOCK, memory allocations will start failing, very likely leading to our entire process crashing. This can happen if we aren't a privileged process, for example if running as non-root user, or inside an unprivileged container. Signed-off-by: Dan Streetman <ddstreet@canonical.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2021-03-11 14:19:14 +01:00
Christine Caulfield	8278e48d34	main: Close race condition when moving to statedir Found by covscan which also didn't like us 'leaking' the fd to the lockfile. So close that too. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2021-03-04 16:04:16 +01:00
Christine Caulfield	461cf49467	cfg: Reinstate cfg tracking CFG tracking was removed in `815375411e`, probably as a mistake, as part of the tidy up of cfg and the removal of dynamic loading. This means that shutdown tracking (using cfg_try_shutdown()) stopped working. This patch restores the trackstart & trackstop API calls (renamed to be more consistent with the exiting libraries) so that shutdown tracking can be used again. Change cfg.shutdown_timeout to be in milliseconds rather than seconds nd use libqb macros for conversion. Add --force option to corosync-cfgtool -H Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2021-01-14 16:09:46 +01:00
Jan Friesse	d76fc6ab85	cfg: Improve nodestatusget versioning Patch tries to make nodestatusget really extendable. Following changes are implemented: - corosync_cfg_node_status_version_t is added with (for now) single value CFG_NODE_STATUS_V1 - corosync_knet_node_status renamed to corosync_cfg_node_status_v1 (it isn't really knet because it works as well for udp(u() - struct res_lib_cfg_nodestatusget_version is added which holds only ipc result header and version on same position as for corosync_cfg_node_status_v1 - corosync_cfg_node_status_get requires version and pointer to one of corosync_cfg_node_status_v structures - request is handled in case switches to make adding new version easier Also fix following bugs: - totempg_nodestatus_get error was retyped to cs_error_t without any meaning. - header.error was not checked at all in the library Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2020-11-26 16:16:49 +01:00
Christine Caulfield	9e7f62d27d	cfg: New API to get extended node/link infomation Current we horribly over-use totempg_ifaces_get() to retrieve information about knet interfaces. This is an attempt to improve on that. All transports are supported (so not only Knet but also UDP(U)). This patch builds best against the "onwire-upgrade" branch of knet as that's what sparked my interest in getting more information out. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-11-26 16:15:50 +01:00
Jan Friesse	4a2f48b17b	totemknet: Check both cipher and hash for crypto Previously only crypto cipher was used as a way to find out if crypto is enabled or disabled. This usually works ok until cipher is set to none and hash to some other value (like sha1). Such config is perfectly valid and it was not supported correctly. As a solution, check both cipher and hash. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2020-11-12 13:47:15 +01:00
Ferenc Wágner	3d5481c9ef	The ring id file needn't be executable At the same time simplify the overwrite logic and stop clearing the umask (which is unexpected and quite pointless here, as applications can't really protect the users from their own pathological settings). Signed-off-by: Ferenc Wágner <wferi@debian.org> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-11-10 14:16:07 +01:00
liangxin1300	e17ac2503c	totemconfig: remove redundant nodeid error log Signed-off-by: liangxin1300 <XLiang@suse.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2020-10-19 11:31:52 +02:00
Aleksei Burlakov	98bfd9988b	totemsrp: More informative messages ... when token and consensus timeouts pop. Signed-off-by: Aleksei Burlakov <aburlakov@suse.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-10-15 16:46:51 +02:00
Jan Friesse	8221f7802a	config: Increase default token timeout to 3000 ms Default token timeout of 1000 ms was often changed by users because of other workloads on machine which may make corosync responding a bit later than needed and resulting in token loss. 3000 ms was chosen as a compromise between token timeout increase and allow live cluster upgrade (other nodes should receive token by node with new default on time). It doesn't affect token token_coefficient so final token timeout still depends on number of configured nodes (just base is higher). This change slows down failover a bit so for clusters where failover times are important, please change the token timeout in configuration file corosync.conf as a: totem { version: 2 token: 1000 ... Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2020-10-15 16:42:53 +02:00
Jan Friesse	4eb3629728	quorum: Add support for nodelist callback Current quorum callback contains only actual view list and there is no way how to find out joined/left nodes. This cannot be emulated by user app, because when corosync restarts before other nodes notices then view list is unchanged (ring id is changed tho). Solution is to implement similar callback as for cpg which contains ring id, member list, joined list and left list. To implement such callback and keep backwards compatibility, quorum_model_initialize is introduced. Its behavior is similar to cpg_model_initialize. This allows passing model v1, which contains enhanced quorum (full ring id is passed instead of just seq number) and nodelist callbacks. To find out which events should be sent by corosync daemon, new message MESSAGE_REQ_QUORUM_MODEL_GETTYPE is used. Quorum library on init was sending MESSAGE_REQ_QUORUM_GETTYPE. Whem model v1 is requested the MESSAGE_REQ_QUORUM_MODEL_GETTYPE is used, which contains model number so corosync knows that client is using model v1 and can send enhanced quorum and nodelist events. Nodelist event is (for now) send both in case of change of membership and also when requested, also when CS_TRACK_CURRENT is requested, but then left_list and joined_list is left empty, because they don't make too much sense there. New test application testquorummodel is added as an example of new API usage. Also during patch developement, I found few bugs here and there, which are also fixed: - quorum_initialize was never returning error code returned by MESSAGE_REQ_QUORUM_GETTYPE call (always returned CS_OK) - Allocated memory in send_library_notification was based on sizeof(unsigned int) instead of mar_uint32_t. That's not wrong, but it make more sense to use sizeof(mar_uint32_t) instead (big thanks to Chrissie for englishify the man pages) Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2020-10-12 13:22:11 +02:00
Jan Friesse	40d636e9ef	totemsrp: Move token received callback Trigger token received callback only for valid token. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2020-09-29 15:51:49 +02:00
liangxin1300	1aaa2467b9	totemconfig: improve linknumber checking Check whether linknumber larger than INTERFACE_MAX and display error if so. Signed-off-by: liangxin1300 <XLiang@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-09-18 12:12:26 +02:00
liangxin1300	9461f87218	totemconfig: add interface number to the error str Signed-off-by: liangxin1300 <XLiang@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-09-17 15:40:26 +02:00
liangxin1300	ad2f1c6272	cfg: enhance message_handler_req_lib_cfg_killnode While execute corosync-cfgtool -k <nodeid> to kill node: * Check whether nodeid exists * Check whether the node was joined Signed-off-by: liangxin1300 <XLiang@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-09-17 15:26:10 +02:00
liangxin1300	f0e1eaff2d	totemconfig: validate totem.transport value Signed-off-by: liangxin1300 <XLiang@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-09-03 16:00:31 +02:00
Christine Caulfield	5f71445be0	config: Allow reconfiguration of crypto options Needs new knet crypto API. If it's not available, then fall back to the old API and forbid changing crypto while running. To avoid us being dependant on the leader node, each node sends its own crypto_reconfig_phase messages so we can guarantee that the reconfiguration always completes on each node. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-07-09 16:54:16 +02:00
Christine Caulfield	f8b63083e1	config: Fix crash when a reload fails twice Have string values stored in char arrays in totem_config so we don't get into a mess with the pointers. Also remove vsftype (which hasn't been used since corosync 1) Use strncpy even though we know the string is fine. Keep covscan happy Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-04-24 16:27:18 +02:00
Christine Caulfield	4ddc96cd4e	config: Don't free pointers used by transports reload failed for UDP[U] because they had saved pointers to the interfaces[] array. so memcpy into that rather then re-allocate it. Also, move the check for different IP address families so it also gets run at reload time. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-04-24 16:27:09 +02:00
Christine Caulfield	7cb539e2e3	config: don't reload vquorum if reload fails Fix an 'error: success' stype message by propogating error_string back down the stack. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-04-24 16:27:01 +02:00
Christine Caulfield	600072ef38	cfg: Improve error return to cfgtool -R Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-04-24 16:26:54 +02:00
Christine Caulfield	f078fff6eb	config: Reorganise the config system To be more reliable & maintainable The basic plan here is to fix reloads to be more stable using read/parse/verify/build/commit stages, so that any errors will not leave corosync in an unstable state. This should also make the code more maintainable as currently the verify/commit stages are horribly intertwined. Also: - Fix local_node_pos not being updated in the new map during validation (broke adding and removing new nodes in the middle of the list). - Fix reconfiguration so that nodes are indexed by nodeid and not their position in the list. This is an old bug that's just been carried over Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-04-24 16:26:44 +02:00
Jan Friesse	1777d9992c	Revert "totemip: compare sin6_scope_id and interface_num" This reverts commit `efd34df531` to make master compile after revert of `934c47ed43`. Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2020-04-22 13:30:36 +02:00
Jan Friesse	cd6cc90a6f	Revert "totemip: Add support for sin6_scope_id" This reverts commit `934c47ed43` which is causing protocol incompatibility in needle. Master seems to be not affected, but it needs more checking. Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2020-04-22 13:30:19 +02:00
Christine Caulfield	c631951ef5	icmap: icmap_init_r() leaks if trie_create() fails Thanks to Coverity for finding this Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-03-26 14:42:41 +01:00
Jan Friesse	ca320beac2	votequorum: set wfa status only on startup Previously reload of configuration with enabled wait_for_all result in set of wait_for_all_status which set cluster_is_quorate to 0 but didn't inform the quorum service so votequorum and quorum information may get out of sync. Example is 1 node cluster, which is extended to 3 nodes. Quorum service reports cluster as a quorate (incorrect) and votequorum as not-quorate (correct). Similar behavior happens when extending cluster in general, but some configurations are less incorrect (3->4). Discussed solution was to inform quorum service but that would mean every reload would cause loss of quorum until all nodes would be seen again. Such behaviour is consistent but seems to be a bit too strict. Proposed solution sets wait_for_all_status only on startup and doesn't touch it during reload. This solution fulfills requirement of "cluster will be quorate for the first time only after all nodes have been visible at least once at the same time." because node clears wait_for_all_status only after it sees all other nodes or joins cluster which is quorate. It also solves problem with extending cluster, because when cluster becomes unquorate (1->3) wait_for_all_status is set. Added assert is only for ensure that I haven't missed any case when quorate cluster may become unquorate. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2020-03-24 14:13:32 +01:00
Jan Friesse	0c16442f2d	votequorum: Change check of expected_votes Previously value of new expected_votes was checked so newly computed quorum value was in the interval <total_votes / 2, total_votes>. The upper range prevented the cluster to become unquorate, but bottom check was almost useless because it allowed to change expected_votes so it is smaller than total_votes. Solution is to check if expected_votes is bigger or equal to total_votes and for quorate cluster only check if cluster doesn't become unquorate (for unquorate cluster one can set upper range freely - as it is perfectly possible when using config file) Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2020-03-13 09:06:55 +01:00
Jan Friesse	35662dd0ec	main: Add schedmiss timestamp into message This is useful for matching schedmiss event in stats map with logged event. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2020-02-27 08:37:35 +01:00
liangxin1300	efd34df531	totemip: compare sin6_scope_id and interface_num When user configure a specific interface like vlan with the same IPv6 link-local address, Corosync should compare sin6_scope_id with interface_num, to make sure got the right interface to bind Signed-off-by: liangxin1300 <XLiang@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-02-21 15:46:22 +01:00
Jan Friesse	38d1d10d39	totemip: Remove unused totemip_copy_endian_convert Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2020-02-17 17:31:55 +01:00
Jan Friesse	934c47ed43	totemip: Add support for sin6_scope_id sin6_scope_id was not present in totemip structure making impossible to use link-local ipv6 address. Patch adds sin6_scope_id and changes convert/copy functions to use it (formally also comparator functions should be changed, but it seems to cause more harm and it is not really needed). This makes corosync work with link-local addresses fine for both UDPU and Knet transport as long as interface specification is used (so fe80::xxxx:xxxx:xxxx:xxxx%eth0). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2020-02-17 17:31:42 +01:00
Jan Friesse	720a892751	cfgtool: Improve link status display Totemknet is enhanced to use 'n' character for localhost and not adding status, because it is safe to expect that localhost link is always connectd. corosync-cfgtool is enhanced to properly decode 'n', '?' and 'd' characters and display its meaning for extended status. Special characters are also documented in man page. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2020-02-12 13:08:25 +01:00
Hideo Yamauchi	0143ee9a2f	totemknet: Change the initial value of the status Signed-off-by: Hideo Yamauchi <renayama19661014@ybb.ne.jp> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-02-10 16:41:22 +01:00
Jan Friesse	ebd05fa008	stats: Use nanoseconds from epoch for schedmiss Using monotonic time is not working because it doesn't have to match time from epoch. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2020-01-23 17:58:41 +01:00
Christine Caulfield	48b6894ef4	stats: Add stats for scheduler misses This patch add a stats.schedmiss.* set of entries that are a record of the last 10 times corosync was not scheduled in time. These entries are keypt in reverse order (so stats.schedmiss.0.* is always the latest one kept) and the values, including the timestamp, are in milliseconds. It's also possible to use a cmap tracker to follow these events, which might be useful. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2020-01-22 17:06:10 +01:00

1 2 3 4 5 ...

2130 Commits