mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2025-08-14 17:12:34 +00:00

Author	SHA1	Message	Date
Fabio M. Di Nitto	c00502a70a	build: drop another leftover from the past Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-12 07:13:04 +01:00
Fabio M. Di Nitto	b654661b4c	build: drop obsoleted SOCKETDIR option yet another leftover from the past that can go away Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-12 07:12:48 +01:00
Fabio M. Di Nitto	fd79118110	build: drop last LCRSO references Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-12 07:12:20 +01:00
Fabio M. Di Nitto	eb3d49ef7d	pload: make it a test service and not a public one pload is a performance benchmark that measures the onwire speed of corosync. problem is that once pload has been executed, the cluster is basically dead. turn pload into a test tool, by removing corosync-pload tool and user library. cleanup pload code to make it more readable and drop lots of unnecessary stuff. add test/ploadstart tool that can configure and start pload via cmap calls. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-12 07:11:51 +01:00
Fabio M. Di Nitto	142ce8c3a1	totem: drop crypt_accept: concept/option this was another old onwire compat mode that is not useful anylonger. we can safely move the new model by default. According to Honza (real hardware 1 node testing) there are no performance impact. My tests (8 nodes VM cluster), there is up to 10/12% performance improvements up to 1M packet size where old and new models are equal. As a side note, nss still shows to be a performance loss on both real and virtual hw (without any kind of nss hw acceleration). Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-10 07:08:30 +01:00
Angus Salkeld	03b32d7fad	Fix typo in stats key name. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-09 21:54:51 +11:00
Angus Salkeld	41b4416bd4	Remove unused function logsys_priority_name_get() Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-09 21:54:51 +11:00
Angus Salkeld	f628ccba8b	Add pid, hostname and process name to the logfile Note this is only for file targets not stderr or syslog. https://bugzilla.redhat.com/show_bug.cgi?id=789925 Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-09 21:54:51 +11:00
Fabio M. Di Nitto	a6ffed0a52	drop last references to compatibility: whitetank Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2012-03-09 11:38:54 +01:00
Fabio M. Di Nitto	e0e27e3d12	utils: cleanup main daemon exit codes some of them are not in use anymore and can be dropped. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-09 11:15:44 +01:00
Fabio M. Di Nitto	8f6e5ff530	sync: kill evil and syncv1 in one shot this change breaks onwire compatibility. cpg is the only user of sync_* interface and it's the only service that will require extra testing. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-09 11:15:08 +01:00
Jan Friesse	da878290d9	man: Add cmap pages to index.html Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-06 10:04:26 +01:00
Jan Friesse	500ae491e3	man: Add description of cpg_iteration_* functions Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-06 10:04:22 +01:00
Jan Friesse	eaa34a6255	man: Fix cmap_iter_finalize typo Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-06 10:04:18 +01:00
Angus Salkeld	b3f940e6fc	Treat ENOMSG as TRY_AGAIN. ENOMSG is returned by the ringbuffer when you attempt to read a message and there is nothing there to read. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-06 07:58:09 +11:00
Angus Salkeld	e6aab06573	Add common IPC errors. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-06 07:58:09 +11:00
Fabio M. Di Nitto	e996e36083	quorumtool: improve display of status data always display membership data from the local node display when a node is unknown to the local node instead of an error from IPC. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-05 14:30:17 +01:00
Fabio M. Di Nitto	64fd946086	votequorum: move last malloc/alloca buf to static this should guarantee that votequorum won't fail under high memory pressure. Price is 3500 bytes extra preallocated at startup. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-05 14:30:17 +01:00
Fabio M. Di Nitto	90c602902c	votequorum: fix node allocation memory leak stop using malloc for each new node, because we cannot free the memory easily. Move to a static allocated buffer that can contain PROCESSOR_MAX + qdevice cluster_node instead. We can never have more than PROCESSOR_MAX nodes anyway and the memory footprint is small enough compared to memory leaks (those can effectively happen only in very dynamic clusters with tons of different nodes joining/leaveing with different nodeids). Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-05 14:30:17 +01:00
Fabio M. Di Nitto	2d7a8ab29a	votequorum: rename leave_remove to allow_downscale pointed out that leave_remove can be easily confused with the old cman leave_remove behavior. The two are substantially different and we need to avoid confusion both for users and our support team. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto	75b3dc0f4e	votequorum: fix handling of config updates cmap changes are local to the node only and should not be broadcasted as configuration changes. if any change has happened to us, we will inform other nodes via send_nodeinfo. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto	861d2c90ef	votequorum: free our data and lists on exit this is mostly to avoid valgrind errors on exit and make the output more readable. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto	edf0728323	votequorum: disallow special features vs qdevice simply taking the safest path here since integration of qdevice is not fully complete Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto	33ea03f426	votequorum: fix node check based on reconfig parameter Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto	43e08bb143	votequorum: make a common function to calculate votes and cluster members Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto	692fd72468	votequorum: incorporate static config into dynamic no functional changes or extra features yet Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto	f960d0a342	votequorum: move all configuration in votequorum_readconfig Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto	3a717fc8e9	votequorum: start moving from static to fully dynamic config Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto	f12bfc5ad8	votequorum: disallow wait_for_all and qdevice operations The problem here is that user expectations, when using both modes at the same time, have not been set yet. There are 2/3 options that need investigation. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto	4a93ff267f	votequorum: improve debugging output Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-03-02 14:36:47 +01:00
Jan Friesse	25381738c2	Always set interface_up in totemip_iface_check Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-02 09:41:36 +01:00
Fabio M. Di Nitto	e34f095551	votequorum: fix node->flags type when receiving nodeinfo messages old_flags was set to uint16_t but it needs to be uint32_t. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-02-29 09:58:10 +01:00
Fabio M. Di Nitto	6a9e4760da	votequorum: fix segfault in wfa status update this is a regression introduced by `cb5fd775` when reading static config us->flags does not exists yet and therefor setting it will cause a segfault. Move the settings after cluster_node *us is created, with the long term plan to simply kill the whole _static readconfig bits in favour of dynamic (runtime changeable) bits. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-02-29 09:58:10 +01:00
Fabio M. Di Nitto	66466906a1	quorumtool: improve Membership information output align nodeid, votes and name to make it all more readable Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-02-28 11:34:50 +01:00
Fabio M. Di Nitto	d8471ee873	quorumtool: make output more human friendly and retain machine parsable bits Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-02-28 11:34:50 +01:00
Fabio M. Di Nitto	197ea4ade0	quorumtools: fix typo in man page Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-02-28 11:34:50 +01:00
Fabio M. Di Nitto	cbd4bb93c1	quorumtools: drop unused option parsing Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-02-28 11:34:50 +01:00
Fabio M. Di Nitto	4345328a7d	quorumtool: fix version display info we don't need that on every run Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2012-02-28 11:34:50 +01:00
Fabio M. Di Nitto	725b7a61d9	quorumtool: swap node state and node votes output there is no point to show the votes if the node is dead Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-02-27 12:41:04 +01:00
Fabio M. Di Nitto	03c76be696	votequorum: fix votequorum_getinfo man page and align struct name Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-02-27 12:41:04 +01:00
Fabio M. Di Nitto	c439bc3aa8	quorumtool: update man page and help text improve error output since this is more than a debugging tool now Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-02-27 12:41:04 +01:00
Fabio M. Di Nitto	cb5fd77501	votequorum: major rework to fix qdevice API and integration with core qdevice is a very special node in the cluster and it adds a certain amount of complexity and special cases across the code. most of the qdevice data are shared across the cluster (name/votes) but effectively each node has a different view of the qdevice (registered/unregistered/voting/etc.) with this change, we align the qdevice view across the node, exchanging more data between nodes and we fix how qdevice behaves and it is configured. The only side effect is that the amount of data transmitted on wire is slightly higher. The qdevice API is still disabled by default. This means that the amount of real changes in current code are a lot smaller than it appears by this patch. TODO: documentation/man pages needs to be updated once this change is in (and behavior finalized). User visible changes: - configuration (coroparse, exec/votequorum): the quorum device section is now standalone within the quorum. quorum { provider: corosync_votequorum device { model: (name) timeout: (millisec) votes: } } the keyword "model:" is mandatory to enable qdevice in configuration and should express the name of the script/daemon that will provide the qdevice. Looking into the future, an init script or systemd service will look for that name in /path/to/be/decided/name and start/stop qdevice. timeout: defines the maximum interval the qdevice implementation has available between poll (see votequorum_qdevice_poll.3) before the device is considered dead and votes discarded votes: is now a configuration parameter and not an API call. quorum devices don't care what they need to vote. votes is autocalculated when a nodelist is available and all nodes in the list vote 1. Otherwise this parameter is mandatory. - configuration (exec/votequorum): startup and runtime configuration changes have been improved. errors at startup are considered fatal. errors at runtime have different exit paths. startup: * quorum.two_node and qdevice are incompatible. * quorum.expected_votes requires quorum.device.votes. * quorum.expected_votes - quorum.device.votes cannot be lower than 2. * qdevice and last_man_standing are mutually exclusive. * qdevice and auto_tie_breaker are mutually exclusive. runtime config changes: * quorum.two_node and qdevice are incompatible: if quorum device is alive, two_node is disabled. if quorum device is not alive and node count is 2, two_node is enabled, and quorum device cannot be registered * if either last_man_standing or auto_tie_breaker were enabled at startup, and at runtime quorum device is configured, quorum device registration will be blocked. * if quorum.expected_votes is configured but not quorum.device.votes, quorum device registration will be blocked. * if quorum.device.votes is not configured and we cannot automatically calculate it, quorum device registration will be blocked. * An error in configuring quorum.expected_votes and quorum.device.votes will block quorum device registration. blocking quorum device registation, also means dropping the votes. quorum.device.votes (either set or automatically calculated) is now used to determine current expected_votes in the cluster. - logging (exec/votequorum): all errors from configuration are treated as WARNING/CRITICAL. lots of extra DEBUG output is added (see internal changes too). - corosync-quorumtool (tools/corosync-quorumtool): * added option to forcefully kick out a quorum device from the local node. This is for emergency recovery only and it is only available when qdevice API is built-in. * Improved status output, specifically add node state and qdevice information [root@fedora-master-node2 coro]# corosync-quorumtool -s Version: 1.99.4.12-9c7d-dirty Quorum type: corosync_votequorum Nodes: 2 Ring ID: 132 Quorate: Yes Node votes: 1 Node state: Member Expected votes: 3 Highest expected: 3 Total votes: 3 Quorum: 2 Flags: Quorate Qdevice Nodeid Votes Name 1 1 fedora-master-node1.int.fabbione.net 2 1 fedora-master-node2.int.fabbione.net 0 1 QDEVICE (Voting) * allow to print status for any node in the cluster known to local node. [root@fedora-master-node1 coro]# corosync-quorumtool -s Version: 1.99.4.12-9c7d-dirty Quorum type: corosync_votequorum Nodes: 2 Ring ID: 144 Quorate: Yes Node votes: 1 Node state: Member Expected votes: 3 Highest expected: 3 Total votes: 2 Quorum: 2 Flags: Quorate Nodeid Votes Name 1 1 fedora-master-node1.int.fabbione.net 2 1 fedora-master-node2.int.fabbione.net [root@fedora-master-node1 coro]# corosync-quorumtool -s -n 2 Version: 1.99.4.12-9c7d-dirty Quorum type: corosync_votequorum Nodes: 2 Ring ID: 144 Quorate: Yes Node votes: 1 Node state: Member Expected votes: 3 Highest expected: 3 Total votes: 3 Quorum: 2 Flags: Quorate Qdevice Nodeid Votes Name 1 1 fedora-master-node1.int.fabbione.net 2 1 fedora-master-node2.int.fabbione.net 0 1 QDEVICE (Voting) Internal changes: - change qdevice timer to not run all time, but only when necessary. - change votequorum_nodeinfo on wire data to use flags instead of uint8_t and add QDEVICE status. - allocate nodeid 0 to qdevice since it's the only real nodeid that be reserved. - change send_nodeinfo to allow to send nodeinfo for any node so that we can share qdevice info across the cluster (and this might be useful in future if we need to sync internal cluster view). - add votequorum api call to update qdevice name - add runtime data if quorum device has been forcefully disabled by config error - add qdevice votes to expected_votes calculation (this is probably the biggest difference vs cman) - change votequorum_read_nodelist_configuration so that we can autocalculate votes for qdevice (we need the nodecount vs votes). - add all checks for startup/runtime config (see above). - do not make qdevice part of the membership_list received from totem. None of our users care about it and it is not a real node. - change onwire message handlers to deal with "data for this node from any node" case and undersand nodeid 0 for qdevice info - always allocate qdevice at startup. this simplifies code a lot. - dispatch qdevice nodeinfo on membership changes. - inform libvotequorum users when a qdevice is registered - improve substantially qdevice api and add a simple barrier based on qdevice name. - add qdevice API barrier at cluster level. This feature allow only one qdevice name to be active in the cluster at any time. - qdevice getinfo can now report status for qdevice on any node. - change slightly the way the qdevice API is built-in/out: only the libvotequorum calls are #ifdef'out now. Doing so in the core is too complex and would make the code unreadable with the risk of missing a bit or two effectively introducing an on-wire incompatibility if we will ever turn the API on. - probably added some bugs on the way... TODO: update qdevice_* API once the above is settled and test qdevice integration with other features. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com> (only second part)	2012-02-27 09:30:26 +01:00
Fabio M. Di Nitto	9c7d1d3096	build: fix fallout from swithing to common shared lib when building corosync on a clean system or for the very first time, corosync_common needs to be visible both via -L for link and for the LD_PATH, otherwise the linker cannot resolve normal library dependencies. This issue does NOT affect corosync users, but it's confined to internal corosync only. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-02-22 09:43:47 +01:00
Jan Friesse	c92225393b	Document SAM_RECOVERY_POLICY_CMAP Also all irelevant references for SAM_RECOVERY_POLICY_CONFDB are corrected. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-21 16:33:56 +01:00
Jan Friesse	c30c088597	Tweak nodeid warning Nodeid warning now appears only when both totem.nodeid and nodelist nodeid exists. When nodelist nodeid is not defined, totem.nodeid is used. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-21 16:33:56 +01:00
Jan Friesse	61b2a85ebe	spec: Add optional xmlconf Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-02-21 16:27:16 +01:00
Jan Friesse	04720649ba	iba: Use configured node id Corosync was ignoring nodeid for iba transport and always used autogenerated one. Original patch by: Jason Dillaman <jdillama@redhat.com> Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-02-21 16:27:16 +01:00
Angus Salkeld	40727bd6a3	Convert the common lib into a shared lib. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-02-21 20:26:08 +11:00
Jan Friesse	88ae75d6c2	Allow autoconfiguration of interface section Thanks to totemip_getifaddrs infrastructure it's now possible to use nodelist informations to autoconfigure interface bindnetaddr. Together with cluster_name, interface section can be completely omitted. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-16 10:47:57 +01:00
Jan Friesse	ba13537471	totemconfig: ensure suffix for ringX_addr Patch makes sure, that ringX_addr key has really _addr suffix. Previously, it was possible to enter ringXanything and it was interpreted as ringX_addr. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-02-16 10:47:57 +01:00

1 2 3 4 5 ...

3073 Commits