mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2025-08-03 20:00:48 +00:00

Author	SHA1	Message	Date
Angus Salkeld	076e8b74f7	STATS: add the service name to the connection name. This helps to quickly identify what service the application is connected to. The object will now look like: runtime.connections.corosync-objctl:CONFDB:19654:13.service_id=11 runtime.connections.corosync-objctl:CONFDB:19654:13.client_pid=19654 etc... This also makes it clearer to receivers of the dbus/snmp events what is going on. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-29 13:48:13 +11:00
Angus Salkeld	4991ccd3d8	NOTIFYD: prevent duplicate quorate events. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-29 13:48:09 +11:00
Angus Salkeld	a97e1f0813	NOTIFYD: fix retrieving the application's parent name. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-29 13:47:42 +11:00
Jan Friesse	b4bef1cbf5	cfgtool: print list of IP with space between items Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-24 17:42:09 +01:00
Jan Friesse	f6df7823fa	cpgtool: print list of IP with space between items Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-24 17:42:09 +01:00
Jan Friesse	033f7ced10	cfg_get_node_addrs: Return correct addresses Zero element array behavior is very different from normal array or pointer. This behavior is root of problem in not returning correctly filled array of addresses. This appeared only in rrp mode, where more then one address is returned. All memcpy's are now correctly converted to copy pointer to char. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-24 17:42:08 +01:00
Steven Dake	7d5e588931	totemsrp: free messages originated in recovery rather then rely on messages_free Relying on messages_free may seem like it should work, but it leads to a situation where every node has released the messages, yet some nodes think messages are missing. The output then looks like "Retransmit: #" in repitition. This patch frees those messages immediately during the transition to the OPERATIONAL state and sets the internal variables totemsrp depends upon to the proper values. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2011-03-24 09:25:15 -07:00
Steven Dake	ef05817ce5	totemsrp: Only restore old ring id information one time The current code stores the current ring information every time a commit token is generated. This causes the old ring id used for comparison purposes to increase if a token is lost in commit or recovery, resulting in failure of totem. This patch changes the behavior to only store the old ring id one time when the commit token is received, and then further commit token ring id saves are not done until OPERATIONAL is reached. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2011-03-24 09:22:34 -07:00
Steven Dake	1a7b7a39f4	totemsrp: Remove recv_flush code The recv_flush code is no longer necessary because of the miss_count_count addition. It can in some cases lead to register corruption because of interactions with -fstack-protector, the recursive nature of how this code works, and interactions with the optimizer in some versions of gcc. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2011-03-24 09:21:27 -07:00
Angus Salkeld	75087f7c1b	confdb: send notifications from the main thread not IPC thread corosync-notifyd has exposed an issue with confdb notifications. The normal state of affairs is: IPC thread > lock > objdb > lock objdb notification whilst really useful turn things around: <middle of big call chain> objdb > lock > confdb > ipc > lock This reverse ordering of locks causes a horrible dead lock. I see this patch as a work around until corosync-2.0 when most of the threads and locking disappear. This patch adds a pipe to confdb service. When we get a objdb notification a struct gets written to the pipe. The poll loop then runs the dispatch in the main thread. In the dispatch we call the real ipc_dispatch_send(). Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-24 07:54:42 +11:00
Steven Dake	d99fba72e6	Resolve abort during simulatenous stopping of atleast 4 nodes consider 5 nodes. node 3,4 stopped (by random stopping) node 1,2,5 form new configuration and during recovery node 1 and node 2 are stopped (via service service corosync stop). This causes 5 never to finish recovery within the timeout period, triggering a token loss in recovery. Bug #623176 resolved an assert which happens because the full ring id was being restored. The resolution to Bug #623176 was to not restore the full ring id, and instead operate (according to specifications) the new ring id. Unfortunately this exposes a problem whereby the restarting of nodes 1-4 generate the same ring id. This ring id gets to the recovery failed node 5 which is now in gather, and triggers a condition not accounted for in the original totem specification. It appears later work from Dr. Agarwal's PHD dissertation considers this scenario. That solution entails rejecting the regular token in the above condition. Since the ring id is also used to make decisions for commit token acceptance, we must also take care to reject the regular token in all cases after transitioning from OPERATIONAL. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-21 09:26:35 -07:00
Angus Salkeld	7004457014	notifyd: dispatch only one message at a time. This is avoid getting stuck in the dispatch processing messages when the user is trying to shutdown the service. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-21 09:24:01 -07:00
Angus Salkeld	0ad2494ae7	Fix some "set but not used" warnings [-Wunused-but-set-variable] Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-16 07:13:42 +11:00
Angus Salkeld	c9dee9eaa7	Remove the ttl option from udpu and rely on the kernel ttl setting. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2011-03-15 19:35:23 +11:00
Angus Salkeld	86ada30aa4	Fix the ttl defaults and range 1) both IPv4 and IPv6 mcast should default to ttl=1 2) the range should be 0..255 0 is valid meaning localhost only (cluster of one) Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2011-03-15 19:34:46 +11:00
Russell Bryant	9909a20859	Add Doxyfile to .gitignore Signed-off-by: Russell Bryant <russell@russellbryant.net> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-03-15 11:08:45 +11:00
Angus Salkeld	b6ba64c1eb	docs: auto-generate the version Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-12 19:39:04 +11:00
Russell Bryant	5da4d5479a	Convert existing documentation to doxygen format. This patch modifies most of the existing comments in header files to be in a format that doxygen can interpret. This provides another significant improvement to the web/pdf/etc generated documentation without having to add new content. Signed-off-by: Russell Bryant <russell@russellbryant.net> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-03-12 15:03:16 +11:00
Zane Bitter	dddaeef21c	Allocate packet buffers in the transport drivers This change paves the way for eliminating a copy within the Infiniband driver in the future by transferring responsibility for allocating and freeing message buffers to the transport driver layer. Tested under valgrind on a single-node cluster. Signed-off-by: Zane Bitter <zane.bitter@gmail.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-11 20:38:28 -07:00
Zane Bitter	2303525125	Fix minor errors in man page documentation for corosync.conf * Correct 'See Also' reference to corosync.conf(5) in corosync(8) man page * Update path to default config (now /etc/corosync/corosync.conf) Signed-off-by: Zane Bitter <zane.bitter@gmail.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-10 01:25:08 -07:00
Steven Dake	6aa47fde95	Fix abort when token is lost in RECOVERY state A commit token should be rejected when a token is lost in the recovery state. This occurs naturally because the ring id increases by 4 for every new ring. Prior to this patch, if the token was lost, the old ring id information was restored, causing a commit token to be accepted when it should be rejected. This erronously accepted commit token would lead to an assertion which is fixed by this patch. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-03-07 17:15:05 -07:00
Russell Bryant	c112ee8c89	Add content for the doxygen main page. This creates some content on the main page of the documentation generated by doxygen. The main page includes the license and a link to the project web site. Signed-off-by: Russell Bryant <russell@russellbryant.net> eviewed-by: Steven Dake <sdake@redhat.com>	2011-03-07 08:42:01 -06:00
Russell Bryant	e5456008d0	Resolve a couple of doxygen warnings. This resolves a couple of doxygen warnings. First, the group needed a name. Second, all of the functions in the file were added to the group but doxygen complained about the lack of an end to the grouping. Signed-off-by: Russell Bryant <russell@russellbryant.net> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-07 08:39:58 -06:00
Russell Bryant	7478a3e136	Update doxygen configuration file. The included doxygen configuration file was a bit stale. It included some options that were obsolete and caused doxygen to generate some warnings when running it. Most of the changes here were simply done by running "doxygen -u" to automatically update the file. It added its documentation for the options and removed the obsolete options. This also includes one configuration change, which is to set EXTRACT_ALL to yes. This instructs doxygen to generate documentation pages for all files, public functions, and public data structures even if they are not currently documented using doxygen syntax. Doxygen is capable of generating some useful documentation on its own, such as dependency graphs. Signed-off-by: Russell Bryant <russell@russellbryant.net> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-07 08:38:53 -06:00
Russell Bryant	8ed864ddc5	Minor build system updates for doxygen. The configure script has been updated to check for the doxygen and dot applications (from doxygen and graphviz). The results from these checks are now used in the Makefile to ensure that the tools are installed when you run "make doxygen". If they are not, it will generate a helpful error message. Signed-off-by: Russell Bryant <russell@russellbryant.net> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-07 08:36:53 -06:00
Russell Bryant	a609f79f1f	Ensure that strings are null terminated after strncpy(). From the strcpy(3) man page, the following warning is given: The strncpy() function is similar, except that at most n bytes of src are copied. Warning: If there is no null byte among the first n bytes of src, the string placed in dest will not be null-terminated. The current corosync code base does not take this warning into account when using strncpy, potentially resulting in non-null terminated strings. Signed-off-by: Russell Bryant <russell@russellbryant.net> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-07 08:30:03 -06:00
Russell Bryant	1be0c3bdc6	Add -l option to corosync-keygen. This option (-l or --less-secure) causes corosync-keygen to read from /dev/urandom instead of /dev/random to ensure that no input is required from the user. It may be useful when this command is used from a script. Signed-off-by: Russell Bryant <russell@russellbryant.net> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-03-05 10:02:57 -06:00
Steven Dake	7471c88346	Don't assert when ring id file is less then 8 bytes If the ring id file for the processor is less then 8 bytes, totemsrp would assert. Our speculation is that this condition happens during a fencing operation or local filesystem corruption. With this patch, Corosync will create fresh ring id file data when the incorrect number of bytes are read from the ring id. Amend to use sizeof the strerror string length and PATH_MAX for the path length. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-02-24 15:34:39 -07:00
Steven Dake	d9b2f3937b	snmp: Allow buildling of corosync on already existing older install of corosync When building corosync against older libraries already installed on the system, the corosync-notifyd application uses the wrong Makefile.am commands. This results in the SNMPLIBS (which includes -L/usr/lib64) coming before the proper LDADD flags. The result is an inability to compile on an already existing installation. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-02-24 15:23:37 -07:00
Jan Friesse	894ece6a14	objdb: destroy all handles in _clear_object Patch replaces free for object_instance with handle_destroy to remove leaks in handles (and also memory leak). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-02-24 12:15:01 +01:00
Jan Friesse	41aeecc4ef	Iterate all items in object_reload_notification Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-02-23 13:36:28 +01:00
Jan Friesse	12163b62d2	corosync-fplay: use uint32_t and remove bit-shift The flight recorder records all data in 32 bit words. Use uint32_t type rather then unsigned int. Also remove bit-shift with multiply by sizeof uint32_t. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-02-23 13:31:38 +01:00
Jan Friesse	d3e9382d57	corosync-fplay: Use size_t length mod in printf Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-02-23 13:31:31 +01:00
Jan Friesse	7b0517f5e9	corosync-fplay: handle too large rec_size Corrupted files may contain items with rec_size larger then g_record buffer and/or flt_data_size. Also g_record array size is now defined as constant. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-02-22 10:11:48 +01:00
Jan Friesse	c5e8237325	logsys: Properly lock flt data before dump Data needs to be locked, otherwise resulting fdata file may be incorrect. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-02-22 10:11:11 +01:00
Jan Friesse	88515e3d20	logsys: Don't leak fd on successful fdata dump Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-02-22 10:09:10 +01:00
Russell Bryant	907d974352	Add calls to pthread_attr_destroy(). This patch adds a couple of missing calls to pthread_attr_destroy(). There were a couple of instances where pthread_attr_init() was being used without a cooresponding call to pthread_attr_destroy(). This also localizes the pthread_attr_t to the function where it is needed instead of having it persist (the man page specifically states that destroying the attributes structure has no effect on threads created using the attributes). Signed-off-by: Russell Bryant <russell@russellbryant.net> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-02-21 12:14:07 -07:00
Angus Salkeld	4c9b8d3acf	CTS: wait (consistently) for 15 minutes for events Signed-off-by: Angus Salkeld <asalkeld@redhat.com>	2011-02-14 13:40:17 +11:00
Angus Salkeld	d72f6e38a4	autobuild: clean the build dir first. This deletes files like .version that cause problems. Signed-off-by: Angus Salkeld <asalkeld@redhat.com>	2011-02-14 08:13:36 +11:00
Angus Salkeld	4e337c7b05	CTS: temp remove troublesome tests. Right I know - not so good to comment out tests. BUT they are passing but there is some weirdness in ssh reconnecting to these nodes that causes CTS false negatives. So the nodes are watchdogged (as expected) but when they come back up cts gets stuck in a loop re-trying to ssh into them. It odd as a manual ssh works fine. Basically I think it's more important the we get reliable testing than have these test in there. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-02-11 16:57:49 +11:00
Angus Salkeld	f2a961d155	Make node state a string (not an integer) Ryan noticed this inconsistency, all other status's are string so this should be too. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Seven Dake <sdake@redhat.com> Reviewed-by: Ryan O'Hara <rohara@redhat.com>	2011-02-08 08:10:30 +11:00
Angus Salkeld	e1a6b2ccfb	CONFDB: fix parent_get response id Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Seven Dake <sdake@redhat.com>	2011-02-08 08:10:20 +11:00
Angus Salkeld	52cd433df6	MIB: expand the descriptions of the notifications Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-02-04 09:48:34 -07:00
Lon Hohberger	cca89e0a06	Match up MIB to notifyd & add SNMP quorum events Signed-off-by: Lon Hohberger <lhh@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-02-04 09:48:16 -07:00
Lon Hohberger	6f7182a71f	Make SNMP MIB match what is being sent over DBUS Signed-off-by: Lon Hohberger <lhh@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2011-02-04 09:47:58 -07:00
Angus Salkeld	2a568d6e79	Add dbus and snmp notifier This is to send dbus events on major cluster events: - membership changes - application connect/dissconnet from corosync - quorum changes dbus events can then be converted into snmp traps by foghorn or corosync-notifyd can be run to directly send snmp traps. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Signed-off-by: Lon Hohberger <lhh@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Russell Bryant <russell@russellbryant.net> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2011-02-04 09:47:35 -07:00
Angus Salkeld	89e4c1c048	CONFDB: add confdb_object_name_get() This is useful when tracking object changes. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Seven Dake <sdake@redhat.com>	2011-02-04 09:47:15 -07:00
Angus Salkeld	34cb488999	STATS: fix key name length on "join_count" Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Seven Dake <sdake@redhat.com>	2011-02-04 09:46:52 -07:00
Angus Salkeld	4da371f4f7	STATS: increase the space for application names Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Seven Dake <sdake@redhat.com>	2011-02-04 09:44:12 -07:00
Jan Friesse	fbbb3f01cb	Handle "nocluster" kernel parameter in init script Init script checks kernel parameters and refuses to start corosync if nocluster parameter exist on boot time. The init script will continue to work as expected from console/tty after boot. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2011-01-31 14:27:36 +01:00

1 2 3 4 5 ...

2538 Commits