This is avoid getting stuck in the dispatch processing
messages when the user is trying to shutdown the service.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
1) both IPv4 and IPv6 mcast should default to ttl=1
2) the range should be 0..255
0 is valid meaning localhost only (cluster of one)
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
This patch modifies most of the existing comments in header files to be
in a format that doxygen can interpret. This provides another
significant improvement to the web/pdf/etc generated documentation
without having to add new content.
Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
This change paves the way for eliminating a copy within the Infiniband
driver in the future by transferring responsibility for allocating and
freeing message buffers to the transport driver layer.
Tested under valgrind on a single-node cluster.
Signed-off-by: Zane Bitter <zane.bitter@gmail.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
A commit token should be rejected when a token is lost in the recovery
state. This occurs naturally because the ring id increases by 4 for
every new ring. Prior to this patch, if the token was lost, the old
ring id information was restored, causing a commit token to be accepted
when it should be rejected. This erronously accepted commit token would
lead to an assertion which is fixed by this patch.
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
This creates some content on the main page of the documentation
generated by doxygen. The main page includes the license and a link
to the project web site.
Signed-off-by: Russell Bryant <russell@russellbryant.net>
eviewed-by: Steven Dake <sdake@redhat.com>
This resolves a couple of doxygen warnings. First, the group needed a
name. Second, all of the functions in the file were added to the group
but doxygen complained about the lack of an end to the grouping.
Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Steven Dake <sdake@redhat.com>
The included doxygen configuration file was a bit stale. It included
some options that were obsolete and caused doxygen to generate some
warnings when running it. Most of the changes here were simply done by
running "doxygen -u" to automatically update the file. It added its
documentation for the options and removed the obsolete options.
This also includes one configuration change, which is to set EXTRACT_ALL
to yes. This instructs doxygen to generate documentation pages for all
files, public functions, and public data structures even if they are not
currently documented using doxygen syntax. Doxygen is capable of
generating some useful documentation on its own, such as dependency
graphs.
Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Steven Dake <sdake@redhat.com>
The configure script has been updated to check for the doxygen and dot
applications (from doxygen and graphviz). The results from these checks
are now used in the Makefile to ensure that the tools are installed when
you run "make doxygen". If they are not, it will generate a helpful
error message.
Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Steven Dake <sdake@redhat.com>
From the strcpy(3) man page, the following warning is given:
The strncpy() function is similar, except that at most n bytes of src
are copied. Warning: If there is no null byte among the first n bytes
of src, the string placed in dest will not be null-terminated.
The current corosync code base does not take this warning into account
when using strncpy, potentially resulting in non-null terminated strings.
Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Steven Dake <sdake@redhat.com>
This option (-l or --less-secure) causes corosync-keygen to read from
/dev/urandom instead of /dev/random to ensure that no input is required
from the user. It may be useful when this command is used from a
script.
Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Steven Dake <sdake@redhat.com>
If the ring id file for the processor is less then 8 bytes, totemsrp would
assert. Our speculation is that this condition happens during a fencing
operation or local filesystem corruption.
With this patch, Corosync will create fresh ring id file data when the
incorrect number of bytes are read from the ring id.
Amend to use sizeof the strerror string length and PATH_MAX for the path length.
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
When building corosync against older libraries already installed on the system,
the corosync-notifyd application uses the wrong Makefile.am commands. This
results in the SNMPLIBS (which includes -L/usr/lib64) coming before the proper
LDADD flags. The result is an inability to compile on an already existing
installation.
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
Patch replaces free for object_instance with handle_destroy to remove
leaks in handles (and also memory leak).
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
The flight recorder records all data in 32 bit words. Use uint32_t type
rather then unsigned int. Also remove bit-shift with multiply by sizeof
uint32_t.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Corrupted files may contain items with rec_size larger then g_record
buffer and/or flt_data_size.
Also g_record array size is now defined as constant.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Data needs to be locked, otherwise resulting fdata file may be
incorrect.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
This patch adds a couple of missing calls to pthread_attr_destroy().
There were a couple of instances where pthread_attr_init() was being
used without a cooresponding call to pthread_attr_destroy(). This also
localizes the pthread_attr_t to the function where it is needed instead
of having it persist (the man page specifically states that destroying
the attributes structure has no effect on threads created using the
attributes).
Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Steven Dake <sdake@redhat.com>
Right I know - not so good to comment out tests.
BUT they are passing but there is some weirdness
in ssh reconnecting to these nodes that causes CTS false
negatives.
So the nodes are watchdogged (as expected) but when they come
back up cts gets stuck in a loop re-trying to ssh into
them. It odd as a manual ssh works fine.
Basically I think it's more important the we get reliable
testing than have these test in there.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Ryan noticed this inconsistency, all other status's
are string so this should be too.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Seven Dake <sdake@redhat.com>
Reviewed-by: Ryan O'Hara <rohara@redhat.com>
This is to send dbus events on major cluster events:
- membership changes
- application connect/dissconnet from corosync
- quorum changes
dbus events can then be converted into snmp traps by foghorn or
corosync-notifyd can be run to directly send snmp traps.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Signed-off-by: Lon Hohberger <lhh@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Init script checks kernel parameters and refuses to start corosync if
nocluster parameter exist on boot time. The init script will
continue to work as expected from console/tty after boot.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
New objdb var runtime.totem.pg.mrp.srp.firewall_enabled_or_nic_failure
is set to 1 if continuous_gather is larger then MAX_NO_CONT_GATHER.
Under normal conditions, value of variable is 0.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Some switches delay multicast packets vs the unicast token. This patch works
around that problem by providing a new tuneable called miss_count_const. This
tuneable works by counting the number of times a message is found missing
and once reaching the const value, marks it as missing in the retransmit list.
This improves performance and doesn't display warning messages about missed
multicast messages when operating in these switching environments.
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
This prevents a shared mem leak if corosync dies while clients
are connected.
Calling cpg_finalize() did not release the shared mem as
coroipcc_msg_send_reply_receive() returned an error and
thus coroipcc_service_disconnect() did not get called.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
This may typically happen if local firewall is enabled. Patch adds new
item to statistics called continuous_gather where is number of
continuous entered gather state. If this number is bigger then
MAX_NO_CONT_GATHER, warning message is displayed. This is also used on
exiting, so stop of corosync is now possible even with enabled firewall.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>