Commit Graph

2513 Commits

Author SHA1 Message Date
Russell Bryant
a609f79f1f Ensure that strings are null terminated after strncpy().
From the strcpy(3) man page, the following warning is given:
  The strncpy() function is similar, except that at most n bytes of src
  are  copied.  Warning: If there is no null byte among the first n bytes
  of src, the string placed in dest will not be null-terminated.

The current corosync code base does not take this warning into account
when using strncpy, potentially resulting in non-null terminated strings.

Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-03-07 08:30:03 -06:00
Russell Bryant
1be0c3bdc6 Add -l option to corosync-keygen.
This option (-l or --less-secure) causes corosync-keygen to read from
/dev/urandom instead of /dev/random to ensure that no input is required
from the user.  It may be useful when this command is used from a
script.

Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-03-05 10:02:57 -06:00
Steven Dake
7471c88346 Don't assert when ring id file is less then 8 bytes
If the ring id file for the processor is less then 8 bytes, totemsrp would
assert.  Our speculation is that this condition happens during a fencing
operation or local filesystem corruption.

With this patch, Corosync will create fresh ring id file data when the
incorrect number of bytes are read from the ring id.

Amend to use sizeof the strerror string length and PATH_MAX for the path length.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-02-24 15:34:39 -07:00
Steven Dake
d9b2f3937b snmp: Allow buildling of corosync on already existing older install of corosync
When building corosync against older libraries already installed on the system,
the corosync-notifyd application uses the wrong Makefile.am commands.  This
results in the SNMPLIBS (which includes -L/usr/lib64) coming before the proper
LDADD flags.  The result is an inability to compile on an already existing
installation.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-02-24 15:23:37 -07:00
Jan Friesse
894ece6a14 objdb: destroy all handles in _clear_object
Patch replaces free for object_instance with handle_destroy to remove
leaks in handles (and also memory leak).

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-02-24 12:15:01 +01:00
Jan Friesse
41aeecc4ef Iterate all items in object_reload_notification
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-02-23 13:36:28 +01:00
Jan Friesse
12163b62d2 corosync-fplay: use uint32_t and remove bit-shift
The flight recorder records all data in 32 bit words. Use uint32_t type
rather then unsigned int. Also remove bit-shift with multiply by sizeof
uint32_t.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-02-23 13:31:38 +01:00
Jan Friesse
d3e9382d57 corosync-fplay: Use size_t length mod in printf
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-02-23 13:31:31 +01:00
Jan Friesse
7b0517f5e9 corosync-fplay: handle too large rec_size
Corrupted files may contain items with rec_size larger then g_record
buffer and/or flt_data_size.

Also g_record array size is now defined as constant.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-02-22 10:11:48 +01:00
Jan Friesse
c5e8237325 logsys: Properly lock flt data before dump
Data needs to be locked, otherwise resulting fdata file may be
incorrect.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-02-22 10:11:11 +01:00
Jan Friesse
88515e3d20 logsys: Don't leak fd on successful fdata dump
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-02-22 10:09:10 +01:00
Russell Bryant
907d974352 Add calls to pthread_attr_destroy().
This patch adds a couple of missing calls to pthread_attr_destroy().

There were a couple of instances where pthread_attr_init() was being
used without a cooresponding call to pthread_attr_destroy().  This also
localizes the pthread_attr_t to the function where it is needed instead
of having it persist (the man page specifically states that destroying
the attributes structure has no effect on threads created using the
attributes).

Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-02-21 12:14:07 -07:00
Angus Salkeld
4c9b8d3acf CTS: wait (consistently) for 15 minutes for events
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
2011-02-14 13:40:17 +11:00
Angus Salkeld
d72f6e38a4 autobuild: clean the build dir first.
This deletes files like .version that cause problems.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
2011-02-14 08:13:36 +11:00
Angus Salkeld
4e337c7b05 CTS: temp remove troublesome tests.
Right I know - not so good to comment out tests.
BUT they are passing but there is some weirdness
in ssh reconnecting to these nodes that causes CTS false
negatives.
So the nodes are watchdogged (as expected) but when they come
back up cts gets stuck in a loop re-trying to ssh into
them. It odd as a manual ssh works fine.

Basically I think it's more important the we get reliable
testing than have these test in there.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-02-11 16:57:49 +11:00
Angus Salkeld
f2a961d155 Make node state a string (not an integer)
Ryan noticed this inconsistency, all other status's
are string so this should be too.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Seven Dake <sdake@redhat.com>
Reviewed-by: Ryan O'Hara <rohara@redhat.com>
2011-02-08 08:10:30 +11:00
Angus Salkeld
e1a6b2ccfb CONFDB: fix parent_get response id
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Seven Dake <sdake@redhat.com>
2011-02-08 08:10:20 +11:00
Angus Salkeld
52cd433df6 MIB: expand the descriptions of the notifications
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-02-04 09:48:34 -07:00
Lon Hohberger
cca89e0a06 Match up MIB to notifyd & add SNMP quorum events
Signed-off-by: Lon Hohberger <lhh@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-02-04 09:48:16 -07:00
Lon Hohberger
6f7182a71f Make SNMP MIB match what is being sent over DBUS
Signed-off-by: Lon Hohberger <lhh@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-02-04 09:47:58 -07:00
Angus Salkeld
2a568d6e79 Add dbus and snmp notifier
This is to send dbus events on major cluster events:
 - membership changes
 - application connect/dissconnet from corosync
 - quorum changes

dbus events can then be converted into snmp traps by foghorn or
corosync-notifyd can be run to directly send snmp traps.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Signed-off-by: Lon Hohberger <lhh@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2011-02-04 09:47:35 -07:00
Angus Salkeld
89e4c1c048 CONFDB: add confdb_object_name_get()
This is useful when tracking object changes.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Seven Dake <sdake@redhat.com>
2011-02-04 09:47:15 -07:00
Angus Salkeld
34cb488999 STATS: fix key name length on "join_count"
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Seven Dake <sdake@redhat.com>
2011-02-04 09:46:52 -07:00
Angus Salkeld
4da371f4f7 STATS: increase the space for application names
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Seven Dake <sdake@redhat.com>
2011-02-04 09:44:12 -07:00
Jan Friesse
fbbb3f01cb Handle "nocluster" kernel parameter in init script
Init script checks kernel parameters and refuses to start corosync if
nocluster parameter exist on boot time. The init script will
continue to work as expected from console/tty after boot.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-01-31 14:27:36 +01:00
Jan Friesse
5c951ac641 Add objdb firewall_enabled_or_nic_failure
New objdb var runtime.totem.pg.mrp.srp.firewall_enabled_or_nic_failure
is set to 1 if continuous_gather is larger then MAX_NO_CONT_GATHER.
Under normal conditions, value of variable is 0.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-01-12 11:26:25 +01:00
Angus Salkeld
29755d4526 Add missing entries into .gitignore
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-01-12 09:42:24 +11:00
Angus Salkeld
20d545e946 remove unused function declaration
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-01-12 09:42:24 +11:00
Angus Salkeld
6f098bba1d fix timersub warning on freebsd
Make them all protected by #ifndef timersub

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-01-12 09:42:24 +11:00
Steven Dake
6646a864b4 Handle delayed multicast packets that occur with switches
Some switches delay multicast packets vs the unicast token.  This patch works
around that problem by providing a new tuneable called miss_count_const.  This
tuneable works by counting the number of times a message is found missing
and once reaching the const value, marks it as missing in the retransmit list.

This improves performance and doesn't display warning messages about missed
multicast messages when operating in these switching environments.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-01-11 10:34:46 -07:00
Angus Salkeld
e0cce2c907 CPG: make sure coroipcc_service_disconnect() is always called.
This prevents a shared mem leak if corosync dies while clients
are connected.

Calling cpg_finalize() did not release the shared mem as
coroipcc_msg_send_reply_receive() returned an error and
thus coroipcc_service_disconnect() did not get called.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-01-03 21:29:01 +11:00
Angus Salkeld
a9b436c7a1 IPC: send failure message to client if memory maps fail
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-01-03 21:28:44 +11:00
Jan Friesse
b9df4424b1 Display warning when not possible to form cluster
This may typically happen if local firewall is enabled. Patch adds new
item to statistics called continuous_gather where is number of
continuous entered gather state. If this number is bigger then
MAX_NO_CONT_GATHER, warning message is displayed. This is also used on
exiting, so stop of corosync is now possible even with enabled firewall.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2010-12-03 10:11:11 +01:00
Fabio M. Di Nitto
bafb69bf75 build: fix make srpm from release tarball
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2010-12-01 11:58:17 -07:00
Fabio M. Di Nitto
3b34f6092d build: fix rpm build to include corosync-blackbox
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2010-12-01 11:50:52 -07:00
Steven Dake
8c3680d126 Revert "Always autogen the tree when building an RPM"
This reverts commit d145838a21.
2010-12-01 11:18:19 -07:00
Steven Dake
d145838a21 Always autogen the tree when building an RPM
Since the source tarball never includes the autogen'ed tree in the new source
repo methodology, always autogen the tree.

Signed-off-by: Steven Dake <sdake@redhat.com>
2010-12-01 10:27:06 -07:00
Steven Dake
9096c4d96b Set the max buffer size for sockets
Set the recv buffer to a large size and the send buffer to a large size to
allow the kernel to store more messages before dropping messages.

Amended to change optlen type to socklen_t

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2010-12-01 09:45:30 -07:00
Steven Dake
00e340d095 The flushing code was introducing data corruption because of recursion errors
that occur as a result of the design of udpu.  Totem no longer requires
the flushing technique because we don't mark a packet as missing until it has
not been seen by a certain number of token rotations per a previous patch.  This
mechanism was introduced to work around a problem in switches where multicast
messages may be delayed by long periods compared to the unicast token.

This patch removes the flushing logic from udpu since it is no longer necessary.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2010-11-28 01:45:08 -07:00
Angus Salkeld
2c46de5ac1 Add totem/interface/ttl config option.
This adds a per-interface config option to
adjust the TTL.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2010-11-24 14:35:56 +11:00
Fabio M. Di Nitto
565b32c262 build: fix makefile to ship corosync.conf.example.udpu
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by Angus Salkeld <asalkeld@redhat.com>
2010-11-19 19:55:34 +11:00
Steven Dake
aa03dca478 Merge branch 'topic-udpu'
Conflicts:
	Makefile.am

Signed-off-by: Steven Dake <sdake@redhat.com>
2010-11-18 15:03:19 -07:00
Steven Dake
b403fcbea9 Remove dead soresueaddr code
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2010-11-18 14:51:17 -07:00
Steven Dake
bb05aed93f Add the UDPU transport
The UDPU transport is useful for those deployments which can't use multicast.
UDPU works by using UDP unicast, which is fully supported by every switch
manufacturer by default and doesn't rely on a functional IGMP implementation.

An example of the UDPU transport is contained in the corosync.conf.example.udpu
file which shows a 16 node cluster.  This file should be copied to each node
in the cluster and IP addresses changed as appropriate.

Amended to remove dead udpu REUSEADDR socket option.

Signed-off-by: Steven Dake <sdake@redhat.com>
2010-11-18 14:21:30 -07:00
Fabio M. Di Nitto
c835c88f73 build: fix spec file and srpm/rpm generation
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2010-11-10 10:43:07 -07:00
Fabio M. Di Nitto
b2400314b2 add release script and git based versioning
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2010-11-10 07:46:53 -07:00
Steven Dake
0730d997b1 Merge branch 'master', remote branch 'origin/master' 2010-11-10 07:08:54 -07:00
Steven Dake
71c89ed653 Add license information to LICENSE file about build process files
A few files licensed under GPLv3+ produce text output but are not used as
part of the runtime or libraries provided by Corosync.  Make that notification
in the LICENSE file.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>
2010-11-10 07:05:45 -07:00
Angus Salkeld
5e43f750e1 Add -i <num-iterations> to cpgverify
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2010-10-21 17:27:40 -07:00
Steven Dake
e14ed63828 New topic descriptions based upon work community wants to do
This file describes the topics of interest for development, their start and
finish date, their main developer, and a description of the topic.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2010-10-22 10:02:55 +11:00