Commit Graph

1307 Commits

Author SHA1 Message Date
Angus Salkeld
0ad2494ae7 Fix some "set but not used" warnings [-Wunused-but-set-variable]
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-03-16 07:13:42 +11:00
Angus Salkeld
c9dee9eaa7 Remove the ttl option from udpu and rely on the kernel ttl setting.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2011-03-15 19:35:23 +11:00
Angus Salkeld
86ada30aa4 Fix the ttl defaults and range
1) both IPv4 and IPv6 mcast should default to ttl=1
2) the range should be 0..255
   0 is valid meaning localhost only (cluster of one)

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2011-03-15 19:34:46 +11:00
Russell Bryant
5da4d5479a Convert existing documentation to doxygen format.
This patch modifies most of the existing comments in header files to be
in a format that doxygen can interpret.  This provides another
significant improvement to the web/pdf/etc generated documentation
without having to add new content.

Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-03-12 15:03:16 +11:00
Zane Bitter
dddaeef21c Allocate packet buffers in the transport drivers
This change paves the way for eliminating a copy within the Infiniband
driver in the future by transferring responsibility for allocating and
freeing message buffers to the transport driver layer.

Tested under valgrind on a single-node cluster.

Signed-off-by: Zane Bitter <zane.bitter@gmail.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-03-11 20:38:28 -07:00
Steven Dake
6aa47fde95 Fix abort when token is lost in RECOVERY state
A commit token should be rejected when a token is lost in the recovery
state.  This occurs naturally because the ring id increases by 4 for
every new ring.  Prior to this patch, if the token was lost, the old
ring id information was restored, causing a commit token to be accepted
when it should be rejected.  This erronously accepted commit token would
lead to an assertion which is fixed by this patch.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-03-07 17:15:05 -07:00
Russell Bryant
c112ee8c89 Add content for the doxygen main page.
This creates some content on the main page of the documentation
generated by doxygen.  The main page includes the license and a link
to the project web site.

Signed-off-by: Russell Bryant <russell@russellbryant.net>
eviewed-by: Steven Dake <sdake@redhat.com>
2011-03-07 08:42:01 -06:00
Russell Bryant
a609f79f1f Ensure that strings are null terminated after strncpy().
From the strcpy(3) man page, the following warning is given:
  The strncpy() function is similar, except that at most n bytes of src
  are  copied.  Warning: If there is no null byte among the first n bytes
  of src, the string placed in dest will not be null-terminated.

The current corosync code base does not take this warning into account
when using strncpy, potentially resulting in non-null terminated strings.

Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-03-07 08:30:03 -06:00
Steven Dake
7471c88346 Don't assert when ring id file is less then 8 bytes
If the ring id file for the processor is less then 8 bytes, totemsrp would
assert.  Our speculation is that this condition happens during a fencing
operation or local filesystem corruption.

With this patch, Corosync will create fresh ring id file data when the
incorrect number of bytes are read from the ring id.

Amend to use sizeof the strerror string length and PATH_MAX for the path length.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-02-24 15:34:39 -07:00
Jan Friesse
894ece6a14 objdb: destroy all handles in _clear_object
Patch replaces free for object_instance with handle_destroy to remove
leaks in handles (and also memory leak).

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-02-24 12:15:01 +01:00
Jan Friesse
41aeecc4ef Iterate all items in object_reload_notification
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-02-23 13:36:28 +01:00
Jan Friesse
c5e8237325 logsys: Properly lock flt data before dump
Data needs to be locked, otherwise resulting fdata file may be
incorrect.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-02-22 10:11:11 +01:00
Jan Friesse
88515e3d20 logsys: Don't leak fd on successful fdata dump
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-02-22 10:09:10 +01:00
Russell Bryant
907d974352 Add calls to pthread_attr_destroy().
This patch adds a couple of missing calls to pthread_attr_destroy().

There were a couple of instances where pthread_attr_init() was being
used without a cooresponding call to pthread_attr_destroy().  This also
localizes the pthread_attr_t to the function where it is needed instead
of having it persist (the man page specifically states that destroying
the attributes structure has no effect on threads created using the
attributes).

Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-02-21 12:14:07 -07:00
Angus Salkeld
34cb488999 STATS: fix key name length on "join_count"
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Seven Dake <sdake@redhat.com>
2011-02-04 09:46:52 -07:00
Angus Salkeld
4da371f4f7 STATS: increase the space for application names
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Seven Dake <sdake@redhat.com>
2011-02-04 09:44:12 -07:00
Jan Friesse
5c951ac641 Add objdb firewall_enabled_or_nic_failure
New objdb var runtime.totem.pg.mrp.srp.firewall_enabled_or_nic_failure
is set to 1 if continuous_gather is larger then MAX_NO_CONT_GATHER.
Under normal conditions, value of variable is 0.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-01-12 11:26:25 +01:00
Steven Dake
6646a864b4 Handle delayed multicast packets that occur with switches
Some switches delay multicast packets vs the unicast token.  This patch works
around that problem by providing a new tuneable called miss_count_const.  This
tuneable works by counting the number of times a message is found missing
and once reaching the const value, marks it as missing in the retransmit list.

This improves performance and doesn't display warning messages about missed
multicast messages when operating in these switching environments.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-01-11 10:34:46 -07:00
Angus Salkeld
a9b436c7a1 IPC: send failure message to client if memory maps fail
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-01-03 21:28:44 +11:00
Jan Friesse
b9df4424b1 Display warning when not possible to form cluster
This may typically happen if local firewall is enabled. Patch adds new
item to statistics called continuous_gather where is number of
continuous entered gather state. If this number is bigger then
MAX_NO_CONT_GATHER, warning message is displayed. This is also used on
exiting, so stop of corosync is now possible even with enabled firewall.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2010-12-03 10:11:11 +01:00
Steven Dake
9096c4d96b Set the max buffer size for sockets
Set the recv buffer to a large size and the send buffer to a large size to
allow the kernel to store more messages before dropping messages.

Amended to change optlen type to socklen_t

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2010-12-01 09:45:30 -07:00
Steven Dake
00e340d095 The flushing code was introducing data corruption because of recursion errors
that occur as a result of the design of udpu.  Totem no longer requires
the flushing technique because we don't mark a packet as missing until it has
not been seen by a certain number of token rotations per a previous patch.  This
mechanism was introduced to work around a problem in switches where multicast
messages may be delayed by long periods compared to the unicast token.

This patch removes the flushing logic from udpu since it is no longer necessary.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2010-11-28 01:45:08 -07:00
Angus Salkeld
2c46de5ac1 Add totem/interface/ttl config option.
This adds a per-interface config option to
adjust the TTL.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2010-11-24 14:35:56 +11:00
Steven Dake
aa03dca478 Merge branch 'topic-udpu'
Conflicts:
	Makefile.am

Signed-off-by: Steven Dake <sdake@redhat.com>
2010-11-18 15:03:19 -07:00
Steven Dake
b403fcbea9 Remove dead soresueaddr code
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2010-11-18 14:51:17 -07:00
Steven Dake
bb05aed93f Add the UDPU transport
The UDPU transport is useful for those deployments which can't use multicast.
UDPU works by using UDP unicast, which is fully supported by every switch
manufacturer by default and doesn't rely on a functional IGMP implementation.

An example of the UDPU transport is contained in the corosync.conf.example.udpu
file which shows a 16 node cluster.  This file should be copied to each node
in the cluster and IP addresses changed as appropriate.

Amended to remove dead udpu REUSEADDR socket option.

Signed-off-by: Steven Dake <sdake@redhat.com>
2010-11-18 14:21:30 -07:00
Fabio M. Di Nitto
b2400314b2 add release script and git based versioning
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2010-11-10 07:46:53 -07:00
Angus Salkeld
f0104b6d31 Add .gitignore files.
Otherwise "git status" is a pain.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@edhat.com>
2010-10-21 07:43:46 -07:00
Jan Friesse
7c8cdfb197 Remove delay in library on corosync shutdown
Patch removes 2 seconds delay in library on normal corosync shutdown.
Delay is still present on abnormal shutdown.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3059 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-10-12 13:03:37 +00:00
Angus Salkeld
10be299e7b Check for a properly configured multicast address.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3057 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-27 22:41:26 +00:00
Angus Salkeld
07d06c0c0f Add monitoring and watchdog services.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3053 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-27 21:12:03 +00:00
Angus Salkeld
72addbc4cd Add a Finite State Machine.(fsm.h)
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3052 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-27 21:11:04 +00:00
Angus Salkeld
61b7d85978 Add a Finite State Machine.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3051 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-27 21:08:01 +00:00
Angus Salkeld
53b0aa47e6 objdb: fix some ugly indentation.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3048 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-25 06:51:36 +00:00
Angus Salkeld
739f9ab1b7 objdb: delete trackers when an object is deleted
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3047 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-25 06:50:21 +00:00
Angus Salkeld
23e3455fa7 objdb: object_created_notification() fix the order of the parent and object handles.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3046 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-25 06:49:28 +00:00
Steven Dake
4ac55e52e4 Patch from Kacper Kowalik to support honoring user defined LDFLAGS.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3042 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-14 18:10:12 +00:00
Steven Dake
71c54f9440 Fix few xopen tsafe issues.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3037 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-08-31 20:16:20 +00:00
Angus Salkeld
3b320c17ae IPC: return CS_ERR_NO_RESOURCES to library when low on fds.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3029 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-08-25 01:13:14 +00:00
Angus Salkeld
f95c4f76c3 POLL: gracefully handle running out of file descriptors.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3028 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-08-25 01:07:37 +00:00
Steven Dake
6992410df6 Remove checking of sub parameters in service.d files.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3024 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-08-24 18:45:43 +00:00
Steven Dake
5a3c285fbd Properly detect shutdown of corosync process
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3022 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-08-17 18:08:13 +00:00
Steven Dake
fef259970a Remove cancel token retransmit timeout.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3012 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-08-03 17:31:33 +00:00
Jan Friesse
93fb44ed0f Allow running only one instance of Corosync
Patch makes Corosync more compliant with common practices
for writing daemon. It creates pid file (usually 
/var/run/corosync.pid) and flocks it. So only one instance
of Corosync can be executed now.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3010 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-08-02 12:36:20 +00:00
Steven Dake
8fa6f4f58e Remove consensus check for two node cluster cases which can have smaller
consensus values.  Document in man page the behavior of consensus.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3005 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-07-27 19:00:37 +00:00
Steven Dake
1135c911cd Fix problem where flow control could lock up ipc under very heavy load in very
rare circumstances.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3001 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-07-21 17:03:36 +00:00
Fabio M. Di Nitto
2b253383dc Fix logging_daemon config parser code.
Resolves: rhbz#615203


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2997 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-07-19 06:36:48 +00:00
Angus Salkeld
0fab390ae4 SYNC: always call sync_aborted() in sync_confchg_fn().
1) sync_callbacks.sync_abort can be null.
2) sync_processing is set to 0 after syncv1 is done.
   Then syncv2 processing is down. If we get a config change
   after syncv1 is down, but before syncv2 is done then it won't
   get aborted.



git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2995 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-07-17 04:59:40 +00:00
Angus Salkeld
b10fb56e8e SYNCV2: add debug when messages are discarded
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2993 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-07-16 02:11:25 +00:00
Angus Salkeld
2b4d150f81 SYNC: add some ENTER() trace points.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2992 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-07-16 02:09:51 +00:00