Commit Graph

432 Commits

Author SHA1 Message Date
Jan Friesse
26db8b21b2 api: Change some of totempg definitons
Recent changes in patch "Get rid of hdb usage in totempg.h interface"
caused incompatibility between corosync API and totempg.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-10-24 17:43:36 +02:00
Jan Friesse
1711aea72f Allow compilation of totempg without warnings
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-10-24 17:43:28 +02:00
Jan Friesse
99bbf4cc78 logsys.h: Properly define LEAVE macro
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-10-24 14:24:52 +02:00
Angus Salkeld
1b63c3cf57 LOG: update the log defines
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-10-22 10:51:47 +11:00
Angus Salkeld
78a5260c06 LOG: use libqb facility conversion functions
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-10-21 19:34:43 +11:00
Jan Friesse
752239eaa1 rrp: Higher threshold in passive mode for mcast
There were too much false positives with passive mode rrp when high
number of messages were received.

Patch adds new configurable variable rrp_problem_count_mcast_threshold
which is by default 10 times rrp_problem_count_threshold and this is
used as threshold for multicast packets in passive mode. Variable is
unused in active mode.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed by: Steven Dake <sdake@redhat.com>
2011-09-01 11:21:09 +02:00
Steven Dake
e920fef7e9 Get rid of hdb usage in totempg.h interface
hdb has some expense and is not necessary in the totempg.so runtime.  This
patch removes the dependence on hdb and instead uses a direct pointer.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-08-23 22:29:01 -07:00
Steven Dake
bb42020f9a Use qb_hdb instead of mutex based hdb code
Rid ourselves of the mutex usage still in the code base

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-08-23 12:48:21 -07:00
Steven Dake
9f36a892a8 Move cs_queue.h from include directory to exec directory
This file is only used by totemsrp.c.  Move out of general include
directory.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-08-22 19:31:33 -07:00
Jan Friesse
99852ab203 Allow compile master on RHEL 6
corosync_timer_handle_t is know conditionally defined to prevent double
definition causing compile fault on RHEL 6 systems.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-08-09 11:29:48 +02:00
Angus Salkeld
37e17e7a94 libqb: logging & trace
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:16 +10:00
Angus Salkeld
f717bc60e1 libqb: make timer api a wrapper around qb_loop timers.
- change timeout value to nano seconds
- fix timer handles (don't alloc on stack)

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:14 +10:00
Angus Salkeld
c6895faa05 libqb: change ipc -> qb_ipc
IPC: return 0/-ENOBUFS from message handler
IPC: use the new rate_limit API to improve perf.
CPG: add send_async API & hook up flow control
IPC: Fix flow control getting stuck.
IPC: Port the remaining libs to use libqb IPC
IPC: remove libqb flowcontrol API
TEST: put cpg_dispatch() in it's own thread
IPC: cleanup ipc_glue.c name everything cs_ipcs_*()
IPC: add back statistics
IPC: remove coroipcc_ symbols from lib*.versions
IPC: init each se's IPC as it is loaded.
IPC: use the new connection_closed() event to free the context.
IPC: re-add zero copy functionality back
IPC: remove cpg_mcast_joined_async() and make it the default
 -> now cpg_mcast_joined() == cpg_mcast_joined_async()
libqb: expose a libqb error converter
libqb: add missing error conversions
libqb: remove repeat try loop in lib/cpg.c
CPG: fix zero copy mcast
CPG: use newer return codes
Add ENOTCONN to qb_to_cs_error()
libqb: fix error conversion from errno to cs_error_t in confdb
libqb: change errno_to_cs to qb_to_cs_error
libqb: add a cs_strerror() to get a more meaningful message
libqb: fix some confusing error conversions.
libqb: set the timeout on recv's to -1 (wait forever)

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:14 +10:00
Angus Salkeld
fce8a3c3b6 libqb: convert coropoll calls to qb_loop calls.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:14 +10:00
Tim Beale
04f37df2f7 Add some more stats for debugging
+ overload - number of times client is told to try again
+ invalid_request - message contained invalid paramter, e.g. invalid size
+ msg_queue_avail - messages currently available at the Totem layer
+ msg-queue_reserved - messages currently reserved at the Totem layer

Signed-off-by: Tim Beale <tim.beale@alliedtelesis.co.nz>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-19 08:58:41 -07:00
Steven Dake
c544e87bb0 Correct missing poll funtions from service handler struct needed for confdb APIs
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2011-07-15 13:30:41 -07:00
Tim Beale
77f7e5b0fe Fix compile/runtime issues for _POSIX_THREAD_PROCESS_SHARED < 1
For the case where _POSIX_THREAD_PROCESS_SHARED < 1, the code doesn't compile
for corosync v1.3.1. And when it does compile, it crashes on our system - our
version of uClibc seems to always expect a 4th arg. The man pages suggests
the 4th arg is optional, but does say: 'For greater portability it is best to
always call semctl() with four arguments', which is what this patch does.
Also removed semop as it's an unused variable.

Signed-off-by: Tim Beale <tim.beale@alliedtelesis.co.nz>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-06 06:44:22 -07:00
Jiaju Zhang
5dc33c2824 RRP: redundant ring automatic recovery
This patch automatically recovers redundant ring failures.

Please note that this patch introduced rrp_autorecovery_check_timeout
in totem config hence breaks internal ABI. The internal ABI users
of totem.h need to rebuild their binaries.

Signed-off-by: Jiaju Zhang <jjzhang@suse.de>
Signed-off-by: Steven Dake <sdake@redhat.com>
Tested-by: Jan Friesse <jfriesse@redhat.com>
Tested-by: Florian Haas <florian.haas@linbit.com>
Tested-by: Jiaju Zhang <jjzhang@suse.de>
2011-07-05 09:13:48 -07:00
Jan Friesse
8c717c22b2 Remove spinlocks
Spinlocks are now removed, because even spinlock can improve
speed is some special cases, in most cases it makes corosync CPU usage
much more intensive and less responsive then if only mutexes are used.

What we were doing is:
pthread_mutex_lock
pthread_spin_lock
pthread_spin_unlock
pthread_mutex_unlock

what is not safe.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-29 12:01:54 +02:00
Jerome Flesch
00434a4f10 Fix usage of strerror_r()/perror()
Signed-off-by: Jerome Flesch <jerome.flesch@netasq.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-06-28 09:56:58 +02:00
Jan Friesse
afa0398ca4 mainconfig: Check retval of logsys_format_set
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-06 10:02:34 +02:00
Jerome Flesch
b112672115 coroipcs_handler_dispatch(): Fix conn_info->service security value: -1 is not a good security value since it's equal to SOCKET_SERVICE_INIT
Signed-off-by: Jerome Flesch <jerome.flesch@netasq.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2011-05-27 13:40:36 +02:00
Russell Bryant
5da4d5479a Convert existing documentation to doxygen format.
This patch modifies most of the existing comments in header files to be
in a format that doxygen can interpret.  This provides another
significant improvement to the web/pdf/etc generated documentation
without having to add new content.

Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-03-12 15:03:16 +11:00
Angus Salkeld
89e4c1c048 CONFDB: add confdb_object_name_get()
This is useful when tracking object changes.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Seven Dake <sdake@redhat.com>
2011-02-04 09:47:15 -07:00
Steven Dake
6646a864b4 Handle delayed multicast packets that occur with switches
Some switches delay multicast packets vs the unicast token.  This patch works
around that problem by providing a new tuneable called miss_count_const.  This
tuneable works by counting the number of times a message is found missing
and once reaching the const value, marks it as missing in the retransmit list.

This improves performance and doesn't display warning messages about missed
multicast messages when operating in these switching environments.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-01-11 10:34:46 -07:00
Jan Friesse
b9df4424b1 Display warning when not possible to form cluster
This may typically happen if local firewall is enabled. Patch adds new
item to statistics called continuous_gather where is number of
continuous entered gather state. If this number is bigger then
MAX_NO_CONT_GATHER, warning message is displayed. This is also used on
exiting, so stop of corosync is now possible even with enabled firewall.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2010-12-03 10:11:11 +01:00
Angus Salkeld
2c46de5ac1 Add totem/interface/ttl config option.
This adds a per-interface config option to
adjust the TTL.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2010-11-24 14:35:56 +11:00
Steven Dake
aa03dca478 Merge branch 'topic-udpu'
Conflicts:
	Makefile.am

Signed-off-by: Steven Dake <sdake@redhat.com>
2010-11-18 15:03:19 -07:00
Steven Dake
bb05aed93f Add the UDPU transport
The UDPU transport is useful for those deployments which can't use multicast.
UDPU works by using UDP unicast, which is fully supported by every switch
manufacturer by default and doesn't rely on a functional IGMP implementation.

An example of the UDPU transport is contained in the corosync.conf.example.udpu
file which shows a 16 node cluster.  This file should be copied to each node
in the cluster and IP addresses changed as appropriate.

Amended to remove dead udpu REUSEADDR socket option.

Signed-off-by: Steven Dake <sdake@redhat.com>
2010-11-18 14:21:30 -07:00
Angus Salkeld
f0104b6d31 Add .gitignore files.
Otherwise "git status" is a pain.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@edhat.com>
2010-10-21 07:43:46 -07:00
Jan Friesse
7c8cdfb197 Remove delay in library on corosync shutdown
Patch removes 2 seconds delay in library on normal corosync shutdown.
Delay is still present on abnormal shutdown.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3059 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-10-12 13:03:37 +00:00
Angus Salkeld
10be299e7b Check for a properly configured multicast address.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3057 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-27 22:41:26 +00:00
Angus Salkeld
83b24b660b WD/SAM integration.
- timestamps -> uint64_t and in nanosecs
- use clock_gettime
- common object naming
- common state names
- timeouts in milliseconds



git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3054 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-27 21:13:15 +00:00
Angus Salkeld
07d06c0c0f Add monitoring and watchdog services.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3053 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-27 21:12:03 +00:00
Jan Friesse
1a32fc4a6c SAM Confdb integration
Patch add support for Confdb integration with SAM. It's now possible to
use SAM_RECOVERY_POLICY_CONFDB as flag to previous policies.
    
Also new function sam_mark_failed is added for ability to use RECOVERY
policy together with confdb and get expected results (specially with
integration with corosync watchdog)

Patch also makes SAM thread safe.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3050 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-27 07:34:21 +00:00
Angus Salkeld
397e648080 objdb: fix some strange types (uint8_t* -> void*).
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3045 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-25 06:48:24 +00:00
Angus Salkeld
f95c4f76c3 POLL: gracefully handle running out of file descriptors.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3028 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-08-25 01:07:37 +00:00
Steven Dake
5a3c285fbd Properly detect shutdown of corosync process
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3022 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-08-17 18:08:13 +00:00
Steven Dake
985d3d0166 Fix merge error with revision 3001.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3002 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-07-21 20:48:40 +00:00
Steven Dake
1135c911cd Fix problem where flow control could lock up ipc under very heavy load in very
rare circumstances.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3001 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-07-21 17:03:36 +00:00
Steven Dake
cf0d63aa3f Patch to fix stack protector sig abort that occurs when ipc buffer is too
short.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2974 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-06-29 18:15:20 +00:00
Steven Dake
0e9f0bfeb4 Make cpg_membership_get() functional.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2855 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-19 05:03:52 +00:00
Angus Salkeld
4ff33854ad add __attribute__((noreturn)) to functions that always exit.
we had some __attribute__((__noreturn__))
and some    __attribute__((noreturn))

I made them all: __attribute__((noreturn))



git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2853 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-19 04:34:53 +00:00
Angus Salkeld
ad71b66c7b cov (many): make sure all _set() functions return a signed int
in the body it can return -1, and callers check for -1.
but the return type is unsigned int?



git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2849 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-18 00:46:08 +00:00
Angus Salkeld
25a3d310ac cov 10390: remove pointless assert.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2837 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-16 21:32:21 +00:00
Jan Friesse
d5884cd714 SAM integration of quorum
Patch adds integration of SAM and quorum, so it's now possible to use
SAM_RECOVERY_POLICY_QUORUM_QUIT or SAM_RECOVERY_POLICY_QUORUM_RESTART
recovery policy. With these policies, sam_start will block until
corosync is quorate. If quorum is lost during health checking, recovery
action is taken.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2822 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-13 11:20:49 +00:00
Jan Friesse
e8b143595c CPG model_initialize and ringid + members callback
Patch adds new function to initialize cpg, cpg_model_initialize. Model
is set of callbacks. With this function, future addions of models
should  be possible without changing the ABI.

Patch also contains callback in CPG_MODEL_V1 for notification about
Totem membership changes.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2770 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-04-20 12:40:48 +00:00
Jan Friesse
da6fce352b Support for store user data in SAM
Ability to in-memory storing of user data which survives between
instances of process.

Also ability needed ability for bi-directional communication between
child and parent is added.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2769 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-04-20 10:32:07 +00:00
Jan Friesse
a568462601 Support for user configurable warning signal
Allow developer configure a signal to be send as a warning signal
before real SIGKILL.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2753 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-04-01 12:51:07 +00:00
Angus Salkeld
15afe1d192 CTS: add a test for sync events
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2742 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-03-30 07:11:56 +00:00