Commit Graph

2844 Commits

Author SHA1 Message Date
Angus Salkeld
71743b7a65 CTS: log bind() errors better
Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
2011-08-09 10:37:16 +10:00
Angus Salkeld
e50353b8f6 CTS: log cfg results
Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
2011-08-09 10:37:16 +10:00
Angus Salkeld
6be329053d CTS: rename flatiron to needle
Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
25842e3c53 CTS: add exit handler to test_agents
Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
d0a9235902 CTS: add "Too many open files" to the BadNews pattern
Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
e92eb5c520 CTS: impove debug during msgSend test
Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
540ee870ed CTS: add logging to test agent
Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
8085b224bb CTS: increse wait for node to reboot
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
fbc1084454 CTS: support new pacemaker-cts
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
5a2185683a AUGEAS: fix "tags" log field
Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
25751d12d2 TEST: fix the print out when cpg_finalize() fails
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
f55b23eaa9 libqb: use the new cs_strerror() to print out the error message.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
75a16ee20e libqb: fix iov_len in pcmk_test
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
4614c91fef libqb: fix valgring warnings in mon/wd
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
b5afc9283d libqb: change pause_timestamp to uint64_t
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
b8eae0e769 libqb: rip out objdb & serialize locks
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
17a4e6d9e5 libqb: only init IPC on service engines that need it.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
4026d78e0f libqb: remove the lib init/exit from the test service agent
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
b785c5ed08 libqb: use the main loop to shutdown
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
63e16ab583 libqb: remove tsafe.c
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
78e06739b7 libqb: remove worker thread - keep to one thread.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
f717bc60e1 libqb: make timer api a wrapper around qb_loop timers.
- change timeout value to nano seconds
- fix timer handles (don't alloc on stack)

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:14 +10:00
Angus Salkeld
c6895faa05 libqb: change ipc -> qb_ipc
IPC: return 0/-ENOBUFS from message handler
IPC: use the new rate_limit API to improve perf.
CPG: add send_async API & hook up flow control
IPC: Fix flow control getting stuck.
IPC: Port the remaining libs to use libqb IPC
IPC: remove libqb flowcontrol API
TEST: put cpg_dispatch() in it's own thread
IPC: cleanup ipc_glue.c name everything cs_ipcs_*()
IPC: add back statistics
IPC: remove coroipcc_ symbols from lib*.versions
IPC: init each se's IPC as it is loaded.
IPC: use the new connection_closed() event to free the context.
IPC: re-add zero copy functionality back
IPC: remove cpg_mcast_joined_async() and make it the default
 -> now cpg_mcast_joined() == cpg_mcast_joined_async()
libqb: expose a libqb error converter
libqb: add missing error conversions
libqb: remove repeat try loop in lib/cpg.c
CPG: fix zero copy mcast
CPG: use newer return codes
Add ENOTCONN to qb_to_cs_error()
libqb: fix error conversion from errno to cs_error_t in confdb
libqb: change errno_to_cs to qb_to_cs_error
libqb: add a cs_strerror() to get a more meaningful message
libqb: fix some confusing error conversions.
libqb: set the timeout on recv's to -1 (wait forever)

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:14 +10:00
Angus Salkeld
fce8a3c3b6 libqb: convert coropoll calls to qb_loop calls.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:14 +10:00
Angus Salkeld
6fa114ac8d Add systemd unit files for corosync and corosync-notifyd
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 08:21:02 +10:00
Florian Haas
1957865dd6 corosync.conf.example: add note about host addresses in bindnetaddr
https://lists.linux-foundation.org/pipermail/openais/2011-July/016563.html

Jan Friesse pointed out that bindnetaddr should be set to a host
address (as opposed to a network address) on hosts where multiple
NICs live on the same subnet. Add a comment to that effect to
the example configuration file.

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-07 09:50:27 -07:00
Florian Haas
6fa4a339b1 corosync.conf.example: include comments
It's nice to say people should read the man page. It's also naive to
assume that they always do. Include comments in the example config
file itself.

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Dan Frincu <dan.frincu@1and1.ro>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-07 09:50:22 -07:00
Florian Haas
178d09ed85 corosync.conf.example: change mcastaddr
Change suggested mcastaddr to one in the 239.255.0.0/16
pseudo-subnet. Multicast addresses outside 239.x.x.x may be IANA
registered and can clash with other services present on the
network. Suggest an address defined as part of the multicast IPv4
Local Scope in RFC 2365.

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Dan Frincu <dan.frincu@1and1.ro>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-07 09:50:18 -07:00
Florian Haas
f85b9448f8 corosync.conf.example: change bindnetaddr
Change the example configuration file so "bindnetaddr" has a value
that more obviously looks like a network address. So as not to have
people think they need to set an existing IP address here (and hence,
have non-identical corosync.conf files between nodes).

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Dan Frincu <dan.frincu@1and1.ro>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-07 09:50:14 -07:00
Jan Friesse
d4fb83e971 main: let poll really stop before totempg_finalize
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-26 10:07:08 +02:00
Jan Friesse
ddb5214c2c Revert "totemsrp: Remove recv_flush code"
This reverts commit 1a7b7a39f4.

Reversion is needed to remove overflow of receive buffers and dropping
messages.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2011-07-26 10:05:55 +02:00
MORITA Kazutaka
1d9f444fec totemsrp: fix buffer overflows for large clusters (> 100 nodes)
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-24 13:33:26 -07:00
Jan Friesse
2d75c7058f specfile: Install corosync-signals.conf for dbus
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-21 09:47:33 +02:00
Jan Friesse
a197e7b1ce specfile: use _datadir as var expansion not exec
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-21 09:47:30 +02:00
Jan Friesse
f103fb29b3 specfile: Correct URL and source0
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-21 09:47:27 +02:00
Tim Beale
04f37df2f7 Add some more stats for debugging
+ overload - number of times client is told to try again
+ invalid_request - message contained invalid paramter, e.g. invalid size
+ msg_queue_avail - messages currently available at the Totem layer
+ msg-queue_reserved - messages currently reserved at the Totem layer

Signed-off-by: Tim Beale <tim.beale@alliedtelesis.co.nz>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-19 08:58:41 -07:00
Jan Friesse
ad5cda223c rrp: Handle rollower in passive rrp properly
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-18 11:46:56 +02:00
Jan Friesse
d02d288747 rrp: handle rollover in active rrp properly
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-18 11:46:50 +02:00
Jan Friesse
a48c8e517d totemconfig: Change default FAIL_TO_RECV_CONST
Previous default (50) was too low for most modern switch hardware. This
may trigger abort because the aru doesn't increase for 50 token
rotations combined with a defect in how failed to recv conditions are
handled.  By increasing this tunable, the condition should no longer
trigger the errant code.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-18 11:46:21 +02:00
Steven Dake
c544e87bb0 Correct missing poll funtions from service handler struct needed for confdb APIs
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2011-07-15 13:30:41 -07:00
Steven Dake
a3d98f1652 Fix problem where corosync will segfault if there are gaps in recovery queue
Fixes a problem where there are gaps in the recovery queue.  Example my_aru = 5,
but there are messages at 7,8.  8 = my_high_seq_received which results
in data slots taken up in new message queue.  What should really happen
is these last messages should be delivered after a transitional
configuration to maintain SAFE agreement.  We don't have support for
SAFE atm, so it is probably safe just to throw these messages away.  Without
this change, the new message queue on a new configuraton change is out of sync.

Signed-off-by: Steven Dake <sdake@redhat.com>
Tested-by: Tim Beale <tlbeale@gmail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2011-07-15 10:39:57 -07:00
Jan Friesse
57749ec02a totemiba: free send_buf on ibv_reg_mr failure
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-08 08:15:14 +02:00
Florian Haas
051bca82df build: disable RDMA support in RPMs by default
Rather than curiously disable RDMA support by default in configure and
enable it by default in RPM builds, streamline the default
configuration to always turn RDMA support off. It can be enabled in
RPM builds with "--with rdma".

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-06 07:13:53 -07:00
Florian Haas
e715a455b6 build: set RDMA related _LIBS and _CFLAGS only if building with RDMA support
Having to force {ibverbs,rdmacm}_{LIBS,CFLAGS} looks positively odd;
so this may warrant further review. However, they are definitely not
needed if building without RDMA support.

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-06 07:12:25 -07:00
Florian Haas
17fb819af1 build: make RDMA support an RPM build conditional
Enable RDMA in RPM builds by default to maintain the previous behavior
(which always included --enable-rdma in the %configure invocation).

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-06 07:11:52 -07:00
Florian Haas
b8809eaf27 build: force LC_ALL=C correctly for dates
Failure to force "C" dates will have RPM et al. complain about invalid
dates and timestamps.

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-06 06:56:18 -07:00
Tim Beale
77f7e5b0fe Fix compile/runtime issues for _POSIX_THREAD_PROCESS_SHARED < 1
For the case where _POSIX_THREAD_PROCESS_SHARED < 1, the code doesn't compile
for corosync v1.3.1. And when it does compile, it crashes on our system - our
version of uClibc seems to always expect a 4th arg. The man pages suggests
the 4th arg is optional, but does say: 'For greater portability it is best to
always call semctl() with four arguments', which is what this patch does.
Also removed semop as it's an unused variable.

Signed-off-by: Tim Beale <tim.beale@alliedtelesis.co.nz>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-06 06:44:22 -07:00
Tim Beale
ba107f0a33 getpwnam_r()/getgrnam_r() returns ERANGE for some systems
On our system the expected buffer length is 256. This means calls to
getpwnam_r()/getgrnam_r() return ERANGE error and corosync fails to startup.
These 2 functions return ERANGE when insufficient buffer space is supplied.
Judging by the man page for getpwnam_r, the correct way to determine the
buffersize on any given system is to use sysconf().

Signed-off-by: Tim Beale <tim.beale@alliedtelesis.co.nz>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-06 06:31:50 -07:00
Jiaju Zhang
5dc33c2824 RRP: redundant ring automatic recovery
This patch automatically recovers redundant ring failures.

Please note that this patch introduced rrp_autorecovery_check_timeout
in totem config hence breaks internal ABI. The internal ABI users
of totem.h need to rebuild their binaries.

Signed-off-by: Jiaju Zhang <jjzhang@suse.de>
Signed-off-by: Steven Dake <sdake@redhat.com>
Tested-by: Jan Friesse <jfriesse@redhat.com>
Tested-by: Florian Haas <florian.haas@linbit.com>
Tested-by: Jiaju Zhang <jjzhang@suse.de>
2011-07-05 09:13:48 -07:00
Tim Serong
cfb96c64d9 Correct mailing list address in corosync_overview manpage
Signed-off-by: Tim Serong <tserong@novell.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-04 15:15:13 +02:00