Commit Graph

2719 Commits

Author SHA1 Message Date
Florian Haas
1957865dd6 corosync.conf.example: add note about host addresses in bindnetaddr
https://lists.linux-foundation.org/pipermail/openais/2011-July/016563.html

Jan Friesse pointed out that bindnetaddr should be set to a host
address (as opposed to a network address) on hosts where multiple
NICs live on the same subnet. Add a comment to that effect to
the example configuration file.

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-07 09:50:27 -07:00
Florian Haas
6fa4a339b1 corosync.conf.example: include comments
It's nice to say people should read the man page. It's also naive to
assume that they always do. Include comments in the example config
file itself.

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Dan Frincu <dan.frincu@1and1.ro>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-07 09:50:22 -07:00
Florian Haas
178d09ed85 corosync.conf.example: change mcastaddr
Change suggested mcastaddr to one in the 239.255.0.0/16
pseudo-subnet. Multicast addresses outside 239.x.x.x may be IANA
registered and can clash with other services present on the
network. Suggest an address defined as part of the multicast IPv4
Local Scope in RFC 2365.

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Dan Frincu <dan.frincu@1and1.ro>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-07 09:50:18 -07:00
Florian Haas
f85b9448f8 corosync.conf.example: change bindnetaddr
Change the example configuration file so "bindnetaddr" has a value
that more obviously looks like a network address. So as not to have
people think they need to set an existing IP address here (and hence,
have non-identical corosync.conf files between nodes).

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Dan Frincu <dan.frincu@1and1.ro>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-07 09:50:14 -07:00
Jan Friesse
d4fb83e971 main: let poll really stop before totempg_finalize
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-26 10:07:08 +02:00
Jan Friesse
ddb5214c2c Revert "totemsrp: Remove recv_flush code"
This reverts commit 1a7b7a39f4.

Reversion is needed to remove overflow of receive buffers and dropping
messages.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2011-07-26 10:05:55 +02:00
MORITA Kazutaka
1d9f444fec totemsrp: fix buffer overflows for large clusters (> 100 nodes)
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-24 13:33:26 -07:00
Jan Friesse
2d75c7058f specfile: Install corosync-signals.conf for dbus
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-21 09:47:33 +02:00
Jan Friesse
a197e7b1ce specfile: use _datadir as var expansion not exec
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-21 09:47:30 +02:00
Jan Friesse
f103fb29b3 specfile: Correct URL and source0
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-21 09:47:27 +02:00
Tim Beale
04f37df2f7 Add some more stats for debugging
+ overload - number of times client is told to try again
+ invalid_request - message contained invalid paramter, e.g. invalid size
+ msg_queue_avail - messages currently available at the Totem layer
+ msg-queue_reserved - messages currently reserved at the Totem layer

Signed-off-by: Tim Beale <tim.beale@alliedtelesis.co.nz>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-19 08:58:41 -07:00
Jan Friesse
ad5cda223c rrp: Handle rollower in passive rrp properly
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-18 11:46:56 +02:00
Jan Friesse
d02d288747 rrp: handle rollover in active rrp properly
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-18 11:46:50 +02:00
Jan Friesse
a48c8e517d totemconfig: Change default FAIL_TO_RECV_CONST
Previous default (50) was too low for most modern switch hardware. This
may trigger abort because the aru doesn't increase for 50 token
rotations combined with a defect in how failed to recv conditions are
handled.  By increasing this tunable, the condition should no longer
trigger the errant code.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-18 11:46:21 +02:00
Steven Dake
c544e87bb0 Correct missing poll funtions from service handler struct needed for confdb APIs
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2011-07-15 13:30:41 -07:00
Steven Dake
a3d98f1652 Fix problem where corosync will segfault if there are gaps in recovery queue
Fixes a problem where there are gaps in the recovery queue.  Example my_aru = 5,
but there are messages at 7,8.  8 = my_high_seq_received which results
in data slots taken up in new message queue.  What should really happen
is these last messages should be delivered after a transitional
configuration to maintain SAFE agreement.  We don't have support for
SAFE atm, so it is probably safe just to throw these messages away.  Without
this change, the new message queue on a new configuraton change is out of sync.

Signed-off-by: Steven Dake <sdake@redhat.com>
Tested-by: Tim Beale <tlbeale@gmail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2011-07-15 10:39:57 -07:00
Jan Friesse
57749ec02a totemiba: free send_buf on ibv_reg_mr failure
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-08 08:15:14 +02:00
Florian Haas
051bca82df build: disable RDMA support in RPMs by default
Rather than curiously disable RDMA support by default in configure and
enable it by default in RPM builds, streamline the default
configuration to always turn RDMA support off. It can be enabled in
RPM builds with "--with rdma".

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-06 07:13:53 -07:00
Florian Haas
e715a455b6 build: set RDMA related _LIBS and _CFLAGS only if building with RDMA support
Having to force {ibverbs,rdmacm}_{LIBS,CFLAGS} looks positively odd;
so this may warrant further review. However, they are definitely not
needed if building without RDMA support.

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-06 07:12:25 -07:00
Florian Haas
17fb819af1 build: make RDMA support an RPM build conditional
Enable RDMA in RPM builds by default to maintain the previous behavior
(which always included --enable-rdma in the %configure invocation).

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-06 07:11:52 -07:00
Florian Haas
b8809eaf27 build: force LC_ALL=C correctly for dates
Failure to force "C" dates will have RPM et al. complain about invalid
dates and timestamps.

Signed-off-by: Florian Haas <florian.haas@linbit.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-06 06:56:18 -07:00
Tim Beale
77f7e5b0fe Fix compile/runtime issues for _POSIX_THREAD_PROCESS_SHARED < 1
For the case where _POSIX_THREAD_PROCESS_SHARED < 1, the code doesn't compile
for corosync v1.3.1. And when it does compile, it crashes on our system - our
version of uClibc seems to always expect a 4th arg. The man pages suggests
the 4th arg is optional, but does say: 'For greater portability it is best to
always call semctl() with four arguments', which is what this patch does.
Also removed semop as it's an unused variable.

Signed-off-by: Tim Beale <tim.beale@alliedtelesis.co.nz>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-06 06:44:22 -07:00
Tim Beale
ba107f0a33 getpwnam_r()/getgrnam_r() returns ERANGE for some systems
On our system the expected buffer length is 256. This means calls to
getpwnam_r()/getgrnam_r() return ERANGE error and corosync fails to startup.
These 2 functions return ERANGE when insufficient buffer space is supplied.
Judging by the man page for getpwnam_r, the correct way to determine the
buffersize on any given system is to use sysconf().

Signed-off-by: Tim Beale <tim.beale@alliedtelesis.co.nz>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-06 06:31:50 -07:00
Jiaju Zhang
5dc33c2824 RRP: redundant ring automatic recovery
This patch automatically recovers redundant ring failures.

Please note that this patch introduced rrp_autorecovery_check_timeout
in totem config hence breaks internal ABI. The internal ABI users
of totem.h need to rebuild their binaries.

Signed-off-by: Jiaju Zhang <jjzhang@suse.de>
Signed-off-by: Steven Dake <sdake@redhat.com>
Tested-by: Jan Friesse <jfriesse@redhat.com>
Tested-by: Florian Haas <florian.haas@linbit.com>
Tested-by: Jiaju Zhang <jjzhang@suse.de>
2011-07-05 09:13:48 -07:00
Tim Serong
cfb96c64d9 Correct mailing list address in corosync_overview manpage
Signed-off-by: Tim Serong <tserong@novell.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-07-04 15:15:13 +02:00
Masatake YAMATO
7ba892dac3 fix typos in cpg_mcast_joined.3 and cpg_zcb_mcast_joined.3
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
2011-06-29 09:12:31 -07:00
Steven Dake
899052484e Add coverity target to corosync makefile.am
Allow a make coverity target for those developers with coverity tools
available to them.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-06-29 09:11:42 -07:00
Jan Friesse
94d934e0e0 coroipcc: Test _SC_PAGESIZE result
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-29 15:17:49 +02:00
Jan Friesse
8c717c22b2 Remove spinlocks
Spinlocks are now removed, because even spinlock can improve
speed is some special cases, in most cases it makes corosync CPU usage
much more intensive and less responsive then if only mutexes are used.

What we were doing is:
pthread_mutex_lock
pthread_spin_lock
pthread_spin_unlock
pthread_mutex_unlock

what is not safe.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-29 12:01:54 +02:00
Jan Friesse
5458d4f27a votequorum: free newly allocated node if nodeid==0
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-29 11:59:57 +02:00
Jerome Flesch
00434a4f10 Fix usage of strerror_r()/perror()
Signed-off-by: Jerome Flesch <jerome.flesch@netasq.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-06-28 09:56:58 +02:00
Steven Dake
ae4a3af340 sched_params log message incorrect
The sched_params parameter was set before being printed.

Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Reviewed-by:  <sdake@redhat.com>
2011-06-22 22:46:56 -07:00
Jan Friesse
424200d962 configure.ac: Align --enable-* options description
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-22 11:29:39 +02:00
Jan Friesse
5a6a8a0c9e configure.ac: change edefault to default
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-22 11:25:08 +02:00
Jan Friesse
ae2ac5945b CTS: Test for confdb dispatch deadlock
Test is disabled by default because it depends on SMP and about 2GB RAM.
It's also testing race, so test is unreliable.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-22 11:21:01 +02:00
Jan Friesse
b5d2f4578a confdb: Resolve dispatch deadlock
Following situation could happen:
- one thread is waiting for finish write operation (line 853), objdb is
  locked
- flush (done in objdb_notify_dispatch) is called in main thread, but
  this call will never appear because main thread is waiting for objdb
  lock.

In this situation deadlock appears.

Commit solves this by:
- setting pipe to non-blocking mode
- pipe is used only as trigger for coropoll
- dispatch messages are stored in list
- main thread is processing messages from list

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-22 11:20:55 +02:00
Jan Friesse
e8000c7b9b objdb: save copy of handles in object_find_create
Following situation could happen:
- process 1 thru confdb creates find handle
- calls find iteration once
- different process 2 deletes object pointed by process 1 iterator
- process 1 calls iteration again ->
  object_find_instance->find_child_list is invalid pointer

-> segfault

Now object_find_create creates array of matching object handlers and
object_find_next uses that array together with check for name. This
prevents situation where between steps 2 and 3 new object is created
with different name but sadly with same handle.

Also good to note that this patch is more or less quick hack rather
then proper solution. Real proper solution is to not use pointers
and rather use handles everywhere. This is big TODO.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-22 11:13:12 +02:00
Jiaju Zhang
c6bfc6b5d6 RRP: Fix ring initialization issue for UDPU mode
Redundant ring has some problem in the UDP unicast mode. The problem
is the second ring has not been successfully initialized, that is, the
second time iface_changes happens, the member list for that interface
has not been added, which results in that ring cannot transmit normal
message. So the second ring cannot take over the work if the first
ring is down. This patch fixes this issue.

comments from review:
More work is needed probably in totemnet where totemnet maintains the
the of node list and an iterator for them, and totemudpu_member_add adds
state information to a context for the iteration.

In any regard, that is somewhat difficult to test, so I'll merge this
patch for now - keep in mind interface changes on the bindnetaddr will
cause problems with udpu after this patch has been commmitted.

Signed-off-by: Jiaju Zhang <jjzhang@suse.de>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-16 17:23:36 -07:00
Jan Friesse
2e5dc5f322 coroipcc: check recvmsg result in socket_recv
According specification recvmsg can return 0, which means that
connection is closed. We had this check, but limited only for systems
other then Linux. recvmsg can return 0 even on Linux, so check is now
applied on all systems.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-06-10 12:33:19 +02:00
Jan Friesse
9afb4bdaa8 confdb: Properly check result of object_find_create
in confdb_object_iter result of object_find_create is now properly
checked. object_find_create can return -1 if object doesn't exists.
Without this check, incorrect handle (memory garbage) was directly
passed to object_find_next.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-06-10 12:33:07 +02:00
Jan Friesse
50f05bfa15 crypto: rng_make_prng prevent buf overflow
with bits set to 1023, buf of 256 bytes was filled by rng_get_bytes
up to 257 bytes. Buf is now 258 bytes so it's no longer problem.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-10 12:12:05 +02:00
Jan Friesse
afa0398ca4 mainconfig: Check retval of logsys_format_set
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-06 10:02:34 +02:00
Jan Friesse
aa23d20125 testcpgzc: fgets buffer to really allocated size
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-03 11:11:28 +02:00
Jan Friesse
f95d3b3bf2 cpg: do_proc_join change list_slice to list_add
In this concrete case result is equivalent but makes coverity happy.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-03 11:10:08 +02:00
Jan Friesse
531e81602f totemudp: memset of proper size
In totemudp_mcast_thread_state_constructor memset to
sizeof(struct totemudp_mcast_thread_state) instead of size of
pointer.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-03 11:09:27 +02:00
Jan Friesse
ea0a24866c coroipcs: init buf in coroipcs_handler_dispatch
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-03 11:09:01 +02:00
Jan Friesse
c2a39cb8e2 coroparse: don't leak dirent
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-03 11:00:56 +02:00
Jan Friesse
d76bb76d1f logsys: _logsys_wthread_create never returns != 0
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-03 10:59:17 +02:00
Jan Friesse
844c8759d7 notifyd: Check retval of corosync_cfg_initialize
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-03 10:59:08 +02:00
Jan Friesse
6b9297131c totemconfig: discard check of objdb_get_string ret
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-03 10:58:15 +02:00