Commit Graph

1917 Commits

Author SHA1 Message Date
Jan Friesse
90da72cd7f cfg: Check interface status and name length
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-06-18 14:36:12 +02:00
Jan Friesse
335da1ecfd cfg: Check number of interfaces
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-06-18 14:36:12 +02:00
Jan Friesse
5dc3fc4bda totemrrp: Make status string shorter
Status string should be same lenght as needed for cfg
ringstatusget function.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-06-18 14:36:11 +02:00
Jan Friesse
845a625908 totem: Don't leak instance variable on crypto fail
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-18 14:35:25 +02:00
Jan Friesse
93286a344e totemudpu: Handle fd leak in totemudpu
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-18 14:35:21 +02:00
Jan Friesse
421de34972 totemconfig: Check length of rrp_mode string
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-18 14:35:15 +02:00
Jan Friesse
675da75759 coroparse: Ensure that config items fits into cmap
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-18 14:35:05 +02:00
Jan Friesse
e094ab2e2c votequorum: Prevent leak in qdevice_is_configured
Also LEAVE from function is now properly logged.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-17 15:47:27 +02:00
Jan Friesse
4310d84e4d Initialize error variable in ykd_init
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-13 10:53:57 +02:00
Jan Friesse
92b900da67 Initialize node_found in nodelist_to_interface fun
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-13 10:53:57 +02:00
Jan Friesse
903e02875d Initialize item in cmap_mcast_send
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-13 10:53:56 +02:00
Jan Friesse
f198955644 votequrorum: Assert sender nodeid is known
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-13 10:53:56 +02:00
Jan Friesse
56ee492471 Check result of logsys_subsys_create
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-13 10:53:56 +02:00
Jan Friesse
d5d4cdb972 Check logsys_format_set result in logsys setup
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-13 10:53:56 +02:00
Jan Friesse
90f8a68a2b Use proper totem_ip_address size in memset
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-13 10:53:56 +02:00
Jan Friesse
df6b87f293 Free icmap strings in logconfig
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-13 10:53:56 +02:00
Jan Friesse
ce9c69da03 Properly break MAIN_CP_CB_DATA_STATE_QDEVICE state
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-13 10:53:55 +02:00
Jan Friesse
d5d3fb4d45 Do not dereference format_buffer when it's NULL
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-13 10:53:55 +02:00
Jan Friesse
96a89a0085 Check icmap str get for clustername
Even this check is really not needed, it's nice to have it and on fault
ensure that cluster_name is really NULL.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-13 10:53:55 +02:00
Jan Friesse
966f461b69 Properly check result of stat func in coroparse
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-06-13 10:53:55 +02:00
Jan Friesse
e684e4ca6f Remove unnecessary mmap in cpg
Code for zero-copy in cpg does following mmaps:
- Mmap anonymous, private memory to some address (-> malloc)
- Mmap shared memory of fd to address returned by first mmap
  (effectively shadows first mapping)

This is not necessary and only one mapping is needed.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-05-21 14:46:15 +02:00
Jan Friesse
8429d01389 Detect big scheduling pauses
Add poll timer scheduler to be called 3 times per token timeout.
If poll timer was not called for more then 0.8 * token timeout, it means
corosync process was not scheduled and ether token_timeout should be
increased or load should be reduced (useful for VM, where host is
overcommitted so VM is not scheduled as expected).

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-04-08 09:58:42 +02:00
Jan Friesse
86b074dc1a Support for numerical uid/gid
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-04-02 09:32:10 +02:00
Andrei Belov
005e7fd3b9 Improved POSIX-compliant handling of getpwnam_r() and getgrnam_r().
Signed-off-by: Andrei Belov <defanator@gmail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-03-28 16:32:53 +01:00
Jan Friesse
0e3d1a9c51 totempg: Make iov_delv local variable
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-03-21 14:24:23 +01:00
Xia Li
ca6051e80c Convert the nodeid byte order to be aligned with network order
When using corosync with clear_node_high_bit setting to yes,
the highest bit is cleared.  When all the cluster nodes are in
one subnet, we probably configure the IP addresses as follows:

node1: 147.2.207.64
node2: 147.2.207.192

If the byte order of the nodeid is little endian, wiping off the
highest bit will make the two nodes have the same nodeid!

This patch fixes this by converting the nodeid to network order.

Signed-off-by: Xia Li <xli@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-03-19 16:39:59 +01:00
Jeremy Fitzhardinge
52f88d04ea Handle ERANGE from getpwnam_r / getgrnam_r
These functions return ERANGE if the supplied buffer is too small to
fit a line.  Try doubling the buffer a few times until it works.
2013-03-07 16:59:51 -08:00
Jan Friesse
66172a501a Handle unexpected closing brace in config file
If configuration file contains closing brace before opening brace
at top level, configuration parsing is stopped and file is not
completely parsed. Solution is to detect extra closing brace and display
error.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-01-31 16:11:22 +01:00
Jan Friesse
663489d277 Handle colon in configuration file
If colon was entered as part of value on end of value, it is deleted.
This makes impossible to enter (legal) IPv6 address ending with :: (like
fed0::).

Also when line contains both brace and colon, it is parsed twice (first
as key = value and second as start of section). This is handled by
continue in if section.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-01-31 16:11:18 +01:00
Fabio M. Di Nitto
98d0245c7e votequorum: port to sync API (take 2)
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-01-31 15:32:07 +01:00
Fabio M. Di Nitto
55dc09ea23 totemconfig: enforce hmac config when crypto is enabled
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-01-14 12:31:47 +01:00
Kazunori INOUE
1ad21e384e log: move Corosync started log messages
"Corosync Cluster Engine ... started" message is shown after
logsys is full configured.

Signed-off-by: Kazunori INOUE <inouekazu@intellilink.co.jp>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-01-14 11:52:26 +01:00
Fabio M. Di Nitto
ed6bca3293 crypto: drop < 2.3 protocols and onwire compat
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-01-14 11:49:32 +01:00
Fabio M. Di Nitto
b3f456a8ce totemcrypto: fix hmac key initialization
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-01-14 11:23:32 +01:00
Jan Friesse
6127be1806 Move qb_loop creation after daemonization
Creating qb_loop before daemonization is not problem for poll or epoll
type loops, but it's problem for kqueue, because kqueue is not shared
in child with parent after fork.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-12-12 11:47:42 +01:00
Jan Friesse
dd588d004e Add option to specify ip version
Default is ipv4.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-12-03 14:02:32 +01:00
Jan Friesse
92e0f9c7bb Add waiting_trans_ack also to fragmentation layer
Patch for support waiting_trans_ack may fail if there is synchronization
happening between delivery of fragmented message. In such situation,
fragmentation layer is waiting for message with correct number, but it
will never arrive.

Solution is to handle (callback) change of waiting_trans_ack and use
different queue.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-11-22 11:48:12 +01:00
Jan Friesse
2d4e7bebb5 Handle segfault in backlog_get
If instance->memb_state is not OPERATION or RECOVERY, we was passing NULL
to cs_queue_used call.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-11-22 11:48:07 +01:00
Steven Dake
402638929e Fix problem with sync operations under very rare circumstances
This patch creates a special message queue for synchronization messages.
This prevents a situation in which messages are queued in the
new_message_queue but have not yet been originated from corrupting the
synchronization process.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-11-22 11:47:57 +01:00
Fabio M. Di Nitto
220d659b38 totemcrypto: implement crypto packet format 2.2 and crypto_compat: config opt
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-11-22 11:13:30 +01:00
Evgeny Barskiy
e3f615b4a0 corosync to start in infiniband + redundant ring active/passive mode
Corosync now works with infiniband transport in any redundant ring mode

Signed-off-by: Evgeny Barskiy <barskiy@rts.ru>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-11-21 10:28:57 +01:00
Fabio M. Di Nitto
ed63c812af votequorum: fix handling of expected_votes/votes changes from cmapctl
and allow natural selection to take place....

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-11-20 15:45:57 +01:00
Jan Friesse
3cd4f9a1f5 Add support for selecting IPC type
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-11-08 12:16:11 +01:00
Jan Friesse
89809ec80e Check successful initialization of IPC
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-11-08 12:16:06 +01:00
Angus Salkeld
abc3b6abed Try reduce the number of sprintf's
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-11-07 21:28:31 +11:00
Jan Friesse
d4db2ea535 If failed_to_recv is set, consensus can be empty
If failed_to_recv is set (node detect itself not able to receive
message), we can end up with assert, because my_failed_list and
my_member_list are same list. This is happening because we are not
following specification and we allow to mark node itself as failed.
Because if failed_to_recv is set and we reached consensus across nodes,
single node membership is created (ignoring both fail list and
member_list), we can skip assert.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-11-05 15:16:25 +01:00
Jacek Konieczny
07832748f2 link libtotem_pg to libqb
The libtotem_pg library uses symbols from libqb, so it should be
explicitely linked with it. This doesn't cause problems for corosync
binary itself, as it is linked to both libraries, but can cause
problems if anything else links to libtotem_pg.so and automated
checkers can show this as a library problem.

Signed-off-by: Jacek Konieczny <jajcus@jajcus.net>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-10-29 16:49:19 +01:00
Jan Friesse
8a9869eeec Correctly check if service was unloaded
my_processing_idx is pointer to received service list, instead of global
service number. If we check state of service we should use service_id
instead of my_processing_idx.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-10-17 15:06:36 +02:00
Jan Friesse
c165bf4f51 Define AES_*_KEY_LENGTH if not defined
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-10-17 15:06:32 +02:00
Fabio M. Di Nitto
20c5871525 totemcrypto: add support for different encryption methods
(backport from nsscrypto kronosnet code)

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-10-15 10:00:16 +02:00
Jan Friesse
fc50443f5f Make totemiba compile again
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2012-10-08 17:44:09 +02:00
Jan Friesse
b7635ab9f7 Return back "Totem is unable to form..." message
This patch returns back SUBJ functionality. It rely on fact, that
sendmsg will return error, and if such error is returned for long time,
it's probably because of firewall.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-10-08 16:53:35 +02:00
Jan Friesse
d042671369 Move "Totem is unable to form..." message to main
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-10-08 16:53:33 +02:00
Jan Friesse
6c3b337b37 Use unix socket for local multicast loop
Instead of rely on multicast loop functionality of kernel, we now use
unix socket created by socketpair to deliver multicast messages to
local node. This handles problems with improperly configured local
firewall. So if output/input to/from ethernet interface is blocked, node
is still able to create single node membership.

Dark side of the patch is fact, that membership is always created, so
"Totem is unable to form a cluster..." will never appear (same applies
to continuous_gather key).

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-10-08 16:53:30 +02:00
Jan Friesse
4354ed6ecb Store config_version of other nodes
Config version of other nodes is stored in
runtime.totem.pg.mrp.srp.members.NODEID.config_version key. Also when
local config_version is changed, all nodes are informed.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-10-03 11:26:35 +02:00
Jan Friesse
d2a85593c4 Support for check of config version on start
Config version is requested from other nodes. If our config version is
not 0 and differes from highest config version of other nodes, corosync
quits.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-10-02 16:04:32 +02:00
Jan Friesse
73b0fe688d Make cmap_mcast_send return correct error code
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-10-02 16:04:28 +02:00
Jan Friesse
a273be58ae Make service_build contain correct number of msgs
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-10-02 16:04:24 +02:00
Jan Friesse
3c019f2130 Align items in cmap_mcast_send
Aligning function (kernel style magic) MAR_ALIGN_UP is used for
aligning of items in req_exec_cmap_mcast message.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-10-02 16:04:20 +02:00
Jan Friesse
2214a60639 Support for flt and dbl in mcast_endian_convert
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-10-02 16:04:17 +02:00
Jan Friesse
cbaa2977ae Add support for sending cmap values to wire
Function is little more complex, but it is designed to be used in future
without big changes.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-10-02 16:04:07 +02:00
Jan Friesse
6825c1d39b Parse config_version as 64-bit uint
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-10-02 16:04:02 +02:00
Jan Friesse
373ded0652 Don't access invalid mem in totemconfig interfaces
When ringnumber in config file was set to value bigger or equal to
INTERFACE_MAX, we are using this big value as index to totemconfig
interfaces array, resulting to access to invalid memory and segfault.

Instead of that, ringnumber is now checked and proper error message is
printed if value is too big.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-09-27 13:54:39 +02:00
Jan Friesse
5ce59f49ba Move some totem and cpg messages to trace level
Messages which are flow messages, rather then lifecycle are now logged
in trace level.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-09-19 11:03:16 +02:00
Jan Friesse
5717655019 Add support for debug level trace in config file
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-09-19 11:03:10 +02:00
Fabio M. Di Nitto
8a2e936381 icmap: fix mapping return codes
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-09-12 08:18:50 +02:00
Fabio M. Di Nitto
bb5946babb build: clean AM_CFLAGS and AM_CPPFLAGS usage around
also set commont include dirs.

fPIC and DPIC are automatically detected and added
as required by libtool. We don't need to carry it around.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-09-07 09:04:07 +02:00
Fabio M. Di Nitto
fa92e4068a totemconfig: drop unnecessary includes
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-09-07 09:04:06 +02:00
Jan Friesse
7fe307383f Remove newline in logsys_config_file_set_unlocked
Also remove commented leftover.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-09-06 09:39:18 +02:00
Jan Friesse
bd30fe3dcd Make threaded log work
Previous two log releated patches tried to solve few problems with
threaded libqb, but introduced regressions when running in daemon mode.

This patch takes bigger hammer and hopefully solves all problems.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-09-06 09:39:15 +02:00
Jan Friesse
bd138085ca Ensure qb_log thread is started
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-09-05 09:10:57 +02:00
Jan Friesse
7026fffdf9 Ensure no garbage left in msghdr for sendmsg call
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-09-03 09:34:37 +02:00
Jan Friesse
120b7fac7b Use uint8_t in setsockopt when needed
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-09-03 09:34:35 +02:00
Jan Friesse
ee59122ad7 OpenBSD getifaddrs returns netmask without sa_family
So we relax netmask check and set to same family as ipaddr
if needed

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-09-03 09:34:33 +02:00
Jan Friesse
932829bfca Add header files when needed
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-09-03 09:34:31 +02:00
Angus Salkeld
0e86aa4ac6 Fix cpg_membership_get()
The wrong size was getting set in exec/cpg.c

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-08-31 14:48:35 +10:00
Fabio M. Di Nitto
6d28d51284 build: bring SOLARIS up to the same standard as other OSes
drop all SOLARIS specific ifdefs and replace them with feature checks

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto
a0a14c68e3 totemip: clean up headers a lot more
getifaddrs is always available if there is freeifaddr.

all BSD and openindiana have it defined in ifaddr.h.

drop a bunch of obsoleted headers.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto
18929089d1 build: drop MAP_ANONYMOUS check from configure
define it only in case it's not there

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto
5c5db34e56 build: make libstatgrab the facto default for monitoring service
drop duplicate code and remove the last COROSYNC_LINUX ifdefs
around

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto
a1c154e6fa build: use MADV_NOSYNC only when it's defined
so far only FreeBSD defines it.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto
6098ef2c14 build: make exec/totemip os detection free
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-30 15:00:27 +02:00
Jan Friesse
dbe0e9e382 Log: Use threaded mode for syslog and file log
Syslog and file log can block, so it's good idea to use libqb threaded
mode to prevent it.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-08-30 09:46:48 +02:00
Jan Friesse
9f6e6a990b Use native IPC mechanism
Instead of hardcoded SHM, we should use NATIVE, so libqb is able to find
out what is best/availiable mechanism.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-08-30 09:45:46 +02:00
Fabio M. Di Nitto
427fdd4558 build: fix build on openindiana 151a
openindiana toolchain is rather messy. This is the first cut only

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-28 15:14:49 +02:00
Fabio M. Di Nitto
9f7181b533 build: drop more dlopen leftovers from dinosaur era...
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-28 15:14:49 +02:00
Fabio M. Di Nitto
dd4d7f86e6 build: make monitoring optional in corosync exec
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-28 15:14:49 +02:00
Fabio M. Di Nitto
8f96347100 build: respect watchdog conditional when building corosync exec
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-28 15:14:49 +02:00
Fabio M. Di Nitto
76d18f964d build: use libtool for linking
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-28 15:14:48 +02:00
Tim Beale
6129ce5b59 Remove redundant default-config code
We were checking 'hold_timeout == 0' in 3 different places when setting up
the default totem config.

Signed-off-by: Tim Beale <tlbeale@gmail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-21 14:26:50 +02:00
Tim Beale
77ea036c72 Remove unused structure
Nowhere in the corosync codebase references this structure.

Signed-off-by: Tim Beale <tlbeale@gmail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-21 14:11:48 +02:00
Jan Friesse
397cc89f01 Make logging of WD and MON service correct
MON and WD services are using fsm.h, which calls log function. Such
messages were incorrectly logged as SERV (or random service) which made
debugging hard.

Solution is to add callback parameter to fsm functions and do actual
logging there.

Handling of failure states is also done in calback now.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-08-16 14:45:15 +02:00
Jan Friesse
e3cef955bf IPC: Call lib function only when it's possible
send_ok was incorrectly tested as boolean, even it's errno type
variable.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-08-09 15:10:52 +02:00
Jan Friesse
8014b2facf Close sockets after deleting from poll
This will remove (non critical) debug message from QB about polling on
closed FD.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-08-09 15:10:44 +02:00
Jan Friesse
2d10e2bbea cpg: Check input param name_t length
IPC is using buffer of CS_MAX_NAME_LENGTH for name. If user calls
function with longer string, such string can be passed to service
incomplete.

Solution is to not allow string larger then CS_MAX_NAME_LENGTH
and return error.

Same applies to cpg service.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-08-09 15:10:35 +02:00
Jan Friesse
6f6988afff Handle sync and service unload correctly
When sync started and service is unloaded in meantime, it can happen that
sync will call sync_* functions on unloaded service.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-08-09 15:10:26 +02:00
Jan Friesse
dfe34d330c service: remove leftovers from mt corosync
Multithreaded corosync used to use many ugly workarounds. One of them is
shutdown process, where we had to solve problem with two locks. This was
solved by scheduling jobs between service exit_fn call and actual
service unload. Sadly this can cause to receive message from other node
in that meantime causing corosync to segfault on exit.

Because corosync is now single threaded, we don't need such hacks any
longer.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-08-09 15:10:16 +02:00
Fabio M. Di Nitto
423e37b4ca votequorum: change init/clean up to deal with exit races
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-08 09:03:57 +02:00
Fabio M. Di Nitto
50308cb08d quorumtool: make output more meaningful
there is really no point to have a per node view of (vote)quorum
since all the info are always there.

drop the -n option for status/display nodes and improve
the output to provide a full cluster view at any given time.

Old format:

[root@fedora-master-node2 ~]# corosync-quorumtool -s
Quorum information
------------------
Date: Mon Aug 6 10:22:27 2012
Quorum provider: corosync_votequorum
Nodes: 2
Ring ID: 8
Quorate: Yes

Votequorum information
----------------------
Node ID: 3254954176
Node state: Member
Node votes: 1
Qdevice votes: 1
Expected votes: 3
Highest expected: 3
Total votes: 3
Quorum: 2
Flags: Quorate Qdevice

Membership information
----------------------
Nodeid Votes Name
3238176960 1 fedora-master-node1.int.fabbione.net
3254954176 1 fedora-master-node2.int.fabbione.net
         0 1 QDEVICE (Alive/Voting/NoMasterWins)

New format:

[root@fedora-master-node1 tools]# ./corosync-quorumtool -s
Quorum information
------------------
Date:             Mon Aug  6 15:50:03 2012
Quorum provider:  corosync_votequorum
Nodes:            2
Ring ID:          48
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice

Membership information
----------------------
    Nodeid      Votes    Qdevice Name
3238176960          1     A,V,MW fedora-master-node1.int.fabbione.net
3254954176          1         NR fedora-master-node2.int.fabbione.net
         0          1            QDEVICE

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
6b270c6cd1 votequorum: make the last QDEVICE define name consistent with everything else
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
302545e112 votequorum: add missing return call
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
379b203677 votequorum: make master_wins check stricter
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
9c50f33509 votequorum: add ENTER/LEAVE for consistency
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
2f369e7039 votequorum: delegate qdevice_master_wins setting to qdevice
votequorum has no business to device if master_wins setting is correct or not.
only the qdevice can decide and should set the value for votequorum.

Logic is:

- user requests master_wins from config
- corosync starts
- qdevice starts
- qdevice reads cmap values / register with votequorum
- qdevice decides if the node can support master_wins or not and tells votequorum
- at this point votequorum can check if an unquorate node is part of the master_wins
  partition

it is the qdevice responsibility to keep that value up to date in votequorum and the
value can be changed at runtime.

this commit also exchange per node master_wins information to lay down the infrastructure
to verify discrepancies in node config for master_wins (coming next on this channel).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
cc7bfeb462 votequorum: drop votequorum_qdevice_getinfo and collapse data into getinfo
it's really pointless to have basically a duplicated API call
to transfer one value and one name.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
65a6c29a31 votequorum: external defines should all be prefixed with VOTEQUORUM_
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
2a37b56c49 votequorum: drop _FLAG_ from defines
those are all info flags.. it's redudant and inconsistent

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
3416eacbec votequorum: fix define name to match reality
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
86dd11b28e qdevice: implement master_wins partition
in previous incarnation of qdisk + cman, master_wins was restricted
to 2 node only.

In this new version it is possible to use master_wins for any cluster
size.

Let's assume a 4 node cluster. Each node votes 1, qdevice votes 3.

node 1 becomes qdevice master
node 2/3/4 no

In case of a split (let's assume 2/2):

partition 1: {4, 1}
partition 2: {1, 1}

node 2 in partition 1 would normally be unquorate, leaving effectively
only node 1 active.

master_wins allows node 2 to recognize to be part of a quorate partition
(since node1 is broadcasting that qdevice is voting) and retain
quorum.

node1 has never lost quorate status since qdevice is voting there.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
aa295be834 votequorum: fix flag check for qdevice votes propagation
and cleanup similar code to make it more readable

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
2dae49e54a votequorum: remove last instance of state and rename it to cast_vote
also align naming of vote to cast_vote for info calls

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
3fed1af077 votequorum: several major bug fixes and code cleanup
- add a protection check to avoid spurious messages on membership
  change
- greately simplify processing of nodeinfo, since the only
  data that we send for qdevice over nodeinfo is the number of votes
- fix a flag check to trigger quorum calculation that would
  leave a cluster unquorate under certain conditions

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
62659dbb21 votequorum: move to the new flag structure
simplify different code path as checks are simpler, separate
ALIVE and CAST_VOTE

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
c9e207ec92 votequorum: simplify getinfo data and protect against call against quorum node
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
f2b25936e5 votequorum: use REGISTERED flag consistently
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
0bcb4cddcc votequorum: simply internal qdevice_getinfo function
as data are moving around we can drop lots of special cases

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
43d1439600 votequorum: add qdevice CAST_VOTE status/flag
this is a preparation commit for the next changes. right now it is
no more than an alias to ALIVE.

CAST_VOTE is required to support master/slave feature from qdevice.

Effectively a quorum device can be:

Not registered / registered (connected to API but nothing else is happening)

if registered:

Not alive / alive (quorum device is petting the API via poll and timer is running)

if alive:

Not voting (slave) / voting (master)

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:16 +02:00
Fabio M. Di Nitto
987e26f8d1 votequorum: rename NODE_FLAGS_QDEVICE_STATE to NODE_FLAGS_QDEVICE_ALIVE
STATE is confusing and overloaded term in votequorum as it's used for nodes
and other bits.

make the name unique and ALIVE means that the qdevice is heartbeating
to votequorum.

improve display of the status in tools and tests.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:16 +02:00
Fabio M. Di Nitto
4621a6cd02 votequorum: rename NODE_FLAGS_QDEVICE to NODE_FLAGS_QDEVICE_REGISTERED
make the flag name explicit

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:16 +02:00
Jan Friesse
fed7fc23e1 Don't call sync_* funcs for unloaded services
When service is unloaded, sync shouldn't call sync_init|process|activate
and abort functions. It happens very rare, but in process of unloading
all services, totem can recreate membership and bad things can happen
(service is unloaded, so there may be access to already freed memory,
 ...)

Solution is to fetch services sync handlers in every time when we are
building service list instead of using precreated one.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-08-02 09:34:58 +02:00
Jan Friesse
9fb7979370 Introduce SERVICES_COUNT_MAX macro
Sync/service was using maximal number of services in ehter numberic form
(magic constant) or inconsistently, this means using
SERVICE_HANDLER_MAXIMUM_COUNT which means maximal number of handlers.

New macro solves this.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-08-02 09:32:05 +02:00
Jan Friesse
537bf56fcc cpg: Be more verbose for procjoin message
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-07-30 10:22:16 +02:00
Jan Friesse
04dac3ff5d Correctly free state string in wd
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-07-12 15:53:04 +02:00
Jan Friesse
e4d75d1ab3 Revert "Free state variable allocated in wd_resource_state_is_ok"
This reverts commit 01c63ca17c.
2012-07-11 17:04:41 +02:00
Jan Friesse
a966506c1e cpg: Enhance downlist selection algorithm
Let's say we have 2 nodes:
- node 2 is paused
- node 1 create membership (one node)
- node 2 is unpaused

Result is that node 1 downlist is selected, so it means that from node 2
point of view, node 1 was never down.

Patch solves situation by adding additional check for largest previous
membership.

So current tests are:
1) largest (previous #nodes - #nodes know to have left)
2) (then) largest previous membership
3) (and last as a tie-breaker) node with smallest nodeid

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-06-14 15:15:42 +02:00
Jan Friesse
f3457c5d49 cpg: Print cpg name to debug informations
In downlist and joinlist debug output group was printed in nonsense
format of integer to pointer to array.

Now it's printed by full name.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-06-14 15:15:39 +02:00
Jan Friesse
35446d6bcc cpg: Process join list after downlists
let's say following situation will happen:
- we have 3 nodes
- on wire messages looks like D1,J1,D2,J2,D3,J3 (D is downlist, J is
  joinlist)
- let's say, D1 and D3 contains node 2
- it means that J2 is applied, but right after that, D1 (or D3) is
  applied what means, node 2 is again considered down

It's solved by collecting joinlists and apply them after downlist, so
order is:
- apply best matching downlist
- apply all joinlists

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-06-14 15:15:35 +02:00
Jan Friesse
816d7687b0 cpg: Never choose downlist with localnode
Test scenario is follows:
- node 1, node 2
- node 1 is paused
- node 2 sees node 1 dead
- node 1 unpaused
- node 1 and 2 both choose same dowlist message which includes node 2 ->
node 2 is efectivelly disconnected

Patch includes additional test if left_node is localnode. If so, such
downlist is ignored.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-06-14 15:15:32 +02:00
Jerome FLESCH
99faa3b864 When flushing, discard only memb_join messages
Patch solves problem when 1 ring out of 2 went up/down quite often.

The simplest setup to reproduce bug is following:
- 2 VMs, connected by 2 network interfaces
- OS: Linux
- On one of the VMs, a test program sending some CPG messages (see the
  script "test_corosync.sh" joined to this mail for example)

Here are the Corosync logs we get when we do this setup:

Jun 06 16:23:40 corosync [TOTEM ] A processor joined or left the
membership and a new membership was formed.
Jun 06 16:23:40 corosync [CPG   ] chosen downlist: sender r(0)
ip(192.168.56.104) r(1) ip(192.168.57.104) ; members(old:1 left:0)
Jun 06 16:23:40 corosync [MAIN  ] Completed service synchronization,
ready to provide service.
Jun 06 16:24:37 corosync [TOTEM ] Marking ringid 1 interface
192.168.57.105 FAULTY
Jun 06 16:24:38 corosync [TOTEM ] Automatically recovered ring 1
Jun 06 16:25:33 corosync [TOTEM ] Marking ringid 1 interface
192.168.57.105 FAULTY
Jun 06 16:25:34 corosync [TOTEM ] Automatically recovered ring 1
Jun 06 16:26:35 corosync [TOTEM ] Marking ringid 1 interface
192.168.57.105 FAULTY
Jun 06 16:26:36 corosync [TOTEM ] Automatically recovered ring 1
(...)

The second ring goes down about every 2 minutes and automatically back
up right after.

We spent some times looking for the commit that introduced this bug, and
it appears it's due the following one:
Corosync 1.3.3 -> 1.3.4: e27a58d93d
Corosync 1.4.1 -> 1.4.2: be608c0502
Commit message: Ignore memb_join messages during flush operations

I had a look at this commit, and it seems to me it's dropping too many
packets:
Because of this commit, while totemrrp_recv_flush() is called, Corosync
drops memb_join packets, but also ORF tokens. In the end, it seems that
sometimes, we drop so many of them that Corosync marks the ring as
faulty.

To fix that, only memb_join messages are dropped now.

Signed-off-by: Jerome FLESCH <jerome.flesch@netasq.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-06-11 10:59:30 +02:00
Jan Friesse
2766e57ce5 Store fdata with timestamp and pid in name
This should allow easier handling of various blackbox dumps. Original
fdata name is now symlink to latest created dump.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-06-05 12:19:42 +02:00
Jan Friesse
7ce332a713 totemudpu: Bind sending sockets to bindto address
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-05-31 09:28:52 +02:00
Fabio M. Di Nitto
f008cf442c rename mainconfig to logconfig
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-05-29 09:36:00 +02:00
Fabio M. Di Nitto
b283ef8f12 mainconfig: allow mainconfig logic to be used both internally and externally
corosync logging configuration logic is rather complex and in order
to make it simpler to reuse (at least within corosync/ tree)
we need to be able to use both icmap and cmap.

the patch might seem controversial, but it reduces heaps of code around
from qdevices (coming next).

It might be useful to consider moving this to a common shared library
but there aren't enough users yet and a shared lib would force
corosync to link with cmap (that we do not want at all costs)

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-05-29 09:04:03 +02:00
Angus Salkeld
5831136c87 LOG: make sure the log target is enabled.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-05-29 14:02:42 +10:00
Angus Salkeld
e6b35bdb7a LOG: handle closing unused logfiles better
This fixes a bug where having a second log file will close
the previous one.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-05-29 14:02:42 +10:00
Angus Salkeld
e6afc761fe LOG: be more explict about the qb file names
else we can get messages been put in the wrong subsys.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-05-29 14:02:42 +10:00
Jan Friesse
2894f33c4f totemip: Support bind to exact address
Logic for binding now works in following way:
- Try to find exact match
- If not exact match is found, use first found network address

This allows set concrete IP even if network settings contains two IPs on
same network.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-05-24 14:01:12 +02:00
Jan Friesse
aaa575e091 totemip: insert items in correct order
list_add_tail is used instead of list_add so ip addresses are inserted
in same order as returned by getifaddrs.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-05-24 14:01:08 +02:00
Jan Friesse
0791f44c41 Include ringid in processor joined log message
This should help correlate syslog entires with their blackbox
counterparts.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Andrew Beekhof <andrew@beekhof.net>
2012-05-17 14:58:04 +02:00
Fabio M. Di Nitto
f2444effd0 icmap: don't leak memory when changing ro/rw status on a key
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-04-24 09:28:23 +02:00
Fabio M. Di Nitto
1dcb2d43d9 icmap: fix a valgrind errors (pass 1)
clean up a lot of allocated blocks at exit.
those changes has no runtime effects, but it makes valgrind
output a bit more useful by dropping over 700 errors/warnings to skip
over every single run.

there are still a few icmap related valgrind errors but those need
some more complex and timeconsuming investigation.

pre patch:

==21844== HEAP SUMMARY:
==21844==     in use at exit: 1,229,321 bytes in 1,516 blocks
==21844==   total heap usage: 7,191 allocs, 5,675 frees, 3,819,853 bytes allocated

==21844== LEAK SUMMARY:
==21844==    definitely lost: 3,617 bytes in 11 blocks
==21844==    indirectly lost: 21,960 bytes in 11 blocks
==21844==      possibly lost: 1,080,101 bytes in 131 blocks
==21844==    still reachable: 123,643 bytes in 1,363 blocks
==21844==         suppressed: 0 bytes in 0 blocks

==21844== ERROR SUMMARY: 136 errors from 136 contexts (suppressed: 0 from 0)

post patch:

==25793== HEAP SUMMARY:
==25793==     in use at exit: 1,185,870 bytes in 808 blocks
==25793==   total heap usage: 9,427 allocs, 8,619 frees, 4,156,841 bytes allocated

==25793== LEAK SUMMARY:
==25793==    definitely lost: 3,697 bytes in 12 blocks
==25793==    indirectly lost: 22,248 bytes in 13 blocks
==25793==      possibly lost: 1,079,655 bytes in 113 blocks
==25793==    still reachable: 80,270 bytes in 670 blocks
==25793==         suppressed: 0 bytes in 0 blocks

==25793== ERROR SUMMARY: 119 errors from 119 contexts (suppressed: 0 from 0)

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-04-24 09:28:23 +02:00
Fabio M. Di Nitto
d2872aec70 crypto init: release *_slot resource after init
Those are only used at init phase and we can free some memory for the system.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-04-20 10:57:16 +02:00
Fabio M. Di Nitto
b34c1e2870 ipcs: allow connections only after all services are ready
this fixes a rather annoying race condition at startup where a client
connects to corosync "too fast" before the service is ready to operate
and client gets some random data during initialization phase.

With this fix, we allow connections to ipc only after the main engine
is operational and configured (and after the first totem transition).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-04-16 13:39:03 +02:00
Jan Friesse
f89d7b715f Always allocate totemrrp stats array
This prevents segfault when rrp mode is set with only one ring.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-04-10 09:08:42 +02:00
Jan Friesse
92ead6106f Properly parse uidgid files
Full path to key is now tested rather then key name only.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-04-10 09:08:36 +02:00
Fabio M. Di Nitto
cde4468581 totemcrypt: fix build warning (unused variable)
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-27 12:06:46 +02:00
Fabio M. Di Nitto
4378915a33 totemcrypto: major code cleanup (no functional or onwire changes)
- cleanup include list
- reorder code and functions (crypto then hash)
- split crypt/decrypt/hash functions
- some micro optimizations by dropping a few memcpy
- make the code more readable (better var names and buffers mapping)
- improve exit paths on error (return codes and free)
- store crypto header size instead of recalculating it per packet

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-27 11:43:07 +02:00
Jan Friesse
e925f42165 Make ifaces_get work with dynamic no_rings
Commit which added number of addresses to srp_address structure didn't
count with totemsrp_ifaces_get where whole structure was copied instead
of addresses only. This is now fixed.

Also to make API totempg forward compatible, size of interfaces array
must be passed to ifaces_get like functions to prevent memory overwrite.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-26 11:54:26 +02:00
Jan Friesse
124ff4339c Add no_addrs field in srp_addr structure
This should allow us future change to dynamic number of rings without
breaking wire compatibility.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-22 14:03:38 +01:00
Jan Friesse
7a0a39b949 Mark few more icmap keys as read only
Also most of the key settings are now centralized in one function, so
it's easier to audit.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-16 09:37:25 +01:00
Jan Friesse
e57b5b9e6d crypto: Remove sha224 and add md5 hash
SHA224 is not supported on RHEL6 and also it's kind of weird. Instead of
that, md5 can now be configured.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-15 17:36:56 +01:00
Jan Friesse
3b7c2f0588 Update crypto_set API
Also few leftovers from cfg is removed and version of totempg is
increased to 5 to reflect all changes we made

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-15 17:33:53 +01:00
Fabio M. Di Nitto
c75153feb4 crypto: allocate padding in crypto_header
while it might seem a waste of space by using 2 extra bytes in
the crypto_config_header, it actually gives us the option
to grow "unknown at this time" features without hopefully
breaking onwire compat

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-15 12:55:11 +01:00
Fabio M. Di Nitto
4a2d503643 crypto: add new hashing methods and fix config defaults
add support for sha224/256/384/512

change config defaults to match coroparse and totemconfig

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-15 10:55:32 +01:00
Fabio M. Di Nitto
737de4dbd4 crypto: change network packets and add dynamic crypto header/data
The new network packet will look:

struct crypto_config_header * that provides info on crypto/hashing
hash_block[size based on hashing function] (if hash is selected)
salt[SALT_SIZE] (if crypto is selected)
...data...

and we kill the concept of crypto_security_header completely since
values are now dynamic for hash_block_size.

the reason why hash_block needs to be there, is because we do
hash salt in case both hashing and crypto are selected.

the crypto_config_header is totally transparent to totem
and to any underlaying crypto functions.

as we go cleaning, also use HASH_BLOCK_SIZE to generate hash_block.
the input buffer and output buffer size are dependent on the algo
used to hash.

we can now determine the real header size and adjust net_mtu properly
at startup. This will allow in future to use any algorithm since
size is dynamic.

some part of the code still needs some polishing to make it more
readable (specially the mapping of pointers into the packet
is still a bit obscure).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-14 15:57:01 +01:00
Fabio M. Di Nitto
c3f7d0ef3e totem: don't send garbage onwire if we fail to crypt
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-14 15:30:40 +01:00
Fabio M. Di Nitto
452800c958 crypto: add crypto config to network data
this add 2 bytes at the end of the each packet to propagate
config info.

in case there is a config mismatch packet must be rejected.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-14 12:32:10 +01:00
Fabio M. Di Nitto
0a6a6bbcfa crypto: drop secauth and make crypto none work again
keep totem.secauth config key for compatibility

if the key is NOT set, crypto will default to aes256/sha1
if the key is set to "off", crypto is disabled.
this reflects pretty much old behavior

keywords totem.crypto_cipher and totem.crypto_hash can
override secauth individually.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-14 11:28:36 +01:00
Jan Friesse
ab1675f0fe Parse and use hash and crypto from config file
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-13 17:38:59 +01:00
Jan Friesse
cb97ed186a Rename totemcrypto
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-13 17:38:46 +01:00
Fabio M. Di Nitto
55e8476697 crypto: mask the crypto operations from totem packet size management
totem doesn't need to understand what crypto does.

totem needs to be able to tell crypto: "those are data, play with them"
and crypto needs to return: "here are your scrambled data and the new size"

similar to decrypt/verify.

this way we add enough dynamic within crypto to change header size and all
at any given time (for different hash algorithm for example) without
affecting on wire compat.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-13 15:50:58 +01:00
Jan Friesse
42a2f69e6f onecrypt: move encryption code to crypto.c
This will remove duplicity of code.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-13 12:23:13 +01:00
Jan Friesse
b5f7dcefeb cfg: remove crypto_set
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-13 12:23:10 +01:00
Jan Friesse
8cdd2fc493 Remove libtomcrypt
Tomcrypt in corosync is for long time not updated. Because we have
support for libnss, libtomcrypt can be removed.

Also few leftovers (AES is 256 bits, not 128, ...) are removed.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-13 09:19:47 +01:00
Fabio M. Di Nitto
20a5289074 drop evs service
there are several reasons for this:

1) evs is only partially implemented with no plans to complete it

typedef enum {
       EVS_TYPE_UNORDERED, /* not implemented */
       EVS_TYPE_FIFO,          /* same as agreed */
       EVS_TYPE_AGREED,
       EVS_TYPE_SAFE           /* not implemented */
} evs_guarantee_t;

2) evs has no users in any upstream distribution and no search
   engine can find any other upstream using it.

3) the only reason (I was told) to carry around evs was that evs
   receives the full ring_id struct from totem. This is only
   partially correct because while the structures are prepared
   to carry around those data, they are never transmitted from
   corosync engine down the IPC line to the user.
   CPG ring_id contains the exact same information and it's
   actually less buggy (due to prototying of the info).

worst case scenario where a user really absolutely need libevs,
it can be easily reimplemented as libcpg wrapper and avoid
lots of code duplication.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-12 15:51:50 +01:00
Fabio M. Di Nitto
c00502a70a build: drop another leftover from the past
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-12 07:13:04 +01:00
Fabio M. Di Nitto
fd79118110 build: drop last LCRSO references
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-12 07:12:20 +01:00
Fabio M. Di Nitto
eb3d49ef7d pload: make it a test service and not a public one
pload is a performance benchmark that measures the onwire
speed of corosync.

problem is that once pload has been executed, the cluster
is basically dead.

turn pload into a test tool, by removing corosync-pload tool
and user library.

cleanup pload code to make it more readable and drop lots
of unnecessary stuff.

add test/ploadstart tool that can configure and start pload
via cmap calls.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-12 07:11:51 +01:00
Fabio M. Di Nitto
142ce8c3a1 totem: drop crypt_accept: concept/option
this was another old onwire compat mode that is not useful anylonger.

we can safely move the new model by default.

According to Honza (real hardware 1 node testing) there are no
performance impact.

My tests (8 nodes VM cluster), there is up to 10/12% performance
improvements up to 1M packet size where old and new models are equal.

As a side note, nss still shows to be a performance loss on both
real and virtual hw (without any kind of nss hw acceleration).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-10 07:08:30 +01:00
Angus Salkeld
03b32d7fad Fix typo in stats key name.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-09 21:54:51 +11:00
Angus Salkeld
41b4416bd4 Remove unused function logsys_priority_name_get()
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-09 21:54:51 +11:00
Angus Salkeld
f628ccba8b Add pid, hostname and process name to the logfile
Note this is only for file targets not stderr or syslog.

https://bugzilla.redhat.com/show_bug.cgi?id=789925

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-09 21:54:51 +11:00
Fabio M. Di Nitto
e0e27e3d12 utils: cleanup main daemon exit codes
some of them are not in use anymore and can be dropped.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-09 11:15:44 +01:00
Fabio M. Di Nitto
8f6e5ff530 sync: kill evil and syncv1 in one shot
this change breaks onwire compatibility.

cpg is the only user of sync_* interface and it's the only
service that will require extra testing.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-09 11:15:08 +01:00
Fabio M. Di Nitto
64fd946086 votequorum: move last malloc/alloca buf to static
this should guarantee that votequorum won't fail under high memory
pressure. Price is 3500 bytes extra preallocated at startup.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-05 14:30:17 +01:00
Fabio M. Di Nitto
90c602902c votequorum: fix node allocation memory leak
stop using malloc for each new node, because we cannot free the memory
easily. Move to a static allocated buffer that can contain
PROCESSOR_MAX + qdevice cluster_node instead.

We can never have more than PROCESSOR_MAX nodes anyway and the memory
footprint is small enough compared to memory leaks (those can
effectively happen only in very dynamic clusters with tons of different
nodes joining/leaveing with different nodeids).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-05 14:30:17 +01:00
Fabio M. Di Nitto
2d7a8ab29a votequorum: rename leave_remove to allow_downscale
pointed out that leave_remove can be easily confused with the old
cman leave_remove behavior. The two are substantially different
and we need to avoid confusion both for users and our support team.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto
75b3dc0f4e votequorum: fix handling of config updates
cmap changes are local to the node only and should not be broadcasted
as configuration changes.

if any change has happened to us, we will inform other nodes via
send_nodeinfo.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto
861d2c90ef votequorum: free our data and lists on exit
this is mostly to avoid valgrind errors on exit and make the output
more readable.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto
edf0728323 votequorum: disallow special features vs qdevice
simply taking the safest path here since integration of qdevice is not
fully complete

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto
33ea03f426 votequorum: fix node check based on reconfig parameter
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto
43e08bb143 votequorum: make a common function to calculate votes and cluster members
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto
692fd72468 votequorum: incorporate static config into dynamic
no functional changes or extra features yet

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto
f960d0a342 votequorum: move all configuration in votequorum_readconfig
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto
3a717fc8e9 votequorum: start moving from static to fully dynamic config
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto
f12bfc5ad8 votequorum: disallow wait_for_all and qdevice operations
The problem here is that user expectations, when using both modes
at the same time, have not been set yet. There are 2/3 options
that need investigation.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto
4a93ff267f votequorum: improve debugging output
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Jan Friesse
25381738c2 Always set interface_up in totemip_iface_check
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-02 09:41:36 +01:00
Fabio M. Di Nitto
e34f095551 votequorum: fix node->flags type when receiving nodeinfo messages
old_flags was set to uint16_t but it needs to be uint32_t.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-29 09:58:10 +01:00
Fabio M. Di Nitto
6a9e4760da votequorum: fix segfault in wfa status update
this is a regression introduced by cb5fd775

when reading static config us->flags does not exists yet and therefor
setting it will cause a segfault.

Move the settings after cluster_node *us is created, with the long
term plan to simply kill the whole _static readconfig bits
in favour of dynamic (runtime changeable) bits.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-29 09:58:10 +01:00
Fabio M. Di Nitto
cb5fd77501 votequorum: major rework to fix qdevice API and integration with core
qdevice is a very special node in the cluster and it adds a certain
amount of complexity and special cases across the code.

most of the qdevice data are shared across the cluster (name/votes)
but effectively each node has a different view of the qdevice
(registered/unregistered/voting/etc.)

with this change, we align the qdevice view across the node,
exchanging more data between nodes and we fix how qdevice behaves
and it is configured.

The only side effect is that the amount of data transmitted on wire
is slightly higher.

The qdevice API is still disabled by default. This means that
the amount of real changes in current code are a lot smaller
than it appears by this patch.

TODO: documentation/man pages needs to be updated once
      this change is in (and behavior finalized).

User visible changes:

- configuration (coroparse, exec/votequorum):
  the quorum device section is now standalone within the quorum.

  quorum {
    provider: corosync_votequorum
    device {
      model: (name)
      timeout: (millisec)
      votes:
    }
  }

  the keyword "model:" is mandatory to enable qdevice in configuration
  and should express the name of the script/daemon that will provide
  the qdevice. Looking into the future, an init script or systemd
  service will look for that name in /path/to/be/decided/name
  and start/stop qdevice.

  timeout: defines the maximum interval the qdevice implementation
  has available between poll (see votequorum_qdevice_poll.3) before
  the device is considered dead and votes discarded

  votes: is now a configuration parameter and not an API call.
  quorum devices don't care what they need to vote.
  votes is autocalculated when a nodelist is available and all
  nodes in the list vote 1. Otherwise this parameter is mandatory.

- configuration (exec/votequorum):
  startup and runtime configuration changes have been improved.
  errors at startup are considered fatal. errors at runtime
  have different exit paths.

  startup:

  * quorum.two_node and qdevice are incompatible.
  * quorum.expected_votes requires quorum.device.votes.
  * quorum.expected_votes - quorum.device.votes cannot be lower
    than 2.
  * qdevice and last_man_standing are mutually exclusive.
  * qdevice and auto_tie_breaker are mutually exclusive.

  runtime config changes:

  * quorum.two_node and qdevice are incompatible:
    if quorum device is alive, two_node is disabled.
    if quorum device is not alive and node count is 2, two_node is
       enabled, and quorum device cannot be registered

  * if either last_man_standing or auto_tie_breaker were enabled
    at startup, and at runtime quorum device is configured,
    quorum device registration will be blocked.

  * if quorum.expected_votes is configured but not quorum.device.votes,
    quorum device registration will be blocked.

  * if quorum.device.votes is not configured and we cannot
    automatically calculate it, quorum device registration will be blocked.

  * An error in configuring quorum.expected_votes and quorum.device.votes
    will block quorum device registration.

blocking quorum device registation, also means dropping the votes.

quorum.device.votes (either set or automatically calculated) is now
used to determine current expected_votes in the cluster.

- logging (exec/votequorum):

  all errors from configuration are treated as WARNING/CRITICAL.

  lots of extra DEBUG output is added (see internal changes too).

- corosync-quorumtool (tools/corosync-quorumtool):

  * added option to forcefully kick out a quorum device from the local
    node. This is for emergency recovery only and it is only
    available when qdevice API is built-in.

  * Improved status output, specifically add node state and qdevice
    information

[root@fedora-master-node2 coro]# corosync-quorumtool -s
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          132
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net
   0     1  QDEVICE (Voting)

  * allow to print status for any node in the cluster known to
    local node.

[root@fedora-master-node1 coro]# corosync-quorumtool -s
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          144
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      2
Quorum:           2
Flags:            Quorate
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net

[root@fedora-master-node1 coro]# corosync-quorumtool -s -n 2
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          144
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net
         0     1  QDEVICE (Voting)

Internal changes:

- change qdevice timer to not run all time, but only when necessary.
- change votequorum_nodeinfo on wire data to use flags instead of uint8_t
  and add QDEVICE status.
- allocate nodeid 0 to qdevice since it's the only real
  nodeid that be reserved.
- change send_nodeinfo to allow to send nodeinfo for any node
  so that we can share qdevice info across the cluster
  (and this might be useful in future if we need to sync
   internal cluster view).
- add votequorum api call to update qdevice name
- add runtime data if quorum device has been forcefully disabled
  by config error
- add qdevice votes to expected_votes calculation (this
  is probably the biggest difference vs cman)
- change votequorum_read_nodelist_configuration so that
  we can autocalculate votes for qdevice (we need the nodecount
  vs votes).
- add all checks for startup/runtime config (see above).
- do not make qdevice part of the membership_list received from
  totem. None of our users care about it and it is not a real node.
- change onwire message handlers to deal with "data for this node from any node"
  case and undersand nodeid 0 for qdevice info
- always allocate qdevice at startup. this simplifies code a lot.
- dispatch qdevice nodeinfo on membership changes.
- inform libvotequorum users when a qdevice is registered
- improve substantially qdevice api and add a simple
  barrier based on qdevice name.
- add qdevice API barrier at cluster level. This feature allow
  only one qdevice name to be active in the cluster at any time.
- qdevice getinfo can now report status for qdevice on any node.
- change slightly the way the qdevice API is built-in/out:
  only the libvotequorum calls are #ifdef'out now. Doing so in
  the core is too complex and would make the code unreadable
  with the risk of missing a bit or two effectively introducing
  an on-wire incompatibility if we will ever turn the API on.
- probably added some bugs on the way...

TODO: update qdevice_* API once the above is settled and test
      qdevice integration with other features.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com> (only second part)
2012-02-27 09:30:26 +01:00
Jan Friesse
c30c088597 Tweak nodeid warning
Nodeid warning now appears only when both totem.nodeid and nodelist
nodeid exists. When nodelist nodeid is not defined, totem.nodeid is
used.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-21 16:33:56 +01:00
Jan Friesse
04720649ba iba: Use configured node id
Corosync was ignoring nodeid for iba transport and always used
autogenerated one.

Original patch by: Jason Dillaman <jdillama@redhat.com>
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-02-21 16:27:16 +01:00
Angus Salkeld
40727bd6a3 Convert the common lib into a shared lib.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-02-21 20:26:08 +11:00
Jan Friesse
88ae75d6c2 Allow autoconfiguration of interface section
Thanks to totemip_getifaddrs infrastructure it's now possible to use
nodelist informations to autoconfigure interface bindnetaddr. Together
with cluster_name, interface section can be completely omitted.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:57 +01:00
Jan Friesse
ba13537471 totemconfig: ensure suffix for ringX_addr
Patch makes sure, that ringX_addr key has really _addr suffix.
Previously, it was possible to enter ringXanything and it was
interpreted as ringX_addr.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:57 +01:00
Jan Friesse
8cde53aa99 cmap: Handle NULL in [i]cmap_set_string value
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:57 +01:00
Jan Friesse
88ddaecfe9 Create solaris specific getifaddrs
This not only makes possible to use generic totemip_iface_check, but
also fixes some problems with previous implementation (fixed mask, not
very well supported ipv6, ...)

Tested on OpenIndiana 151a

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:57 +01:00
Jan Friesse
fd47fddcaf Add totemip_iface_check based on totemip_getifaddrs
Also Linux and BSD/Darwin specific bits are no longer needed, so they
are gone.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:57 +01:00
Jan Friesse
27e9988486 Add generic implementation of getifaddrs
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:56 +01:00
Angus Salkeld
023c4fa0cc Move hdb_error_to_cs to corotypes.h
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-14 11:10:14 +11:00
Steven Dake
415ef892ad Remove empty testquorum.c file
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-02-13 17:05:04 -07:00
Steven Dake
2ad0cdc832 Update copyright header dates in exec directory
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-02-13 17:05:04 -07:00
Steven Dake
4ee9550f80 Remove jhash.h since it is not used
We would use libqb for hashing now if we needed hashing.
cpg no longer uses jhash.h.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>
2012-02-13 17:05:04 -07:00
Steven Dake
815375411e Remove unused or unimplemented CFG apis
Remove:
cfg_statetrack
cfg_statetrackstop
cfg_administrativestateste
cfg_administrativestateget
cfg_serviceload
cfg_serviceunload

Rev SO to 5.0.0

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-02-13 17:04:49 -07:00
Fabio M. Di Nitto
8840113704 votequorum: fix variable init
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-09 16:49:25 +01:00
Fabio M. Di Nitto
e3ba920307 votequorum: fix possible memory corruption
nodeid = 0 is a valide nodeid and node associated with it should
not be freed

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-09 16:49:25 +01:00
Fabio M. Di Nitto
939a7b2d66 quorum: don't leak memory on error
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-09 16:49:25 +01:00
Angus Salkeld
6cd576b0f5 move hdb_error_to_cs to common_lib
Note the previous inconsistent implementation.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-09 10:45:56 +11:00
Angus Salkeld
da483b8121 Add a common library that can be shared between libs and corosync
We have always had this problem and worked around it by coping code
or using inline functions. Both not good IMO.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-09 10:45:56 +11:00
Steven Dake
7592e3b61e Remove include/engine/quorum and integrate it into exec/engine.h
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-08 08:31:10 -07:00
Steven Dake
01c63ca17c Free state variable allocated in wd_resource_state_is_ok
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-07 08:42:58 -07:00
Steven Dake
c05cbb65bc Remove leaked resource error from wd_resource_state_is_ok
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-07 08:42:58 -07:00
Steven Dake
190dba3933 Remove use after free and free of uninit value in mainconfig error path
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-07 08:42:58 -07:00
Steven Dake
46a2b1a297 Remove use after free in corosync_main_config_set in error path
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-07 08:42:58 -07:00
Fabio M. Di Nitto
cff57430d6 votequorum: fix quorum_ringid setting before any delivery occours
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-02-07 14:07:09 +01:00
Angus Salkeld
8992acb815 LOG: add libqb as a "subsys"
So we can see libqb internal logs

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-07 10:53:56 +11:00
Jan Friesse
546aea23cf cmap: Check RO flag in adjust int function
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-06 16:37:00 +01:00
Jiaju Zhang
dd9e177af7 CPG: Send CPG_REASON_PROCDOWN when really needed
This patch fixes the issue that in some cases where cpg_finalize()
was called just after cpg_leave() was called, CPG_REASON_PROCDOWN
might also be sent while CPG_REASON_LEAVE had already been sent.
This behavior is not aligned with what the man page has described:
"CPG_REASON_PROCDOWN - the process left a group without calling
cpg_leave()."
And it will confuse CPG's clients in that one process left results
in two different reasons being sent.

The root cause of this issue is cpg_leave() will return after
adding the LEAVE message to the sending queue, but the cpg's group
name has not been cleared yet. Just at that time, cpg_finalize()
is being called, then it determines if there is the calling of
cpg_leave() happened only by the checking of cpg's group name, so
this method is not sufficient.

Signed-off-by: Jiaju Zhang <jjzhang@suse.de>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-06 08:07:54 -07:00
Fabio M. Di Nitto
3b77dd9d83 votequorum: fix expected votes manual override from quorumtools
votequorum internal quorum/expected_vote check was slightly too
conservative and was not done correctly when leave_remove feature
is enabled.

this fix allows admins to effectively override expected_votes
and drive ev_barrier as expected.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-02-03 10:33:33 +01:00
Jan Friesse
0929dcb68c Better checks of integer values in coroparse
Instead of atoi, strtol is used. This allows detection of typical
problems like empty value of key and incorrectly entered numbers.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-03 09:16:43 +01:00
Fabio M. Di Nitto
230231fedb votequorum: add runtime internal data to icmap runtime.votequorum.*
specifically ev_barrier, two_node, lowest_node_id and wait_for_all_status
are values that change internally at runtime and keeping track
of those can make debugging rather easy, specially when LOG_DEBUG is not
set.

Also track our node id.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-By: Christine Caulfield <ccaulfie@redhat.com>
2012-02-02 16:36:57 +01:00
Jan Friesse
33e5ce8d56 Show correct error when open of logfile failed
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-02 09:30:49 +01:00
Jan Friesse
a80febda7e Store error str if can't open logfile
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-02 09:30:49 +01:00
Angus Salkeld
af9cfc7b55 IPC: reference count the connection whilst flushing the outq
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-02 11:34:26 +11:00
Angus Salkeld
45cb05f1ad IPC: allow for failures in the connection_created callback
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-02-01 08:51:13 +11:00
Fabio M. Di Nitto
46b7b155a4 votequorum: add leave_remove option
this also cleanup NODESTATE for good. JOINING was never used

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-31 16:58:08 +01:00
Fabio M. Di Nitto
c16086bead votequorum: honor onwire node flags change
internal flags were not propagated correctly in the node status

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-01-31 10:20:32 +01:00
Fabio M. Di Nitto
9fa83dabbe quorum: fix load/unload priority for quorum services
all main services are loaded at priority 1.
vfs_quorum and votequorum did not specify a priority and
automatically defaulting to 0, that has a special meaning
of being loaded last and unloaded last.

this is not correct behavior and limits what votequorum
can do at shutdown, for example notify other nodes that
it is leaving (something that cannot be gathered by
totem membership change callback).

fix vsf_quorum to load at priority 1 as the other
default services and bump votequorum to 2 (needs to
unload before everything else currently known).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-01-31 10:16:52 +01:00
Fabio M. Di Nitto
a2b960d109 service: fix service unload regression introduced by lcrso dropping
service exec_exit_fn was not honored because the loop was looking
into the wrong icmap key

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-01-31 10:16:16 +01:00
Fabio M. Di Nitto
fc61b20a8a votequorum: drop unnecessary flags
code inspection shows that those internal flags are never used

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-01-31 10:14:19 +01:00
Steven Dake
007e5c9458 Honor exec_init_fn call
exec_init_fn now either returns NULL (success) or a string which indicates
the error that occured during service engine initialization.  If an error
occurs, corosync will exit.  This patch adds ykd and makes other suggestions
from Fabio Di Nitto.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>
2012-01-30 14:05:09 -07:00
Fabio M. Di Nitto
ccd36af00e votequorum: rename qdisk to qdevice
a quorum device is not necessarely a disk and this also aligns
various names to be generic

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-By: Christine Caulfield <ccaulfie@redhat.com>
2012-01-27 11:17:02 +01:00
Fabio M. Di Nitto
769fc913f3 quorum: drop quorum.quorate config option
it's unused / unnecessary

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-By: Christine Caulfield <ccaulfie@redhat.com>
2012-01-27 11:16:36 +01:00
Fabio M. Di Nitto
b05477859f votequorum: fix expected_votes propagation
it is not correct to randomly accept expected_votes from any node in
the cluster. We can only allow expected_votes from quorate nodes.

A quorate cluster is "always" right and have the correct expected_votes.

One of the different bug triggers:

quorum {
  expected_votes: 8
  auto_tie_breaker: 1
  last_man_standing: 1
}

start all 8 nodes.
clean shut down 2 nodes.
wait for lms to kick in.
kill 3 nodes with highest nodeid
(we want to retain a quorate partition of 3 nodes)
start one node again -> cluster will be unquorate

This happens because the node rebooting/rejoining with
non current cluster status will propagate an expected_votes of 8,
while in reality the cluster is down to expected_votes: 3.

4 nodes are still < 5 (quorum for 8 nodes/votes).

In order to avoid this condition, we need to exchange expected_votes
information among nodes but we cannot randomly trust everybody.

1) Allow expected_votes to be changed cluster-wide only if the
   information is coming from a quorate node.
2) Fix node->expected_votes based on quorate status
3) allow a joining node to decrease quorum and expected_votes
   if the node is not yet quorate, but it's joining a quorate
   cluster

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-26 14:32:54 +01:00
Fabio M. Di Nitto
88e6830df1 votequorum: fix auto_tie_breaker design and simplify code a lot
auto_tie_breaker requires to know the lowest node id in the currently
quorate partition and not of the whole cluster.

this allow us to determine the lowest node id as soon as we are quorate
and remove the complexity to read it from WFA or nodelist. Add
the same time it adds the flexibility for dynamic nodeids in a cluster.

drop requirement on WFA if nodelist is not specified

update man page

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-26 14:32:54 +01:00
Fabio M. Di Nitto
40aa40ed84 votequorum: drop NODESTATE_LEAVING
this is another leftover from cman compatibility layer

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-26 14:32:54 +01:00
Fabio M. Di Nitto
269e0c4970 votequorum: change quorum.expected_votes override behavior
as agreed on the mailing list, quorum.expected_votes should override
automatically calculated expected_votes from nodelist.

Also simplify the code to handle expected_votes. "silly defaults" is now
unnecessary because votequorum does config sanity checks upfront.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-25 14:06:27 +01:00
Fabio M. Di Nitto
efbf5282f9 votequorum: two_node should enable wait_for_all by default
This avoids fencing races at startup of a cluster.

It is still possible to override WFA by explicitly setting
wait_for_all: 0

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-01-25 07:04:24 +01:00
Angus Salkeld
14fd1c927a Add debug log messages to corosync for join/leave
This is needed by cts.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-25 11:33:09 +11:00
Angus Salkeld
3698b78de9 LOG: make sure that debug works to syslog
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-25 11:33:09 +11:00
Jan Friesse
e89201b9c9 totemiba: Remove unused wthread.h include
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-01-24 16:28:55 +01:00
Fabio M. Di Nitto
78edc1f24b votequorum: add support for nodelist config bits
expected votes is now calculated automatically and quorum.expected_votes
can be used to override nodelist calculation. The highest of the two
value is used for runtime.

quorum_votes can be specified either in the node list or in quorum.votes.
The node list has priority over global.

propagate votequorum initalization errors (due to config inconsistencies)
back to vsf_quorum.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-01-23 11:46:34 +01:00
Angus Salkeld
3131601ce2 Remove all unneccessary "\n" from log messages
These look ugly, are inconsistently done and just have
to be removed later in libqb before calling syslog.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-23 13:08:23 +11:00
Angus Salkeld
61c0995e1c Shorten some really long lines in main.c
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-23 13:08:23 +11:00
Jan Friesse
0c2e3c8408 Make local_node ring0 address read-only
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-20 11:09:37 +01:00
Jan Friesse
d6cbdd9b84 Support for dynamic nodelist udpu member change
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-20 11:08:35 +01:00
Jan Friesse
16007acbef Use nodeid provided in nodelist
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-20 11:08:35 +01:00
Jan Friesse
de70c0007c Support udpu members in nodelist
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-20 11:08:35 +01:00
Jan Friesse
c8a62d8b3c Add local_node_pos icmap key
Key contains local node position in nodelist

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-20 11:08:35 +01:00