Commit Graph

3073 Commits

Author SHA1 Message Date
Angus Salkeld
df06e98298 CTS: handle socket exceptions better
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-27 17:41:21 +11:00
Angus Salkeld
1331c43075 CTS: fix shell script variable name
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-27 17:41:21 +11:00
Fabio M. Di Nitto
5e4c02bd36 update TODO list
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-26 14:32:54 +01:00
Fabio M. Di Nitto
b05477859f votequorum: fix expected_votes propagation
it is not correct to randomly accept expected_votes from any node in
the cluster. We can only allow expected_votes from quorate nodes.

A quorate cluster is "always" right and have the correct expected_votes.

One of the different bug triggers:

quorum {
  expected_votes: 8
  auto_tie_breaker: 1
  last_man_standing: 1
}

start all 8 nodes.
clean shut down 2 nodes.
wait for lms to kick in.
kill 3 nodes with highest nodeid
(we want to retain a quorate partition of 3 nodes)
start one node again -> cluster will be unquorate

This happens because the node rebooting/rejoining with
non current cluster status will propagate an expected_votes of 8,
while in reality the cluster is down to expected_votes: 3.

4 nodes are still < 5 (quorum for 8 nodes/votes).

In order to avoid this condition, we need to exchange expected_votes
information among nodes but we cannot randomly trust everybody.

1) Allow expected_votes to be changed cluster-wide only if the
   information is coming from a quorate node.
2) Fix node->expected_votes based on quorate status
3) allow a joining node to decrease quorum and expected_votes
   if the node is not yet quorate, but it's joining a quorate
   cluster

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-26 14:32:54 +01:00
Fabio M. Di Nitto
88e6830df1 votequorum: fix auto_tie_breaker design and simplify code a lot
auto_tie_breaker requires to know the lowest node id in the currently
quorate partition and not of the whole cluster.

this allow us to determine the lowest node id as soon as we are quorate
and remove the complexity to read it from WFA or nodelist. Add
the same time it adds the flexibility for dynamic nodeids in a cluster.

drop requirement on WFA if nodelist is not specified

update man page

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-26 14:32:54 +01:00
Fabio M. Di Nitto
40aa40ed84 votequorum: drop NODESTATE_LEAVING
this is another leftover from cman compatibility layer

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-26 14:32:54 +01:00
Fabio M. Di Nitto
f25d5829f2 update TODO list
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-25 14:06:27 +01:00
Fabio M. Di Nitto
284ca61865 votequorum: add documentation and man pages
fix a few typos on the way and separate config / library bits

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-25 14:06:27 +01:00
Fabio M. Di Nitto
269e0c4970 votequorum: change quorum.expected_votes override behavior
as agreed on the mailing list, quorum.expected_votes should override
automatically calculated expected_votes from nodelist.

Also simplify the code to handle expected_votes. "silly defaults" is now
unnecessary because votequorum does config sanity checks upfront.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-25 14:06:27 +01:00
Fabio M. Di Nitto
efbf5282f9 votequorum: two_node should enable wait_for_all by default
This avoids fencing races at startup of a cluster.

It is still possible to override WFA by explicitly setting
wait_for_all: 0

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-01-25 07:04:24 +01:00
Angus Salkeld
3b678a8f14 CTS: make basic tests config-generic
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-25 11:33:09 +11:00
Angus Salkeld
c01ed3dbfa CTS: fix starting/stopping of test_agents
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-25 11:33:09 +11:00
Angus Salkeld
ccaef16ee0 CTS: tidy up the shutdown of cpg_test_agent
it is not neccessary to close the fd and remove it
from the mainloop

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-25 11:33:09 +11:00
Angus Salkeld
02dca3d823 CTS: temp comment out quorum tests
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-25 11:33:09 +11:00
Angus Salkeld
1ba1a9ce34 CTS: fix quourm command
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-25 11:33:09 +11:00
Angus Salkeld
12ab7f7b7d CTS: fix up the formt strings
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-25 11:33:09 +11:00
Angus Salkeld
14fd1c927a Add debug log messages to corosync for join/leave
This is needed by cts.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-25 11:33:09 +11:00
Angus Salkeld
3698b78de9 LOG: make sure that debug works to syslog
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-25 11:33:09 +11:00
Jan Friesse
e89201b9c9 totemiba: Remove unused wthread.h include
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-01-24 16:28:55 +01:00
Jan Friesse
cdff97a044 Make xmlconf in SPEC conditional
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-01-24 16:28:44 +01:00
Angus Salkeld
a516a1f9e4 Change the last references from objctl to cmapctl
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-01-24 09:47:51 +11:00
Fabio M. Di Nitto
329a10f6d4 Update TODO list
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-01-23 11:46:34 +01:00
Fabio M. Di Nitto
78edc1f24b votequorum: add support for nodelist config bits
expected votes is now calculated automatically and quorum.expected_votes
can be used to override nodelist calculation. The highest of the two
value is used for runtime.

quorum_votes can be specified either in the node list or in quorum.votes.
The node list has priority over global.

propagate votequorum initalization errors (due to config inconsistencies)
back to vsf_quorum.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-01-23 11:46:34 +01:00
Angus Salkeld
3131601ce2 Remove all unneccessary "\n" from log messages
These look ugly, are inconsistently done and just have
to be removed later in libqb before calling syslog.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-23 13:08:23 +11:00
Angus Salkeld
61c0995e1c Shorten some really long lines in main.c
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-23 13:08:23 +11:00
Angus Salkeld
605204ebb4 cmap: add a delete with prefix (needed by cts)
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-23 13:08:23 +11:00
Angus Salkeld
1b585b831a cmap: change -t and -T around (capital == with prefix)
I want to add a prefic delete option and then these will
not be consistent.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-23 13:08:22 +11:00
Angus Salkeld
aedaa07823 cmap: tweek the usage text
1) It wasn't obvious to me what -b did
2) -a has been removed

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-23 13:08:22 +11:00
Angus Salkeld
2e0fcf932b cmap: add a load option for cts "-p"
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-23 13:08:22 +11:00
Angus Salkeld
fa9f239d06 CTS: convert the test agents to use qb logging
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-23 13:08:22 +11:00
Angus Salkeld
3b94612322 CTS: fix the corosync start/stop settings
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-23 13:08:22 +11:00
Angus Salkeld
4f99bef8b6 CTS: set the syslog restart commands up correctly
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-23 13:08:22 +11:00
Jan Friesse
9e6ea08925 Add nodelist informations to manual page
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-20 11:09:40 +01:00
Jan Friesse
0c2e3c8408 Make local_node ring0 address read-only
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-20 11:09:37 +01:00
Jan Friesse
d6cbdd9b84 Support for dynamic nodelist udpu member change
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-20 11:08:35 +01:00
Jan Friesse
16007acbef Use nodeid provided in nodelist
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-20 11:08:35 +01:00
Jan Friesse
de70c0007c Support udpu members in nodelist
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-20 11:08:35 +01:00
Jan Friesse
c8a62d8b3c Add local_node_pos icmap key
Key contains local node position in nodelist

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-20 11:08:35 +01:00
Jan Friesse
6d0b0b1493 Parse nodelist in coroparse
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-20 11:08:34 +01:00
Jan Friesse
a10e229de9 mon: Remove leftover print of debug output to err
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-01-19 15:02:52 +01:00
Jan Friesse
2acf8920a3 Set default multicast port if not specified
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-19 15:02:24 +01:00
Fabio M. Di Nitto
de2e5be755 votequorum: drop protocol versioning in favour of extra space on the wire
protocol needs to stay compatible across a corosync MAJOR release.
Implementing internal protocol version compat is at best suicidal.

Add extra space to the net struct and we can use flags to determine
feature sets.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-19 14:37:35 +01:00
Angus Salkeld
34e37f130f autobuild: make sure systemd is enabled on f15+
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reriewed-by: Steven Dake <sdake@redhat.com>
2012-01-19 22:06:20 +11:00
Angus Salkeld
280a71a33c autobuild: ssh into node as root
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com
2012-01-19 22:06:09 +11:00
Angus Salkeld
2678f0a0db Fix spec file when passed "--with-testagents"
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-19 22:06:00 +11:00
Angus Salkeld
0f7526e694 CTS: remove the test service agent
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-19 22:04:29 +11:00
Fabio M. Di Nitto
2cd6ad9922 votequorum: ifdef qdiskd API out
as agreed, the API has not been tested yet. Adding later is better than
removing it.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-18 14:23:06 +01:00
Fabio M. Di Nitto
9150921ad4 votequorum: be slightly more efficent and consistent
reduce req_exec_quorum_nodeinfo from 40 to 16 bytes (onwire)

add 4 bytes to req_exec_quorum_reconfigure to be consisent
with feature/version checking (onwire data)

make all nodeid definition "unsigned int" instead of some random mix.

reduce size of different vars

remove lots of unnecessary swab due to reducing size of data

drop join_time from cluster_node, it's never used

fix printing of nodeids from random mix to uint for consistency

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-18 13:50:25 +01:00
Jan Friesse
aed3970fdf Update TODO with recent mcast addr changes
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-18 09:12:30 +01:00
Jan Friesse
eef9028465 Store auto generated mcast addr and port to icmap
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-18 09:12:26 +01:00