Commit Graph

1535 Commits

Author SHA1 Message Date
Fabio M. Di Nitto
692fd72468 votequorum: incorporate static config into dynamic
no functional changes or extra features yet

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto
f960d0a342 votequorum: move all configuration in votequorum_readconfig
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto
3a717fc8e9 votequorum: start moving from static to fully dynamic config
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto
f12bfc5ad8 votequorum: disallow wait_for_all and qdevice operations
The problem here is that user expectations, when using both modes
at the same time, have not been set yet. There are 2/3 options
that need investigation.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto
4a93ff267f votequorum: improve debugging output
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Jan Friesse
25381738c2 Always set interface_up in totemip_iface_check
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-02 09:41:36 +01:00
Fabio M. Di Nitto
e34f095551 votequorum: fix node->flags type when receiving nodeinfo messages
old_flags was set to uint16_t but it needs to be uint32_t.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-29 09:58:10 +01:00
Fabio M. Di Nitto
6a9e4760da votequorum: fix segfault in wfa status update
this is a regression introduced by cb5fd775

when reading static config us->flags does not exists yet and therefor
setting it will cause a segfault.

Move the settings after cluster_node *us is created, with the long
term plan to simply kill the whole _static readconfig bits
in favour of dynamic (runtime changeable) bits.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-29 09:58:10 +01:00
Fabio M. Di Nitto
cb5fd77501 votequorum: major rework to fix qdevice API and integration with core
qdevice is a very special node in the cluster and it adds a certain
amount of complexity and special cases across the code.

most of the qdevice data are shared across the cluster (name/votes)
but effectively each node has a different view of the qdevice
(registered/unregistered/voting/etc.)

with this change, we align the qdevice view across the node,
exchanging more data between nodes and we fix how qdevice behaves
and it is configured.

The only side effect is that the amount of data transmitted on wire
is slightly higher.

The qdevice API is still disabled by default. This means that
the amount of real changes in current code are a lot smaller
than it appears by this patch.

TODO: documentation/man pages needs to be updated once
      this change is in (and behavior finalized).

User visible changes:

- configuration (coroparse, exec/votequorum):
  the quorum device section is now standalone within the quorum.

  quorum {
    provider: corosync_votequorum
    device {
      model: (name)
      timeout: (millisec)
      votes:
    }
  }

  the keyword "model:" is mandatory to enable qdevice in configuration
  and should express the name of the script/daemon that will provide
  the qdevice. Looking into the future, an init script or systemd
  service will look for that name in /path/to/be/decided/name
  and start/stop qdevice.

  timeout: defines the maximum interval the qdevice implementation
  has available between poll (see votequorum_qdevice_poll.3) before
  the device is considered dead and votes discarded

  votes: is now a configuration parameter and not an API call.
  quorum devices don't care what they need to vote.
  votes is autocalculated when a nodelist is available and all
  nodes in the list vote 1. Otherwise this parameter is mandatory.

- configuration (exec/votequorum):
  startup and runtime configuration changes have been improved.
  errors at startup are considered fatal. errors at runtime
  have different exit paths.

  startup:

  * quorum.two_node and qdevice are incompatible.
  * quorum.expected_votes requires quorum.device.votes.
  * quorum.expected_votes - quorum.device.votes cannot be lower
    than 2.
  * qdevice and last_man_standing are mutually exclusive.
  * qdevice and auto_tie_breaker are mutually exclusive.

  runtime config changes:

  * quorum.two_node and qdevice are incompatible:
    if quorum device is alive, two_node is disabled.
    if quorum device is not alive and node count is 2, two_node is
       enabled, and quorum device cannot be registered

  * if either last_man_standing or auto_tie_breaker were enabled
    at startup, and at runtime quorum device is configured,
    quorum device registration will be blocked.

  * if quorum.expected_votes is configured but not quorum.device.votes,
    quorum device registration will be blocked.

  * if quorum.device.votes is not configured and we cannot
    automatically calculate it, quorum device registration will be blocked.

  * An error in configuring quorum.expected_votes and quorum.device.votes
    will block quorum device registration.

blocking quorum device registation, also means dropping the votes.

quorum.device.votes (either set or automatically calculated) is now
used to determine current expected_votes in the cluster.

- logging (exec/votequorum):

  all errors from configuration are treated as WARNING/CRITICAL.

  lots of extra DEBUG output is added (see internal changes too).

- corosync-quorumtool (tools/corosync-quorumtool):

  * added option to forcefully kick out a quorum device from the local
    node. This is for emergency recovery only and it is only
    available when qdevice API is built-in.

  * Improved status output, specifically add node state and qdevice
    information

[root@fedora-master-node2 coro]# corosync-quorumtool -s
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          132
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net
   0     1  QDEVICE (Voting)

  * allow to print status for any node in the cluster known to
    local node.

[root@fedora-master-node1 coro]# corosync-quorumtool -s
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          144
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      2
Quorum:           2
Flags:            Quorate
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net

[root@fedora-master-node1 coro]# corosync-quorumtool -s -n 2
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          144
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net
         0     1  QDEVICE (Voting)

Internal changes:

- change qdevice timer to not run all time, but only when necessary.
- change votequorum_nodeinfo on wire data to use flags instead of uint8_t
  and add QDEVICE status.
- allocate nodeid 0 to qdevice since it's the only real
  nodeid that be reserved.
- change send_nodeinfo to allow to send nodeinfo for any node
  so that we can share qdevice info across the cluster
  (and this might be useful in future if we need to sync
   internal cluster view).
- add votequorum api call to update qdevice name
- add runtime data if quorum device has been forcefully disabled
  by config error
- add qdevice votes to expected_votes calculation (this
  is probably the biggest difference vs cman)
- change votequorum_read_nodelist_configuration so that
  we can autocalculate votes for qdevice (we need the nodecount
  vs votes).
- add all checks for startup/runtime config (see above).
- do not make qdevice part of the membership_list received from
  totem. None of our users care about it and it is not a real node.
- change onwire message handlers to deal with "data for this node from any node"
  case and undersand nodeid 0 for qdevice info
- always allocate qdevice at startup. this simplifies code a lot.
- dispatch qdevice nodeinfo on membership changes.
- inform libvotequorum users when a qdevice is registered
- improve substantially qdevice api and add a simple
  barrier based on qdevice name.
- add qdevice API barrier at cluster level. This feature allow
  only one qdevice name to be active in the cluster at any time.
- qdevice getinfo can now report status for qdevice on any node.
- change slightly the way the qdevice API is built-in/out:
  only the libvotequorum calls are #ifdef'out now. Doing so in
  the core is too complex and would make the code unreadable
  with the risk of missing a bit or two effectively introducing
  an on-wire incompatibility if we will ever turn the API on.
- probably added some bugs on the way...

TODO: update qdevice_* API once the above is settled and test
      qdevice integration with other features.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com> (only second part)
2012-02-27 09:30:26 +01:00
Jan Friesse
c30c088597 Tweak nodeid warning
Nodeid warning now appears only when both totem.nodeid and nodelist
nodeid exists. When nodelist nodeid is not defined, totem.nodeid is
used.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-21 16:33:56 +01:00
Jan Friesse
04720649ba iba: Use configured node id
Corosync was ignoring nodeid for iba transport and always used
autogenerated one.

Original patch by: Jason Dillaman <jdillama@redhat.com>
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-02-21 16:27:16 +01:00
Angus Salkeld
40727bd6a3 Convert the common lib into a shared lib.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-02-21 20:26:08 +11:00
Jan Friesse
88ae75d6c2 Allow autoconfiguration of interface section
Thanks to totemip_getifaddrs infrastructure it's now possible to use
nodelist informations to autoconfigure interface bindnetaddr. Together
with cluster_name, interface section can be completely omitted.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:57 +01:00
Jan Friesse
ba13537471 totemconfig: ensure suffix for ringX_addr
Patch makes sure, that ringX_addr key has really _addr suffix.
Previously, it was possible to enter ringXanything and it was
interpreted as ringX_addr.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:57 +01:00
Jan Friesse
8cde53aa99 cmap: Handle NULL in [i]cmap_set_string value
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:57 +01:00
Jan Friesse
88ddaecfe9 Create solaris specific getifaddrs
This not only makes possible to use generic totemip_iface_check, but
also fixes some problems with previous implementation (fixed mask, not
very well supported ipv6, ...)

Tested on OpenIndiana 151a

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:57 +01:00
Jan Friesse
fd47fddcaf Add totemip_iface_check based on totemip_getifaddrs
Also Linux and BSD/Darwin specific bits are no longer needed, so they
are gone.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:57 +01:00
Jan Friesse
27e9988486 Add generic implementation of getifaddrs
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:56 +01:00
Angus Salkeld
023c4fa0cc Move hdb_error_to_cs to corotypes.h
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-14 11:10:14 +11:00
Steven Dake
415ef892ad Remove empty testquorum.c file
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-02-13 17:05:04 -07:00
Steven Dake
2ad0cdc832 Update copyright header dates in exec directory
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-02-13 17:05:04 -07:00
Steven Dake
4ee9550f80 Remove jhash.h since it is not used
We would use libqb for hashing now if we needed hashing.
cpg no longer uses jhash.h.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>
2012-02-13 17:05:04 -07:00
Steven Dake
815375411e Remove unused or unimplemented CFG apis
Remove:
cfg_statetrack
cfg_statetrackstop
cfg_administrativestateste
cfg_administrativestateget
cfg_serviceload
cfg_serviceunload

Rev SO to 5.0.0

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-02-13 17:04:49 -07:00
Fabio M. Di Nitto
8840113704 votequorum: fix variable init
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-09 16:49:25 +01:00
Fabio M. Di Nitto
e3ba920307 votequorum: fix possible memory corruption
nodeid = 0 is a valide nodeid and node associated with it should
not be freed

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-09 16:49:25 +01:00
Fabio M. Di Nitto
939a7b2d66 quorum: don't leak memory on error
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-09 16:49:25 +01:00
Angus Salkeld
6cd576b0f5 move hdb_error_to_cs to common_lib
Note the previous inconsistent implementation.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-09 10:45:56 +11:00
Angus Salkeld
da483b8121 Add a common library that can be shared between libs and corosync
We have always had this problem and worked around it by coping code
or using inline functions. Both not good IMO.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-09 10:45:56 +11:00
Steven Dake
7592e3b61e Remove include/engine/quorum and integrate it into exec/engine.h
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-08 08:31:10 -07:00
Steven Dake
01c63ca17c Free state variable allocated in wd_resource_state_is_ok
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-07 08:42:58 -07:00
Steven Dake
c05cbb65bc Remove leaked resource error from wd_resource_state_is_ok
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-07 08:42:58 -07:00
Steven Dake
190dba3933 Remove use after free and free of uninit value in mainconfig error path
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-07 08:42:58 -07:00
Steven Dake
46a2b1a297 Remove use after free in corosync_main_config_set in error path
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-07 08:42:58 -07:00
Fabio M. Di Nitto
cff57430d6 votequorum: fix quorum_ringid setting before any delivery occours
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-02-07 14:07:09 +01:00
Angus Salkeld
8992acb815 LOG: add libqb as a "subsys"
So we can see libqb internal logs

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-07 10:53:56 +11:00
Jan Friesse
546aea23cf cmap: Check RO flag in adjust int function
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-06 16:37:00 +01:00
Jiaju Zhang
dd9e177af7 CPG: Send CPG_REASON_PROCDOWN when really needed
This patch fixes the issue that in some cases where cpg_finalize()
was called just after cpg_leave() was called, CPG_REASON_PROCDOWN
might also be sent while CPG_REASON_LEAVE had already been sent.
This behavior is not aligned with what the man page has described:
"CPG_REASON_PROCDOWN - the process left a group without calling
cpg_leave()."
And it will confuse CPG's clients in that one process left results
in two different reasons being sent.

The root cause of this issue is cpg_leave() will return after
adding the LEAVE message to the sending queue, but the cpg's group
name has not been cleared yet. Just at that time, cpg_finalize()
is being called, then it determines if there is the calling of
cpg_leave() happened only by the checking of cpg's group name, so
this method is not sufficient.

Signed-off-by: Jiaju Zhang <jjzhang@suse.de>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-06 08:07:54 -07:00
Fabio M. Di Nitto
3b77dd9d83 votequorum: fix expected votes manual override from quorumtools
votequorum internal quorum/expected_vote check was slightly too
conservative and was not done correctly when leave_remove feature
is enabled.

this fix allows admins to effectively override expected_votes
and drive ev_barrier as expected.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-02-03 10:33:33 +01:00
Jan Friesse
0929dcb68c Better checks of integer values in coroparse
Instead of atoi, strtol is used. This allows detection of typical
problems like empty value of key and incorrectly entered numbers.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-03 09:16:43 +01:00
Fabio M. Di Nitto
230231fedb votequorum: add runtime internal data to icmap runtime.votequorum.*
specifically ev_barrier, two_node, lowest_node_id and wait_for_all_status
are values that change internally at runtime and keeping track
of those can make debugging rather easy, specially when LOG_DEBUG is not
set.

Also track our node id.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-By: Christine Caulfield <ccaulfie@redhat.com>
2012-02-02 16:36:57 +01:00
Jan Friesse
33e5ce8d56 Show correct error when open of logfile failed
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-02 09:30:49 +01:00
Jan Friesse
a80febda7e Store error str if can't open logfile
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-02 09:30:49 +01:00
Angus Salkeld
af9cfc7b55 IPC: reference count the connection whilst flushing the outq
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-02 11:34:26 +11:00
Angus Salkeld
45cb05f1ad IPC: allow for failures in the connection_created callback
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-02-01 08:51:13 +11:00
Fabio M. Di Nitto
46b7b155a4 votequorum: add leave_remove option
this also cleanup NODESTATE for good. JOINING was never used

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-31 16:58:08 +01:00
Fabio M. Di Nitto
c16086bead votequorum: honor onwire node flags change
internal flags were not propagated correctly in the node status

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-01-31 10:20:32 +01:00
Fabio M. Di Nitto
9fa83dabbe quorum: fix load/unload priority for quorum services
all main services are loaded at priority 1.
vfs_quorum and votequorum did not specify a priority and
automatically defaulting to 0, that has a special meaning
of being loaded last and unloaded last.

this is not correct behavior and limits what votequorum
can do at shutdown, for example notify other nodes that
it is leaving (something that cannot be gathered by
totem membership change callback).

fix vsf_quorum to load at priority 1 as the other
default services and bump votequorum to 2 (needs to
unload before everything else currently known).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-01-31 10:16:52 +01:00
Fabio M. Di Nitto
a2b960d109 service: fix service unload regression introduced by lcrso dropping
service exec_exit_fn was not honored because the loop was looking
into the wrong icmap key

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-01-31 10:16:16 +01:00
Fabio M. Di Nitto
fc61b20a8a votequorum: drop unnecessary flags
code inspection shows that those internal flags are never used

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-01-31 10:14:19 +01:00
Steven Dake
007e5c9458 Honor exec_init_fn call
exec_init_fn now either returns NULL (success) or a string which indicates
the error that occured during service engine initialization.  If an error
occurs, corosync will exit.  This patch adds ykd and makes other suggestions
from Fabio Di Nitto.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>
2012-01-30 14:05:09 -07:00