Commit Graph

3073 Commits

Author SHA1 Message Date
Fabio M. Di Nitto
c00502a70a build: drop another leftover from the past
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-12 07:13:04 +01:00
Fabio M. Di Nitto
b654661b4c build: drop obsoleted SOCKETDIR option
yet another leftover from the past that can go away

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-12 07:12:48 +01:00
Fabio M. Di Nitto
fd79118110 build: drop last LCRSO references
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-12 07:12:20 +01:00
Fabio M. Di Nitto
eb3d49ef7d pload: make it a test service and not a public one
pload is a performance benchmark that measures the onwire
speed of corosync.

problem is that once pload has been executed, the cluster
is basically dead.

turn pload into a test tool, by removing corosync-pload tool
and user library.

cleanup pload code to make it more readable and drop lots
of unnecessary stuff.

add test/ploadstart tool that can configure and start pload
via cmap calls.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-12 07:11:51 +01:00
Fabio M. Di Nitto
142ce8c3a1 totem: drop crypt_accept: concept/option
this was another old onwire compat mode that is not useful anylonger.

we can safely move the new model by default.

According to Honza (real hardware 1 node testing) there are no
performance impact.

My tests (8 nodes VM cluster), there is up to 10/12% performance
improvements up to 1M packet size where old and new models are equal.

As a side note, nss still shows to be a performance loss on both
real and virtual hw (without any kind of nss hw acceleration).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-10 07:08:30 +01:00
Angus Salkeld
03b32d7fad Fix typo in stats key name.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-09 21:54:51 +11:00
Angus Salkeld
41b4416bd4 Remove unused function logsys_priority_name_get()
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-09 21:54:51 +11:00
Angus Salkeld
f628ccba8b Add pid, hostname and process name to the logfile
Note this is only for file targets not stderr or syslog.

https://bugzilla.redhat.com/show_bug.cgi?id=789925

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-09 21:54:51 +11:00
Fabio M. Di Nitto
a6ffed0a52 drop last references to compatibility: whitetank
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-03-09 11:38:54 +01:00
Fabio M. Di Nitto
e0e27e3d12 utils: cleanup main daemon exit codes
some of them are not in use anymore and can be dropped.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-09 11:15:44 +01:00
Fabio M. Di Nitto
8f6e5ff530 sync: kill evil and syncv1 in one shot
this change breaks onwire compatibility.

cpg is the only user of sync_* interface and it's the only
service that will require extra testing.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-09 11:15:08 +01:00
Jan Friesse
da878290d9 man: Add cmap pages to index.html
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-06 10:04:26 +01:00
Jan Friesse
500ae491e3 man: Add description of cpg_iteration_* functions
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-06 10:04:22 +01:00
Jan Friesse
eaa34a6255 man: Fix cmap_iter_finalize typo
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-06 10:04:18 +01:00
Angus Salkeld
b3f940e6fc Treat ENOMSG as TRY_AGAIN.
ENOMSG is returned by the ringbuffer when you attempt to read
a message and there is nothing there to read.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-06 07:58:09 +11:00
Angus Salkeld
e6aab06573 Add common IPC errors.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-06 07:58:09 +11:00
Fabio M. Di Nitto
e996e36083 quorumtool: improve display of status data
always display membership data from the local node

display when a node is unknown to the local node instead of an error
from IPC.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-05 14:30:17 +01:00
Fabio M. Di Nitto
64fd946086 votequorum: move last malloc/alloca buf to static
this should guarantee that votequorum won't fail under high memory
pressure. Price is 3500 bytes extra preallocated at startup.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-05 14:30:17 +01:00
Fabio M. Di Nitto
90c602902c votequorum: fix node allocation memory leak
stop using malloc for each new node, because we cannot free the memory
easily. Move to a static allocated buffer that can contain
PROCESSOR_MAX + qdevice cluster_node instead.

We can never have more than PROCESSOR_MAX nodes anyway and the memory
footprint is small enough compared to memory leaks (those can
effectively happen only in very dynamic clusters with tons of different
nodes joining/leaveing with different nodeids).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-05 14:30:17 +01:00
Fabio M. Di Nitto
2d7a8ab29a votequorum: rename leave_remove to allow_downscale
pointed out that leave_remove can be easily confused with the old
cman leave_remove behavior. The two are substantially different
and we need to avoid confusion both for users and our support team.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto
75b3dc0f4e votequorum: fix handling of config updates
cmap changes are local to the node only and should not be broadcasted
as configuration changes.

if any change has happened to us, we will inform other nodes via
send_nodeinfo.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto
861d2c90ef votequorum: free our data and lists on exit
this is mostly to avoid valgrind errors on exit and make the output
more readable.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto
edf0728323 votequorum: disallow special features vs qdevice
simply taking the safest path here since integration of qdevice is not
fully complete

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto
33ea03f426 votequorum: fix node check based on reconfig parameter
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto
43e08bb143 votequorum: make a common function to calculate votes and cluster members
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto
692fd72468 votequorum: incorporate static config into dynamic
no functional changes or extra features yet

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto
f960d0a342 votequorum: move all configuration in votequorum_readconfig
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto
3a717fc8e9 votequorum: start moving from static to fully dynamic config
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto
f12bfc5ad8 votequorum: disallow wait_for_all and qdevice operations
The problem here is that user expectations, when using both modes
at the same time, have not been set yet. There are 2/3 options
that need investigation.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Fabio M. Di Nitto
4a93ff267f votequorum: improve debugging output
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:47 +01:00
Jan Friesse
25381738c2 Always set interface_up in totemip_iface_check
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-02 09:41:36 +01:00
Fabio M. Di Nitto
e34f095551 votequorum: fix node->flags type when receiving nodeinfo messages
old_flags was set to uint16_t but it needs to be uint32_t.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-29 09:58:10 +01:00
Fabio M. Di Nitto
6a9e4760da votequorum: fix segfault in wfa status update
this is a regression introduced by cb5fd775

when reading static config us->flags does not exists yet and therefor
setting it will cause a segfault.

Move the settings after cluster_node *us is created, with the long
term plan to simply kill the whole _static readconfig bits
in favour of dynamic (runtime changeable) bits.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-29 09:58:10 +01:00
Fabio M. Di Nitto
66466906a1 quorumtool: improve Membership information output
align nodeid, votes and name to make it all more readable

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-02-28 11:34:50 +01:00
Fabio M. Di Nitto
d8471ee873 quorumtool: make output more human friendly and retain machine parsable bits
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-02-28 11:34:50 +01:00
Fabio M. Di Nitto
197ea4ade0 quorumtools: fix typo in man page
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-02-28 11:34:50 +01:00
Fabio M. Di Nitto
cbd4bb93c1 quorumtools: drop unused option parsing
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-02-28 11:34:50 +01:00
Fabio M. Di Nitto
4345328a7d quorumtool: fix version display info
we don't need that on every run

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-02-28 11:34:50 +01:00
Fabio M. Di Nitto
725b7a61d9 quorumtool: swap node state and node votes output
there is no point to show the votes if the node is dead

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-27 12:41:04 +01:00
Fabio M. Di Nitto
03c76be696 votequorum: fix votequorum_getinfo man page and align struct name
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-27 12:41:04 +01:00
Fabio M. Di Nitto
c439bc3aa8 quorumtool: update man page and help text
improve error output since this is more than a debugging tool now

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-27 12:41:04 +01:00
Fabio M. Di Nitto
cb5fd77501 votequorum: major rework to fix qdevice API and integration with core
qdevice is a very special node in the cluster and it adds a certain
amount of complexity and special cases across the code.

most of the qdevice data are shared across the cluster (name/votes)
but effectively each node has a different view of the qdevice
(registered/unregistered/voting/etc.)

with this change, we align the qdevice view across the node,
exchanging more data between nodes and we fix how qdevice behaves
and it is configured.

The only side effect is that the amount of data transmitted on wire
is slightly higher.

The qdevice API is still disabled by default. This means that
the amount of real changes in current code are a lot smaller
than it appears by this patch.

TODO: documentation/man pages needs to be updated once
      this change is in (and behavior finalized).

User visible changes:

- configuration (coroparse, exec/votequorum):
  the quorum device section is now standalone within the quorum.

  quorum {
    provider: corosync_votequorum
    device {
      model: (name)
      timeout: (millisec)
      votes:
    }
  }

  the keyword "model:" is mandatory to enable qdevice in configuration
  and should express the name of the script/daemon that will provide
  the qdevice. Looking into the future, an init script or systemd
  service will look for that name in /path/to/be/decided/name
  and start/stop qdevice.

  timeout: defines the maximum interval the qdevice implementation
  has available between poll (see votequorum_qdevice_poll.3) before
  the device is considered dead and votes discarded

  votes: is now a configuration parameter and not an API call.
  quorum devices don't care what they need to vote.
  votes is autocalculated when a nodelist is available and all
  nodes in the list vote 1. Otherwise this parameter is mandatory.

- configuration (exec/votequorum):
  startup and runtime configuration changes have been improved.
  errors at startup are considered fatal. errors at runtime
  have different exit paths.

  startup:

  * quorum.two_node and qdevice are incompatible.
  * quorum.expected_votes requires quorum.device.votes.
  * quorum.expected_votes - quorum.device.votes cannot be lower
    than 2.
  * qdevice and last_man_standing are mutually exclusive.
  * qdevice and auto_tie_breaker are mutually exclusive.

  runtime config changes:

  * quorum.two_node and qdevice are incompatible:
    if quorum device is alive, two_node is disabled.
    if quorum device is not alive and node count is 2, two_node is
       enabled, and quorum device cannot be registered

  * if either last_man_standing or auto_tie_breaker were enabled
    at startup, and at runtime quorum device is configured,
    quorum device registration will be blocked.

  * if quorum.expected_votes is configured but not quorum.device.votes,
    quorum device registration will be blocked.

  * if quorum.device.votes is not configured and we cannot
    automatically calculate it, quorum device registration will be blocked.

  * An error in configuring quorum.expected_votes and quorum.device.votes
    will block quorum device registration.

blocking quorum device registation, also means dropping the votes.

quorum.device.votes (either set or automatically calculated) is now
used to determine current expected_votes in the cluster.

- logging (exec/votequorum):

  all errors from configuration are treated as WARNING/CRITICAL.

  lots of extra DEBUG output is added (see internal changes too).

- corosync-quorumtool (tools/corosync-quorumtool):

  * added option to forcefully kick out a quorum device from the local
    node. This is for emergency recovery only and it is only
    available when qdevice API is built-in.

  * Improved status output, specifically add node state and qdevice
    information

[root@fedora-master-node2 coro]# corosync-quorumtool -s
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          132
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net
   0     1  QDEVICE (Voting)

  * allow to print status for any node in the cluster known to
    local node.

[root@fedora-master-node1 coro]# corosync-quorumtool -s
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          144
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      2
Quorum:           2
Flags:            Quorate
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net

[root@fedora-master-node1 coro]# corosync-quorumtool -s -n 2
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          144
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net
         0     1  QDEVICE (Voting)

Internal changes:

- change qdevice timer to not run all time, but only when necessary.
- change votequorum_nodeinfo on wire data to use flags instead of uint8_t
  and add QDEVICE status.
- allocate nodeid 0 to qdevice since it's the only real
  nodeid that be reserved.
- change send_nodeinfo to allow to send nodeinfo for any node
  so that we can share qdevice info across the cluster
  (and this might be useful in future if we need to sync
   internal cluster view).
- add votequorum api call to update qdevice name
- add runtime data if quorum device has been forcefully disabled
  by config error
- add qdevice votes to expected_votes calculation (this
  is probably the biggest difference vs cman)
- change votequorum_read_nodelist_configuration so that
  we can autocalculate votes for qdevice (we need the nodecount
  vs votes).
- add all checks for startup/runtime config (see above).
- do not make qdevice part of the membership_list received from
  totem. None of our users care about it and it is not a real node.
- change onwire message handlers to deal with "data for this node from any node"
  case and undersand nodeid 0 for qdevice info
- always allocate qdevice at startup. this simplifies code a lot.
- dispatch qdevice nodeinfo on membership changes.
- inform libvotequorum users when a qdevice is registered
- improve substantially qdevice api and add a simple
  barrier based on qdevice name.
- add qdevice API barrier at cluster level. This feature allow
  only one qdevice name to be active in the cluster at any time.
- qdevice getinfo can now report status for qdevice on any node.
- change slightly the way the qdevice API is built-in/out:
  only the libvotequorum calls are #ifdef'out now. Doing so in
  the core is too complex and would make the code unreadable
  with the risk of missing a bit or two effectively introducing
  an on-wire incompatibility if we will ever turn the API on.
- probably added some bugs on the way...

TODO: update qdevice_* API once the above is settled and test
      qdevice integration with other features.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com> (only second part)
2012-02-27 09:30:26 +01:00
Fabio M. Di Nitto
9c7d1d3096 build: fix fallout from swithing to common shared lib
when building corosync on a clean system or for the very first
time, corosync_common needs to be visible both via -L for link
and for the LD_PATH, otherwise the linker cannot resolve
normal library dependencies.

This issue does NOT affect corosync users, but it's confined
to internal corosync only.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-22 09:43:47 +01:00
Jan Friesse
c92225393b Document SAM_RECOVERY_POLICY_CMAP
Also all irelevant references for SAM_RECOVERY_POLICY_CONFDB are
corrected.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-21 16:33:56 +01:00
Jan Friesse
c30c088597 Tweak nodeid warning
Nodeid warning now appears only when both totem.nodeid and nodelist
nodeid exists. When nodelist nodeid is not defined, totem.nodeid is
used.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-21 16:33:56 +01:00
Jan Friesse
61b2a85ebe spec: Add optional xmlconf
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-02-21 16:27:16 +01:00
Jan Friesse
04720649ba iba: Use configured node id
Corosync was ignoring nodeid for iba transport and always used
autogenerated one.

Original patch by: Jason Dillaman <jdillama@redhat.com>
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-02-21 16:27:16 +01:00
Angus Salkeld
40727bd6a3 Convert the common lib into a shared lib.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-02-21 20:26:08 +11:00
Jan Friesse
88ae75d6c2 Allow autoconfiguration of interface section
Thanks to totemip_getifaddrs infrastructure it's now possible to use
nodelist informations to autoconfigure interface bindnetaddr. Together
with cluster_name, interface section can be completely omitted.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:57 +01:00
Jan Friesse
ba13537471 totemconfig: ensure suffix for ringX_addr
Patch makes sure, that ringX_addr key has really _addr suffix.
Previously, it was possible to enter ringXanything and it was
interpreted as ringX_addr.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:57 +01:00