Commit Graph

381 Commits

Author SHA1 Message Date
Fabio M. Di Nitto
2dae49e54a votequorum: remove last instance of state and rename it to cast_vote
also align naming of vote to cast_vote for info calls

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto
43d1439600 votequorum: add qdevice CAST_VOTE status/flag
this is a preparation commit for the next changes. right now it is
no more than an alias to ALIVE.

CAST_VOTE is required to support master/slave feature from qdevice.

Effectively a quorum device can be:

Not registered / registered (connected to API but nothing else is happening)

if registered:

Not alive / alive (quorum device is petting the API via poll and timer is running)

if alive:

Not voting (slave) / voting (master)

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:16 +02:00
Fabio M. Di Nitto
987e26f8d1 votequorum: rename NODE_FLAGS_QDEVICE_STATE to NODE_FLAGS_QDEVICE_ALIVE
STATE is confusing and overloaded term in votequorum as it's used for nodes
and other bits.

make the name unique and ALIVE means that the qdevice is heartbeating
to votequorum.

improve display of the status in tools and tests.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:16 +02:00
Fabio M. Di Nitto
06e75d0b22 votequorum: re-enable qdevice api
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-07 11:07:16 +02:00
Fabio M. Di Nitto
514f2a13bd testcpg: fix build warning
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-06-01 08:47:44 +02:00
Dan Clark
88dd3e1eea Improve testcpg to handle change of node identity
Signed-off-by: Dan Clark <2clarkd@gmail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-04-26 09:01:21 +02:00
Jan Friesse
8cdd2fc493 Remove libtomcrypt
Tomcrypt in corosync is for long time not updated. Because we have
support for libnss, libtomcrypt can be removed.

Also few leftovers (AES is 256 bits, not 128, ...) are removed.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-13 09:19:47 +01:00
Fabio M. Di Nitto
20a5289074 drop evs service
there are several reasons for this:

1) evs is only partially implemented with no plans to complete it

typedef enum {
       EVS_TYPE_UNORDERED, /* not implemented */
       EVS_TYPE_FIFO,          /* same as agreed */
       EVS_TYPE_AGREED,
       EVS_TYPE_SAFE           /* not implemented */
} evs_guarantee_t;

2) evs has no users in any upstream distribution and no search
   engine can find any other upstream using it.

3) the only reason (I was told) to carry around evs was that evs
   receives the full ring_id struct from totem. This is only
   partially correct because while the structures are prepared
   to carry around those data, they are never transmitted from
   corosync engine down the IPC line to the user.
   CPG ring_id contains the exact same information and it's
   actually less buggy (due to prototying of the info).

worst case scenario where a user really absolutely need libevs,
it can be easily reimplemented as libcpg wrapper and avoid
lots of code duplication.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-12 15:51:50 +01:00
Fabio M. Di Nitto
eb3d49ef7d pload: make it a test service and not a public one
pload is a performance benchmark that measures the onwire
speed of corosync.

problem is that once pload has been executed, the cluster
is basically dead.

turn pload into a test tool, by removing corosync-pload tool
and user library.

cleanup pload code to make it more readable and drop lots
of unnecessary stuff.

add test/ploadstart tool that can configure and start pload
via cmap calls.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-12 07:11:51 +01:00
Fabio M. Di Nitto
cb5fd77501 votequorum: major rework to fix qdevice API and integration with core
qdevice is a very special node in the cluster and it adds a certain
amount of complexity and special cases across the code.

most of the qdevice data are shared across the cluster (name/votes)
but effectively each node has a different view of the qdevice
(registered/unregistered/voting/etc.)

with this change, we align the qdevice view across the node,
exchanging more data between nodes and we fix how qdevice behaves
and it is configured.

The only side effect is that the amount of data transmitted on wire
is slightly higher.

The qdevice API is still disabled by default. This means that
the amount of real changes in current code are a lot smaller
than it appears by this patch.

TODO: documentation/man pages needs to be updated once
      this change is in (and behavior finalized).

User visible changes:

- configuration (coroparse, exec/votequorum):
  the quorum device section is now standalone within the quorum.

  quorum {
    provider: corosync_votequorum
    device {
      model: (name)
      timeout: (millisec)
      votes:
    }
  }

  the keyword "model:" is mandatory to enable qdevice in configuration
  and should express the name of the script/daemon that will provide
  the qdevice. Looking into the future, an init script or systemd
  service will look for that name in /path/to/be/decided/name
  and start/stop qdevice.

  timeout: defines the maximum interval the qdevice implementation
  has available between poll (see votequorum_qdevice_poll.3) before
  the device is considered dead and votes discarded

  votes: is now a configuration parameter and not an API call.
  quorum devices don't care what they need to vote.
  votes is autocalculated when a nodelist is available and all
  nodes in the list vote 1. Otherwise this parameter is mandatory.

- configuration (exec/votequorum):
  startup and runtime configuration changes have been improved.
  errors at startup are considered fatal. errors at runtime
  have different exit paths.

  startup:

  * quorum.two_node and qdevice are incompatible.
  * quorum.expected_votes requires quorum.device.votes.
  * quorum.expected_votes - quorum.device.votes cannot be lower
    than 2.
  * qdevice and last_man_standing are mutually exclusive.
  * qdevice and auto_tie_breaker are mutually exclusive.

  runtime config changes:

  * quorum.two_node and qdevice are incompatible:
    if quorum device is alive, two_node is disabled.
    if quorum device is not alive and node count is 2, two_node is
       enabled, and quorum device cannot be registered

  * if either last_man_standing or auto_tie_breaker were enabled
    at startup, and at runtime quorum device is configured,
    quorum device registration will be blocked.

  * if quorum.expected_votes is configured but not quorum.device.votes,
    quorum device registration will be blocked.

  * if quorum.device.votes is not configured and we cannot
    automatically calculate it, quorum device registration will be blocked.

  * An error in configuring quorum.expected_votes and quorum.device.votes
    will block quorum device registration.

blocking quorum device registation, also means dropping the votes.

quorum.device.votes (either set or automatically calculated) is now
used to determine current expected_votes in the cluster.

- logging (exec/votequorum):

  all errors from configuration are treated as WARNING/CRITICAL.

  lots of extra DEBUG output is added (see internal changes too).

- corosync-quorumtool (tools/corosync-quorumtool):

  * added option to forcefully kick out a quorum device from the local
    node. This is for emergency recovery only and it is only
    available when qdevice API is built-in.

  * Improved status output, specifically add node state and qdevice
    information

[root@fedora-master-node2 coro]# corosync-quorumtool -s
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          132
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net
   0     1  QDEVICE (Voting)

  * allow to print status for any node in the cluster known to
    local node.

[root@fedora-master-node1 coro]# corosync-quorumtool -s
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          144
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      2
Quorum:           2
Flags:            Quorate
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net

[root@fedora-master-node1 coro]# corosync-quorumtool -s -n 2
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          144
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net
         0     1  QDEVICE (Voting)

Internal changes:

- change qdevice timer to not run all time, but only when necessary.
- change votequorum_nodeinfo on wire data to use flags instead of uint8_t
  and add QDEVICE status.
- allocate nodeid 0 to qdevice since it's the only real
  nodeid that be reserved.
- change send_nodeinfo to allow to send nodeinfo for any node
  so that we can share qdevice info across the cluster
  (and this might be useful in future if we need to sync
   internal cluster view).
- add votequorum api call to update qdevice name
- add runtime data if quorum device has been forcefully disabled
  by config error
- add qdevice votes to expected_votes calculation (this
  is probably the biggest difference vs cman)
- change votequorum_read_nodelist_configuration so that
  we can autocalculate votes for qdevice (we need the nodecount
  vs votes).
- add all checks for startup/runtime config (see above).
- do not make qdevice part of the membership_list received from
  totem. None of our users care about it and it is not a real node.
- change onwire message handlers to deal with "data for this node from any node"
  case and undersand nodeid 0 for qdevice info
- always allocate qdevice at startup. this simplifies code a lot.
- dispatch qdevice nodeinfo on membership changes.
- inform libvotequorum users when a qdevice is registered
- improve substantially qdevice api and add a simple
  barrier based on qdevice name.
- add qdevice API barrier at cluster level. This feature allow
  only one qdevice name to be active in the cluster at any time.
- qdevice getinfo can now report status for qdevice on any node.
- change slightly the way the qdevice API is built-in/out:
  only the libvotequorum calls are #ifdef'out now. Doing so in
  the core is too complex and would make the code unreadable
  with the risk of missing a bit or two effectively introducing
  an on-wire incompatibility if we will ever turn the API on.
- probably added some bugs on the way...

TODO: update qdevice_* API once the above is settled and test
      qdevice integration with other features.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com> (only second part)
2012-02-27 09:30:26 +01:00
Fabio M. Di Nitto
9c7d1d3096 build: fix fallout from swithing to common shared lib
when building corosync on a clean system or for the very first
time, corosync_common needs to be visible both via -L for link
and for the LD_PATH, otherwise the linker cannot resolve
normal library dependencies.

This issue does NOT affect corosync users, but it's confined
to internal corosync only.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-22 09:43:47 +01:00
Angus Salkeld
40727bd6a3 Convert the common lib into a shared lib.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-02-21 20:26:08 +11:00
Angus Salkeld
11fb909036 TEST: remove unused code.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-14 11:10:14 +11:00
Angus Salkeld
a2c1408620 TEST: add logging to testcpg and testevs
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-14 11:10:14 +11:00
Angus Salkeld
0379faca4e TEST: Use pacemaker repeat macro
This is to simulate the way pacemaker uses the cpg api.
With this you can run testcpg directly after corosync
starts and it should initialise ok.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-14 11:10:14 +11:00
Fabio M. Di Nitto
68a0105fd0 testvotequorum: fix test loop to break if votequorum goes away
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-09 16:49:25 +01:00
Fabio M. Di Nitto
dfb3cd693a testquorum: check for quorum_dispatch return code
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-09 16:49:25 +01:00
Fabio M. Di Nitto
62bbe076a8 corotype: drop deprecated CPG_ defines
the only user of those obsoleted defines is dlm master (already ported)
to use CS_ and cmirror (that needs full porting to new corosync either way).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-08 13:37:46 +01:00
Fabio M. Di Nitto
4120a2c1cb corotypes: drop deprecated EVS_ defines
none of our current dependencies use it.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-08 13:37:46 +01:00
Angus Salkeld
ac498ca97a Remove deprecated function qb_util_set_log_function()
Use the standard qb_log api.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-07 10:53:56 +11:00
Fabio M. Di Nitto
46b7b155a4 votequorum: add leave_remove option
this also cleanup NODESTATE for good. JOINING was never used

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-31 16:58:08 +01:00
Fabio M. Di Nitto
ccd36af00e votequorum: rename qdisk to qdevice
a quorum device is not necessarely a disk and this also aligns
various names to be generic

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-By: Christine Caulfield <ccaulfie@redhat.com>
2012-01-27 11:17:02 +01:00
Fabio M. Di Nitto
40aa40ed84 votequorum: drop NODESTATE_LEAVING
this is another leftover from cman compatibility layer

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-01-26 14:32:54 +01:00
Fabio M. Di Nitto
2cd6ad9922 votequorum: ifdef qdiskd API out
as agreed, the API has not been tested yet. Adding later is better than
removing it.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-18 14:23:06 +01:00
Fabio M. Di Nitto
1cf165e776 votequorum: display flags for all features
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-13 09:25:47 +01:00
Fabio M. Di Nitto
f464038b17 votequorum: drop HASSTATE/SETSTATE
this is a leftover from killing DISALLOWED

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-13 09:25:47 +01:00
Fabio M. Di Nitto
9589611dc4 votequorum: drop concept of DISALLOWED
this is a very old leftover from the RHEL5 timeframe, not used in RHEL6.

Also change votequorum soname since this change implies an ABI change.

Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-01-10 15:48:10 +01:00
Fabio M. Di Nitto
e34d509df7 quorum: change API to return quorum type at initialization time
corosync internal theory of operation is that without a quorum provider
the cluster is always quorate. This is fine for membership free clusters
but it does pose a problem for applications that need membership and
"real" quorum.

this change add quorum_type to quorum_initialize call to return QUORUM_FREE
or QUORUM_SET. Applications can then make their own decisions to error out
or continue operating.

The only other way to know if a quorum provider is enabled/configured is
to poke at confdb/objdb, but adds an unnecessary burden to applications
that really don't need to use an entire library for a boolean value.

Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-01-10 15:47:24 +01:00
Steven Dake
8ad583a54c Move logsys.c into corosync binary instead of a shared object
Our preferred shared logging system is exported via the libqb library.  As
a result, the corosync project no longer needs to export logsys.so and the
code can be directly included in the binary.  The header file can also be
removed.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-01-06 18:19:59 -07:00
Jan Friesse
7c250a5147 Remove objdb and confdb
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-12-15 09:19:18 +01:00
Jan Friesse
120531cddb Move SAM to use CMAP service
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-12-15 09:19:18 +01:00
Angus Salkeld
5aa44cd20b Tweek the increment in cpgbench so the message size gets to 1M
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-12-15 10:04:45 +11:00
Steven Dake
bdd03a4bb7 Remove unchecked return problem in test code
Signed-off-by: Steven Dake <sdake@redhat.com>
2011-11-26 08:50:25 -07:00
Steven Dake
aa76b79f24 Remove unchecked return warning
Signed-off-by: Steven Dake <sdake@redhat.com>
2011-11-26 08:50:25 -07:00
Angus Salkeld
0290297b42 Fix last warnings so we can build with --enable-fatal-warnings
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-11-15 09:42:26 +11:00
Angus Salkeld
3ade35ca01 TEST: make cpgbench go to 1M
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-10-21 20:20:17 +11:00
Angus Salkeld
37e17e7a94 libqb: logging & trace
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:16 +10:00
Angus Salkeld
25751d12d2 TEST: fix the print out when cpg_finalize() fails
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:15 +10:00
Angus Salkeld
c6895faa05 libqb: change ipc -> qb_ipc
IPC: return 0/-ENOBUFS from message handler
IPC: use the new rate_limit API to improve perf.
CPG: add send_async API & hook up flow control
IPC: Fix flow control getting stuck.
IPC: Port the remaining libs to use libqb IPC
IPC: remove libqb flowcontrol API
TEST: put cpg_dispatch() in it's own thread
IPC: cleanup ipc_glue.c name everything cs_ipcs_*()
IPC: add back statistics
IPC: remove coroipcc_ symbols from lib*.versions
IPC: init each se's IPC as it is loaded.
IPC: use the new connection_closed() event to free the context.
IPC: re-add zero copy functionality back
IPC: remove cpg_mcast_joined_async() and make it the default
 -> now cpg_mcast_joined() == cpg_mcast_joined_async()
libqb: expose a libqb error converter
libqb: add missing error conversions
libqb: remove repeat try loop in lib/cpg.c
CPG: fix zero copy mcast
CPG: use newer return codes
Add ENOTCONN to qb_to_cs_error()
libqb: fix error conversion from errno to cs_error_t in confdb
libqb: change errno_to_cs to qb_to_cs_error
libqb: add a cs_strerror() to get a more meaningful message
libqb: fix some confusing error conversions.
libqb: set the timeout on recv's to -1 (wait forever)

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:14 +10:00
Angus Salkeld
fce8a3c3b6 libqb: convert coropoll calls to qb_loop calls.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:14 +10:00
Jan Friesse
aa23d20125 testcpgzc: fgets buffer to really allocated size
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-03 11:11:28 +02:00
Russell Bryant
a609f79f1f Ensure that strings are null terminated after strncpy().
From the strcpy(3) man page, the following warning is given:
  The strncpy() function is similar, except that at most n bytes of src
  are  copied.  Warning: If there is no null byte among the first n bytes
  of src, the string placed in dest will not be null-terminated.

The current corosync code base does not take this warning into account
when using strncpy, potentially resulting in non-null terminated strings.

Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-03-07 08:30:03 -06:00
Angus Salkeld
29755d4526 Add missing entries into .gitignore
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-01-12 09:42:24 +11:00
Angus Salkeld
6f098bba1d fix timersub warning on freebsd
Make them all protected by #ifndef timersub

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-01-12 09:42:24 +11:00
Angus Salkeld
5e43f750e1 Add -i <num-iterations> to cpgverify
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2010-10-21 17:27:40 -07:00
Angus Salkeld
f0104b6d31 Add .gitignore files.
Otherwise "git status" is a pain.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@edhat.com>
2010-10-21 07:43:46 -07:00
Angus Salkeld
83b24b660b WD/SAM integration.
- timestamps -> uint64_t and in nanosecs
- use clock_gettime
- common object naming
- common state names
- timeouts in milliseconds



git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3054 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-27 21:13:15 +00:00
Jan Friesse
1a32fc4a6c SAM Confdb integration
Patch add support for Confdb integration with SAM. It's now possible to
use SAM_RECOVERY_POLICY_CONFDB as flag to previous policies.
    
Also new function sam_mark_failed is added for ability to use RECOVERY
policy together with confdb and get expected results (specially with
integration with corosync watchdog)

Patch also makes SAM thread safe.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3050 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-27 07:34:21 +00:00
Angus Salkeld
a5d6c5f151 makefile: add -lquorum -lcoroipcc to sam test programs
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2905 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-21 08:53:52 +00:00
Steven Dake
0e9f0bfeb4 Make cpg_membership_get() functional.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2855 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-19 05:03:52 +00:00
Angus Salkeld
4ff33854ad add __attribute__((noreturn)) to functions that always exit.
we had some __attribute__((__noreturn__))
and some    __attribute__((noreturn))

I made them all: __attribute__((noreturn))



git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2853 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-19 04:34:53 +00:00
Jan Friesse
d5884cd714 SAM integration of quorum
Patch adds integration of SAM and quorum, so it's now possible to use
SAM_RECOVERY_POLICY_QUORUM_QUIT or SAM_RECOVERY_POLICY_QUORUM_RESTART
recovery policy. With these policies, sam_start will block until
corosync is quorate. If quorum is lost during health checking, recovery
action is taken.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2822 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-13 11:20:49 +00:00
Jan Friesse
088a2a0f17 Allow call sam_warn_signal_set after sam_register
Patch fixes situation, when user want to change warn signal after
call of sam_register function. This was not possible, because parent
process never got new value from child.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2821 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-13 11:19:52 +00:00
Steven Dake
5e8f1a730f logsys rework to deal with memory corruption around debug:on configurations.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2816 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-13 04:36:42 +00:00
Angus Salkeld
1df8675995 testcpg: fix a format string compile warning.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2805 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-05 04:37:43 +00:00
Jan Friesse
e8b143595c CPG model_initialize and ringid + members callback
Patch adds new function to initialize cpg, cpg_model_initialize. Model
is set of callbacks. With this function, future addions of models
should  be possible without changing the ABI.

Patch also contains callback in CPG_MODEL_V1 for notification about
Totem membership changes.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2770 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-04-20 12:40:48 +00:00
Jan Friesse
da6fce352b Support for store user data in SAM
Ability to in-memory storing of user data which survives between
instances of process.

Also ability needed ability for bi-directional communication between
child and parent is added.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2769 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-04-20 10:32:07 +00:00
Angus Salkeld
ec09a97867 Fix some "make lint" problems
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2674 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-03-03 21:52:08 +00:00
Angus Salkeld
9b5b7872b6 Correct testcpg's groupname.length
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2661 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-02-18 20:10:36 +00:00
Jan Friesse
0ed4d53083 SAM implementation merge
The SAM library provide a tool to check the health
of an application. The main purpose of SAM is to restart
a local process when it fails to respond to a healthcheck
request in a configured time interval.



git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2570 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-12-07 17:06:53 +00:00
Steven Dake
692b723ae1 Remove unused variable in stress_cpgfdget.c.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2479 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-09-25 06:03:50 +00:00
Steven Dake
b130f5e434 Add test program that finds limits of cpg message size.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2389 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-08-17 18:22:51 +00:00
Steven Dake
4e3e77eb13 Always keep autogenerated node ids in totem as LE even on BE arches.
Have testcpg print out autogenerated nodeid properly on BE arch.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2377 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-08-04 00:04:13 +00:00
Jan Friesse
49136748b0 Add lcoroipcc as build requirement for stress_cpgcontext
This is needed for build without errors on system, where
lcoroipcc is not installed on system yet.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2363 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-07-16 12:16:06 +00:00
Steven Dake
55ac5ee8e9 add stress_cpgcontext.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2359 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-07-10 12:54:56 +00:00
Steven Dake
cdea22993f Add stress_cpgfdget test case.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2358 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-07-10 12:29:50 +00:00
Steven Dake
c9de702374 Add stress test case for cpg and coroipcc zero copy buffer alloc and free
operations.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2357 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-07-10 12:05:48 +00:00
Steven Dake
2a31caedd3 Add ring id field to evs.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2341 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-07-01 20:57:37 +00:00
Steven Dake
453ef211c1 Pass handle is evs callback functions.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2339 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-07-01 20:21:48 +00:00
Steven Dake
3e0ab804cb Code cleanup for evs service from Wojtek.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2333 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-07-01 19:19:06 +00:00
Steven Dake
1402dac1ee Remove timersub redefine.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2274 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-06-20 17:36:04 +00:00
Steven Dake
b8e3951ca1 Add (void *) casts for iovector assignments to remove compile warnings.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2270 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-06-19 20:43:12 +00:00
Steven Dake
2d7937de26 Warn user of missing dirs and exit gracefully.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2262 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-06-19 01:53:24 +00:00
Fabio M. Di Nitto
b31f150b5f logsys: remove leftover files from running tests
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2252 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-06-18 05:39:28 +00:00
Fabio M. Di Nitto
6d5ce092a1 logsys: port to new packed rec_ident version
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2250 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-06-18 05:32:56 +00:00
Fabio M. Di Nitto
5597a2381f logsys: merge tags into rec_ident
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2246 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-06-18 05:15:10 +00:00
Jan Friesse
8605bbc7b2 Fix coroipcc linking
Fixes rhbz#499918
	
Functions from ckpt library (like aCkptCheckpointOpen,
saCkptSectionIterationInitialize, ...) internally uses corosync functions
reply_receive, reply_receive_in_buf, ... This functions are included in
coroipcc.c source file and uses global static variable ipc_hdb.

Without patch, coroipcc is linked to shared library (libcoroipcc.so) AND linked
with every corosync libraries (like cpg, ....), so global variable ipc_hdb is
included not only in libcoroipcc.so, but also in libcpg.so, ...

dlm_controld has function retrieve_plocks, and whole binary is linked with
libcoroipcc and libcpg. So ipc_hdb is included TWICE (so has TWO addresses).

Main problem causing the bug was, that reply_receive uses address from one
library, and reply_receive_in_buf uses other. This confuses check of hdb_get
function.

After removing linking of coroipcc.o to cpg, and rather use of dynamic version,
 (this means, there is only one instance of ipc_hdb) problem disappeared.



git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2203 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-06-01 08:46:10 +00:00
Jim Meyering
1f40a10983 don't include <signal.h> when it's not used
* exec/coroparse.c: Likewise.
* exec/quorum.c: Likewise.
* exec/sync.c: Likewise.
* exec/totemmrp.c: Likewise.
* exec/totemnet.c: Likewise.
* exec/totemrrp.c: Likewise.
* exec/totemsrp.c: Likewise.
* exec/vsf_quorum.c: Likewise.
* exec/vsf_ykd.c: Likewise.
* lcr/uic.c: Likewise.
* lcr/uis.c: Likewise.
* lib/cfg.c: Likewise.
* services/cfg.c: Likewise.
* services/cpg.c: Likewise.
* services/evs.c: Likewise.
* services/pload.c: Likewise.
* services/testquorum.c: Likewise.
* services/votequorum.c: Likewise.
* test/testconfdb.c: Likewise.
* test/testcpg.c: Likewise.
* test/testcpgzc.c: Likewise.
* test/testzcgc.c: Likewise.
* tools/corosync-cfgtool.c: Likewise.
* tools/corosync-objctl.c: Likewise.
* tools/corosync-pload.c: Likewise.

git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2193 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-05-18 16:41:46 +00:00
Jim Meyering
d44aad2eea don't include <assert.h> when it's not used
* exec/apidef.c: Likewise.
* exec/mainconfig.c: Likewise.
* exec/service.c: Likewise.
* exec/timer.c: Likewise.
* exec/totemconfig.c: Likewise.
* exec/totemmrp.c: Likewise.
* exec/vsf_quorum.c: Likewise.
* services/testquorum.c: Likewise.
* test/cpgbench.c: Likewise.
* test/cpgbenchzc.c: Likewise.
* tools/corosync-fplay.c: Likewise.

git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2192 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-05-18 16:41:37 +00:00
Jim Meyering
84f1fbb53f always include <config.h> before any other file
* test/cpgbench.c: Include <config.h> before any other file.
* test/cpgbenchzc.c: Ditto.

git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2191 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-05-18 16:41:28 +00:00
Jim Meyering
621d734e7c avoid 'incompatible pointer type' compiler warning
* test/cpgverify.c (cpg_deliver_fn): Remove const.

git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2168 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-05-04 12:37:27 +00:00
Steven Dake
2274c6c961 Add cpgverify program to test directory.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2162 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-28 06:45:26 +00:00
Steven Dake
d72f0cc03c Remove const from delivery callback to allow inplace endian changes of
message contents.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2150 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-26 02:38:46 +00:00
Steven Dake
65f8490350 use uint64_t for hdb_handle_t type and also specify some formatting
strings for printing handles out of the handle database.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2126 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-23 10:03:01 +00:00
Jim Meyering
904a10ed38 remove all trailing blanks
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2117 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-22 08:03:55 +00:00
Steven Dake
75c4bc0d71 Zero copy feature for IPC transmits. Also integrated into CPG library
service.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2114 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-21 23:37:49 +00:00
Fabio M. Di Nitto
c3c75acfd2 Add logsys v3
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2091 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-20 04:28:33 +00:00
Steven Dake
d830a52db5 Remove declaration of data struct inside code.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2085 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-18 07:43:33 +00:00
Steven Dake
832b6cb7e6 Remove warning in evsbench.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2084 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-18 07:42:25 +00:00
Steven Dake
043de4d80a check result of fgets in testcpg.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2082 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-18 07:25:04 +00:00
Steven Dake
0969721db3 Rework how dispatch functions so service engines work properly.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2079 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-18 07:06:14 +00:00
Jim Meyering
e5962b419d testvotequorum1.c: don't shadow file-scoped global, "handle"
* test/testvotequorum1.c (main): Rename: s/handle/g_handle/.

git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2073 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-15 19:12:56 +00:00
Jim Meyering
7d457e121b don't shadow file-scoped global, "handle"
* test/testquorum.c: Rename: s/handle/g_handle/.

git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2069 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-15 19:12:25 +00:00
Jim Meyering
09eeca1299 testevsth.c: const+size_t: evs_deliver_fn, evs_confchg_fn
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2046 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-09 12:32:49 +00:00
Jim Meyering
3df307a36d sync the rest of the code with previous header changes
* exec/coroipcs.c (coroipcs_response_send)
(coroipcs_dispatch_send):
* exec/coroipcs.h (handler_fn_get):
* include/corosync/cpg.h (cpg_deliver_fn_t, cpg_confchg_fn_t):
* test/cpgbench.c (cpg_bm_confchg_fn, cpg_bm_deliver_fn):
* test/testcpg.c (print_cpgname, DeliverCallback)
(ConfchgCallback):
* test/testcpg2.c (deliver, confch):

git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2044 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-08 17:29:53 +00:00
Jim Meyering
00db317b82 sync the rest of the code with previous header changes
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2042 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-08 17:29:37 +00:00
Jim Meyering
cf2c12a988 evs.h: s/int/size_t; const-correctness changes
* exec/sync.c (barrier_data_confchg_entries):
* include/corosync/evs.h (evs_deliver_fn_t, evs_confchg_fn_t):
(evs_callbacks_t):
* lib/evs.c (MIN, evs_join, evs_leave, evs_mcast_joined):
(evs_mcast_groups, evs_membership_get):
* test/evsbench.c (evs_deliver_fn, evs_confchg_fn):
* test/evsverify.c (evs_deliver_fn, evs_confchg_fn, main):
* test/testevs.c (evs_deliver_fn, evs_confchg_fn, main):

git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2023 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-08 06:43:17 +00:00
Jim Meyering
6b9505992f confdb.h (confdb_reload) Add errbuf_len parameter and propagate.
* include/corosync/confdb.h (confdb_callbacks_t):
* lib/confdb.c (confdb_reload):
* lib/sa-confdb.c (confdb_sa_reload):
* lib/sa-confdb.h:
* test/testconfdb.c (main):

git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2004 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-03 20:31:38 +00:00
Jim Meyering
1308223dab confdb.h: error_text vs. buflen
* lib/confdb.c (MIN): Define.
(confdb_write): Use new errbuf_len parameter.
Also note bugs (Chrissie confirms) that error_text is not
set in two error-return cases.
* test/testconfdb.c (do_write_tests): Update use of confdb_write.

git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2002 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-03 20:31:12 +00:00
Steven Dake
870046d065 Patch to use snprintf where appropriate to avoid buffer overrun.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1990 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-04-02 18:49:24 +00:00