Commit Graph

560 Commits

Author SHA1 Message Date
Jan Friesse
b5f7dcefeb cfg: remove crypto_set
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-13 12:23:10 +01:00
Fabio M. Di Nitto
20a5289074 drop evs service
there are several reasons for this:

1) evs is only partially implemented with no plans to complete it

typedef enum {
       EVS_TYPE_UNORDERED, /* not implemented */
       EVS_TYPE_FIFO,          /* same as agreed */
       EVS_TYPE_AGREED,
       EVS_TYPE_SAFE           /* not implemented */
} evs_guarantee_t;

2) evs has no users in any upstream distribution and no search
   engine can find any other upstream using it.

3) the only reason (I was told) to carry around evs was that evs
   receives the full ring_id struct from totem. This is only
   partially correct because while the structures are prepared
   to carry around those data, they are never transmitted from
   corosync engine down the IPC line to the user.
   CPG ring_id contains the exact same information and it's
   actually less buggy (due to prototying of the info).

worst case scenario where a user really absolutely need libevs,
it can be easily reimplemented as libcpg wrapper and avoid
lots of code duplication.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-12 15:51:50 +01:00
Fabio M. Di Nitto
eb3d49ef7d pload: make it a test service and not a public one
pload is a performance benchmark that measures the onwire
speed of corosync.

problem is that once pload has been executed, the cluster
is basically dead.

turn pload into a test tool, by removing corosync-pload tool
and user library.

cleanup pload code to make it more readable and drop lots
of unnecessary stuff.

add test/ploadstart tool that can configure and start pload
via cmap calls.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-12 07:11:51 +01:00
Fabio M. Di Nitto
03c76be696 votequorum: fix votequorum_getinfo man page and align struct name
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-02-27 12:41:04 +01:00
Fabio M. Di Nitto
cb5fd77501 votequorum: major rework to fix qdevice API and integration with core
qdevice is a very special node in the cluster and it adds a certain
amount of complexity and special cases across the code.

most of the qdevice data are shared across the cluster (name/votes)
but effectively each node has a different view of the qdevice
(registered/unregistered/voting/etc.)

with this change, we align the qdevice view across the node,
exchanging more data between nodes and we fix how qdevice behaves
and it is configured.

The only side effect is that the amount of data transmitted on wire
is slightly higher.

The qdevice API is still disabled by default. This means that
the amount of real changes in current code are a lot smaller
than it appears by this patch.

TODO: documentation/man pages needs to be updated once
      this change is in (and behavior finalized).

User visible changes:

- configuration (coroparse, exec/votequorum):
  the quorum device section is now standalone within the quorum.

  quorum {
    provider: corosync_votequorum
    device {
      model: (name)
      timeout: (millisec)
      votes:
    }
  }

  the keyword "model:" is mandatory to enable qdevice in configuration
  and should express the name of the script/daemon that will provide
  the qdevice. Looking into the future, an init script or systemd
  service will look for that name in /path/to/be/decided/name
  and start/stop qdevice.

  timeout: defines the maximum interval the qdevice implementation
  has available between poll (see votequorum_qdevice_poll.3) before
  the device is considered dead and votes discarded

  votes: is now a configuration parameter and not an API call.
  quorum devices don't care what they need to vote.
  votes is autocalculated when a nodelist is available and all
  nodes in the list vote 1. Otherwise this parameter is mandatory.

- configuration (exec/votequorum):
  startup and runtime configuration changes have been improved.
  errors at startup are considered fatal. errors at runtime
  have different exit paths.

  startup:

  * quorum.two_node and qdevice are incompatible.
  * quorum.expected_votes requires quorum.device.votes.
  * quorum.expected_votes - quorum.device.votes cannot be lower
    than 2.
  * qdevice and last_man_standing are mutually exclusive.
  * qdevice and auto_tie_breaker are mutually exclusive.

  runtime config changes:

  * quorum.two_node and qdevice are incompatible:
    if quorum device is alive, two_node is disabled.
    if quorum device is not alive and node count is 2, two_node is
       enabled, and quorum device cannot be registered

  * if either last_man_standing or auto_tie_breaker were enabled
    at startup, and at runtime quorum device is configured,
    quorum device registration will be blocked.

  * if quorum.expected_votes is configured but not quorum.device.votes,
    quorum device registration will be blocked.

  * if quorum.device.votes is not configured and we cannot
    automatically calculate it, quorum device registration will be blocked.

  * An error in configuring quorum.expected_votes and quorum.device.votes
    will block quorum device registration.

blocking quorum device registation, also means dropping the votes.

quorum.device.votes (either set or automatically calculated) is now
used to determine current expected_votes in the cluster.

- logging (exec/votequorum):

  all errors from configuration are treated as WARNING/CRITICAL.

  lots of extra DEBUG output is added (see internal changes too).

- corosync-quorumtool (tools/corosync-quorumtool):

  * added option to forcefully kick out a quorum device from the local
    node. This is for emergency recovery only and it is only
    available when qdevice API is built-in.

  * Improved status output, specifically add node state and qdevice
    information

[root@fedora-master-node2 coro]# corosync-quorumtool -s
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          132
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net
   0     1  QDEVICE (Voting)

  * allow to print status for any node in the cluster known to
    local node.

[root@fedora-master-node1 coro]# corosync-quorumtool -s
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          144
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      2
Quorum:           2
Flags:            Quorate
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net

[root@fedora-master-node1 coro]# corosync-quorumtool -s -n 2
Version:          1.99.4.12-9c7d-dirty
Quorum type:      corosync_votequorum
Nodes:            2
Ring ID:          144
Quorate:          Yes
Node votes:       1
Node state:       Member
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice
Nodeid     Votes  Name
   1     1  fedora-master-node1.int.fabbione.net
   2     1  fedora-master-node2.int.fabbione.net
         0     1  QDEVICE (Voting)

Internal changes:

- change qdevice timer to not run all time, but only when necessary.
- change votequorum_nodeinfo on wire data to use flags instead of uint8_t
  and add QDEVICE status.
- allocate nodeid 0 to qdevice since it's the only real
  nodeid that be reserved.
- change send_nodeinfo to allow to send nodeinfo for any node
  so that we can share qdevice info across the cluster
  (and this might be useful in future if we need to sync
   internal cluster view).
- add votequorum api call to update qdevice name
- add runtime data if quorum device has been forcefully disabled
  by config error
- add qdevice votes to expected_votes calculation (this
  is probably the biggest difference vs cman)
- change votequorum_read_nodelist_configuration so that
  we can autocalculate votes for qdevice (we need the nodecount
  vs votes).
- add all checks for startup/runtime config (see above).
- do not make qdevice part of the membership_list received from
  totem. None of our users care about it and it is not a real node.
- change onwire message handlers to deal with "data for this node from any node"
  case and undersand nodeid 0 for qdevice info
- always allocate qdevice at startup. this simplifies code a lot.
- dispatch qdevice nodeinfo on membership changes.
- inform libvotequorum users when a qdevice is registered
- improve substantially qdevice api and add a simple
  barrier based on qdevice name.
- add qdevice API barrier at cluster level. This feature allow
  only one qdevice name to be active in the cluster at any time.
- qdevice getinfo can now report status for qdevice on any node.
- change slightly the way the qdevice API is built-in/out:
  only the libvotequorum calls are #ifdef'out now. Doing so in
  the core is too complex and would make the code unreadable
  with the risk of missing a bit or two effectively introducing
  an on-wire incompatibility if we will ever turn the API on.
- probably added some bugs on the way...

TODO: update qdevice_* API once the above is settled and test
      qdevice integration with other features.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com> (only second part)
2012-02-27 09:30:26 +01:00
Angus Salkeld
40727bd6a3 Convert the common lib into a shared lib.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-02-21 20:26:08 +11:00
Jan Friesse
8cde53aa99 cmap: Handle NULL in [i]cmap_set_string value
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-16 10:47:57 +01:00
Angus Salkeld
aa5de3e2f4 CPG: fix membership_get()
1) remove BUSY loop from membership get
   Note only cpg_join and cpg_leave ever set the
   BUSY error code.
2) set the size correctly
3) copy the name in correctly

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-14 11:10:14 +11:00
Steven Dake
03c33697f2 Update copyright dates in util directory
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>
2012-02-13 17:05:04 -07:00
Steven Dake
815375411e Remove unused or unimplemented CFG apis
Remove:
cfg_statetrack
cfg_statetrackstop
cfg_administrativestateste
cfg_administrativestateget
cfg_serviceload
cfg_serviceunload

Rev SO to 5.0.0

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-02-13 17:04:49 -07:00
Angus Salkeld
6cd576b0f5 move hdb_error_to_cs to common_lib
Note the previous inconsistent implementation.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-09 10:45:56 +11:00
Angus Salkeld
da483b8121 Add a common library that can be shared between libs and corosync
We have always had this problem and worked around it by coping code
or using inline functions. Both not good IMO.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-09 10:45:56 +11:00
Jan Friesse
9260efdf47 Add CS_DISPATCH_ONE_NONBLOCKING dispatch type
Add missing option for dispatch, which fills gap in combination of
block/nonblock and one/all dispatch types. New type doesn't mask
CS_ERR_TRY_AGAIN, and it means "no message was processed".

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-08 16:03:46 +01:00
Fabio M. Di Nitto
62bbe076a8 corotype: drop deprecated CPG_ defines
the only user of those obsoleted defines is dlm master (already ported)
to use CS_ and cmirror (that needs full porting to new corosync either way).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-08 13:37:46 +01:00
Fabio M. Di Nitto
4120a2c1cb corotypes: drop deprecated EVS_ defines
none of our current dependencies use it.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-08 13:37:46 +01:00
Fabio M. Di Nitto
811c536653 votequorum: fix possible string overflow (-1) in qdevice_register
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-01-31 10:15:47 +01:00
Fabio M. Di Nitto
ccd36af00e votequorum: rename qdisk to qdevice
a quorum device is not necessarely a disk and this also aligns
various names to be generic

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-By: Christine Caulfield <ccaulfie@redhat.com>
2012-01-27 11:17:02 +01:00
Fabio M. Di Nitto
2cd6ad9922 votequorum: ifdef qdiskd API out
as agreed, the API has not been tested yet. Adding later is better than
removing it.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-18 14:23:06 +01:00
Steven Dake
75bc06d916 Remove lcr directory, files, and references since it is no longer needed
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>
2012-01-16 09:30:40 -07:00
Steven Dake
08b635f8da Move cs_error into global header so that third party applications can use it
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Andrew Beekhof <abeekhof@redhat.com>
2012-01-16 07:32:40 -07:00
Fabio M. Di Nitto
23ea4f0f11 votequorum: drop votequorum_leave
this was a compatibility function for cman_tool only.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-13 09:25:47 +01:00
Fabio M. Di Nitto
f464038b17 votequorum: drop HASSTATE/SETSTATE
this is a leftover from killing DISALLOWED

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-13 09:25:47 +01:00
Steven Dake
7c8e83ac34 Change all ais references to corosync
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>
2012-01-12 07:29:15 -07:00
Fabio M. Di Nitto
9589611dc4 votequorum: drop concept of DISALLOWED
this is a very old leftover from the RHEL5 timeframe, not used in RHEL6.

Also change votequorum soname since this change implies an ABI change.

Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-01-10 15:48:10 +01:00
Fabio M. Di Nitto
4306101510 quorum: bump soname for libquorum to reflect API change
Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-01-10 15:48:07 +01:00
Fabio M. Di Nitto
e34d509df7 quorum: change API to return quorum type at initialization time
corosync internal theory of operation is that without a quorum provider
the cluster is always quorate. This is fine for membership free clusters
but it does pose a problem for applications that need membership and
"real" quorum.

this change add quorum_type to quorum_initialize call to return QUORUM_FREE
or QUORUM_SET. Applications can then make their own decisions to error out
or continue operating.

The only other way to know if a quorum provider is enabled/configured is
to poke at confdb/objdb, but adds an unnecessary burden to applications
that really don't need to use an entire library for a boolean value.

Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-01-10 15:47:24 +01:00
Angus Salkeld
9e36255b8e IPC: don't block forever on a recv msg as corosync might be gone.
This at least will not make the client hang forever.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-10 08:32:31 +11:00
Jan Friesse
7c250a5147 Remove objdb and confdb
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-12-15 09:19:18 +01:00
Jan Friesse
120531cddb Move SAM to use CMAP service
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-12-15 09:19:18 +01:00
Jan Friesse
8a45e2b152 Move corosync core to use icmap
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-12-15 09:19:17 +01:00
Jan Friesse
b3c99977de Add user library to use cmap service
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-12-15 09:19:17 +01:00
Angus Salkeld
a748700cde Be more flexible (correct) with flowcontrol.
Many functions do not require flowcontrol and are two-way
so they can get failures from corosync.
Only cpg_mcast_joined() _really_ needs the current level
of flowcontrol.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-12-14 12:03:42 +11:00
Jan Friesse
e5952176d6 hdb* functions already returns -error value
So it's wrong to define hdb_error_to_cs and pass -error value, because
this creates --error = error = CS_OK.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-12-01 08:52:32 +01:00
Steven Dake
73a0adf10e Correct typing in memory_map function in lib/cpg.c
Signed-off-by: Steven Dake <sdake@redhat.com>
2011-11-26 08:50:25 -07:00
Steven Dake
b793135834 Remove default from cpg_model_initialize - atm there is only one model
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-10-21 03:01:07 -07:00
Steven Dake
3ad0979dc1 Remove dead code in evs service
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-10-21 03:01:01 -07:00
Steven Dake
589da8f0e1 Remove dead code in votequorum
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-10-21 03:00:41 -07:00
Steven Dake
d05ddc0342 Remove dead code in cfg.c
Signed-off-by: Steven Dake <sdake@redhat.com>
2011-10-21 02:15:39 -07:00
Angus Salkeld
a716f13bf9 Fix some compiler warnings
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:16 +10:00
Angus Salkeld
af29d5bde3 Use PATH_MAX for file path size
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:16 +10:00
Angus Salkeld
f717bc60e1 libqb: make timer api a wrapper around qb_loop timers.
- change timeout value to nano seconds
- fix timer handles (don't alloc on stack)

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:14 +10:00
Angus Salkeld
c6895faa05 libqb: change ipc -> qb_ipc
IPC: return 0/-ENOBUFS from message handler
IPC: use the new rate_limit API to improve perf.
CPG: add send_async API & hook up flow control
IPC: Fix flow control getting stuck.
IPC: Port the remaining libs to use libqb IPC
IPC: remove libqb flowcontrol API
TEST: put cpg_dispatch() in it's own thread
IPC: cleanup ipc_glue.c name everything cs_ipcs_*()
IPC: add back statistics
IPC: remove coroipcc_ symbols from lib*.versions
IPC: init each se's IPC as it is loaded.
IPC: use the new connection_closed() event to free the context.
IPC: re-add zero copy functionality back
IPC: remove cpg_mcast_joined_async() and make it the default
 -> now cpg_mcast_joined() == cpg_mcast_joined_async()
libqb: expose a libqb error converter
libqb: add missing error conversions
libqb: remove repeat try loop in lib/cpg.c
CPG: fix zero copy mcast
CPG: use newer return codes
Add ENOTCONN to qb_to_cs_error()
libqb: fix error conversion from errno to cs_error_t in confdb
libqb: change errno_to_cs to qb_to_cs_error
libqb: add a cs_strerror() to get a more meaningful message
libqb: fix some confusing error conversions.
libqb: set the timeout on recv's to -1 (wait forever)

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-08-09 10:37:14 +10:00
Jan Friesse
94d934e0e0 coroipcc: Test _SC_PAGESIZE result
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-29 15:17:49 +02:00
Jan Friesse
2e5dc5f322 coroipcc: check recvmsg result in socket_recv
According specification recvmsg can return 0, which means that
connection is closed. We had this check, but limited only for systems
other then Linux. recvmsg can return 0 even on Linux, so check is now
applied on all systems.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-06-10 12:33:19 +02:00
Jan Friesse
0273c54054 coroipcc: proper path size in coroipcc_zcb_alloc
memory_map function internally limits maximum path size to
PATH_MAX but coroipcc_zcb_alloc passed smaller buffer.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-03 10:57:42 +02:00
Jan Friesse
6af98e79ee libquorum: memset/memcpy proper size of callbacks
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-06-03 10:57:18 +02:00
Jerome Flesch
795aa5e24c coroipcc_dispatch_get(): Fix --enable-small-memory-footprint support
Signed-off-by: Jerome Flesch <jerome.flesch@netasq.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2011-05-27 13:42:42 +02:00
Jerome Flesch
76426d7901 coroipcc: Fix unhandled BSD EOF in coroipcc_dispatch_get()
Signed-off-by: Jerome Flesch <jerome.flesch@netasq.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-05-27 13:35:02 +02:00
Steven Dake
6a752ba1b1 Align ipc on 8 byte boundaries
Align all ipc messages on 8 byte boundaries.  This alignment will remove bus
errors on systems that can't access non-byte aligned data and should improve
performance.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2011-04-14 17:25:08 -07:00
Jan Friesse
033f7ced10 cfg_get_node_addrs: Return correct addresses
Zero element array behavior is very different from normal array or
pointer. This behavior is root of problem in not returning correctly
filled array of addresses. This appeared only in rrp mode, where more
then one address is returned.

All memcpy's are now correctly converted to copy pointer to char.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-03-24 17:42:08 +01:00
Angus Salkeld
0ad2494ae7 Fix some "set but not used" warnings [-Wunused-but-set-variable]
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-03-16 07:13:42 +11:00
Russell Bryant
e5456008d0 Resolve a couple of doxygen warnings.
This resolves a couple of doxygen warnings.  First, the group needed a
name.  Second, all of the functions in the file were added to the group
but doxygen complained about the lack of an end to the grouping.

Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-03-07 08:39:58 -06:00
Angus Salkeld
89e4c1c048 CONFDB: add confdb_object_name_get()
This is useful when tracking object changes.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Seven Dake <sdake@redhat.com>
2011-02-04 09:47:15 -07:00
Angus Salkeld
e0cce2c907 CPG: make sure coroipcc_service_disconnect() is always called.
This prevents a shared mem leak if corosync dies while clients
are connected.

Calling cpg_finalize() did not release the shared mem as
coroipcc_msg_send_reply_receive() returned an error and
thus coroipcc_service_disconnect() did not get called.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2011-01-03 21:29:01 +11:00
Angus Salkeld
83b24b660b WD/SAM integration.
- timestamps -> uint64_t and in nanosecs
- use clock_gettime
- common object naming
- common state names
- timeouts in milliseconds



git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3054 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-27 21:13:15 +00:00
Jan Friesse
1a32fc4a6c SAM Confdb integration
Patch add support for Confdb integration with SAM. It's now possible to
use SAM_RECOVERY_POLICY_CONFDB as flag to previous policies.
    
Also new function sam_mark_failed is added for ability to use RECOVERY
policy together with confdb and get expected results (specially with
integration with corosync watchdog)

Patch also makes SAM thread safe.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3050 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-27 07:34:21 +00:00
Steven Dake
4ac55e52e4 Patch from Kacper Kowalik to support honoring user defined LDFLAGS.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3042 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-09-14 18:10:12 +00:00
Angus Salkeld
3b320c17ae IPC: return CS_ERR_NO_RESOURCES to library when low on fds.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3029 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-08-25 01:13:14 +00:00
Steven Dake
5a3c285fbd Properly detect shutdown of corosync process
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3022 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-08-17 18:08:13 +00:00
Steven Dake
1135c911cd Fix problem where flow control could lock up ipc under very heavy load in very
rare circumstances.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3001 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-07-21 17:03:36 +00:00
Steven Dake
10fcf89591 Speed up IPC connection process.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2986 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-07-07 21:43:15 +00:00
Steven Dake
cf0d63aa3f Patch to fix stack protector sig abort that occurs when ipc buffer is too
short.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2974 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-06-29 18:15:20 +00:00
Angus Salkeld
b116eaca00 ipc: Fix error handling of mmap util functions.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2972 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-06-29 05:31:44 +00:00
Jan Friesse
1d1c3059ad coroipcc - don't loop forever on EINTR
This patch unify behaviour of SYS V semaphores and POSIX semaphores.
POSIX semaphores never return CS_ERR_TRY_AGAIN on EINTR and keeps
waiting. This was fixed for SYS V semaphores in rev. 2303.

Another change is to remove very small probability of hung forever in
coroipcc_dispatch_put.

Last change is removal of duplicate code by adding ipc_sem_wait function.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2907 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-27 08:05:31 +00:00
Steven Dake
0e9f0bfeb4 Make cpg_membership_get() functional.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2855 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-19 05:03:52 +00:00
Steven Dake
02cddf6854 Fix free of ring status information when memory allocation fails during
allocation of the ring status information.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2852 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-18 17:20:05 +00:00
Angus Salkeld
b1d84f9ecc cov 10373: check poll return value
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2844 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-16 23:35:45 +00:00
Angus Salkeld
b2a304d8f8 cov (10387, 10397): cleanup memory mapping functions
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2840 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-16 21:35:52 +00:00
Angus Salkeld
0886c03881 cov 10374: check sam_hc_send() before counter++
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2839 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-16 21:34:41 +00:00
Angus Salkeld
d146ae8ec9 cov 10396: prevent a leak under error conditions (lib/sam.c)
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2830 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-14 02:08:30 +00:00
Jan Friesse
d5884cd714 SAM integration of quorum
Patch adds integration of SAM and quorum, so it's now possible to use
SAM_RECOVERY_POLICY_QUORUM_QUIT or SAM_RECOVERY_POLICY_QUORUM_RESTART
recovery policy. With these policies, sam_start will block until
corosync is quorate. If quorum is lost during health checking, recovery
action is taken.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2822 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-13 11:20:49 +00:00
Jan Friesse
088a2a0f17 Allow call sam_warn_signal_set after sam_register
Patch fixes situation, when user want to change warn signal after
call of sam_register function. This was not possible, because parent
process never got new value from child.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2821 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-13 11:19:52 +00:00
Angus Salkeld
1cdad3104c Fix "mock --with testagents"
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2798 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-05-04 00:50:24 +00:00
Jan Friesse
450d821bdf Fix parallel build of libs in lib directory
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2790 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-04-28 16:00:55 +00:00
Jan Friesse
171f65578c Handle POLLNVAL in coroipcc
Old code in coroipcc doesn't handle POLLNVAL. It can happen, that some
applications (for example fenced) will stuck forever.

Also poll result is now handled more correctly.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2789 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-04-26 16:16:20 +00:00
Jan Friesse
e8b143595c CPG model_initialize and ringid + members callback
Patch adds new function to initialize cpg, cpg_model_initialize. Model
is set of callbacks. With this function, future addions of models
should  be possible without changing the ABI.

Patch also contains callback in CPG_MODEL_V1 for notification about
Totem membership changes.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2770 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-04-20 12:40:48 +00:00
Jan Friesse
da6fce352b Support for store user data in SAM
Ability to in-memory storing of user data which survives between
instances of process.

Also ability needed ability for bi-directional communication between
child and parent is added.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2769 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-04-20 10:32:07 +00:00
Jan Friesse
2a12dafffb Fix confdb linking (add -ldl)
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2768 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-04-20 10:08:15 +00:00
Steven Dake
5408399b23 Remove problem where NULL dispatch handler functions would result in lockup
of the dispatch if they were sent by a service engine.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2754 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-04-02 00:10:43 +00:00
Jan Friesse
a568462601 Support for user configurable warning signal
Allow developer configure a signal to be send as a warning signal
before real SIGKILL.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2753 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-04-01 12:51:07 +00:00
Jan Friesse
4b18364c61 Support for specific libraries version
Patch adds support for changing version number of library simply by edit
lib$(LIB).verso file.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2752 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-04-01 12:35:31 +00:00
Jrme Flesch
52acd736d0 Coroipcc: Make sure that coroipcc_service_connect() always return a valid cs_error_t
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2748 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-03-30 07:24:59 +00:00
Christine Caulfield
1baa7b2ab3 Add a reload callback to libconfdb.
This also increments the libconfdb version to 4.1.0



git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2683 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-03-16 09:51:30 +00:00
Jan Friesse
009dfc090e Support for lib_cpg_finalize
Add support for MESSAGE_REQ_CPG_FINALIZE message. This will allow us
remove cpg_pd from list of active connections, and remove problem, when
cpg_finalize + cpg_initialize + cpg_join can result in CPG_ERR_EXIST
error.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2676 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-03-04 12:17:47 +00:00
Steven Dake
3dd78c7aec Fix error handling to avoid segfaults/leaks on error in coroipcc_service_connect.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2672 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-03-03 19:19:12 +00:00
Angus Salkeld
5dd31150ee convert strtok to strtok_r
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2663 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-02-25 19:25:20 +00:00
Jan Friesse
1649931691 Fix freeze of IPC library connection on sem_wait
This patch solves library waiting on sem_wait. It doesn't
solve all other problems, which can make corosync not
to exit (malloc race, global lock deadlock, ...)

RHBZ#547511


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2643 fd59a12c-fef9-0310-b244-a6a79926bd2f
2010-01-14 08:39:06 +00:00
Jan Friesse
0ed4d53083 SAM implementation merge
The SAM library provide a tool to check the health
of an application. The main purpose of SAM is to restart
a local process when it fails to respond to a healthcheck
request in a configured time interval.



git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2570 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-12-07 17:06:53 +00:00
Angus Salkeld
663f894498 make sure key_names past from confdb are null terminated.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2561 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-12-02 22:14:00 +00:00
Angus Salkeld
ce8046f353 coroipcs: add logging for flow control state changes.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2551 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-11-29 18:25:51 +00:00
Angus Salkeld
3848fc2069 COVERITY 4: remove dead code in XYZ_dispatch().
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2549 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-11-23 00:32:31 +00:00
Angus Salkeld
870e4549df COVERITY 11: remove dead code from cpg_iteration_next().
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2545 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-11-22 06:29:46 +00:00
Angus Salkeld
73b7aa19bb Add value types to objdb keys.
This allows you to create a key with a know type.
And then get the type with the key value.



git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2511 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-10-10 03:20:38 +00:00
Steven Dake
45d68cab47 Fix merge error in previous commit.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2472 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-09-21 20:08:38 +00:00
Steven Dake
b3f661d295 Patch from Jerome to fix segfault in dispatch functions.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2468 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-09-21 19:03:16 +00:00
Steven Dake
77e63f0044 Remove warning in coroipcc.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2462 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-09-20 06:53:10 +00:00
Steven Dake
f170a431ce Fix error with revision 2415.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2416 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-09-14 20:27:45 +00:00
Steven Dake
c14c130df5 Fix dispatch returning TRY_AGAIN when using DISPATCH_ALL parameter because of
regression caused by revision 2046:lib/coroipcc.c.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2415 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-09-14 18:58:08 +00:00
Angus Salkeld
b13a7aba13 add small-memory-footprint option to configure
This adds the following option to configure:
--enable-small-memory-footprint

When enabled the following defines are set
to reduce the overall memory footprint.

MESSAGE_SIZE_MAX=1024*64
MESSAGE_QUEUE_MAX=512
PROCESSOR_COUNT_MAX=16
IPC_REQUEST_SIZE=1024*64
IPC_RESPONSE_SIZE=1024*64
IPC_DISPATCH_SIZE=1024*64




git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2410 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-09-09 18:58:38 +00:00
Christine Caulfield
6f3fabe398 Add corosync-quorumtool
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2406 fd59a12c-fef9-0310-b244-a6a79926bd2f
2009-09-08 08:14:58 +00:00