defaults in services.c) and can load another module to do the quorum
work (eg YKD which I've made more compliant too). All the quorum code
has been removed from sync.c. quorum.c is simply a shim later for the
coroapi, the main module is in vsf_quorum.c
There are coroapi calls to query quorate status and also to get
notifications when it changes.
I've included the testquorum.lcrso module in this patch because I think
it's really helpful for testing. It sets the quorum state based on an
objdb variable, this can be set or cleared using corosync-cfgtool
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1704 fd59a12c-fef9-0310-b244-a6a79926bd2f
module doesn't provide quorum itself, merely a framework for setting and
querying it. I envisage YKD plugging into this rather than straight into
sync() eventually.
I've plugged this into the sync() routines rather than replacing them so
that quorum is itself a VSF, rather than a replacement - I'm not sure if
that is best or not. Opinions are welcome.
I've added an extra enum member to the service_handler so that we can
send IPC messages when the cluster isn't quorate. This will default to
NO (as now) but allows us to query and set quorum when we don't have it
.. a useful feature !
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1674 fd59a12c-fef9-0310-b244-a6a79926bd2f
confdb subsystems.
This is useful to provide atomic counters (ag handle numbers) for
long-running (though not persistent) connections. It's not currently
possible via confdb to atomically get a new number from objdb due to the
lack of locking. Doing it via increment operations in the IPC thread
provides enough atomicity to make it useful. Fabio has already
identified a use for these calls.
It could also provide some form of basic co-operative locking mechanism
for IPC-using processes (not direct objdb calls).
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1662 fd59a12c-fef9-0310-b244-a6a79926bd2f
Use a 2 phase "commit" operation:
1) Invoke verifyconfig that should catch the errors before the reload operation
2) Invoke reloadconfig that performs the operation and should _never_ fail
Implementation note: if step 2 fails, there is no fall back at the moment.
Fix the IPC table for confdb:
MESSAGE_REQ_CONFDB_XPATH_EVAL_EXPRESSION = 12 was added to include/ipc_confdb.h
without an associated call. Thanks Chrissie for spotting this.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1629 fd59a12c-fef9-0310-b244-a6a79926bd2f
This call causes a complete list of active groups and their
membership lists to be sent to a callback function.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1571 fd59a12c-fef9-0310-b244-a6a79926bd2f
- AMF handles a component report of injurious health.
- AMF handles saAmfHealthcheckConfirm() SA_AIS_ERR_FAILED_OPERATION
so that if it's a recent recovery ongoing amf does nothing but if it's
no immediate recovery in progress, AMF invokes the recovery action
specified by the component when the health check is started If
the individual recommendation was SA_AMF_NO_RECOMMENDATION,
then AMF uses the configured recovery action for the component
(saAmfCompRecoveryOnError). If this recommendation also is
SA_AMF_NO_RECOMMENDATION, then AMF makes a component restart or
component/SU fail over counts on the value of
saAmfCompDisableRestart and saAmfSUFailover.
- Handling of cleanup of a component and health check response hardened.
- Time supervision and check return value of clc-cli CLEANUP command.
- Handle 'recommended recovery' specified by a component in an error
report. The potential recovery action to choose
implemented is - component restart - and - node fails over.
- The attribute saAmfCompDisableRestart is now recognizable which means
that if the component specifies 'Component restart' and restart is
disabled
then the SU in which the component is contained shall fall over.
- The attribute saAmfSUFailover will not be recognized. SU will always
fail
over as a single entity.
- A component can report an error on another component than itself.
- Implementation 'Instantiation Level' according to chapter 3.9.2 in the
AMF specification.
- Implementation of the escalation levels, component restart, SU
restart, SU fail over and Node fail over.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1321 fd59a12c-fef9-0310-b244-a6a79926bd2f
One of type 'AMF invoked' and one of type 'component invoked'. testamf1.c
code got a bit restructured at the same time.
Changes in amf.conf to complement testamf1
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1274 fd59a12c-fef9-0310-b244-a6a79926bd2f
2- On Solaris, the SA components executed have no names.
3- When killing the testamf1 component, it makes the aisexec process
crash on both of my nodes.
4- max priority for RR on solaris is 59.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1247 fd59a12c-fef9-0310-b244-a6a79926bd2f