this flag (0|1) can be configured via quorum.last_man_standing and when
enabled, it allows expected_votes to be dynamically recalculated.
Assuming an 8 nodes cluster, every node votes 1 (mandatory requirement for
this feature).
In the first event, 3 nodes are lost.
The remaining partition of 5 is barely quorate.
After a configurable timeout (quorum.last_man_standing_window, default 10sec)
the quorate partition is allow to recalculate expected_votes based on
the remaining nodes.
This operation will bring expected_votes to 5 and quorum to 3.
Repeating the above loop, in the next event, 2 more nodes are allowed to
die. etc. etc.
Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
this is a very old leftover from the RHEL5 timeframe, not used in RHEL6.
Also change votequorum soname since this change implies an ABI change.
Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
this is necessary to reset the wait_for_all data when a node is joining a cluster
that has already seen all nodes once.
Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
this flag (0|1) can be configured via quorum.auto_tie_breaker and when
enabled, support for perfect even split is on.
In case of a 50% of votes loss in one single transition, the partition
with the node that has the lowest node id will remain quorate.
Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
this flag (0|1) can be configured via quorum.wait_for_all and changes
behavior when granting quorum for the first time.
Normal behavior (default / 0) grants quorum as soon as enough nodes
are available in a cluster.
Setting this value to 1 will grant quorum only after all cluster
memembers are part of the cluster at the same time.
Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
corosync internal theory of operation is that without a quorum provider
the cluster is always quorate. This is fine for membership free clusters
but it does pose a problem for applications that need membership and
"real" quorum.
this change add quorum_type to quorum_initialize call to return QUORUM_FREE
or QUORUM_SET. Applications can then make their own decisions to error out
or continue operating.
The only other way to know if a quorum provider is enabled/configured is
to poke at confdb/objdb, but adds an unnecessary burden to applications
that really don't need to use an entire library for a boolean value.
Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Our preferred shared logging system is exported via the libqb library. As
a result, the corosync project no longer needs to export logsys.so and the
code can be directly included in the binary. The header file can also be
removed.
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
switch from LOG_INFO to LOG_DEBUG for some basic operations
Reviewed-by: Steven Dake <sdake@redhat.com>
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Included are following parts:
- XSLT template with actual conversion
- simple wrapper on top of xsltproc called corosync-xmlproc
- example XML file
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
corosync-cmapctl is direct replacement for corosync-objctl
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Cmap service is application developer interface to icmap and it is
direct replacement for confdb.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Icmap is replacement for objdb, based on libqb map (trie).
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Currently:
- send_reserve() adds to the reserve
- msg_count_send_ok() tests ((avail - totempg_reserved) > msg_count)
So essentially we are checking to see if 2 * msg_count can fit in
the q.
So instead I am using byte_count_send_ok (size) to see if the
message will fit then calling send_reserve()
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
To allow async cpg messages of 1M we need to:
1) increase the totem queue size 4 times
2) align the critical level to one large message free
There are a number of reasons for doing this:
We can't let cpg_mcast_joined() fail because the user will not see it
and will assume is has succeded.
The reason we are getting good performance is by providing a negative
feedback loop from the totem q to the IPC/poll system. This relies
on 4 q states low/med/high/crit. With messages of size 1M you
now have a q of size one and now go from level low to crit instantly
then back to low as messages are put on and taken off. I don't think
this is the best behaviour. By having a q size of 4 allows the system
to utilize the q better and give us time to respond to changes in
the q level.
To effectively achieve flow control with a q of size 1 would require
all the clients to request the space on the q like is done in
totempg_groups_joined_reserve() but probably in shared memory
This would take quite a bit of re-work.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>