There were too much false positives with passive mode rrp when high
number of messages were received.
Patch adds new configurable variable rrp_problem_count_mcast_threshold
which is by default 10 times rrp_problem_count_threshold and this is
used as threshold for multicast packets in passive mode. Variable is
unused in active mode.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed by: Steven Dake <sdake@redhat.com>
IPC: return 0/-ENOBUFS from message handler
IPC: use the new rate_limit API to improve perf.
CPG: add send_async API & hook up flow control
IPC: Fix flow control getting stuck.
IPC: Port the remaining libs to use libqb IPC
IPC: remove libqb flowcontrol API
TEST: put cpg_dispatch() in it's own thread
IPC: cleanup ipc_glue.c name everything cs_ipcs_*()
IPC: add back statistics
IPC: remove coroipcc_ symbols from lib*.versions
IPC: init each se's IPC as it is loaded.
IPC: use the new connection_closed() event to free the context.
IPC: re-add zero copy functionality back
IPC: remove cpg_mcast_joined_async() and make it the default
-> now cpg_mcast_joined() == cpg_mcast_joined_async()
libqb: expose a libqb error converter
libqb: add missing error conversions
libqb: remove repeat try loop in lib/cpg.c
CPG: fix zero copy mcast
CPG: use newer return codes
Add ENOTCONN to qb_to_cs_error()
libqb: fix error conversion from errno to cs_error_t in confdb
libqb: change errno_to_cs to qb_to_cs_error
libqb: add a cs_strerror() to get a more meaningful message
libqb: fix some confusing error conversions.
libqb: set the timeout on recv's to -1 (wait forever)
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Previous default (50) was too low for most modern switch hardware. This
may trigger abort because the aru doesn't increase for 50 token
rotations combined with a defect in how failed to recv conditions are
handled. By increasing this tunable, the condition should no longer
trigger the errant code.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
This patch automatically recovers redundant ring failures.
Please note that this patch introduced rrp_autorecovery_check_timeout
in totem config hence breaks internal ABI. The internal ABI users
of totem.h need to rebuild their binaries.
Signed-off-by: Jiaju Zhang <jjzhang@suse.de>
Signed-off-by: Steven Dake <sdake@redhat.com>
Tested-by: Jan Friesse <jfriesse@redhat.com>
Tested-by: Florian Haas <florian.haas@linbit.com>
Tested-by: Jiaju Zhang <jjzhang@suse.de>
1) both IPv4 and IPv6 mcast should default to ttl=1
2) the range should be 0..255
0 is valid meaning localhost only (cluster of one)
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
This option (-l or --less-secure) causes corosync-keygen to read from
/dev/urandom instead of /dev/random to ensure that no input is required
from the user. It may be useful when this command is used from a
script.
Signed-off-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Steven Dake <sdake@redhat.com>
This is to send dbus events on major cluster events:
- membership changes
- application connect/dissconnet from corosync
- quorum changes
dbus events can then be converted into snmp traps by foghorn or
corosync-notifyd can be run to directly send snmp traps.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Signed-off-by: Lon Hohberger <lhh@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Russell Bryant <russell@russellbryant.net>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Some switches delay multicast packets vs the unicast token. This patch works
around that problem by providing a new tuneable called miss_count_const. This
tuneable works by counting the number of times a message is found missing
and once reaching the const value, marks it as missing in the retransmit list.
This improves performance and doesn't display warning messages about missed
multicast messages when operating in these switching environments.
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
This adds a per-interface config option to
adjust the TTL.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
The UDPU transport is useful for those deployments which can't use multicast.
UDPU works by using UDP unicast, which is fully supported by every switch
manufacturer by default and doesn't rely on a functional IGMP implementation.
An example of the UDPU transport is contained in the corosync.conf.example.udpu
file which shows a 16 node cluster. This file should be copied to each node
in the cluster and IP addresses changed as appropriate.
Amended to remove dead udpu REUSEADDR socket option.
Signed-off-by: Steven Dake <sdake@redhat.com>
- timestamps -> uint64_t and in nanosecs
- use clock_gettime
- common object naming
- common state names
- timeouts in milliseconds
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3054 fd59a12c-fef9-0310-b244-a6a79926bd2f
Patch add support for Confdb integration with SAM. It's now possible to
use SAM_RECOVERY_POLICY_CONFDB as flag to previous policies.
Also new function sam_mark_failed is added for ability to use RECOVERY
policy together with confdb and get expected results (specially with
integration with corosync watchdog)
Patch also makes SAM thread safe.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@3050 fd59a12c-fef9-0310-b244-a6a79926bd2f
in corosync.conf to explicitly state that you need to use
the network address, as opposed to "should always end in
zero", which is only correct for class C networks.
Regards,
Tim
--
Tim Serong <tserong@novell.com>
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2856 fd59a12c-fef9-0310-b244-a6a79926bd2f
Patch adds integration of SAM and quorum, so it's now possible to use
SAM_RECOVERY_POLICY_QUORUM_QUIT or SAM_RECOVERY_POLICY_QUORUM_RESTART
recovery policy. With these policies, sam_start will block until
corosync is quorate. If quorum is lost during health checking, recovery
action is taken.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2822 fd59a12c-fef9-0310-b244-a6a79926bd2f
Patch adds new function to initialize cpg, cpg_model_initialize. Model
is set of callbacks. With this function, future addions of models
should be possible without changing the ABI.
Patch also contains callback in CPG_MODEL_V1 for notification about
Totem membership changes.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2770 fd59a12c-fef9-0310-b244-a6a79926bd2f
Ability to in-memory storing of user data which survives between
instances of process.
Also ability needed ability for bi-directional communication between
child and parent is added.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2769 fd59a12c-fef9-0310-b244-a6a79926bd2f
- allows white characters before #
- new function to parse log destinations (remove code duplicity)
- clarify man page
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2642 fd59a12c-fef9-0310-b244-a6a79926bd2f
nodes from executing a token timeout in the COMMIT state while another node
executes a consensus timeout, showing to applications as a temporary network
partition.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@2567 fd59a12c-fef9-0310-b244-a6a79926bd2f