totem: Add cancel_hold_on_retransmit config option

Previously, existence of retransmit messages canceled holding
of token (and never allowed representative to enter token hold
state).

This makes token rotating maximum speed and keeps processor
resending messages over and over again - overloading network
and reducing chance to successfully deliver the messages.

Also there were reports of various Antivirus / IPS / IDS which slows
down delivery of packets with certain sizes (packets bigger than token)
what make Corosync retransmit messages over and over again.

Proposed solution is to allow representative to enter token hold
state when there are only retransmit messages. This allows network to
handle overload and/or gives Antivirus/IPS/IDS enough time scan and
deliver packets without corosync entering "FAILED TO RECEIVE" state and
adding more load to network.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
This commit is contained in:
Jan Friesse 2021-08-11 17:34:05 +02:00
parent 23db6cba49
commit cdf72925db
4 changed files with 25 additions and 3 deletions

View File

@ -81,6 +81,7 @@
#define MAX_MESSAGES 17
#define MISS_COUNT_CONST 5
#define BLOCK_UNLISTED_IPS 1
#define CANCEL_TOKEN_HOLD_ON_RETRANSMIT 0
/* This constant is not used for knet */
#define UDP_NETMTU 1500
@ -144,6 +145,8 @@ static void *totem_get_param_by_name(struct totem_config *totem_config, const ch
return totem_config->knet_compression_model;
if (strcmp(param_name, "totem.block_unlisted_ips") == 0)
return &totem_config->block_unlisted_ips;
if (strcmp(param_name, "totem.cancel_token_hold_on_retransmit") == 0)
return &totem_config->cancel_token_hold_on_retransmit;
return NULL;
}
@ -365,6 +368,9 @@ void totem_volatile_config_read (struct totem_config *totem_config, icmap_map_t
totem_volatile_config_set_boolean_value(totem_config, temp_map, "totem.block_unlisted_ips", deleted_key,
BLOCK_UNLISTED_IPS);
totem_volatile_config_set_boolean_value(totem_config, temp_map, "totem.cancel_token_hold_on_retransmit",
deleted_key, CANCEL_TOKEN_HOLD_ON_RETRANSMIT);
}
int totem_volatile_config_validate (

View File

@ -3981,8 +3981,9 @@ static int message_handler_orf_token (
transmits_allowed = fcc_calculate (instance, token);
mcasted_retransmit = orf_token_rtr (instance, token, &transmits_allowed);
if (instance->my_token_held == 1 &&
(token->rtr_list_entries > 0 || mcasted_retransmit > 0)) {
if (instance->totem_config->cancel_token_hold_on_retransmit &&
instance->my_token_held == 1 &&
(token->rtr_list_entries > 0 || mcasted_retransmit > 0)) {
instance->my_token_held = 0;
forward_token = 1;
}

View File

@ -244,6 +244,8 @@ struct totem_config {
unsigned int block_unlisted_ips;
unsigned int cancel_token_hold_on_retransmit;
void (*totem_memb_ring_id_create_or_load) (
struct memb_ring_id *memb_ring_id,
unsigned int nodeid);

View File

@ -32,7 +32,7 @@
.\" * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
.\" * THE POSSIBILITY OF SUCH DAMAGE.
.\" */
.TH COROSYNC_CONF 5 2021-07-23 "corosync Man Page" "Corosync Cluster Engine Programmer's Manual"
.TH COROSYNC_CONF 5 2021-08-11 "corosync Man Page" "Corosync Cluster Engine Programmer's Manual"
.SH NAME
corosync.conf - corosync executive configuration file
@ -584,6 +584,19 @@ with an old configuration.
The default value is yes.
.TP
cancel_token_hold_on_retransmit
Allows Corosync to hold token by representative when there is too much
retransmit messages. This allows network to process increased load without
overloading it. Used mechanism is same as described for
.B hold
directive.
Some deployments may prefer to never hold token when there is
retransmit messages. If so, option should be set to yes.
The default value is no.
.PP
Within the
.B logging