bgpd: intelligently adjust coalesce timer

The subgroup coalesce timer controls how long updates to a particular subgroup are delayed in order to allow additional peers to join the subgroup. Presently the timer value is 200 ms. Increase it to 1 second and adjust up as peers are configured, with an upper cap at 10s. This cuts convergence time by a factor of 3 at large scale (300+ peers, 1000+ prefixes per peer). Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2025-12-18 11:36:11 +00:00 · 2017-11-30 14:11:12 -05:00 · 2017-11-30 14:11:12 -05:00 · 4961a5a2eb
commit 4961a5a2eb
parent 5561f52343
3 changed files with 26 additions and 15 deletions
--- a/bgpd/bgp_packet.c
+++ b/bgpd/bgp_packet.c
@ -2156,22 +2156,8 @@ int bgp_process_packet(struct thread *thread)
 	int mprc;		  // message processing return code
 	peer = THREAD_ARG(thread);
 /*
 * This functionality is presently disabled. Unfortunately due to the
 * way bgpd is structured, reading more than one packet per input cycle
 * severely impacts convergence time. This is because advancing the
 * state of the routing table based on prefixes learned from one peer
 * prior to all (or at least most) peers being established and placed
 * into an update-group will make UPDATE generation starve
 * bgp_accept(), delaying convergence. This is a deficiency that needs
 * to be fixed elsewhere in the codebase, but for now our hand is
 * forced.
 */
 #if 0
 	rpkt_quanta_old = atomic_load_explicit(&peer->bgp->rpkt_quanta,
 					       memory_order_relaxed);
 #endif
 	rpkt_quanta_old = 1;
 	fsm_update_result = 0;
 	/* Guard against scheduled events that occur after peer deletion. */
--- a/bgpd/bgp_updgrp.h
+++ b/bgpd/bgp_updgrp.h
@ -29,7 +29,27 @@
 #include "bgp_advertise.h"
-#define BGP_DEFAULT_SUBGROUP_COALESCE_TIME 200
+/*
 * The following three heuristic constants determine how long advertisement to
 * a subgroup will be delayed after it is created. The intent is to allow
 * transient changes in peer state (primarily session establishment) to settle,
 * so that more peers can be grouped together and benefit from sharing
 * advertisement computations with the subgroup.
 *
 * These values have a very large impact on initial convergence time; any
 * changes should be accompanied by careful performance testing at all scales.
 *
 * The coalesce time 'C' for a new subgroup within a particular BGP instance
 * 'B' with total number of known peers 'P', established or not, is computed as
 * follows:
 *
 * C = MIN(BGP_MAX_SUBGROUP_COALESCE_TIME,
 *         BGP_DEFAULT_SUBGROUP_COALESCE_TIME +
 *         (P*BGP_PEER_ADJUST_SUBGROUP_COALESCE_TIME))
 */
 #define BGP_DEFAULT_SUBGROUP_COALESCE_TIME 1000
 #define BGP_MAX_SUBGROUP_COALESCE_TIME 10000
 #define BGP_PEER_ADJUST_SUBGROUP_COALESCE_TIME 50
 #define PEER_UPDGRP_FLAGS                                                      \
 	(PEER_FLAG_LOCAL_AS_NO_PREPEND | PEER_FLAG_LOCAL_AS_REPLACE_AS)
--- a/bgpd/bgpd.c
+++ b/bgpd/bgpd.c
@ -1474,6 +1474,11 @@ struct peer *peer_create(union sockunion *su, const char *conf_if,
 	listnode_add_sort(bgp->peer, peer);
 	hash_get(bgp->peerhash, peer, hash_alloc_intern);
 	/* Adjust update-group coalesce timer heuristics for # peers. */
 	long ct = BGP_DEFAULT_SUBGROUP_COALESCE_TIME
 		  + (bgp->peer->count * BGP_PEER_ADJUST_SUBGROUP_COALESCE_TIME);
 	bgp->coalesce_time = MIN(BGP_MAX_SUBGROUP_COALESCE_TIME, ct);
 	active = peer_active(peer);
 	/* Last read and reset time set */