bgpd: Allow extending peer timeout in rare case

Currently the I/O pthread handles incoming/outgoing data communication with all peers. There is no attempt at modifying the hold timers. It's sole goal is to read/write data to appropriate channels. All this data is handled as *events* on the master pthread in BGP. The problem is that if the master pthread is extremely busy then any packet read that would be treated as a keepalive event may happen after the hold timer pops, due to the way thread events are handled in lib/thread.c. In a last gap attempt, if we notice that we have incoming data to proceses on the input Queue, slightly delay the hold timer. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2025-08-15 02:43:41 +00:00 · 2020-06-15 10:35:50 -04:00 · 2020-06-15 10:35:50 -04:00 · d0874d195d
commit d0874d195d
parent 1a5fc72066
1 changed files with 20 additions and 0 deletions
--- a/bgpd/bgp_fsm.c
+++ b/bgpd/bgp_fsm.c
@ -512,6 +512,7 @@ static int bgp_connect_timer(struct thread *thread)
 /* BGP holdtime timer. */
 static int bgp_holdtime_timer(struct thread *thread)
 {
+	atomic_size_t inq_count;
 	struct peer *peer;

 	peer = THREAD_ARG(thread);
@ -521,6 +522,25 @@ static int bgp_holdtime_timer(struct thread *thread)
 		zlog_debug("%s [FSM] Timer (holdtime timer expire)",
 			   peer->host);

+	/*
+	 * Given that we do not have any expectation of ordering
+	 * for handling packets from a peer -vs- handling
+	 * the hold timer for a peer as that they are both
+	 * events on the peer.  If we have incoming
+	 * data on the peers inq, let's give the system a chance
+	 * to handle that data.  This can be especially true
+	 * for systems where we are heavily loaded for one
+	 * reason or another.
+	 */
+	inq_count = atomic_load_explicit(&peer->ibuf->count,
+					 memory_order_relaxed);
+	if (inq_count) {
+		BGP_TIMER_ON(peer->t_holdtime, bgp_holdtime_timer,
+			     peer->v_holdtime);
+
+		return 0;
+	}
+
 	THREAD_VAL(thread) = Hold_Timer_expired;
 	bgp_event(thread); /* bgp_event unlocks peer */