bgpd: BGP assert when it tries to access peer which is closed.

Problem: BGP peer pointer is present in keepalive hash table
even when socket has been closed in some race condition.
When keepalive tries to access this peer it asserts.

RCA: Below sequence of events causing assert.
1. Config node peer has went down due to TCP reset
   it's FD has been set to -1.
2. Doppelganger peer goes to established state and it has
   been added to peer hash table for keepalive when it was
   in openconfirm state.
3. Config node parameters including FD are exchanged with
   doppelganger. Doppelganger will not have FD -1.
4. Doppelganger will be deleted as part of this it will
   remove it from the keepalive peer hash table.
5. While removing from hash table it tries to acquire lock.
6. During this time keepalive thread has the lock and in
   a loop trying to send keepalive for peers in hash table.
7. It tries to send keepalive for doppelganger peer with fd
   set to -1 and asserts.

Signed-off-by: Santosh P K <sapk@vmware.com>
This commit is contained in:
Santosh P K 2019-11-25 08:49:38 -08:00
parent bb2d775cca
commit 74e00a55c1

View File

@ -164,6 +164,14 @@ static struct peer *peer_xfer_conn(struct peer *from_peer)
bgp_writes_off(from_peer);
bgp_reads_off(from_peer);
/*
* Before exchanging FD remove doppelganger from
* keepalive peer hash. It could be possible conf peer
* fd is set to -1. If blocked on lock then keepalive
* thread can access peer pointer with fd -1.
*/
bgp_keepalives_off(from_peer);
BGP_TIMER_OFF(peer->t_routeadv);
BGP_TIMER_OFF(peer->t_connect);
BGP_TIMER_OFF(peer->t_connect_check_r);