When bgp_stop finishes and it deletes the peer it is sending
back a return code stating that the peer was deleted, but
the code was operating like it was not deleted and continued
to access the data structure. Fix.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The return code from a event handling perspective
is an enum. Let's intentionally make it a switch
so that all cases are ensured to be covered now
and in the future.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Straighten out the code to not mix the two. Especially
since bgp was assigning non enum values to the enum.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The BGP FSM was using the peer as the unit of work
but the FSM is connection focused. So let's switch
it over to using that.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
As part of the conversion to a `struct peer_connection` it will
be desirable to have 2 pointers one for when we open a connection
and one for when we receive a connection. Start this actual
conversion over to this in `struct peer`. If this sounds confusing
take a look at the bgp state machine for connections and how
it resolves the processing of this router opening -vs- this
router receiving an open. At some point in time the state
machine decides that we are keeping one of the two connections.
Future commits will allow us to untangle the peer/doppelganger
duality with this abstraction.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The status and ostatus are a function of the `struct peer_connection`
move it into that data structure.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Move PEER_THREAD_WRITES_ON and PEER_THREAD_READS_ON to
be a part of the `struct peer_connection` since this is
a connection oriented bit of data.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Move the peer->t_write and peer->t_read into `struct peer_connection`
as that these are properties of the connection.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
P# Please enter the commit message for your changes. Lines starting
BGP tracks connections based upon the peer. But the problem
with this is that the doppelganger structure for it is being
created. This has introduced a bunch of fragileness in that
the peer exists independently of the connections to it.
The whole point of the doppelganger structure was to allow
BGP to both accept and initiate tcp connections and then
when we get one to a `good` state we collapse into the
appropriate one. The problem with this is that having
2 peer structures for this creates a situation where
we have to make sure we are configing the `right` one
and also make sure that we collapse the two independent
peer structures into 1 acting peer. This makes no sense
let's abstract out the peer into having 2 connection
one for incoming connections and one for outgoing connections
then we can easily collapse down without having to do crazy
stuff. In addition people adding new features don't need
to have to go touch a million places in the code.
This is the start of this abstraction. In this commit
we'll just pull out the fd and input/output buffers
into a connection data structure. Future commits
will abstract further.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When a state machine transition fails, bgpd would output
data about what happened, but not necessarily give the
reason why. Add that data to the output.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Imagine this scenario:
A peer has very large hold/keepalive timers of 600/200. This peer is
using the DataCenter default time. As such the open will cause
the t_holdtime to be negotiated to 600 seconds. Now also imagine
that both peers are in update-delay. If we do not restart the
timers and both peers are in Update Delay, we will continously
reset the peer because the hold time will be hit( since the peer
is not sending us any data ).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
gcc 12.2.0 complains `error: ‘%s’ directive argument is null`, even
though all enum values are covered with a string. Let's just go with a
`???` default.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
a) Make it legible what type of message is being passed
back and forth instead of having to guess it from
the insufficient debugs
b) Make it explicit which bgp instance is sending this
data
c) Cleanup bgp_zebra_update to have a cleaner api
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Implement: https://datatracker.ietf.org/doc/html/draft-abraitis-bgp-version-capability
Tested with GoBGP:
```
% ./gobgp neighbor 192.168.10.124
BGP neighbor is 192.168.10.124, remote AS 65001
BGP version 4, remote router ID 200.200.200.202
BGP state = ESTABLISHED, up for 00:01:49
BGP OutQ = 0, Flops = 0
Hold time is 3, keepalive interval is 1 seconds
Configured hold time is 90, keepalive interval is 30 seconds
Neighbor capabilities:
multiprotocol:
ipv4-unicast: advertised and received
ipv6-unicast: advertised
route-refresh: advertised and received
extended-nexthop: advertised
Local: nlri: ipv4-unicast, nexthop: ipv6
UnknownCapability(6): received
UnknownCapability(9): received
graceful-restart: advertised and received
Local: restart time 10 sec
ipv6-unicast
ipv4-unicast
Remote: restart time 120 sec, notification flag set
ipv4-unicast, forward flag set
4-octet-as: advertised and received
add-path: received
Remote:
ipv4-unicast: receive
enhanced-route-refresh: received
long-lived-graceful-restart: advertised and received
Local:
ipv6-unicast, restart time 10 sec
ipv4-unicast, restart time 20 sec
Remote:
ipv4-unicast, restart time 0 sec, forward flag set
fqdn: advertised and received
Local:
name: donatas-pc, domain:
Remote:
name: spine1-debian-11, domain:
software-version: advertised and received
Local:
GoBGP/3.10.0
Remote:
FRRouting/8.5-dev-MyOwnFRRVersion-gdc92f44a45-dirt
cisco-route-refresh: received
Message statistics:
```
FRR side:
```
root@spine1-debian-11:~# vtysh -c 'show bgp neighbor 192.168.10.17 json' | \
> jq '."192.168.10.17".neighborCapabilities.softwareVersion.receivedSoftwareVersion'
"GoBGP/3.10.0"
root@spine1-debian-11:~#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before this, if the peer disables sending FQDN capability, the old hostname
still (STALE) exists and is misleading in the outputs of `show bgp ...`.
Especially when using with `bgp default show-hostname`, etc.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The bgp_fsm_change_status function takes an int
for the new bgp state, which is an `enum bgp_fsm_status status`
let's convert over to being explicit.bgpd: use the enum instead of an int
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The BGP fsm uses return codes to pass event success/fail
as well as some extra data to the bgp_event_update function.
Convert this to use a enum instead of an int to track the
changes.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The bgp->peerhash is made up of the sockunion and the CONFIG_NODE
flag. If the CONFIG_NODE flag is moved around or changed then
we get into a situation where both the doppelganger and the peer
actually hash to the exact same thing. Leading to wrongful deletion
and pointers being used after freed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Three fixes:
a) When calling bgp_fsm_change_status with `Deleted` do
not add a new event to the peer's t_event because
we are already in the process of deleting everything
b) When bgp_stop decides to delete a peer return a notification
that it is happening to bgp_event_update so that it does not
set the peer state back to idle or do other processing.
c) bgp_event_update can cause a peer deletion, because
the peer can be deleted in the fsm function but the peer
data structure is still pointed to and used after words.
So lock the peer before entering and prevent a use after
free.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Simulated latency with:
```
tc qdisc add dev eth3 root netem delay 100ms
```
```
donatas-laptop# sh ip bgp summary failed
IPv4 Unicast Summary (VRF default):
BGP router identifier 192.0.2.252, local AS number 65000 vrf-id 0
BGP table version 28
RIB entries 0, using 0 bytes of memory
Peers 1, using 724 KiB of memory
Neighbor EstdCnt DropCnt ResetTime Reason
192.168.10.65 2 2 00:00:17 Admin. shutdown (RTT)
Displayed neighbors 1
Total number of neighbors 1
donatas-laptop#
```
Another end received:
```
%NOTIFICATION: received from neighbor 192.168.10.17 6/2 (Cease/Administrative Shutdown) "shutdown due to high round-trip-time (104ms > 5ms, hit 21 times)"
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Currently the bgp mib specifies two traps:
a) Into established state
b) transition backwards from a state
b) really is an interesting case. It means transitioning
from say established to starting over. It can also
mean when bgp is trying to connect and that fails and
the state transitions backwards.
Now let's imagine 500 peers with tight timers (say a data center)
and there is network trauma you have just created an inordinately
large number of traps for each peer.
Let's limit FRR to changing from the old status as Established
to something else. This will greatly limit the trap but it
will also be something end operators are actually interested in.
I actually had several operators say they had to write special code
to ignore all the backward state transitions that they didn't care
about.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When waiting on a path to reach the peer, modify the debug/show
output to give a better understanding to the operator about what
they should be looking for.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Also, make sure we check if the advertisement table changed using FROM peer,
not TO peer.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
We are allocating temporary memory for information about
what to process in this thread, which is not being cleaned
up on thread cancelling.
Signed-off-by: Samanvitha B Bhargav <bsmanvitha@vmware.com>
==1179738== 48 (40 direct, 8 indirect) bytes in 1 blocks are definitely lost in loss record 13 of 29
==1179738== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==1179738== by 0x493C8D5: qcalloc (memory.c:116)
==1179738== by 0x208F0C: ecommunity_dup (bgp_ecommunity.c:267)
==1179738== by 0x2B300C: conf_copy (bgp_updgrp.c:170)
==1179738== by 0x2B35BF: peer2_updgrp_copy (bgp_updgrp.c:277)
==1179738== by 0x2B5189: update_group_find (bgp_updgrp.c:826)
==1179738== by 0x2B70D0: update_group_adjust_peer (bgp_updgrp.c:1769)
==1179738== by 0x23DB7D: update_group_adjust_peer_afs (bgp_updgrp.h:519)
==1179738== by 0x243B21: bgp_establish (bgp_fsm.c:2129)
==1179738== by 0x244B94: bgp_event_update (bgp_fsm.c:2597)
==1179738== by 0x26B0E6: bgp_process_packet (bgp_packet.c:2895)
==1179738== by 0x498F5FD: thread_call (thread.c:2008)
==1179738== by 0x49253DA: frr_run (libfrr.c:1198)
==1179738== by 0x1EEC38: main (bgp_main.c:520)
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
BGP SoO is a tag that is appended on BGP updates to allow a peer to mark
a particular peer as belonging to a particular site. In certain MPLS L3 VPN
configurations, the BGP AS-Path may not provide the granularity needed
prevent a loop in the control-plane. With this in mind, BGP SoO is designed
to fill this gap and prevent a routing loop that may occur.
If we configure for example, `neighbor soo 65000:1` at PEs, routes won't be
announced between CPEs if soo matches. This is especially needed when using
as-override or allowas-in.
Also, this is the automated way of the same behavior as configuring route-maps
for each peer like:
```
bgp extcommunity-list cpe permit soo 65000:1
!
route-map cpe permit 10
set extcommunity soo 65000:1
...
route-map cpe deny 10
match extcommunity cpe
route-map cpe permit 20
...
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>