When a state machine transition fails, bgpd would output
data about what happened, but not necessarily give the
reason why. Add that data to the output.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Imagine this scenario:
A peer has very large hold/keepalive timers of 600/200. This peer is
using the DataCenter default time. As such the open will cause
the t_holdtime to be negotiated to 600 seconds. Now also imagine
that both peers are in update-delay. If we do not restart the
timers and both peers are in Update Delay, we will continously
reset the peer because the hold time will be hit( since the peer
is not sending us any data ).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
gcc 12.2.0 complains `error: ‘%s’ directive argument is null`, even
though all enum values are covered with a string. Let's just go with a
`???` default.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
a) Make it legible what type of message is being passed
back and forth instead of having to guess it from
the insufficient debugs
b) Make it explicit which bgp instance is sending this
data
c) Cleanup bgp_zebra_update to have a cleaner api
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Implement: https://datatracker.ietf.org/doc/html/draft-abraitis-bgp-version-capability
Tested with GoBGP:
```
% ./gobgp neighbor 192.168.10.124
BGP neighbor is 192.168.10.124, remote AS 65001
BGP version 4, remote router ID 200.200.200.202
BGP state = ESTABLISHED, up for 00:01:49
BGP OutQ = 0, Flops = 0
Hold time is 3, keepalive interval is 1 seconds
Configured hold time is 90, keepalive interval is 30 seconds
Neighbor capabilities:
multiprotocol:
ipv4-unicast: advertised and received
ipv6-unicast: advertised
route-refresh: advertised and received
extended-nexthop: advertised
Local: nlri: ipv4-unicast, nexthop: ipv6
UnknownCapability(6): received
UnknownCapability(9): received
graceful-restart: advertised and received
Local: restart time 10 sec
ipv6-unicast
ipv4-unicast
Remote: restart time 120 sec, notification flag set
ipv4-unicast, forward flag set
4-octet-as: advertised and received
add-path: received
Remote:
ipv4-unicast: receive
enhanced-route-refresh: received
long-lived-graceful-restart: advertised and received
Local:
ipv6-unicast, restart time 10 sec
ipv4-unicast, restart time 20 sec
Remote:
ipv4-unicast, restart time 0 sec, forward flag set
fqdn: advertised and received
Local:
name: donatas-pc, domain:
Remote:
name: spine1-debian-11, domain:
software-version: advertised and received
Local:
GoBGP/3.10.0
Remote:
FRRouting/8.5-dev-MyOwnFRRVersion-gdc92f44a45-dirt
cisco-route-refresh: received
Message statistics:
```
FRR side:
```
root@spine1-debian-11:~# vtysh -c 'show bgp neighbor 192.168.10.17 json' | \
> jq '."192.168.10.17".neighborCapabilities.softwareVersion.receivedSoftwareVersion'
"GoBGP/3.10.0"
root@spine1-debian-11:~#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before this, if the peer disables sending FQDN capability, the old hostname
still (STALE) exists and is misleading in the outputs of `show bgp ...`.
Especially when using with `bgp default show-hostname`, etc.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The bgp_fsm_change_status function takes an int
for the new bgp state, which is an `enum bgp_fsm_status status`
let's convert over to being explicit.bgpd: use the enum instead of an int
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The BGP fsm uses return codes to pass event success/fail
as well as some extra data to the bgp_event_update function.
Convert this to use a enum instead of an int to track the
changes.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The bgp->peerhash is made up of the sockunion and the CONFIG_NODE
flag. If the CONFIG_NODE flag is moved around or changed then
we get into a situation where both the doppelganger and the peer
actually hash to the exact same thing. Leading to wrongful deletion
and pointers being used after freed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Three fixes:
a) When calling bgp_fsm_change_status with `Deleted` do
not add a new event to the peer's t_event because
we are already in the process of deleting everything
b) When bgp_stop decides to delete a peer return a notification
that it is happening to bgp_event_update so that it does not
set the peer state back to idle or do other processing.
c) bgp_event_update can cause a peer deletion, because
the peer can be deleted in the fsm function but the peer
data structure is still pointed to and used after words.
So lock the peer before entering and prevent a use after
free.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Simulated latency with:
```
tc qdisc add dev eth3 root netem delay 100ms
```
```
donatas-laptop# sh ip bgp summary failed
IPv4 Unicast Summary (VRF default):
BGP router identifier 192.0.2.252, local AS number 65000 vrf-id 0
BGP table version 28
RIB entries 0, using 0 bytes of memory
Peers 1, using 724 KiB of memory
Neighbor EstdCnt DropCnt ResetTime Reason
192.168.10.65 2 2 00:00:17 Admin. shutdown (RTT)
Displayed neighbors 1
Total number of neighbors 1
donatas-laptop#
```
Another end received:
```
%NOTIFICATION: received from neighbor 192.168.10.17 6/2 (Cease/Administrative Shutdown) "shutdown due to high round-trip-time (104ms > 5ms, hit 21 times)"
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Currently the bgp mib specifies two traps:
a) Into established state
b) transition backwards from a state
b) really is an interesting case. It means transitioning
from say established to starting over. It can also
mean when bgp is trying to connect and that fails and
the state transitions backwards.
Now let's imagine 500 peers with tight timers (say a data center)
and there is network trauma you have just created an inordinately
large number of traps for each peer.
Let's limit FRR to changing from the old status as Established
to something else. This will greatly limit the trap but it
will also be something end operators are actually interested in.
I actually had several operators say they had to write special code
to ignore all the backward state transitions that they didn't care
about.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When waiting on a path to reach the peer, modify the debug/show
output to give a better understanding to the operator about what
they should be looking for.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Also, make sure we check if the advertisement table changed using FROM peer,
not TO peer.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
We are allocating temporary memory for information about
what to process in this thread, which is not being cleaned
up on thread cancelling.
Signed-off-by: Samanvitha B Bhargav <bsmanvitha@vmware.com>
==1179738== 48 (40 direct, 8 indirect) bytes in 1 blocks are definitely lost in loss record 13 of 29
==1179738== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==1179738== by 0x493C8D5: qcalloc (memory.c:116)
==1179738== by 0x208F0C: ecommunity_dup (bgp_ecommunity.c:267)
==1179738== by 0x2B300C: conf_copy (bgp_updgrp.c:170)
==1179738== by 0x2B35BF: peer2_updgrp_copy (bgp_updgrp.c:277)
==1179738== by 0x2B5189: update_group_find (bgp_updgrp.c:826)
==1179738== by 0x2B70D0: update_group_adjust_peer (bgp_updgrp.c:1769)
==1179738== by 0x23DB7D: update_group_adjust_peer_afs (bgp_updgrp.h:519)
==1179738== by 0x243B21: bgp_establish (bgp_fsm.c:2129)
==1179738== by 0x244B94: bgp_event_update (bgp_fsm.c:2597)
==1179738== by 0x26B0E6: bgp_process_packet (bgp_packet.c:2895)
==1179738== by 0x498F5FD: thread_call (thread.c:2008)
==1179738== by 0x49253DA: frr_run (libfrr.c:1198)
==1179738== by 0x1EEC38: main (bgp_main.c:520)
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
BGP SoO is a tag that is appended on BGP updates to allow a peer to mark
a particular peer as belonging to a particular site. In certain MPLS L3 VPN
configurations, the BGP AS-Path may not provide the granularity needed
prevent a loop in the control-plane. With this in mind, BGP SoO is designed
to fill this gap and prevent a routing loop that may occur.
If we configure for example, `neighbor soo 65000:1` at PEs, routes won't be
announced between CPEs if soo matches. This is especially needed when using
as-override or allowas-in.
Also, this is the automated way of the same behavior as configuring route-maps
for each peer like:
```
bgp extcommunity-list cpe permit soo 65000:1
!
route-map cpe permit 10
set extcommunity soo 65000:1
...
route-map cpe deny 10
match extcommunity cpe
route-map cpe permit 20
...
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
convert:
frr_with_mutex(..)
to:
frr_with_mutex (..)
To make all our code agree with what clang-format is going to produce
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
RFC9234 is a way to establish correct connection roles (Customer/
Provider, Peer or with RS) between bgp speakers. This patch:
- Add a new configuration/terminal option to set the appropriate local
role;
- Add a mechanism for checking used roles, implemented by exchanging
the corresponding capabilities in OPEN messages;
- Add strict mode to force other party to use this feature;
- Add basic support for a new transitive optional bgp attribute - OTC
(Only to Customer);
- Add logic for default setting OTC attribute and filtering routes with
this attribute by the edge speakers, if the appropriate conditions are
met;
- Add two test stands to check role negotiation and route filtering
during role usage.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
Originally commit: 8761cd6ddb
implemented the ability for LL nexthop tracking to be
interface based. During some more testing the code
to make the interface event happen was noticed to
not properly run in some cases. This was because
it was originally assumed that the connect could
not succeed. Testing has shown that the connect
can succeed and FRR needs to handle the nht tracking
in this case too.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
If we receive CEASE Notification or HOLDTIME notification, retain STALE
routes if it's not a CEASE/Hard Reset.
When doing `clear ip bgp PEER`, we can control if this would be CEASE/Hard Reset
or not by using `bgp hard-administrative-reset` knob.
When `bgp graceful-restart notification` is disabled, STALE routes won't be
retained when receiving Notification message.
Co-authored-by: Biswajit Sadhu <biswajit.sadhu@gmail.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Firstly, *keep no change* for `hash_get()` with NULL
`alloc_func`.
Only focus on cases with non-NULL `alloc_func` of
`hash_get()`.
Since `hash_get()` with non-NULL `alloc_func` parameter
shall not fail, just ignore the returned value of it.
The returned value must not be NULL.
So in this case, remove the unnecessary checking NULL
or not for the returned value and add `void` in front
of it.
Importantly, also *keep no change* for the two cases with
non-NULL `alloc_func` -
1) Use `assert(<returned_data> == <searching_data>)` to
ensure it is a created node, not a found node.
Refer to `isis_vertex_queue_insert()` of isisd, there
are many examples of this case in isid.
2) Use `<returned_data> != <searching_data>` to judge it
is a found node, then free <searching_data>.
Refer to `aspath_intern()` of bgpd, there are many
examples of this case in bgpd.
Here, <returned_data> is the returned value from `hash_get()`,
and <searching_data> is the data, which is to be put into
hash table.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Also, add N-Bit (Notification) flag for Graceful Restart.
This is a preparation for RFC8538.
More information: https://datatracker.ietf.org/doc/html/rfc8538
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>