Capability was unset, but forgot to unset the role.
Fixes: 5ad080d37a ("bgpd: Handle Role capability via dynamic capabilities for SET/UNSET properly")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
It was missed to handle UNSET Role capability using dynamic capabilities.
Also move length check before actually handling Role capability.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
```
3 0x00007f423aa42476 in __GI_raise (sig=sig@entry=11) at ../sysdeps/posix/raise.c:26
4 0x00007f423aef9740 in core_handler (signo=11, siginfo=0x7fffc414deb0, context=<optimized out>) at lib/sigevent.c:246
5 <signal handler called>
6 0x0000564dea2fc71e in route_set_aspath_prepend (rule=0x564debd66d50, prefix=0x7fffc414ea30, object=0x7fffc414e400)
at bgpd/bgp_routemap.c:2258
7 0x00007f423aeec7e0 in route_map_apply_ext (map=<optimized out>, prefix=prefix@entry=0x7fffc414ea30,
match_object=match_object@entry=0x7fffc414e400, set_object=set_object@entry=0x7fffc414e400, pref=pref@entry=0x0) at lib/routemap.c:2690
8 0x0000564dea2d277e in bgp_input_modifier (peer=peer@entry=0x7f4238f59010, p=p@entry=0x7fffc414ea30, attr=attr@entry=0x7fffc414e770,
afi=afi@entry=AFI_IP, safi=safi@entry=SAFI_UNICAST, rmap_name=rmap_name@entry=0x0, label=0x0, num_labels=0, dest=0x564debdd5130)
at bgpd/bgp_route.c:1772
9 0x0000564dea2df762 in bgp_update (peer=peer@entry=0x7f4238f59010, p=p@entry=0x7fffc414ea30, addpath_id=addpath_id@entry=0,
attr=0x7fffc414eb50, afi=afi@entry=AFI_IP, safi=<optimized out>, safi@entry=SAFI_UNICAST, type=9, sub_type=0, prd=0x0, label=0x0,
num_labels=0, soft_reconfig=0, evpn=0x0) at bgpd/bgp_route.c:4374
10 0x0000564dea2e2047 in bgp_nlri_parse_ip (peer=0x7f4238f59010, attr=attr@entry=0x7fffc414eb50, packet=0x7fffc414eaf0)
at bgpd/bgp_route.c:6249
11 0x0000564dea2c5a58 in bgp_nlri_parse (peer=peer@entry=0x7f4238f59010, attr=attr@entry=0x7fffc414eb50,
packet=packet@entry=0x7fffc414eaf0, mp_withdraw=mp_withdraw@entry=false) at bgpd/bgp_packet.c:339
12 0x0000564dea2c5d66 in bgp_update_receive (peer=peer@entry=0x7f4238f59010, size=size@entry=109) at bgpd/bgp_packet.c:2024
13 0x0000564dea2c901d in bgp_process_packet (thread=<optimized out>) at bgpd/bgp_packet.c:2933
14 0x00007f423af0bf71 in event_call (thread=thread@entry=0x7fffc414ee40) at lib/event.c:1995
15 0x00007f423aebb198 in frr_run (master=0x564deb73c670) at lib/libfrr.c:1213
16 0x0000564dea261b83 in main (argc=<optimized out>, argv=<optimized out>) at bgpd/bgp_main.c:505
```
With the configuration:
```
frr version 9.1-dev-MyOwnFRRVersion
frr defaults traditional
hostname ip-172-31-13-140
log file /tmp/debug.log
log syslog
service integrated-vtysh-config
!
debug bgp keepalives
debug bgp neighbor-events
debug bgp updates in
debug bgp updates out
!
router bgp 100
bgp router-id 9.9.9.9
no bgp ebgp-requires-policy
bgp bestpath aigp
neighbor 172.31.2.47 remote-as 200
!
address-family ipv4 unicast
neighbor 172.31.2.47 default-originate
neighbor 172.31.2.47 route-map RM_IN in
exit-address-family
exit
!
route-map RM_IN permit 10
set as-path prepend 200
exit
!
```
The issue is that we try to process NLRIs even if the attribute length is 0.
Later bgp_update() will handle route-maps and a crash occurs because all the
attributes are NULL, including aspath, where we dereference.
According to the RFC 4271:
A value of 0 indicates that neither the Network Layer
Reachability Information field nor the Path Attribute field is
present in this UPDATE message.
But with a fuzzed UPDATE message this can be faked. I think it's reasonable
to skip processing NLRIs if both update_len and attribute_len are 0.
Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
As part of the conversion to a `struct peer_connection` it will
be desirable to have 2 pointers one for when we open a connection
and one for when we receive a connection. Start this actual
conversion over to this in `struct peer`. If this sounds confusing
take a look at the bgp state machine for connections and how
it resolves the processing of this router opening -vs- this
router receiving an open. At some point in time the state
machine decides that we are keeping one of the two connections.
Future commits will allow us to untangle the peer/doppelganger
duality with this abstraction.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The status and ostatus are a function of the `struct peer_connection`
move it into that data structure.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
BGP tracks connections based upon the peer. But the problem
with this is that the doppelganger structure for it is being
created. This has introduced a bunch of fragileness in that
the peer exists independently of the connections to it.
The whole point of the doppelganger structure was to allow
BGP to both accept and initiate tcp connections and then
when we get one to a `good` state we collapse into the
appropriate one. The problem with this is that having
2 peer structures for this creates a situation where
we have to make sure we are configing the `right` one
and also make sure that we collapse the two independent
peer structures into 1 acting peer. This makes no sense
let's abstract out the peer into having 2 connection
one for incoming connections and one for outgoing connections
then we can easily collapse down without having to do crazy
stuff. In addition people adding new features don't need
to have to go touch a million places in the code.
This is the start of this abstraction. In this commit
we'll just pull out the fd and input/output buffers
into a connection data structure. Future commits
will abstract further.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add this logic inside bgp_capability_send() instead of repeating the whole
logic before calling bgp_capability_send().
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When setting local-role for the neighbor, force sending ROLE capability via
dynamic capability if it's enabled.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Gonna be covered later with further PRs. Now adding them to avoid compiler
errors due to uncovered switch/cases.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
We have dynamic capability support, but it handles only MP capability.
With this change, we can enable software version capability dynamicaly, without
resetting the session.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Modify bgp to not allow a v6 peer to come up if the v6 afi is negotiated
and the outgoing interface has no v6 address as well as zebra does
not support the v6 with v4 nexthop capabilities that some dataplanes
allow.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The last_reset_cause is a plain old BGP_MAX_PACKET_SIZE buffer
that is really enlarging the peer data structure. Let's just
copy the stream that failed and only allocate how ever much
the packet size actually was. While it's likely that we have
a reset reason, the packet typically is not going to be 65k
in size. Let's save space.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
More details: https://www.rfc-editor.org/rfc/rfc8810.html
Not sure if we want to maintain the old code more.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Implement: https://datatracker.ietf.org/doc/html/draft-abraitis-bgp-version-capability
Tested with GoBGP:
```
% ./gobgp neighbor 192.168.10.124
BGP neighbor is 192.168.10.124, remote AS 65001
BGP version 4, remote router ID 200.200.200.202
BGP state = ESTABLISHED, up for 00:01:49
BGP OutQ = 0, Flops = 0
Hold time is 3, keepalive interval is 1 seconds
Configured hold time is 90, keepalive interval is 30 seconds
Neighbor capabilities:
multiprotocol:
ipv4-unicast: advertised and received
ipv6-unicast: advertised
route-refresh: advertised and received
extended-nexthop: advertised
Local: nlri: ipv4-unicast, nexthop: ipv6
UnknownCapability(6): received
UnknownCapability(9): received
graceful-restart: advertised and received
Local: restart time 10 sec
ipv6-unicast
ipv4-unicast
Remote: restart time 120 sec, notification flag set
ipv4-unicast, forward flag set
4-octet-as: advertised and received
add-path: received
Remote:
ipv4-unicast: receive
enhanced-route-refresh: received
long-lived-graceful-restart: advertised and received
Local:
ipv6-unicast, restart time 10 sec
ipv4-unicast, restart time 20 sec
Remote:
ipv4-unicast, restart time 0 sec, forward flag set
fqdn: advertised and received
Local:
name: donatas-pc, domain:
Remote:
name: spine1-debian-11, domain:
software-version: advertised and received
Local:
GoBGP/3.10.0
Remote:
FRRouting/8.5-dev-MyOwnFRRVersion-gdc92f44a45-dirt
cisco-route-refresh: received
Message statistics:
```
FRR side:
```
root@spine1-debian-11:~# vtysh -c 'show bgp neighbor 192.168.10.17 json' | \
> jq '."192.168.10.17".neighborCapabilities.softwareVersion.receivedSoftwareVersion'
"GoBGP/3.10.0"
root@spine1-debian-11:~#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before this patch, we always passed `struct attr` for NLRI_UPDATE, but if we
have a situation with treat-as-withdraw (for example: malformed attribute, or
using a command like `neighbor path-attribute treat-as-withdraw`) the route
MUST be withdrawn form the BGP table.
Hence, we MUST pass attr as NULL, in this case we already have this check
under NLRI_ATTR_ARG() macro, just reuse it properly.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Consider this scenario:
Lots of peers with a bunch of route information that is changing
fast. One of the peers happens to be really slow for whatever
reason. The way the output queue is filled is that bgpd puts
64 packets at a time and then reschedules itself to send more
in the future. Now suppose that peer has hit it's input Queue
limit and is slow. As such bgp will continue to add data to
the output Queue, irrelevant if the other side is receiving
this data.
Let's limit the Output Queue to the same limit as the Input
Queue. This should prevent bgp eating up large amounts of
memory as stream data when under severe network trauma.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Before this patch, data is flushed, and we can't see the data after we send
the notification.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Simulated latency with:
```
tc qdisc add dev eth3 root netem delay 100ms
```
```
donatas-laptop# sh ip bgp summary failed
IPv4 Unicast Summary (VRF default):
BGP router identifier 192.0.2.252, local AS number 65000 vrf-id 0
BGP table version 28
RIB entries 0, using 0 bytes of memory
Peers 1, using 724 KiB of memory
Neighbor EstdCnt DropCnt ResetTime Reason
192.168.10.65 2 2 00:00:17 Admin. shutdown (RTT)
Displayed neighbors 1
Total number of neighbors 1
donatas-laptop#
```
Another end received:
```
%NOTIFICATION: received from neighbor 192.168.10.17 6/2 (Cease/Administrative Shutdown) "shutdown due to high round-trip-time (104ms > 5ms, hit 21 times)"
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If a operator receives an invalid packet that is of insufficient size
then it is possible for BGP to assert during reading of the packet
instead of gracefully resetting the connection with the peer.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
If the timer is not explicitly configured for a peer, the default timer
is not taken into account and SendHoldTimer mechanism does not work at all.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
We are allocating temporary memory for information about
what to process in this thread, which is not being cleaned
up on thread cancelling.
Signed-off-by: Samanvitha B Bhargav <bsmanvitha@vmware.com>
See the BGP message sequence:
R1 R2
| updates |
|------------------>|
| |
| refresh request |
x<------------------|
| |
| updates cont. |
|------------------>|
| |
| end-of-rib |
|------------------>|
| |
When R1 and R2 establish BGP session, R1 begins to send initial updates.
If R2 sends a route-refresh request before EoR, it's silently ignored
by R1, and routes received earlier have no chance to be processed again.
RFC7313 says, "for a BGP speaker that supports the BGP Graceful Restart,
it MUST NOT send a BoRR for an <AFI, SAFI> to a neighbor before it sends
the EoR for the <AFI, SAFI> to the neighbor." But it doesn't forbid
route-refresh request to be sent before receiving EoR.
To handle this scenario, postpone response to refresh request until EoR
is sent.
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
The "bgp_notify_" apis in bgp_packet.c generate a notification
to a peer, usually during error handling. The io pthread wants
to send notifications in a couple of cases during early
received-packet validation - but the existing api interacts
with the peer struct itself, and that's not safe.
Add a new api for use by the io pthread, and adjust the main
notify api so that it can avoid touching the peer struct.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Handle ORF REMOVE_ALL events as well, because now we just silently return, and
a stale dynamic prefix-list is used instead of the new one.
Before this, soft clear/route refresh was needed. Don't know the reason, but
we didn't send updates when modifying the filters.
Probably due to a massive change of filters and to avoid automatic updates :/
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
convert:
frr_with_mutex(..)
to:
frr_with_mutex (..)
To make all our code agree with what clang-format is going to produce
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
==175785== 0 bytes in 1 blocks are definitely lost in loss record 1 of 88
==175785== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==175785== by 0x492EB8E: qcalloc (in /usr/local/lib/libfrr.so.0.0.0)
==175785== by 0x269823: bgp_notify_decapsulate_hard_reset (in /usr/lib/frr/bgpd)
==175785== by 0x26C85D: bgp_notify_receive (in /usr/lib/frr/bgpd)
==175785== by 0x26E94E: bgp_process_packet (in /usr/lib/frr/bgpd)
==175785== by 0x4985349: thread_call (in /usr/local/lib/libfrr.so.0.0.0)
==175785== by 0x491D521: frr_run (in /usr/local/lib/libfrr.so.0.0.0)
==175785== by 0x1EBEE8: main (in /usr/lib/frr/bgpd)
==175785==
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Related: https://datatracker.ietf.org/doc/html/draft-ietf-idr-bfd-subcode
When BFD Down notification comes and BGP is configured to track on BFD events,
send BGP Cease/BFD Down notification to the peer.
If RFC 8538 is enabled (Notification support for Graceful-Restart), notification
should be encapsulated into Hard Reset message.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>