Commit Graph

7786 Commits

Author SHA1 Message Date
Donatas Abraitis
b29169073b bgpd: Check the actual remaining stream length before taking TLV value
```
    0 0xb50b9f898028 in __sanitizer_print_stack_trace (/home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/.libs/bgpd+0x368028) (BuildId: 3292703ed7958b20076550c967f879db8dc27ca7)
    1 0xb50b9f7ed8e4 in fuzzer::PrintStackTrace() (/home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/.libs/bgpd+0x2bd8e4) (BuildId: 3292703ed7958b20076550c967f879db8dc27ca7)
    2 0xb50b9f7d4d9c in fuzzer::Fuzzer::CrashCallback() (/home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/.libs/bgpd+0x2a4d9c) (BuildId: 3292703ed7958b20076550c967f879db8dc27ca7)
    3 0xe0d12d7469cc  (linux-vdso.so.1+0x9cc) (BuildId: 1a77697e9d723fe22246cfd7641b140c427b7e11)
    4 0xe0d12c88f1fc in __pthread_kill_implementation nptl/pthread_kill.c:43:17
    5 0xe0d12c84a678 in gsignal signal/../sysdeps/posix/raise.c:26:13
    6 0xe0d12c83712c in abort stdlib/abort.c:79:7
    7 0xe0d12d214724 in _zlog_assert_failed /home/ubuntu/frr-public/frr_public_private-libfuzzer/lib/zlog.c:789:2
    8 0xe0d12d1285e4 in stream_get /home/ubuntu/frr-public/frr_public_private-libfuzzer/lib/stream.c:324:3
    9 0xb50b9f8e47c4 in bgp_attr_encap /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_attr.c:2758:3
    10 0xb50b9f8dcd38 in bgp_attr_parse /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_attr.c:3783:10
    11 0xb50b9faf74b4 in bgp_update_receive /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:2383:20
    12 0xb50b9faf1dcc in bgp_process_packet /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:4075:11
    13 0xb50b9f8c90d0 in LLVMFuzzerTestOneInput /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_main.c:582:3
```

Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 0998b38e4d)
2024-07-31 12:17:41 +00:00
Donatas Abraitis
6485e39417 bgpd: Do not process VRF import/export to/from auto created VRF instances
Fixes the crash:

```
(gdb) bt
0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=124583315603008) at ./nptl/pthread_kill.c:44
1  __pthread_kill_internal (signo=11, threadid=124583315603008) at ./nptl/pthread_kill.c:78
2  __GI___pthread_kill (threadid=124583315603008, signo=signo@entry=11) at ./nptl/pthread_kill.c:89
3  0x0000714ed0242476 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
4  0x0000714ed074cfb7 in core_handler (signo=11, siginfo=0x7ffe6d9792b0, context=0x7ffe6d979180) at lib/sigevent.c:258
5  <signal handler called>
6  0x000060f55e33ffdd in route_table_get_info (table=0x0) at ./lib/table.h:177
7  0x000060f55e340053 in bgp_dest_table (dest=0x60f56dabb840) at ./bgpd/bgp_table.h:156
8  0x000060f55e340c9f in is_route_injectable_into_vpn (pi=0x60f56dbc4a60) at ./bgpd/bgp_mplsvpn.h:331
9  0x000060f55e34507c in vpn_leak_from_vrf_update (to_bgp=0x60f56da52070, from_bgp=0x60f56da75af0, path_vrf=0x60f56dbc4a60) at bgpd/bgp_mplsvpn.c:1575
10 0x000060f55e346657 in vpn_leak_from_vrf_update_all (to_bgp=0x60f56da52070, from_bgp=0x60f56da75af0, afi=AFI_IP) at bgpd/bgp_mplsvpn.c:2028
11 0x000060f55e340c10 in vpn_leak_postchange (direction=BGP_VPN_POLICY_DIR_TOVPN, afi=AFI_IP, bgp_vpn=0x60f56da52070, bgp_vrf=0x60f56da75af0) at ./bgpd/bgp_mplsvpn.h:310
12 0x000060f55e34a692 in vpn_leak_postchange_all () at bgpd/bgp_mplsvpn.c:3737
13 0x000060f55e3d91fc in router_bgp (self=0x60f55e5cbc20 <router_bgp_cmd>, vty=0x60f56e2d7660, argc=3, argv=0x60f56da19830) at bgpd/bgp_vty.c:1601
14 0x0000714ed069ddf5 in cmd_execute_command_real (vline=0x60f56da32a80, vty=0x60f56e2d7660, cmd=0x0, up_level=0) at lib/command.c:1002
15 0x0000714ed069df6e in cmd_execute_command (vline=0x60f56da32a80, vty=0x60f56e2d7660, cmd=0x0, vtysh=0) at lib/command.c:1061
16 0x0000714ed069e51e in cmd_execute (vty=0x60f56e2d7660, cmd=0x60f56dbf07d0 "router bgp 100\n", matched=0x0, vtysh=0) at lib/command.c:1227
17 0x0000714ed076faa0 in vty_command (vty=0x60f56e2d7660, buf=0x60f56dbf07d0 "router bgp 100\n") at lib/vty.c:616
18 0x0000714ed07719c4 in vty_execute (vty=0x60f56e2d7660) at lib/vty.c:1379
19 0x0000714ed07740f0 in vtysh_read (thread=0x7ffe6d97c700) at lib/vty.c:2374
20 0x0000714ed07685c4 in event_call (thread=0x7ffe6d97c700) at lib/event.c:1995
21 0x0000714ed06e3351 in frr_run (master=0x60f56d1d2e40) at lib/libfrr.c:1232
22 0x000060f55e2c4b44 in main (argc=7, argv=0x7ffe6d97c978) at bgpd/bgp_main.c:555
(gdb)
```

Fixes https://github.com/FRRouting/frr/issues/16484

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 04f9372409)
2024-07-29 09:20:17 +00:00
Chirag Shah
36a70b5d20 bgpd: backpressure - fix evpn route sync to zebra
In scaled EVPN + ipv4/ipv6 uni route sync to zebra,
some of the ipv4/ipv6 routes skipped reinstallation
due to incorrect local variable's stale value.

Once the local variable value reset in each loop
iteration all skipped routes synced to zebra properly.

Ticket: #3948828

Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
Signed-off-by: Chirag Shah <chirag@nvidia.com>
2024-07-25 21:17:02 +03:00
Rajasekar Raja
d4e8279adc bgpd: backpressure - log error for evpn when route install to zebra fails.
log error for evpn in case route install to zebra fails.

Ticket :#3992392

Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
2024-07-25 21:15:20 +03:00
Rajasekar Raja
f9a9d424f9 bgpd: backpressure - fix ret value evpn_route_select_install
The return value of evpn_route_select_install is ignored in all cases
except during vni route table install/uninstall and based on the
returned value, an error is logged. Fixing this.

Ticket :#3992392

Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
2024-07-25 21:15:17 +03:00
Rajasekar Raja
4c68489d2f bgpd: backpressure - Avoid use after free
Coverity complains there is a use after free (1598495 and 1598496)
At this point, most likely dest->refcount cannot go 1 and free up
the dest, but there might be some code path where this can happen.

Fixing this with a simple order change (no harm fix).

Ticket :#4001204

Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
(cherry picked from commit 40965e5999)
2024-07-23 14:47:38 +00:00
Donatas Abraitis
2a06d1c5e0
Merge pull request #16404 from FRRouting/mergify/bp/dev/10.1/pr-16368
bgpd: backpressure - fix to properly remove dest for bgp under deletion (backport #16368)
2024-07-16 22:08:00 -07:00
Rajasekar Raja
25d515cb25 bgpd: backpressure - Improve debuggability
Improve debuggability in backpressure code.

Ticket :#3980988

Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
(cherry picked from commit 186db96c06)
2024-07-16 16:15:32 +00:00
Rajasekar Raja
9f9c0bc244 bgpd: backpressure - fix to properly remove dest for bgp under deletion
In case of imported routes (L3vni/vrf leaks), when a bgp instance is
being deleted, the peer->bgp comparision with the incoming bgp to remove
the dest from the pending fifo is wrong. This can lead to the fifo
having stale entries resulting in crash.

Two changes are done here.
 - Instead of pop/push items in list if the struct bgp doesnt match,
   simply iterate the list and remove the expected ones.

 - Corrected the way bgp is fetched from dest rather than relying on
   path_info->peer so that it works for all kinds of routes.

Ticket :#3980988

Signed-off-by: Chirag Shah <chirag @nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
(cherry picked from commit 4395fcd8e1)
2024-07-16 16:15:31 +00:00
Donatas Abraitis
2ab249e580 bgpd: Skip empty (auto created) VRF instances when deleting a default BGP instance
Auto created VRF instances does not have any config, so it's not relevant
depending on them.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit bfedb38110)
2024-07-16 14:10:07 +00:00
Donatas Abraitis
b5db1e7352 bgpd: Skip automatically created BGP instances for show CMDs
When using e.g. `adverise-all-vni`, and/or `import vrf ...`, the VRF instance
is created with a default's VRF ASN and tagged as AUTO_VRF. We MUST skip them
here also.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 03c086866b)
2024-07-16 14:10:07 +00:00
Donatas Abraitis
6c573ce305 bgpd: Mark VRF instance as auto created if import vrf is configured for this instance
If we create a new BGP instance (in this case VRF instance), it MUST be marked
as auto created, to avoid bgpd changing VRF instance's ASN to the default VRF's.

That's because of the ordering when FRR reload is happening.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 80a4f87c9a)
2024-07-16 14:10:06 +00:00
Donatas Abraitis
8eb7d0ac1a bgpd: Ignore RFC8212 for BGP Confederations
RFC 8212 should be restricted for eBGP peers.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit fa2cc09d45)
2024-07-01 14:19:50 +00:00
Piotr Suchy
5bd66b695d bgpd: Ignore routes from evpn if VRF is unknown
Fix for a bug, where FRR fails to install route received for an unknown but later-created VRF - detailed description can be found here https://github.com/FRRouting/frr/issues/13708

Signed-off-by: Piotr Suchy <psuchy@akamai.com>
(cherry picked from commit 8044d73300)
2024-06-28 08:34:57 +00:00
Loïc Sang
73b18210f2 bgpd: avoid clearing routes for peers that were never established
Under heavy system load with many peers in passive mode and a large
number of routes, bgpd can enter an infinite loop. This occurs while
processing timeout BGP_OPEN messages, which prevents it from accepting
new connections. The following log entries illustrate the issue:
>bgpd[6151]: [VX6SM-8YE5W][EC 33554460] 3.3.2.224: nexthop_set failed, resetting connection - intf 0x0
>bgpd[6151]: [P790V-THJKS][EC 100663299] bgp_open_receive: bgp_getsockname() failed for peer: 3.3.2.224
>bgpd[6151]: [HTQD2-0R1WR][EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 3.3.2.224
... repeating

The issue occurs when bgpd handles a massive number of routes in the RIB
while receiving numerous BGP_OPEN packets. If bgpd is overloaded, it
fails to process these packets promptly, leading the remote peer to
close the connection and resend BGP_OPEN packets.

When bgpd eventually starts processing these timeout BGP_OPEN packets,
it finds the TCP connection closed by the remote peer, resulting in
"bgp_stop()" being called. For each timeout peer, bgpd must iterate
through the routing table, which is time-consuming and causes new
incoming BGP_OPEN packets to timeout, perpetuating the infinite loop.

To address this issue, the code is modified to check if the peer has
been established at least once before calling "bgp_clear_route_all()".
This ensures that routes are only cleared for peers that had a
successful session, preventing unnecessary iterations over the routing
table for peers that never established a connection.

With this change, BGP_OPEN timeout messages may still occur, but in the
worst case, bgpd will stabilize. Before this patch, bgpd could enter a
loop where it was unable to accpet any new connections.

Signed-off-by: Loïc Sang <loic.sang@6wind.com>
(cherry picked from commit e0ae285eb8)
2024-06-26 20:46:36 +00:00
Russ White
6ea0a93ceb
Merge pull request #16285 from FRRouting/mergify/bp/dev/10.1/pr-15838
bgpd: fix "bgp as-pah access-list" with "set aspath exclude" set/unset issue (backport #15838)
2024-06-25 08:42:02 -04:00
Russ White
37451922d7
Merge pull request #16291 from FRRouting/mergify/bp/dev/10.1/pr-16214
bgpd: A couple more fixes for Tunnel encapsulation handling (backport #16214)
2024-06-25 07:30:45 -04:00
Donatas Abraitis
c01d60b037 bgpd: Check if we have real stream data for tunnel encapsulation sub-tlvs
When the packet is malformed it can use whatever values it wants. Let's check
what the real data we have in a stream instead of relying on malformed values.

Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 9929486d6b)
2024-06-25 11:27:43 +00:00
Donatas Abraitis
c105f9acd9 bgpd: Adjust the length of tunnel encap sub-tlv by sub-tlv type
Fixes: 79563af564 ("bgpd: Get 1 or 2 octets for Sub-TLV length (Tunnel Encap attr)")

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 34b209f0ae)
2024-06-25 11:27:43 +00:00
Donatas Abraitis
26fd1f8167 bgpd: Relax OAD (One-Administration-Domain) for RFC8212
RFC 8212 defines leak prevention for eBGP peers, but BGP-OAD defines a new
peering type One Administrative Domain (OAD), where multiple ASNs could be used
inside a single administrative domain. OAD allows sending non-transitive attributes,
so this prevention should be relaxed too.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 3b98ddf501)
2024-06-25 11:25:33 +00:00
Donatas Abraitis
48b0d338e3
Merge pull request #16281 from FRRouting/mergify/bp/dev/10.1/pr-16213
bgpd: Check if we have really enough data before doing memcpy for FQDN capability (backport #16213)
2024-06-25 13:48:18 +03:00
Donatas Abraitis
a83e5971d5
Merge pull request #16278 from FRRouting/mergify/bp/dev/10.1/pr-16211
bgpd: Check if we have really enough data before doing memcpy for software version (backport #16211)
2024-06-25 13:47:50 +03:00
Francois Dumontet
c05a11df9e bgpd: fix "bgp as-pah access-list" with "set aspath exclude" set/unset issues
whith the following config

router bgp 65001
 no bgp ebgp-requires-policy
 neighbor 192.168.1.2 remote-as external
 neighbor 192.168.1.2 timers 3 10
 !
 address-family ipv4 unicast
  neighbor 192.168.1.2 route-map r2 in
 exit-address-family
exit
!
bgp as-path access-list FIRST seq 5 permit ^65
bgp as-path access-list SECOND seq 5 permit 2$
!
route-map r2 permit 6
 match ip address prefix-list p2
 set as-path exclude as-path-access-list SECOND
exit
!
route-map r2 permit 10
 match ip address prefix-list p1
 set as-path exclude 65003
exit
!
route-map r2 permit 20
 match ip address prefix-list p3
 set as-path exclude all
exit

making some
no bgp as-path access-list SECOND permit 2$
bgp as-path access-list SECOND permit 3$

clear bgp *

no bgp as-path access-list SECOND permit 3$
bgp as-path access-list SECOND permit 2$

clear bgp *

will induce some crashes

thus  we rework the links between aslists and aspath_exclude

Signed-off-by: Francois Dumontet <francois.dumontet@6wind.com>
(cherry picked from commit 094dcc3cda)
2024-06-25 05:12:09 +00:00
Donatas Abraitis
b98a3a1fa8 bgpd: Check if we have really enough data before doing memcpy for FQDN capability
We advance data pointer (data++), but we do memcpy() with the length that is 1-byte
over, which is technically heap overflow.

```
==411461==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x50600011da1a at pc 0xc4f45a9786f0 bp 0xffffed1e2740 sp 0xffffed1e1f30
READ of size 4 at 0x50600011da1a thread T0
    0 0xc4f45a9786ec in __asan_memcpy (/home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/.libs/bgpd+0x3586ec) (BuildId: e794c5f796eee20c8973d7efb9bf5735e54d44cd)
    1 0xc4f45abf15f8 in bgp_dynamic_capability_fqdn /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3457:4
    2 0xc4f45abdd408 in bgp_capability_msg_parse /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3911:4
    3 0xc4f45abdbeb4 in bgp_capability_receive /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3980:9
    4 0xc4f45abde2cc in bgp_process_packet /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:4109:11
    5 0xc4f45a9b6110 in LLVMFuzzerTestOneInput /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_main.c:582:3
```

Found by fuzzing.

Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit b685ab5e1b)
2024-06-24 20:41:15 +00:00
Donatas Abraitis
9f1e1eb142 bgpd: Check if we have really enough data before doing memcpy for software version
If we receive CAPABILITY message (software-version), we SHOULD check if we really
have enough data before doing memcpy(), that could also lead to buffer overflow.

(data + len > end) is not enough, because after this check we do data++ and later
memcpy(..., data, len). That means we have one more byte.

Hit this through fuzzing by

```
    0 0xaaaaaadf872c in __asan_memcpy (/home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/.libs/bgpd+0x35872c) (BuildId: 9c6e455d0d9a20f5a4d2f035b443f50add9564d7)
    1 0xaaaaab06bfbc in bgp_dynamic_capability_software_version /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3713:3
    2 0xaaaaab05ccb4 in bgp_capability_msg_parse /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3839:4
    3 0xaaaaab05c074 in bgp_capability_receive /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3980:9
    4 0xaaaaab05e48c in bgp_process_packet /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:4109:11
    5 0xaaaaaae36150 in LLVMFuzzerTestOneInput /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_main.c:582:3
```

Hit this again by Iggy \m/

Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 5d7af51c4f)
2024-06-24 20:40:00 +00:00
Donatas Abraitis
dc98037015 bgpd: Remove redundant whitespace before printing the reason of the failed peer
Before:

```
Neighbor        EstdCnt DropCnt ResetTime Reason
127.0.0.1             0       0     never  Waiting for peer OPEN (n/a)
```

After:

```
Neighbor        EstdCnt DropCnt ResetTime Reason
127.0.0.1             0       0     never Waiting for peer OPEN (n/a)
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit b5bd626a82)
2024-06-24 19:41:41 +00:00
Donatas Abraitis
348063bbde bgpd: Set last reset reason to admin shutdown if it was manually
Before this patch, we always printed the last reason "Waiting for OPEN", but
if it's a manual shutdown, then we technically are not waiting for OPEN.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit c25c7e929d)
2024-06-24 19:41:41 +00:00
Donatas Abraitis
325c63c6ad
Merge pull request #16255 from FRRouting/mergify/bp/dev/10.1/pr-16059
bgpd: fixed failing to remove VRF if there is a stale l3vni (backport #16059)
2024-06-21 17:51:43 +03:00
Philippe Guibert
7bb66b83b0 bgpd: fix do not use api.backup_nexthop in ZAPI message
The backup_nexthop entry list has been populated by mistake,
and should not. Fix this by reverting the introduced behavior.

Fixes: 237ebf8d45 ("bgpd: rework bgp_zebra_announce() function, separate nexthop handling")

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit d4390fc217)
2024-06-21 06:44:39 +00:00
Kacper Kwaśny
7749253e8c bgpd: fixed failing remove of vrf if there is a stale l3vni
Problem statement:
==================
When a vrf is deleted from the kernel, before its removed from the FRR
config, zebra gets to delete the the vrf and assiciated state.

It does so by sending a request to delete the l3 vni associated with the
vrf followed by a request to delete the vrf itself.

2023/10/06 06:22:18 ZEBRA: [JAESH-BABB8] Send L3_VNI_DEL 1001 VRF
testVRF1001 to bgp
2023/10/06 06:22:18 ZEBRA: [XC3P3-1DG4D] MESSAGE: ZEBRA_VRF_DELETE
testVRF1001

The zebra client communication is asynchronous and about 1/5 cases the
bgp client process them in a different order.

2023/10/06 06:22:18 BGP: [VP18N-HB5R6] VRF testVRF1001(766) is to be
deleted.
2023/10/06 06:22:18 BGP: [RH4KQ-X3CYT] VRF testVRF1001(766) is to be
disabled.
2023/10/06 06:22:18 BGP: [X8ZE0-9TS5H] VRF disable testVRF1001 id 766
2023/10/06 06:22:18 BGP: [X67AQ-923PR] Deregistering VRF 766
2023/10/06 06:22:18 BGP: [K52W0-YZ4T8] VRF Deletion:
testVRF1001(4294967295)
.. and a bit later :
2023/10/06 06:22:18 BGP: [MRXGD-9MHNX] DJERNAES: process L3VNI 1001 DEL
2023/10/06 06:22:18 BGP: [NCEPE-BKB1G][EC 33554467] Cannot process L3VNI
1001 Del - Could not find BGP instance

When the bgp vrf config is removed later it fails on the sanity check if
l3vni is removed.

        if (bgp->l3vni) {
            vty_out(vty, "%% Please unconfigure l3vni %u\n",
                bgp->l3vni);
            return CMD_WARNING_CONFIG_FAILED;
        }

Solution:
=========
The solution is to make bgp cleanup the l3vni a bgp instance is going
down.

The fix:
========
The fix is to add a function in bgp_evpn.c to be responsible for for
deleting the local vni, if it should be needed, and call the function
from bgp_instance_down().

Testing:
========
Created a test, which can run in container lab that remove the vrf on
the host before removing the vrf and the bgp config form frr. Running
this test in a loop trigger the problem 18 times of 100 runs. After the
fix it did not fail.

To verify the fix a log message (which is not in the code any longer)
were used when we had a stale l3vni and needed to call
bgp_evpn_local_l3vni_del() to do the cleanup. This were hit 20 times in
100 test runs.

Signed-off-by: Kacper Kwasny <kkwasny@akamai.com>

bgpd: braces {} are not necessary for single line block

Signed-off-by: Kacper Kwasny <kkwasny@akamai.com>
(cherry picked from commit 171d2583d0)
2024-06-20 07:51:56 +00:00
Donatas Abraitis
e7bc47b501 bgpd: Check against extended community unit size for link bandwidth
If we receive a malformed packets, this could lead ptr_get_be64() reading
the packets more than needed (heap overflow).

```
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
    0 0xaaaaaadf86ec in __asan_memcpy (/home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/.libs/bgpd+0x3586ec) (BuildId: 78123cd26ada92b8b59fc0d74d292ba70c9d2e01)
    1 0xaaaaaaeb60fc in ptr_get_be64 /home/ubuntu/frr-public/frr_public_private-libfuzzer/./lib/stream.h:377:2
    2 0xaaaaaaeb5b90 in ecommunity_linkbw_present /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_ecommunity.c:1895:10
    3 0xaaaaaae50f30 in bgp_attr_ext_communities /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_attr.c:2639:8
    4 0xaaaaaae49d58 in bgp_attr_parse /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_attr.c:3776:10
    5 0xaaaaab063260 in bgp_update_receive /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:2371:20
    6 0xaaaaab05df00 in bgp_process_packet /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:4063:11
    7 0xaaaaaae36110 in LLVMFuzzerTestOneInput /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_main.c:582:3
```

This is triggered when receiving such a packet (malformed):

```
(gdb) bt
0  ecommunity_linkbw_present (ecom=0x555556287990, bw=bw@entry=0x7fffffffda68)
    at bgpd/bgp_ecommunity.c:1802
1  0x000055555564fcac in bgp_attr_ext_communities (args=0x7fffffffd840) at bgpd/bgp_attr.c:2619
2  bgp_attr_parse (peer=peer@entry=0x55555628cdf0, attr=attr@entry=0x7fffffffd960, size=size@entry=20,
    mp_update=mp_update@entry=0x7fffffffd940, mp_withdraw=mp_withdraw@entry=0x7fffffffd950)
    at bgpd/bgp_attr.c:3755
3  0x00005555556aa655 in bgp_update_receive (connection=connection@entry=0x5555562aa030,
    peer=peer@entry=0x55555628cdf0, size=size@entry=41) at bgpd/bgp_packet.c:2324
4  0x00005555556afab7 in bgp_process_packet (thread=<optimized out>) at bgpd/bgp_packet.c:3897
5  0x00007ffff7ac2f73 in event_call (thread=thread@entry=0x7fffffffdc70) at lib/event.c:2011
6  0x00007ffff7a6fb90 in frr_run (master=0x555555bc7c90) at lib/libfrr.c:1212
7  0x00005555556457e1 in main (argc=<optimized out>, argv=<optimized out>) at bgpd/bgp_main.c:543
(gdb) p *ecom
$1 = {refcnt = 1, unit_size = 8 '\b', disable_ieee_floating = false, size = 2, val = 0x555556282150 "",
  str = 0x5555562a9c30 "UNK:0, 255 UNK:2, 6"}
```

Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-06-11 10:03:17 +03:00
Donald Sharp
2a00a648f1
Merge pull request #15900 from mikemallin/v6-vtep-lib-upstream
lib, bgpd, tests, zebra: prefix_sg changes for V6 VTEP
2024-06-07 14:34:11 -04:00
Russ White
84af49b0ae
Merge pull request #15434 from louis-6wind/labels-hash
bgpd: move labels from extra to extra->labels and add them to adj-rib-in and adj-rib-out
2024-06-06 16:27:38 -04:00
Philippe Guibert
2b6bcda64b bgpd: fix label in adj-rib-out
After modifying the "label vpn export value", the vpn label information
of the VRF is not updated to the peers.

For example, the 192.168.0.0/24 prefix is announced to the peer with a
label value of 222.

> router bgp 65500
> [..]
>  neighbor 192.0.2.2 remote-as 65501
>  address-family ipv4-vpn
>   neighbor 192.0.2.2 activate
>  exit-address-family
> exit
> router bgp 65500 vrf vrf2
>  address-family ipv4 unicast
>   network 192.168.0.0/24
>   label vpn export 222
>   rd vpn export 444:444
>   rt vpn both 53:100
>   export vpn
>   import vpn
>  exit-address-family

Changing the label with "label vpn export" does not update the label
value to the peer unless the BGP sessions is re-established.

No labels are stored are stored struct bgp_adj_out so that it is
impossible to compare the current value with the previous value
in adj-RIB-out.

Reference the bgp_labels pointer in struct bgp_adj_out and compare the
values when updating adj-RIB-out.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-06-05 13:11:29 +02:00
Philippe Guibert
9fd26c1aa5 bgpd: fix labels in adj-rib-in
In a BGP L3VPN context using ADJ-RIB-IN (ie. enabled with
'soft-reconfiguration inbound'), after applying a deny route-map and
removing it, the remote MPLS label information is lost. As a result, BGP
is unable to re-install the related routes in the RIB.

For example,

> router bgp 65500
> [..]
>  neighbor 192.0.2.2 remote-as 65501
>  address-family ipv4 vpn
>   neighbor 192.0.2.2 activate
>   neighbor 192.0.2.2 soft-reconfiguration inbound

The 192.168.0.0/24 prefix has a remote label value of 102 in the BGP
RIB.

> # show bgp ipv4 vpn 192.168.0.0/24
>  BGP routing table entry for 444:1:192.168.0.0/24, version 2
>  [..]
>      192.168.0.0 from 192.0.2.2
>        Origin incomplete, metric 0, valid, external, best (First path received)
>        Extended Community: RT:52:100
>        Remote label: 102

A route-map now filter all incoming BGP updates:

> route-map rmap deny 1
> router bgp 65500
>  address-family ipv4 vpn
>   neighbor 192.0.2.2 route-map rmap in

The prefix is now filtered:

> # show bgp ipv4 vpn 192.168.0.0/24
> #

The route-map is detached:

> router bgp 65500
>  address-family ipv4 vpn
>   no neighbor 192.168.0.1 route-map rmap in

The BGP RIB entry is present but the remote label is lost:

> # show bgp ipv4 vpn 192.168.0.0/24
>  BGP routing table entry for 444:1:192.168.0.0/24, version 2
>  [..]
>      192.168.0.0 from 192.0.2.2
>        Origin incomplete, metric 0, valid, external, best (First path received)
>        Extended Community: RT:52:100

The reason for the loose is that labels are stored within struct attr ->
struct extra -> struct bgp_labels but not in the struct bgp_adj_in.

Reference the bgp_labels pointer in struct bgp_adj_in and use its values
when doing a soft reconfiguration of the BGP table.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-06-05 13:11:29 +02:00
Louis Scalbert
b804993394 bgpd: get rid of has_valid_label in bgp_update()
Get rid of has_valid_label in bgp_update() to prepare the next commits.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-06-05 13:11:29 +02:00
Louis Scalbert
ca32945b1f bgpd: move labels from extra to extra->labels
Move labels from extra to extra->labels. Labels are now stored in a hash
list.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-06-05 13:11:29 +02:00
Louis Scalbert
3c86f776f0 bgpd: add bgp_labels hash
Add bgp_labels type and hash list.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-06-05 13:11:29 +02:00
Louis Scalbert
a6de910448 bgpd: store number of labels with 8 bits
8 bits are sufficient to store the number of labels because the current
maximum is 2.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-06-05 13:11:29 +02:00
Louis Scalbert
ae54c9e455 bgpd: fix too leading tabs in vnc_import_bgp
Small rework to fix the following checkpatch warning:

> < WARNING: Too many leading tabs - consider code refactoring
> < #2142: FILE: /tmp/f1-1616988/vnc_import_bgp.c:2142:

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-06-05 13:11:22 +02:00
Louis Scalbert
64fe15fd28 bgpd: add bgp_path_info_num_labels()
Add bgp_path_info_num_labels() to get the number of labels stored in
a path_info structure.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-06-05 11:08:46 +02:00
Louis Scalbert
1fcedd00d1 bgpd: rework vni printing in route_vty_out_detail()
In route_vty_out_detail(), tag_buf stores a string representation of
the VNI label.

Rename tag_buf to vni_buf for clarity and rework the code a little bit
to prepare the following commits.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-06-05 11:08:46 +02:00
Louis Scalbert
c27ad43694 bgpd: num_labels cannot be greater than BGP_MAX_LABELS
num_labels cannot be greater than BGP_MAX_LABELS by design.

Remove the check and the override.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-06-05 11:08:46 +02:00
Louis Scalbert
04748e36a5 bgpd: add bgp_path_info_labels_same()
Add bgp_path_info_labels_same() to compare labels with labels from
path_info. Remove labels_same() that was used for mplsvpn only.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-06-05 11:08:46 +02:00
Louis Scalbert
9771f1a03e bgpd: optimize label copy for new path_info
In bgp_update(), path_info *new has just been created and has void
labels. bgp_labels_same() is always false.

Do not compare previous labels before setting them.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-06-05 11:08:46 +02:00
Louis Scalbert
e93fa4daf0 bgpd: do not init labels in extra
No need to init labels at extra allocation. num_labels is the number
of set labels in label[] and is initialized to 0 by default.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-06-05 11:08:46 +02:00
Louis Scalbert
7a513e3361 bgpd: add bgp_path_info_has_valid_label()
Add bgp_path_has_valid_label to check that a path_info has a valid
label.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-06-05 11:08:46 +02:00
Louis Scalbert
747da057e8 bgpd: check and set extra num_labels
The handling of MPLS labels in BGP faces an issue due to the way labels
are stored in memory. They are stored in bgp_path_info but not in
bgp_adj_in and bgp_adj_out structures. As a consequence, some
configuration changes result in losing labels or even a bgpd crash. For
example, when retrieving routes from the Adj-RIB-in table
("soft-reconfiguration inbound" enabled), labels are missing.

bgp_path_info stores the MPLS labels, as shown below:

> struct bgp_path_info {
>   struct bgp_path_info_extra *extra;
>   [...]
> struct bgp_path_info_extra {
>	mpls_label_t label[BGP_MAX_LABELS];
>	uint32_t num_labels;
>   [...]

To solve those issues, a solution would be to set label data to the
bgp_adj_in and bgp_adj_out structures in addition to the
bgp_path_info_extra structure. The idea is to reference a common label
pointer in all these three structures. And to store the data in a hash
list in order to save memory.

However, an issue in the code prevents us from setting clean data
without a rework. The extra->num_labels field, which is intended to
indicate the number of labels in extra->label[], is not reliably checked
or set. The code often incorrectly assumes that if the extra pointer is
present, then a label must also be present, leading to direct access to
extra->label[] without verifying extra->num_labels. This assumption
usually works because extra->label[0] is set to MPLS_INVALID_LABEL when
a new bgp_path_info_extra is created, but it is technically incorrect.

Cleanup the label code by setting num_labels each time values are set in
extra->label[] and checking extra->num_labels before accessing the
labels.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-06-05 11:08:46 +02:00
Donatas Abraitis
f153b9a9b6 bgpd: Ignore auto created VRF BGP instances
Configuration:

```
vtysh <<EOF
configure

vrf vrf100
 vni 10100
exit-vrf

router bgp 50
 address-family l2vpn evpn
  advertise-all-vni
 exit-address-family
exit

router bgp 100 vrf vrf100
exit
EOF
```

TL;DR; When we configure `advertise-all-vni` (in this case), a new BGP instance
is created with the name vrf100, and ASN 50. Next, when we create
`router bgp 100 vrf vrf100`, we look for the BGP instance with the same name
and we found it, but ASNs are different 50 vs. 100.

Every such a new auto created instance is flagged with BGP_VRF_AUTO.

After the fix:

```
router bgp 50
 !
 address-family l2vpn evpn
  advertise-all-vni
 exit-address-family
exit
!
router bgp 100 vrf vrf100
exit
!
end
donatas.net(config)# router bgp 51
BGP is already running; AS is 50
donatas.net(config)# router bgp 50
donatas.net(config-router)# router bgp 101 vrf vrf100
BGP is already running; AS is 100
donatas.net(config)# router bgp 100 vrf vrf100
donatas.net(config-router)#
```

Fixes: https://github.com/FRRouting/frr/issues/16152
Fixes: https://github.com/FRRouting/frr/issues/9537

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-06-04 18:03:22 +03:00
David Ward
172dd682d9 bgpd: Adjust terminology related to DSCP
The default DSCP used for BGP connections is CS6. The DSCP value is
not part of the TCP header.

When setting the IP_TOS or IPV6_TCLASS socket options, the argument
is not the 6-bit DSCP value, but an 8-bit value for the former IPv4
Type of Service field or IPv6 Traffic Class field, respectively.

Fixes: 425bd64be8 ("bgpd: Allow bgp to control the DSCP session TOS value")
Signed-off-by: David Ward <david.ward@ll.mit.edu>
2024-06-02 06:44:59 -04:00