The bgp labelpool code is grabbing the vpn policy data structure.
This vpn_policy has a pointer to the bgp data structure. If
a item placed on the bgp label pool workqueue happens to sit
there for the microsecond or so and the operator issues a
`no router bgp...` command that corresponds to the vpn_policy
bgp pointer, when the workqueue is run it will crash because
the bgp pointer is now freed and something else owns it.
Modify the labelpool code to store the vrf id associated
with the request on the workqueue. When you wake up
if the vrf id still has a bgp pointer allow the request
to continue, else drop it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When leak_update() rechecks an existing path, it considers nothing to
update if the attributes and labels are not changed. However, it does
not take into account the nexthop validity.
Perform a leak update if the nexthop validity has changed.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Mark a nexthop as invalid if the origin VRF is unusable, either because
it does not exist or its interface is down.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Route aggregation should be checked after a route is added, and
not before. This is for code flow and consistency.
Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
If the peer which has allowas-in enabled and then reimports the routes to another
local VRF, respect that value.
This was working with < 10.2 releases.
Fixes: d4426b62d2 ("bgpd: copy source vrf ASN to leaked route and block loops")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
For JSON output we don't need newline to be printed.
Before:
```
"lastUpdate":{"epoch":1734490463,"string":"Wed Dec 18 04:54:23 2024\n"
```
After:
```
"lastUpdate":{"epoch":1734678560,"string":"Fri Dec 20 09:09:20 2024"
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Releasing the vpn label from label pool chunk using bgp_lp_release routine whenever vpn session is removed.
bgp_lp_release will clear corresponding bit in the allocated map of the label pool chunk and increases nfree by 1
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Previously we couldn't install VPN routes with self AS in the path because
we never checked if we have allowas-in enabled, or not.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When a prefix is imported using the "network" command under a vrf, which
is a connected prefix, and in the context of label allocation per
nexthop:
..
>router bgp 1 vrf vrf1
> address-family ipv4 unicast
> redistribute static
> network 172.16.0.1/32 <--- connected network
> network 192.168.106.0/29
> label vpn export auto
> label vpn export allocation-mode per-nexthop
..
We encounter an MPLS entry where the nexthop is the prefix itself:
> 18 BGP 172.16.0.1 -
Actually, when using the "network" command, a bnc context is used, but
it is filled by using the prefix itself instead of the nexthop for other
BGP updates. Consequently, when picking up the original nexthop for
label allocation, the function behaves incorrectly. Instead ensure that
the nexthop type of bnc->nexthop is not a nexthop_ifindex; otherwise
fallback to the per vrf label.
Update topotests.
Signed-off-by: Loïc Sang <loic.sang@6wind.com>
When we get a update on a route that we already have information on
from another router and that route has been leaked ensure that
we do not crash when trying to releak the code when we may want
to modify the as path.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When we leak routes and are using a different ASN in the
source vrf from the target vrf, it's possible we could
create loops because of an incomplete as-path (missing
the source vrf ASN). This fix adds the source vrf ASN and
stops the importing of a BGP prefix that has the target
ASN in the as-path in the source vrf.
Signed-off-by: Don Slice <dslice@nvidia.com>
1. bgp coredump is observed when we delete default bgp instance
when we have multi-vrf; and route-leaking is enabled between
default, non-default vrfs.
Removing default router bgp when routes leaked between non-default vrfs.
- Routes are leaked from VRF-A to VRF-B
- VPN table is created with auto RD/RT in default instance.
- Default instance is deleted, we try to unimport the routes from all VRFs
- non-default VRF schedules a work-queue to process deleted routes.
- Meanwhile default bgp instance clears VPN tables and free the route
entries as well, which are still referenced by non-default VRFs which
have imported routes.
- When work queue process starts to delete imported route in VRF-A it cores
as it accesses freed memory.
- Whenever we delete bgp in default vrf, we skip deleting routes in the vpn
table, import and export lists.
- The default hidden bgp instance will not be listed in any of the show
commands.
- Whenever we create new default instance, handle it with AS number change
i.e. old hidden default bgp's AS number is updated and also changing
local_as for all peers.
2. A default instance is created with ASN of the vrf with the import
statement.
This may not be the ASN desired for the default table
- First problem with current behavior.
Define two vrfs with different ASNs and then add import between.
starting without any bgp config (no default instance)
A default instance is created with ASN of the vrf with the import
statement.
This may not be the ASN desired for the default table
- Second related problem. Start with a default instance and a vrf in a
different ASN. Do an import statement in the vrf for a bgp vrf instance
not yet defined and it auto-creates that bgp/vrf instance and it inherits
the ASN of the importing vrf
- Handle bgp instances with different ASNs and handle ASN for auto created
BGP instance
Signed-off-by: Kantesh Mundaragi <kmundaragi@vmware.com>
Include SID structure information when removing an SRv6 End.DT4 or End.DT6 SID
from the forwarding plane.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Include SID structure information when installing an SRv6 End.DT6 or End.DT4 SID
in the forwarding plane.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Make the `sid_register()` function non-static to allow other BGP modules
(e.g. bgp_zebra.c) to register SIDs.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Currently, BGP allocates SIDs without interacting with Zebra.
Recently, the SRv6 implementation has been improved. Now, the daemons
need to interact with Zebra through ZAPI to obtain and release SIDs.
This commit extends BGP to request SIDs from Zebra instead of allocating
the SIDs on its own.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
When SRv6 VPN is unconfigured in BGP, BGP needs to interact with SID Manager to
release the SID and make it available to other daemons
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Currently, when SRv6 is enabled in BGP, BGP requests a locator chunk
from Zebra. Zebra assigns a locator chunk to BGP, and then BGP can
allocate SIDs from the locator chunk.
Recently, the implementation of SRv6 in Zebra has been improved, and a
new API has been introduced for obtaining/releasing the SIDs.
Now, the daemons no longer need to request a chunk.
Instead, the daemons interact with Zebra to obtain information about the
locator and subsequently to allocate/release the SIDs.
This commit extends BGP to use the new SRv6 API. In particular, it
removes the chunk throughout the BGP code and modifies BGP to
request/save/advertise the locator instead of the chunk.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
The "if_is_vrf" check is unnecessary because it’s already handled by
"if_get_vrf_loopback". Additionally, it ignores the default loopback and
could introduce potential bugs.
Fixes: 8b81f32e97 ("bgpd: fix label lost when vrf loopback comes back")
Signed-off-by: Loïc Sang <loic.sang@6wind.com>
Fix static-analyser warnings with BGP labels:
> $ scan-build make -j12
> bgpd/bgp_updgrp_packet.c:819:10: warning: Access to field 'extra' results in a dereference of a null pointer (loaded from variable 'path') [core.NullDereference]
> ? &path->extra->labels->label[0]
> ^~~~~~~~~
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Fixes the crash:
```
(gdb) bt
0 __pthread_kill_implementation (no_tid=0, signo=11, threadid=124583315603008) at ./nptl/pthread_kill.c:44
1 __pthread_kill_internal (signo=11, threadid=124583315603008) at ./nptl/pthread_kill.c:78
2 __GI___pthread_kill (threadid=124583315603008, signo=signo@entry=11) at ./nptl/pthread_kill.c:89
3 0x0000714ed0242476 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
4 0x0000714ed074cfb7 in core_handler (signo=11, siginfo=0x7ffe6d9792b0, context=0x7ffe6d979180) at lib/sigevent.c:258
5 <signal handler called>
6 0x000060f55e33ffdd in route_table_get_info (table=0x0) at ./lib/table.h:177
7 0x000060f55e340053 in bgp_dest_table (dest=0x60f56dabb840) at ./bgpd/bgp_table.h:156
8 0x000060f55e340c9f in is_route_injectable_into_vpn (pi=0x60f56dbc4a60) at ./bgpd/bgp_mplsvpn.h:331
9 0x000060f55e34507c in vpn_leak_from_vrf_update (to_bgp=0x60f56da52070, from_bgp=0x60f56da75af0, path_vrf=0x60f56dbc4a60) at bgpd/bgp_mplsvpn.c:1575
10 0x000060f55e346657 in vpn_leak_from_vrf_update_all (to_bgp=0x60f56da52070, from_bgp=0x60f56da75af0, afi=AFI_IP) at bgpd/bgp_mplsvpn.c:2028
11 0x000060f55e340c10 in vpn_leak_postchange (direction=BGP_VPN_POLICY_DIR_TOVPN, afi=AFI_IP, bgp_vpn=0x60f56da52070, bgp_vrf=0x60f56da75af0) at ./bgpd/bgp_mplsvpn.h:310
12 0x000060f55e34a692 in vpn_leak_postchange_all () at bgpd/bgp_mplsvpn.c:3737
13 0x000060f55e3d91fc in router_bgp (self=0x60f55e5cbc20 <router_bgp_cmd>, vty=0x60f56e2d7660, argc=3, argv=0x60f56da19830) at bgpd/bgp_vty.c:1601
14 0x0000714ed069ddf5 in cmd_execute_command_real (vline=0x60f56da32a80, vty=0x60f56e2d7660, cmd=0x0, up_level=0) at lib/command.c:1002
15 0x0000714ed069df6e in cmd_execute_command (vline=0x60f56da32a80, vty=0x60f56e2d7660, cmd=0x0, vtysh=0) at lib/command.c:1061
16 0x0000714ed069e51e in cmd_execute (vty=0x60f56e2d7660, cmd=0x60f56dbf07d0 "router bgp 100\n", matched=0x0, vtysh=0) at lib/command.c:1227
17 0x0000714ed076faa0 in vty_command (vty=0x60f56e2d7660, buf=0x60f56dbf07d0 "router bgp 100\n") at lib/vty.c:616
18 0x0000714ed07719c4 in vty_execute (vty=0x60f56e2d7660) at lib/vty.c:1379
19 0x0000714ed07740f0 in vtysh_read (thread=0x7ffe6d97c700) at lib/vty.c:2374
20 0x0000714ed07685c4 in event_call (thread=0x7ffe6d97c700) at lib/event.c:1995
21 0x0000714ed06e3351 in frr_run (master=0x60f56d1d2e40) at lib/libfrr.c:1232
22 0x000060f55e2c4b44 in main (argc=7, argv=0x7ffe6d97c978) at bgpd/bgp_main.c:555
(gdb)
```
Fixes https://github.com/FRRouting/frr/issues/16484
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
VRF-label association drops when the VRF loopback goes down, however, it
does not return once the interface is enabled again.
Logs show that when VRF loopback goes down, a label drop message is sent
to zebra and immediately resent label installation to zebra, trigged by
"vpn_leak_postchange_all()":
2024/07/16 13:26:29 BGP: [RVJ1J-J2T22] ifp down r1-cust1 vrf id 7
2024/07/16 13:26:29 BGP: [WA2QY-06STJ] vpn_leak_zebra_vrf_label_withdraw: deleting label for vrf VRF r1-cust1 (id=7)
2024/07/16 13:26:30 BGP: [S82AC-6YAC8] vpn_leak_zebra_vrf_label_update: vrf VRF r1-cust1: afi IPv4: setting label 80 for vrf id 7
Since the interface is down, the netlink message is not send to kernel.
Once the interface comes back, zebra ignore the installation assuming
the label is already seen.
To fix this, add a check for the interface status before attempting to
reinstall the label.
Signed-off-by: Loïc Sang <loic.sang@6wind.com>
num_labels cannot be greater than BGP_MAX_LABELS by design.
Remove the check and the override.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add bgp_path_info_labels_same() to compare labels with labels from
path_info. Remove labels_same() that was used for mplsvpn only.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The handling of MPLS labels in BGP faces an issue due to the way labels
are stored in memory. They are stored in bgp_path_info but not in
bgp_adj_in and bgp_adj_out structures. As a consequence, some
configuration changes result in losing labels or even a bgpd crash. For
example, when retrieving routes from the Adj-RIB-in table
("soft-reconfiguration inbound" enabled), labels are missing.
bgp_path_info stores the MPLS labels, as shown below:
> struct bgp_path_info {
> struct bgp_path_info_extra *extra;
> [...]
> struct bgp_path_info_extra {
> mpls_label_t label[BGP_MAX_LABELS];
> uint32_t num_labels;
> [...]
To solve those issues, a solution would be to set label data to the
bgp_adj_in and bgp_adj_out structures in addition to the
bgp_path_info_extra structure. The idea is to reference a common label
pointer in all these three structures. And to store the data in a hash
list in order to save memory.
However, an issue in the code prevents us from setting clean data
without a rework. The extra->num_labels field, which is intended to
indicate the number of labels in extra->label[], is not reliably checked
or set. The code often incorrectly assumes that if the extra pointer is
present, then a label must also be present, leading to direct access to
extra->label[] without verifying extra->num_labels. This assumption
usually works because extra->label[0] is set to MPLS_INVALID_LABEL when
a new bgp_path_info_extra is created, but it is technically incorrect.
Cleanup the label code by setting num_labels each time values are set in
extra->label[] and checking extra->num_labels before accessing the
labels.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Leaked route from the l3VRF are installed with the loopback as the
nexthop interface instead of the real interface.
> B>* 10.0.0.0/30 [20/0] is directly connected, lo (vrf default), weight 1, 00:21:01
Routing of packet from a L3VRF to the default L3VRF destined to a leak
prefix fails because of the default routing rules on Linux.
> 0: from all lookup local
> 1000: from all lookup [l3mdev-table]
> 32766: from all lookup main
> 32767: from all lookup default
When the packet is received in the loopback interface, the local rules
are checked without match, then the l3mdev-table says to route to the
loopback. A routing loop occurs (TTL is decreasing).
> 12:26:27.928748 ens37 In IP (tos 0x0, ttl 64, id 26402, offset 0, flags [DF], proto ICMP (1), length 84)
> 10.0.0.2 > 10.0.1.2: ICMP echo request, id 47463, seq 1, length 64
> 12:26:27.928784 red Out IP (tos 0x0, ttl 63, id 26402, offset 0, flags [DF], proto ICMP (1), length 84)
> 10.0.0.2 > 10.0.1.2: ICMP echo request, id 47463, seq 1, length 64
> 12:26:27.928797 ens38 Out IP (tos 0x0, ttl 63, id 26402, offset 0, flags [DF], proto ICMP (1), length 84)
> 10.0.0.2 > 10.0.1.2: ICMP echo request, id 47463, seq 1, length 64
Do not set the lo interface as a nexthop interface. Keep the real
interface where possible.
Fixes: db7cf73a33 ("bgpd: fix interface on leaks from redistribute connected")
Fixes: 067fbab4e4 ("bgpd: fix interface on leaks from network statement")
Fixes: 8a02d9fe1e ("bgpd: Set nh ifindex to VRF's interface, not the real")
Fixes: https://github.com/FRRouting/frr/issues/15909
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Today, with the following bgp instance configured, the
local VRF label is allocated even if it is not used.
> router bgp 65500 vrf vrf1
> address-family ipv4 unicast
> label vpn export allocation-mode per-nexthop
> label vpn export auto
> rd vpn export 444:1
> rt vpn both 52:100
> export vpn
> import vpn
The 'show mpls table' indicates that the 16 label value
is allocated, but never used in the exported prefixes.
> r1# show mpls table
> Inbound Label Type Nexthop Outbound Label
> -----------------------------------------------------
> 16 BGP vrf1 -
> 17 BGP 192.168.255.13 -
> 18 BGP 192.0.2.12 -
> 19 BGP 192.0.2.11 -
Fix this by only allocating new label values when really
used. Consequently, only 3 labels will be allocated instead
of previously 4.
> r1# show mpls table
> Inbound Label Type Nexthop Outbound Label
> -----------------------------------------------------
> 16 BGP 192.168.255.13 -
> 17 BGP 192.0.2.11 -
> 18 BGP 192.0.2.12 -
Fixes: 577be36a41 ("bgpd: add support for l3vpn per-nexthop label")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Currently bgp_path_info's are stored in reverse order
received. Sort them by the best path ordering.
This will allow for optimizations in the future on
how multipath is done.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This will allow a consistency of approach to adding/removing
pi's to from the workqueue for processing as well as properly
handling the dest->info pi list more appropriately.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The asan memory leak has been detected:
> Direct leak of 16 byte(s) in 1 object(s) allocated from:
> #0 0x7f9066dadd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
> #1 0x7f9066779b5d in qcalloc lib/memory.c:105
> #2 0x556d6ca527c2 in vpn_leak_zebra_vrf_sid_update_per_af bgpd/bgp_mplsvpn.c:389
> #3 0x556d6ca530e1 in vpn_leak_zebra_vrf_sid_update bgpd/bgp_mplsvpn.c:451
> #4 0x556d6ca64b3b in vpn_leak_postchange bgpd/bgp_mplsvpn.h:311
> #5 0x556d6ca64b3b in vpn_leak_postchange_all bgpd/bgp_mplsvpn.c:3751
> #6 0x556d6cb9f116 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3337
> #7 0x7f906685a6b6 in zclient_read lib/zclient.c:4490
> #8 0x7f9066826a32 in event_call lib/event.c:2011
> #9 0x7f906675c444 in frr_run lib/libfrr.c:1217
> #10 0x556d6c980d52 in main bgpd/bgp_main.c:545
> #11 0x7f9065784c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
Fix this by freeing the previous memory chunk.
Fixes: b72c9e1475 ("bgpd: cli for SRv6 SID alloc to redirect to vrf (step4)")
Fixes: 527588aa78 ("bgpd: add support for per-VRF SRv6 SID")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Locally leaked routes remain active after the nexthop VRF interface goes
down.
Update route leaking when the loopback or a VRF interface state change is
received from zebra.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Leaked routes from prefixes defined with 'network <prefix>' are inactive
because they have no valid nexthop interface.
> vrf r1-cust1
> ip route 172.16.29.0/24 192.168.1.2
> router bgp 5227 vrf r1-cust1
> no bgp network import-check
> address-family ipv4 unicast
> network 172.16.29.0/24
> rd vpn export 10:1
> rt vpn import 52:100
> rt vpn export 52:101
> export vpn
> import vpn
> exit-address-family
> exit
> !
> router bgp 5227 vrf r1-cust4
> bgp router-id 192.168.1.1
> !
> address-family ipv4 unicast
> network 192.0.2/24
> rd vpn export 10:1
> rt vpn import 52:101
> rt vpn export 52:100
> export vpn
> import vpn
> exit-address-family
> exit
Extract from the routing table:
> VRF r1-cust1:
> S>* 172.16.29.0/24 [1/0] via 192.168.1.2, r1-eth4, weight 1, 00:47:53
>
> VRF r1-cust4:
> B 172.16.29.0/24 [20/0] is directly connected, unknown (vrf r1-cust1) inactive, weight 1, 00:03:40
Routes imported through the "network" command, as opposed to those
redistributed from the routing table, do not associate with any specific
interface.
When leaking prefix from other VRFs, if the route was imported from the
network statement (ie. static sub-type), set nh_ifindex to the index of
the VRF master interface of the incoming BGP instance.
The result is:
> VRF r1-cust4:
> B>* 172.16.29.0/24 [20/0] is directly connected, r1-cust1 (vrf r1-cust1), weight 1, 00:00:08
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
If 'bgp network import-check' is defined on the source BGP session,
prefixes that are defined with the network command cannot be leaked to
the other VRFs BGP table even if they are present in the origin VRF RIB
if the 'rt import' statement is defined after the 'network <prefix>'
ones.
When a prefix nexthop is updated, update the prefix route leaking. The
current state of nexthop validation is now stored in the attributes of
the bgp path info. Attributes are compared with the previous ones at
route leaking update so that a nexthop validation change now triggers
the update of destination VRF BGP table.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
If 'bgp network import-check' is defined on the source BGP session,
prefixes that are defined with the network command cannot be leaked to
the other VRFs BGP table even if they are present in the origin VRF RIB.
Always validate the nexthop of BGP static routes (i.e. defined with the
network statement) if 'network import-check' is defined on the source
BGP session and the prefix is present in source RIB.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>