When you have suppress-fib-pending turned on it is possible
to end up in a situation where the prefix is not withdrawn
from downstream peers.
Here is the timing that I believe is happening:
a) have 2 paths to a peer.
b) receive a withdrawal from 1 path, set BGP_NODE_FIB_INSTALL_PENDING
and send the route install to zebra.
c) receive a withdrawal from the other path.
d) At this point we have a dest->flags set BGP_NODE_FIB_INSTALL_PENDING
old_select the path_info going away, new_select is NULL
e) A bit further down we call group_announce_route() which calls
the code to see if we should advertise the path. It sees the
BGP_NODE_FIB_INSTALL_PENDING flag and says, nope.
f) the route is sent to zebra to withdraw, which unsets the
BGP_NODE_FIB_INSTALL_PENDING.
g) This function winds up and deletes the path_info. Dest now
has no path infos.
h) BGP receives the route install(from step b) and unsets the
BGP_NODE_FIB_INSTALL_PENDING flag
i) BGP receives the route removed from zebra (from step f) and
unsets the flag again.
We know if there is no new_select, let's go ahead and just
unset the PENDING flag to allow the withdrawal to go out
at the time when the second withdrawal is received.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The json output of advertised-routes is incorrect, as there is a missing
brace with route-distinguisher:
observed with the bgp_vpnv4_noretain test:
> "bgpTableVersion":0,"bgpLocalRouterId":"192.0.2.1","defaultLocPrf":100,"localAS":65500,
> "advertisedRoutes": "192.0.2.1:1":{"rd":"192.0.2.1:1","10.101.0.0/24":{"prefix":"10.101.0.0/24",
expected:
> "bgpTableVersion":0,"bgpLocalRouterId":"192.0.2.1","defaultLocPrf":100,"localAS":65500,
> "advertisedRoutes": { "192.0.2.1:1":{"rd":"192.0.2.1:1","10.101.0.0/24":{"prefix":"10.101.0.0/24",
> ^
> missing brace
Fix this by adding the missing braces.
Fixes: 4838bac033 ("bgpd: neighbors received-routes/advertised-routes stringify changes")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When aggregate is used, the suppressed information is not displayed in
the json attributes of a given path. To illustrate, the dump of the
192.168.2.1/32 path in the bgp_aggregate_address_topo1 topotest:
> # show bgp ipv4
> [..]
> s> 192.168.2.1/32 10.0.0.2 0 65001 i
>
> # show bgp ipv4 detail
> [..]
> BGP routing table entry for 192.168.2.1/32, version 17
> Paths: (1 available, best #1, table default, vrf (null), Advertisements suppressed by an aggregate.)
> Not advertised to any peer
> 65001 <---- missing suppressed flag
> 10.0.0.2 from 10.0.0.2 (10.254.254.3)
> Origin IGP, valid, external, best (First path received)
> Last update: Fri Jan 24 13:11:41 2025
>
> # show bgp ipv4 detail json
> [..]
> ,"192.168.2.1/32": [{"aspath":{"string":"65001","segments":[{"type":"as-sequence","list":[65001]}],"length":1},"origin":"IGP","valid":true,"version":17,
> "bestpath":{"overall":true,"selectionReason":"First path received"}, <---- missing suppressed flag
> "lastUpdate":{"epoch":1737720700,"string":"Fri Jan 24 13:11:40 2025\n"},
> "nexthops":[{"ip":"10.0.0.2","afi":"ipv4","metric":0,"accessible":true,"used":true}],
> "peer":{"peerId":"10.0.0.2","routerId":"10.254.254.3","type":"external"}}]
Fix this by adding the json information.
> # show bgp ipv4 detail
> [..]
> BGP routing table entry for 192.168.2.1/32, version 17
> Paths: (1 available, best #1, table default, vrf (null), Advertisements suppressed by an aggregate.)
> Not advertised to any peer
> 65001, (suppressed)
> 10.0.0.2 from 10.0.0.2 (10.254.254.3)
> Origin IGP, valid, external, best (First path received)
> Last update: Fri Jan 24 13:11:41 2025
>
> # show bgp ipv4 detail json
> [..]
> ,"192.168.2.1/32": [{"aspath":{"string":"65001","segments":[{"type":"as-sequence","list":[65001]}],"length":1},"suppressed":true,"origin":"IGP","valid":true,"version":17,
> "bestpath":{"overall":true,"selectionReason":"First path received"},
> "lastUpdate":{"epoch":1737720991,"string":"Fri Jan 24 13:16:31 2025"},
> "nexthops":[{"ip":"10.0.0.2","afi":"ipv4","metric":0,"accessible":true,"used":true}],"peer":{"peerId":"10.0.0.2","routerId":"10.254.254.3","type":"external"}}]
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
EVPN imported routes AF is not AF_EVPN, in that case
the path info extra with EVPN is nver created.
In order to create evpn info under path extra, define
new api which does not specifically check for AF_EVPN
afi type.
Signed-off-by: Chirag Shah <chirag@nvidia.com>
draft-uttaro-idr-bgp-oad says:
Extended communities which are non-transitive across an AS boundary MAY be
advertised over an EBGP-OAD session if allowed by explicit policy configuration.
If allowed, all the members of the OAD SHOULD be configured to use the same
criteria.
For example, the Origin Validation State Extended Community, defined as
non-transitive in [RFC8097], can be advertised to peers in the same OAD.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Some bgp evpn memory contexts are not freed at the end of the bgp
process.
> =================================================================
> ==1208677==ERROR: LeakSanitizer: detected memory leaks
>
> Direct leak of 96 byte(s) in 2 object(s) allocated from:
> #0 0x7f93ad4b4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
> #1 0x7f93ace77233 in qcalloc lib/memory.c:106
> #2 0x563bb68f4df1 in process_type5_route bgpd/bgp_evpn.c:5084
> #3 0x563bb68fb663 in bgp_nlri_parse_evpn bgpd/bgp_evpn.c:6302
> #4 0x563bb69ea2a9 in bgp_nlri_parse bgpd/bgp_packet.c:347
> #5 0x563bb69f7716 in bgp_update_receive bgpd/bgp_packet.c:2482
> #6 0x563bb6a04d3b in bgp_process_packet bgpd/bgp_packet.c:4091
> #7 0x7f93acf8082d in event_call lib/event.c:1996
> #8 0x7f93ace48931 in frr_run lib/libfrr.c:1232
> #9 0x563bb6880ae1 in main bgpd/bgp_main.c:557
> #10 0x7f93ac829d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
Actually, the bgp evpn context may noy be used if adj rib in is unused.
This may lead to memory leaks. Fix this by freeing the context in those
corner cases.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The 'show bgp ipv[4,6] json' command does not display the interface
value of the nexthop, when BGP sessions are unnumbered, whereas the
non json output displays it correctly. The below example indicates
'r1-eth0' wheras in json, the value is not displayed.
> r1# show bgp ipv4
> BGP table version is 3, local router ID is 10.254.254.1, vrf id 0
> Default local pref 100, local AS 101
> Status codes: s suppressed, d damped, h history, u unsorted, * valid, > best, = multipath,
> i internal, r RIB-failure, S Stale, R Removed
> Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
> Origin codes: i - IGP, e - EGP, ? - incomplete
> RPKI validation codes: V valid, I invalid, N Not found
>
> Network Next Hop Metric LocPrf Weight Path
> *> 10.254.254.1/32 0.0.0.0 0 32768 ?
> *> 10.254.254.2/32 r1-eth0 0 0 102 ?
>
> Displayed 2 routes and 2 total paths
Fix this by adding an 'interface' keyword in the json attributes.
> "nexthops":[{"ip":"2001:db8:1::2","hostname":"r2","afi":"ipv6",
> "scope":"global"},{"interface":"r1-eth0","ip":"fe80::1868:d7ff:fe66:45ae",
> "hostname":"r2","afi":"ipv6","scope":"link-local","used":true}]}]
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Route aggregation should be checked after a route is added, and
not before. This is for code flow and consistency.
Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
Currently when an aggregate-address is configured, the existing
entry is always removed regardless whether any change is involved.
This would create unnecessary churn of the aggregate route when
the same config is applied for the aggregate address.
The fix is to check for duplicate aggregate-address config.
Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
su_local and su_remote in the peer can change based upon
if we are initiating the remote connection or receiving it.
As such we need to treat it as a property of the connection.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently when re-evaluating an aggregate route, the full attribute of
the aggregate route is not compared with the existing one in the BGP
table. That can result in unnecessary churns (un-install and then
install) of the aggregate route when a more specific route is added or
deleted, or when the route-map for the aggregate changes. The churn
would impact route installation and route advertisement.
The fix is to apply the route-map for the aggregate first and then
compare the attribute.
Here is an example of the churn:
debug bgp aggregate prefix 5.5.5.0/24
!
route-map set-comm permit 10
set community 65004:200
!
router bgp 65001
address-family ipv4 unicast
redistribute static
aggregate-address 5.5.5.0/24 route-map set-comm
!
Step 1:
ip route 5.5.5.1/32 Null0
Jan 8 10:28:49 enke-vm1 bgpd[285786]: [J7PXJ-A7YA2] bgp_aggregate_install: aggregate 5.5.5.0/24, count 1
Jan 8 10:28:49 enke-vm1 bgpd[285786]: [Y444T-HEVNG] aggregate 5.5.5.0/24: installed
Step 2:
ip route 5.5.5.2/32 Null0
Jan 8 10:29:03 enke-vm1 bgpd[285786]: [J7PXJ-A7YA2] bgp_aggregate_install: aggregate 5.5.5.0/24, count 2
Jan 8 10:29:03 enke-vm1 bgpd[285786]: [S2EH5-EQSX6] aggregate 5.5.5.0/24: existing, removed
Jan 8 10:29:03 enke-vm1 bgpd[285786]: [Y444T-HEVNG] aggregate 5.5.5.0/24: installed
---
Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
If the peer which has allowas-in enabled and then reimports the routes to another
local VRF, respect that value.
This was working with < 10.2 releases.
Fixes: d4426b62d2 ("bgpd: copy source vrf ASN to leaked route and block loops")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
This reverts commit ee1986f1b5.
The fix is incomplete, and is no longer needed with the fix that applies
the route-map for an aggregate and then compares the attribute.
Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
When stopping and restarting BGP daemon part of the configuration
remains. It should be cleared.
Particulary those are address-family parametes, like: distance,
ead-es-frag, disable-ead-evi-rx, disable-ead-evi-tx.
Signed-off-by: Yaroslav Kholod <y.kholod@vyos.io>
The rpki current state was ignored when calling for the 'show bgp ipvX
detail' command. Handle the support for this, with json support too.
> "rpkiValidationState" : "valid",
> "rpkiValidationState" : "invalid",
> "rpkiValidationState" : "not used",
> "rpkiValidationState" : "not found",
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
For JSON output we don't need newline to be printed.
Before:
```
"lastUpdate":{"epoch":1734490463,"string":"Wed Dec 18 04:54:23 2024\n"
```
After:
```
"lastUpdate":{"epoch":1734678560,"string":"Fri Dec 20 09:09:20 2024"
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Add missing json attribute to BGP path.
Fixes: 82c298be73 ("bgpd: Show RPKI short state in `show bgp <afi> <safi>`")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
If we have a route-map that sets some attributes e.g. community or large-community,
and the route-map is applied for outgoing direction, everything is fine, but
we missed the point that `advertised-routes detail` was not using the applied
attributes to display and instead it uses what is received from the peer (original).
Let's fix this, and use what's already applied (advertise attributes), and
we can now see:
```
route-map r3 permit 10
match ip address prefix-list p1
set community 65001:65002
set extcommunity bandwidth 100
set large-community 65001:65002:65003
exit
!
...
address-family ipv4 unicast
neighbor 192.168.2.3 route-map r3 out
exit-address-family
...
```
The output:
```
r2# show bgp ipv4 neighbors 192.168.2.3 advertised-routes detail
BGP table version is 1, local router ID is 192.168.2.2, vrf id 0
Default local pref 100, local AS 65002
BGP routing table entry for 10.10.10.1/32, version 1
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
192.168.1.1 192.168.2.3
65001
0.0.0.0 from 192.168.1.1 (192.168.1.1)
Origin IGP, valid, external, best (First path received)
Community: 65001:65002
Extended Community: LB:65002:12500000 (100.000 Mbps)
Large Community: 65001:65002:65003
Last update: Thu Dec 19 17:00:40 2024
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
This commit introduces meta queue to the BGP process_queue which is
helpful in having a priority of lists where some routes can be processed
earlier than 'other' routes. This is similar to how meta queue is
present in zebra.
After Fix:
---------
For testing, note that all 100.x routes are marked as Early routes which
got enqueued and dequeued first before Other routes in every batch of
updates. Also, the items are dequeued in FIFO order.
switch# cat /var/log/frr/bgpd.log | grep sub-queue
2024/12/06 19:19:42.788014 BGP: [V64FH-G6883] 88.0.0.9/32 queued into sub-queue Other Route
2024/12/06 19:19:42.856127 BGP: [V64FH-G6883] 100.90.9.186/32 queued into sub-queue Early Route
2024/12/06 19:19:42.856138 BGP: [V64FH-G6883] 100.90.9.187/32 queued into sub-queue Early Route
2024/12/06 19:19:42.886715 BGP: [V64FH-G6883] 66.0.0.9/32 queued into sub-queue Other Route
2024/12/06 19:19:43.022835 BGP: [V64FH-G6883] 33.0.0.9/32 queued into sub-queue Other Route
2024/12/06 19:19:43.058842 BGP: [V64FH-G6883] 44.0.0.9/32 queued into sub-queue Other Route
2024/12/06 19:19:43.092365 BGP: [V64FH-G6883] 55.0.0.9/32 queued into sub-queue Other Route
2024/12/06 19:19:43.540770 BGP: [ZAPXS-9754G] 100.90.9.186/32 dequeued from sub-queue Early Route
2024/12/06 19:19:43.541233 BGP: [ZAPXS-9754G] 100.90.9.187/32 dequeued from sub-queue Early Route
2024/12/06 19:19:43.541523 BGP: [ZAPXS-9754G] 88.0.0.9/32 dequeued from sub-queue Other Route
2024/12/06 19:19:43.602094 BGP: [V64FH-G6883] 88.0.0.9/32 queued into sub-queue Other Route
2024/12/06 19:19:43.649083 BGP: [V64FH-G6883] 100.90.9.186/32 queued into sub-queue Early Route
2024/12/06 19:19:43.649092 BGP: [V64FH-G6883] 100.90.9.187/32 queued into sub-queue Early Route
2024/12/06 19:19:43.649148 BGP: [V64FH-G6883] 77.0.0.9/32 queued into sub-queue Other Route
2024/12/06 19:19:43.712282 BGP: [V64FH-G6883] 100.90.9.138/32 queued into sub-queue Early Route
2024/12/06 19:19:43.712314 BGP: [V64FH-G6883] 100.90.9.139/32 queued into sub-queue Early Route
2024/12/06 19:19:43.817194 BGP: [V64FH-G6883] 100.90.8.58/32 queued into sub-queue Early Route
2024/12/06 19:19:43.817205 BGP: [V64FH-G6883] 100.90.8.59/32 queued into sub-queue Early Route
2024/12/06 19:19:43.942464 BGP: [ZAPXS-9754G] 100.90.9.186/32 dequeued from sub-queue Early Route
2024/12/06 19:19:43.942530 BGP: [ZAPXS-9754G] 100.90.9.187/32 dequeued from sub-queue Early Route
2024/12/06 19:19:43.942550 BGP: [ZAPXS-9754G] 100.90.9.138/32 dequeued from sub-queue Early Route
2024/12/06 19:19:43.942738 BGP: [ZAPXS-9754G] 100.90.9.139/32 dequeued from sub-queue Early Route
2024/12/06 19:19:43.942763 BGP: [ZAPXS-9754G] 100.90.8.58/32 dequeued from sub-queue Early Route
2024/12/06 19:19:43.942788 BGP: [ZAPXS-9754G] 100.90.8.59/32 dequeued from sub-queue Early Route
2024/12/06 19:19:44.558611 BGP: [ZAPXS-9754G] 66.0.0.9/32 dequeued from sub-queue Other Route
2024/12/06 19:19:44.893541 BGP: [ZAPXS-9754G] 33.0.0.9/32 dequeued from sub-queue Other Route
2024/12/06 19:19:45.171794 BGP: [ZAPXS-9754G] 44.0.0.9/32 dequeued from sub-queue Other Route
2024/12/06 19:19:45.453137 BGP: [ZAPXS-9754G] 55.0.0.9/32 dequeued from sub-queue Other Route
2024/12/06 19:19:45.685269 BGP: [ZAPXS-9754G] 88.0.0.9/32 dequeued from sub-queue Other Route
2024/12/06 19:19:45.764752 BGP: [ZAPXS-9754G] 77.0.0.9/32 dequeued from sub-queue Other Route
With 'update-delay' feature (EOIU marker):
------------------------------------------
switch# vtysh -c "show run bgp" | grep update-delay
update-delay 40
switch# cat /var/log/frr/bgpd.log | grep sub-queue
2024/12/06 23:27:46.124461 BGP: [V64FH-G6883] 22.0.0.9/32 queued into sub-queue Other Route
2024/12/06 23:27:46.160224 BGP: [V64FH-G6883] 100.90.8.11/32 queued into sub-queue Early Route
2024/12/06 23:27:46.219663 BGP: [W9QTR-P4REP] EOIU Marker queued into sub-queue EOIU Marker
2024/12/06 23:27:46.269711 BGP: [ZAPXS-9754G] 100.90.8.11/32 dequeued from sub-queue Early Route
2024/12/06 23:27:46.270980 BGP: [ZAPXS-9754G] 22.0.0.9/32 dequeued from sub-queue Other Route
2024/12/06 23:27:46.404868 BGP: [RBX2V-K33CZ] EOIU Marker dequeued from sub-queue EOIU Markera
Ticket: #4200787
Signed-off-by: Karthikeya Venkat Muppalla <kmuppalla@nvidia.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
If we have this construct:
for (pi = bgp_dest_get_bgp_path_info(dest); pi; pi = pi->next) {
...
bgp_process();
}
This can induce an infinite loop. This happens because bgp_process
will move the unsorted items to the top of the list for handling,
as such it is necessary to hold the next pointer to the side
to actually look at each possible bgp_path_info.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
If you have a bestpath list that looks something like this:
<local evpn mac route>
<learned from peer out swp60>
<learned from peer out swp57>
And a network event happens that causes the peer out swp60
to not be in an established state, yet we still have the
path_info for the destination for swp60, bestpath
will currently end up with this order:
<learned from peer out swp60>
<local evpn mac route>
<learned from peer out swp57>
This causes the local evpn mac route to be deleted in zebra( Wrong! ).
This is happening because swp60 is skipped in bestpath calculation and
not considered to be a path yet it stays at the front of the list.
Modify bestpath calculation such that when pulling the unsorted_list
together to pull path info's into that list when they are also
not in a established state.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The json display of the version attribute is originally an
integer. It has changed, most probably mistakenly.
> {
> "vrfId": 7,
> "vrfName": "vrf1",
> "tableVersion": 3,
> "routerId": "192.0.2.1",
> "defaultLocPrf": 100,
> "localAS": 65500,
> "routes": {
> "172.31.0.1/32": {
> "prefix": "172.31.0.1/32",
> "version": "1", <--- int or string ??
Let us fix it, by using the integer display instead.
Fixes: f9f2d188e3 ("bgpd: fix 'json detail' output structure")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
If we have (default enabled) enabled `bgp ebgp-require-policy`, then first check
it before applying the route-maps.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If we receive an IPv6 prefix e.g.: 2001:db8:100::/64 with nextop: 0.0.0.0, and
mp_nexthop: fc00::2, we should not treat this with an invalid nexthop because
of 0.0.0.0. We MUST check for MP_REACH attribute also and decide later if we
have at least one a valid nexthop.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>