"show bgp <afi> <safi> json detail" was incorrectly displaying header
information from route_vty_out_detail_header() as an element of the
"paths" array. This corrects the behavior for 'json detail' so that a
route holds a dictionary with keys for "paths" and header info, which
aligns with how we structure the output for a specific prefix, e.g.
"show bgp <afi> <safi> <prefix> json".
Before:
```
ub20# show ip bgp json detail
{
"vrfId": 0,
"vrfName": "default",
"tableVersion": 3,
"routerId": "100.64.0.222",
"defaultLocPrf": 100,
"localAS": 1,
"routes": { "2.2.2.2/32": [
{ <<<<<<<<< should be outside the array
"prefix":"2.2.2.2/32",
"version":1,
"advertisedTo":{
"192.168.122.12":{
"hostname":"ub20-2"
}
}
},
{
"aspath":{
"string":"Local",
"segments":[
],
"length":0
},
<snip>
```
After:
```
ub20# show ip bgp json detail
{
"vrfId": 0,
"vrfName": "default",
"tableVersion": 3,
"routerId": "100.64.0.222",
"defaultLocPrf": 100,
"localAS": 1,
"routes": { "2.2.2.2/32": {
"prefix": "2.2.2.2/32",
"version": "1",
"advertisedTo": {
"192.168.122.12":{
"hostname":"ub20-2"
}
}
,"paths": [
{
"aspath":{
"string":"Local",
"segments":[
],
"length":0
},
```
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Ensure that a multipath set is fully comprised of EVPN paths (i.e.,
paths imported into the VRF from EVPN address-family) or non-EVPN
paths. This is actually a condition that existed already in the code
but was not properly enforced.
This change, as a side effect, eliminates the known trigger condition
for bad or missing RMAC programming in an EVPN deployment, described
in tickets CM-29043 and CM-31222. Routes (actually, paths) in a VRF
routing table that require VXLAN tunneling to the next hop currently
need some special handling in zebra to deal with the nexthop (neigh)
and RMAC programming, and this is implemented for the entire route
(prefix), not per-path. This can lead to the bad or missing RMAC
situation, which is now eliminated by ensuring all paths in the route
are 'similar'.
The longer-term solution in CL 5.x will be to deal with the special
programming by means of explicit communication between bgpd and zebra.
This is already implemented for EVPN-MH via CM-31398. These changes
will be extended to non-MH also and the special code in zebra removed
or refined.
Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>
Acked-by: Trey Aspelund <taspelund@nvidia.com>
Acked-by: Anuradha Karuppiah <anuradhak@nvidia.com>
Acked-by: Chirag Shah <chirag@nvidia.com>
Ticket: CM-29043
Testing Done:
1. Manual testing
2. precommit on both MLX and BCM platforms
3. evpn-smoke - BCM and VX
Results described in the ticket
When performing deterministic MED processing, ensure that the peer
status is not checked when we encounter a stale path. Otherwise, this
path will be skipped from the DMED consideration leading to it potentially
not being installed.
Test scenario: Consider a prefix with 2 (multi)paths. The peer that
announces the path with the winning DMED undergoes a graceful-restart.
Before it comes back up, the other path goes away. Prior to the fix, a
third router that receives both these paths would have ended up not
having any path installed to the prefix after the above events.
Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>
For internal use:
Ticket: CM-32032
Testing done: Multiple manual testing
Add a keyword self-originate" to extend current CLI commands to filter out self-originated routes only
a\) CLI to show ipv4/ipv6 self-originated routes
"show [ip] bgp [afi] [safi] [all] self-originate [wide|json]"
b\) CLI to show evpn self-originated routes
"show bgp l2vpn evpn route [detail] [type <ead|macip|multicast|es|prefix|1|2|3|4|5>] self-originate [json]"
Signed-off-by: Karl Quan <kquan@nvidia.com>
The bgp network command creates static routes with an optional
route-distinguisher parameter for VPN and EVPN address families.
Store the rd parameter in those static routes. This will be used
by the 'show running-config' later.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The route-distinguisher string can be expressed in different
ways when the AS number is part of the RD. And the configured
string value has to be kept intact.
The following vty commands store the string value internally:
- router bgp / address-family ipv4 unicast / rd vpn export <>
- router bgp / address-family l2vpn evpn / rd <>
- router bgp / address-family l2vpn evpn / vni <> / rd <>
The vty commands where RD is configured in the below places is
not considered:
- router bgp / rfapi related commands
- router bgp / address-family xxx xxx / network .. rd <>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
RD may be built based on an AS number. Like for the AS, the RD
may use the AS notation. The two below examples can illustrate:
RD 1.1:20 stands for an AS4B:NN RD with AS4B=65536 in dot format.
RD 0.1:20 stands for an AS2B:NNNN RD with AS2B=0.1 in dot+ format.
This commit adds the asnotation mode to prefix_rd2str() API so as
to pick up the relevant display.
Two new printfrr extensions are available to display the RD with
the two above display methods.
- The pRDD extension stands for dot asnotation format
- The pRDE extension stands for dot+ asnotation format.
- The pRD extension has been renamed to pRDP extension
The code is changed each time '%pRD' printf extension is called.
Possibly, the asnotation may change the output, then a macro defines
the asnotation mode to use. A side effect of forging the mode to
use is that the string could not be concatenated with other strings
in vty_out and snprintfrr. Those functions have been called multiple
times. When zlog_debug needs to display the RD with some other string,
the prefix_rd2str() old API is used instead of the printf extension.
Some code has been kept untouched:
- code related to running-config. Actually, wherever an RD is displayed,
its configured name should be dumped.
- bgp rfapi code
- bgp evpn multihoming code (partially done), since the logic is
missing to get the asnotation of 'struct bgp_evpn_es'.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A json AS number API is created in order to output a
given AS number. In order to keep backward compatibility,
if the as-notation uses a number, then the json is encoded
as an integer, otherwise the encoding will be a string.
For what is not relevant to running-configuration, the
as-notation mode is the one used for the BGP instance.
Also, the vty completion gets the configured 'as_pretty'
string value, when an user wants to get the available
BGP instances.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Each BGP prefix may have an as-path list attached. A forged
string is stored in the BGP attribute and shows the as-path
list output.
Before this commit, the as-path list output was expressed as
a list of AS values in plain format. Now, if a given BGP instance
uses a specific asnotation, then the output is changed:
new output:
router bgp 1.1 asnotation dot
!
address-family ipv4 unicast
network 10.200.0.0/24 route-map rmap
network 10.201.0.0/24 route-map rmap
redistribute connected route-map rmap
exit-address-family
exit
!
route-map rmap permit 1
set as-path prepend 1.1 5433.55 264564564
exit
ubuntu2004# do show bgp ipv4
BGP table version is 2, local router ID is 10.0.2.15, vrf id 0
Default local pref 100, local AS 1.1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 4.4.4.4/32 0.0.0.0 0 32768 1.1 5433.55 4036.61268 ?
*> 10.0.2.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 ?
10.200.0.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 i
10.201.0.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 i
The changes include:
- the aspath structure has a new field: asnotation type
The ashash list will differentiate 2 aspaths using a different
asnotation.
- 3 new printf extensions display the as number in the wished
format: pASP, pASD, pASE for plain, dot, or dot+ format (extended).
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This introduces the option for a user to lookup one specific prefix in
the advertised-routes or received-routes table of a peer.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Initial commit: 23b2a7ef52
changed the json output of `show bgp <afi> <safi> json` to
not have pretty print because when under a situation where
there are a bunch of routes with a large scale ecmp show
output was taking forever and this commit cut 2 minutes out
of vtysh run time.
Subusequent commit: f4ec52f7cc
changed this back.
When upgrading to latest version the long run time was noticed
due to testing. Let's add back this functionality such that
FRR can have reduced run times with vtysh when it's really
needed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
These two functions always return 0. As such any and all
tests against this make no sense. Remove the return 0
to a void and follow the chain, logically, to remove all
the dead code.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This change updates the nexthop attribute length
accordingly to the safi used. Actually, with the
previous commit, the length calculated was not
aligned with the real nexthop length. Such packet
received by remote peer was malformed, and this
was resulting in breaking vpnv6 peering.
Fix this by updating appropriately the real
nexthop length.
Fixes: 35ac9b53f2 ("bgpd: fix vpnv6 nexthop encoding")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
MPLS VPN networks can either peer with iBGP or eBGP. When
calculating the distance to send to zebra, the imported prefix
is never sent with distance information, even if the vty
command is used under the ipv4 unicast address family:
router bgp 65505 vrf vrf1
address-family ipv4 unicast
distance bgp 26 27 28
[vpn config]
The observation is that the distance sent to zebra for an
imported prefix is still 20:
[..]
VRF vrf1:
B> 192.168.0.0/24 [20/0] via 2.2.2.2 (vrf default) (recursive), label 20, weight 1, 00:00:12
* via 10.125.0.6, ntfp3 (vrf default), label implicit-null/20, weight 1, 00:00:12
The expectation is that the incoming prefix has to follow the
distance that is configured, or the distance derived from the peer
relationship established by the parent prefix.
In the case, an iBGP relationship is done, and no distance
configuration is done, the below show is expected:
[..]
VRF vrf1:
B*> 192.168.0.0/24 [200/0] via 192.168.0.2, r1-gre0 (vrf default), label 20, weight 1, 00:00:12
In the case an iBGP relationship is done, and distance configuration
is performed as below:
[..]
distance bgp 21 201 41
[..]
Then the below show is expected:
[..]
VRF vrf1:
B*> 192.168.0.0/24 [201/0] via 192.168.0.2, r1-gre0 (vrf default), label 20, weight 1, 00:00:12
To get this behaviour, get the peer origin where the prefix is coming
from.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Before this patch, we needed to explicitly define a neighbor to be SOLO
(= separate update-group). Let's ease this functionality for an operator to
avoid confusions.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Introduce a "detail" keyword for per-neighbor/per-afi-safi
advertised-routes and received-routes show commands.
Includes json support.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Shutdown of bgp results in both the bgp_path_info,
bgp_dest and bgp_table's not being freed because
the bgp_path_info remains locked.
Effectively static routes are scheduled for deletion but bgp_process
skips the work because the work queue sees that the bgp router
is marked for deletion. Effectively not doing any work and leaving
data on the floor.
Modify the code when attempting to put into the work queue to
notice and not do so but just unlock the path info.
This is effectively the same as what goes on for normal peering
as that it checks for shutdown and just calls bgp_path_info_free
too.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
```
unet> sh pe2 vtysh -c 'sh ip bgp ipv4 vpn detail-routes'
BGP table version is 4, local router ID is 10.10.10.20, vrf id 0
Default local pref 100, local AS 65001
Route Distinguisher: 192.168.2.2:2
BGP routing table entry for 192.168.2.2:2:10.0.0.0/24, version 1
not allocated
Paths: (1 available, best #1)
Not advertised to any peer
65000
192.168.2.1 from 0.0.0.0 (10.10.10.20) vrf RED(4) announce-nh-self
Origin incomplete, metric 0, localpref 50, valid, sourced, local, best (First path received)
Extended Community: RT:192.168.2.2:2
Originator: 10.10.10.20
Remote label: 2222
Last update: Tue Dec 20 13:01:20 2022
BGP routing table entry for 192.168.2.2:2:172.16.255.1/32, version 2
not allocated
Paths: (1 available, best #1)
Not advertised to any peer
65000
192.168.2.1 from 0.0.0.0 (10.10.10.20) vrf RED(4) announce-nh-self
Origin incomplete, localpref 50, valid, sourced, local, best (First path received)
Extended Community: RT:192.168.2.2:2
Originator: 10.10.10.20
Remote label: 2222
Last update: Tue Dec 20 13:01:20 2022
BGP routing table entry for 192.168.2.2:2:192.168.1.0/24, version 3
not allocated
Paths: (1 available, best #1)
Not advertised to any peer
65000
192.168.2.1 from 0.0.0.0 (10.10.10.20) vrf RED(4) announce-nh-self
Origin incomplete, localpref 50, valid, sourced, local, best (First path received)
Extended Community: RT:192.168.2.2:2
Originator: 10.10.10.20
Remote label: 2222
Last update: Tue Dec 20 13:01:20 2022
BGP routing table entry for 192.168.2.2:2:192.168.2.0/24, version 4
not allocated
Paths: (1 available, best #1)
Not advertised to any peer
65000
192.168.2.1 from 0.0.0.0 (10.10.10.20) vrf RED(4) announce-nh-self
Origin incomplete, metric 0, localpref 50, valid, sourced, local, best (First path received)
Extended Community: RT:192.168.2.2:2
Originator: 10.10.10.20
Remote label: 2222
Last update: Tue Dec 20 13:01:20 2022
Displayed 4 routes and 4 total paths
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Absolutely not possible to read the output and even distinguish the prefix
we are looking for.
Before:
```
donatas-pc# show ip bgp detail
BGP table version is 12, local router ID is 192.168.10.17, vrf id 0
Default local pref 100, local AS 65002
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
65001
2a02:4780:abc::2 from 2a02:4780:abc::2 (200.200.200.202)
(fe80::a00:27ff:fe5e:d19e) (used)
Origin incomplete, metric 0, valid, external, multipath
Last update: Tue Dec 13 22:53:16 2022
65001
192.168.10.124 from 192.168.10.124 (200.200.200.202)
Origin incomplete, metric 0, valid, external, otc 65001, multipath, best (Neighbor IP)
Last update: Tue Dec 13 22:53:16 2022
65001
2a02:4780:abc::2 from 2a02:4780:abc::2 (200.200.200.202)
(fe80::a00:27ff:fe5e:d19e) (used)
Origin IGP, metric 0, valid, external, multipath
Last update: Tue Dec 13 22:53:16 2022
65001
192.168.10.124 from 192.168.10.124 (200.200.200.202)
Origin IGP, metric 0, valid, external, otc 65001, multipath, best (Neighbor IP)
Last update: Tue Dec 13 22:53:16 2022
```
After:
```
donatas-pc# show ip bgp detail
BGP table version is 12, local router ID is 192.168.10.17, vrf id 0
Default local pref 100, local AS 65002
BGP routing table entry for 10.0.2.0/24, version 1
Paths: (2 available, best #2, table default)
Advertised to non peer-group peers:
2a02:4780:abc::2
65001
2a02:4780:abc::2 from 2a02:4780:abc::2 (200.200.200.202)
(fe80::a00:27ff:fe5e:d19e) (used)
Origin incomplete, metric 0, valid, external, multipath
Last update: Tue Dec 13 22:47:16 2022
BGP routing table entry for 10.0.2.0/24, version 1
Paths: (2 available, best #2, table default)
Advertised to non peer-group peers:
2a02:4780:abc::2
65001
192.168.10.124 from 192.168.10.124 (200.200.200.202)
Origin incomplete, metric 0, valid, external, otc 65001, multipath, best (Neighbor IP)
Last update: Tue Dec 13 22:47:16 2022
BGP routing table entry for 10.10.100.0/24, version 2
Paths: (2 available, best #2, table default)
Advertised to non peer-group peers:
2a02:4780:abc::2
65001
2a02:4780:abc::2 from 2a02:4780:abc::2 (200.200.200.202)
(fe80::a00:27ff:fe5e:d19e) (used)
Origin IGP, metric 0, valid, external, multipath
Last update: Tue Dec 13 22:47:16 2022
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Previous commits have introduced a new 8 bits nh_flag in the attr
struct that has increased the memory footprint.
Move the mp_nexthop_prefer_global boolean in the attr structure that
takes 8 bits to the new nh_flag in order to go back to the previous
memory utilization.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
At bgpd startup, VRF instances are sent from zebra before the
interfaces. When importing a l3vpn prefix from another local VRF
instance, the interfaces are not known yet. The prefix nexthop interface
cannot be set to the loopback or the VRF interface, which causes setting
invalid routes in zebra.
Update route leaking when the loopback or a VRF interface is received
from zebra.
At a VRF interface deletion, zebra voluntarily sends a
ZEBRA_INTERFACE_ADD message to move it to VRF_DEFAULT. Do not update if
such a message is received. VRF destruction will destroy all the related
routes without adding codes.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Display IGP metric of the ultimate path in the command
"show bgp vrf X ipv(4|6)".
Fixes: da0c0ef70c ("bgpd: VRF-Lite fix best path selection")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The withdraw message and announcement message of a prefix are received continuously within 50ms, which may lead to abnormal aggregation route reference count.
Steps to reproduce:
--------------------------
step1:
local config aggregate route 111.0.0.0/24
received route:111.0.0.1/32 111.0.0.02/32
ref_count:2
step2:
peer withdraw 111.0.0.1/32 and network 111.0.0.1/32 in 50ms
received route:111.0.0.1/32 111.0.0.02/32
ref_count:1
step3:
peer withdraw 111.0.0.1/32
received route:111.0.0.02/32
ref_count:0
aggregate route will be withdrawn abnormally
Signed-off-by: liuze03 <liuze03@baidu.com>
```
Direct leak of 112 byte(s) in 1 object(s) allocated from:
0 0x7feb66337a06 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:153
1 0x7feb660cbcc3 in qcalloc lib/memory.c:116
2 0x55cc3cba02d1 in info_make bgpd/bgp_route.c:3831
3 0x55cc3cbab4f1 in bgp_update bgpd/bgp_route.c:4733
4 0x55cc3cbb0620 in bgp_nlri_parse_ip bgpd/bgp_route.c:6111
5 0x55cc3cb79473 in bgp_update_receive bgpd/bgp_packet.c:2020
6 0x55cc3cb7c34a in bgp_process_packet bgpd/bgp_packet.c:2929
7 0x7feb6610ecc5 in thread_call lib/thread.c:2006
8 0x7feb660bfb77 in frr_run lib/libfrr.c:1198
9 0x55cc3cb17232 in main bgpd/bgp_main.c:520
10 0x7feb65ae5082 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: 112 byte(s) leaked in 1 allocation(s).
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
This commit eaeba5e868 changed a bit a formatting,
but this part was missed, let's fix it.
An example before the patch:
```
r3# sh ip bgp ipv4 labeled-unicast neighbors 192.168.34.4 advertised-routes
BGP table version is 3, local router ID is 192.168.34.3, vrf id 0
Default local pref 100, local AS 65003
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 10.0.0.1/32 0.0.0.0 0 65001 ?
Total number of prefixes 1
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Use %pI4/%pI6 where possible, otherwise at least atjust stack buffer sizes
for inet_ntop() calls.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
bgp_pcount_adjust() is called only when calling bgp_path_info_set_flag().
Before this patch the pcount is not advanced before checking for overflow.
Additionally, print:
```
[RZMGQ-A03CG] 192.168.255.1(r1) rcvd UPDATE about 172.16.255.254/32 IPv4 unicast -- DENIED due to: maximum-prefix overflow
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
currently the following configuration
dut:
!
interface ntfp2
ip router isis 1
!
router bgp 200
no bgp ebgp-requires-policy
bgp confederation identifier 300
bgp confederation peers 300
neighbor 192.168.1.1 remote-as 100
neighbor 192.168.2.2 remote-as 300
!
address-family ipv4 unicast
neighbor 192.168.2.2 default-originate
exit-address-family
!
router isis 1
is-type level-2-only
net 49.0001.0002.0002.0002.00
redistribute ipv4 connected level-2
!
end
router:
!
interface ntfp2
ip router isis 1
isis circuit-type level-2-only
!
router bgp 300
no bgp ebgp-requires-policy
bgp confederation identifier 300
bgp confederation peers 200
neighbor 192.168.2.1 remote-as 200
neighbor 192.168.3.2 remote-as 400
!
address-family ipv4 unicast
network 3.3.3.0/24
exit-address-family
!
router isis 1
is-type level-2-only
net 49.0001.0003.0003.0003.00
redistribute ipv4 connected level-2
!
end
on dut result of show bgp ipv4 unicast command is:
show bgp ipv4 unicast
BGP table version is 1, local router ID is 192.168.2.1, vrf id 0
Default local pref 100, local AS 200
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.0/24 192.168.1.1 0 0 100 i
instead of
sho bgp ipv4 unicast
BGP table version is 3, local router ID is 192.168.2.1, vrf id 0
Default local pref 100, local AS 200
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.0/24 192.168.1.1 0 0 100 i
*> 3.3.3.0/24 192.168.2.2 0 100 0 (300) i
*> 4.4.4.0/24 192.168.3.2 0 100 0 (300) 400 i
Displayed 3 routes and 3 total paths
According to RFC 5065:the usage of one of the member AS number as the
confederation identifier is not forbidden.
fixes are the following
in bgp_route.c:
in bgp_update remove the test for presence of confederation id in
as_path since, this case is allowed;
in bgp_vty.c
bgp_confederation_peers, remove the test on peer as value
in bgpd.c
bgp_confederation_peers_add
remove the test on peer as value
invert the order of setting peer->sort value and peer->local_as,
since peer->sort is depending from current peer->local_as value
bgp_confederation_peers_remove
invert the order of setting peer->sort value and peer->local_as,
since peer->sort is depending from current peer->local_as value
Signed-off-by: Francois Dumontet <francois.dumontet@6wind.com>
We already have a global knob for graceful-shutdown, but it's handy having
per neighbor knob as well.
Especially when a single neighbor needs to be restarted/shutdown gracefuly.
We can do this route-maps, but this is a faster/cleaner way doing the same
for an operator.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
This commit addresses an issue that happens when using bgp
peering with a rr client, with a received prefix which is the
local ip address of the bgp session.
When using bgp ipv4 unicast session, the local prefix is
received by a peer, and finds out that the proposed prefix
and its next-hop are the same. To avoid a route loop locally,
no nexthop entry is referenced for that prefix, and the route
will not be selected.
When the received peer is a route reflector, the prefix has
to be selected, even if the route can not be installed locally.
Fixes: ("fb8ae704615c") bgpd: prevent routes loop through itself
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Not all places were checking to see if soft reconfiguration
was turned on before calling into it to do all that work.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When deciding whether to apply "neighbor soo" filtering towards a peer,
we were only looking for SoO ecoms that use either AS or AS4 encoding.
This makes sure we also check for IPv4 encoding, since we allow a user
to configure that encoding style against the peer.
Config:
```
router bgp 1
address-family ipv4 unicast
network 100.64.0.2/32 route-map soo-foo
neighbor 192.168.122.12 soo 3.3.3.3:20
exit-address-family
!
route-map soo-foo permit 10
set extcommunity soo 3.3.3.3:20
exit
```
Before:
```
ub20# show ip bgp neighbors 192.168.122.12 advertised-routes
BGP table version is 5, local router ID is 100.64.0.222, vrf id 0
Default local pref 100, local AS 1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 2.2.2.2/32 0.0.0.0 0 100 32768 i
*> 100.64.0.2/32 0.0.0.0 0 100 32768 i
Total number of prefixes 2
```
After:
```
ub20# show ip bgp neighbors 192.168.122.12 advertised-routes
BGP table version is 5, local router ID is 100.64.0.222, vrf id 0
Default local pref 100, local AS 1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 2.2.2.2/32 0.0.0.0 0 100 32768 i
Total number of prefixes 1
```
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Rather than running selected source files through the preprocessor and a
bunch of perl regex'ing to get the list of all DEFUNs, use the data
collected in frr.xref.
This not only eliminates issues we've been having with preprocessor
failures due to nonexistent header files, but is also much faster.
Where extract.pl would take 5s, this now finishes in 0.2s. And since
this is a non-parallelizable build step towards the end of the build
(dependent on a lot of other things being done already), the speedup is
actually noticeable.
Also files containing CLI no longer need to be listed in `vtysh_scan`
since the .xref data covers everything. `#ifndef VTYSH_EXTRACT_PL`
checks are equally obsolete.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Column headers in BGP routes table are not aligned with data when
RPKI status is available. This was fixed to insert a space at the
beginning of the header and at the beginning of lines that do not
have RPKI status.
This fix requires that several testing templates be adjusted to
match the new output.
Signed-off-by: Wayne Morrison <wmorrison@netgate.com>
Ensure that un-configuring allowas-in for a peer or group
clears the related flags and integer value. Tighten the use
of the integer counter so that it's only used when the config
flag is set. Add show output if allowas-in is enabled.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
Fix vni table output broken by 8304dabfab
The "Imported from" output was not getting delimitted by a newline
after the mentioned commit. This fixes that and retains the output
wanted by the original change.
Before:
'''
Route [2]:[0]:[48]:[b6:3a:cc:d5:a1:cd]:[128]:[fe80::b43a:ccff:fed5:a1cd] VNI 30/10 Imported from 2.2.2.2:4:[2]:[0]:[48]:[b6:3a:cc:d5:a1:cd]:[128]:[fe80::b43a:ccff:fed5:a1cd], VNI 30/10
2
2.2.2.2(alfred) from alfred(veth1) (2.2.2.2)
Origin IGP, valid, external, bestpath-from-AS 2, best (First path received)
Extended Community: RT:2:30 ET:8
Last update: Fri Oct 7 16:04:59 2022
'''
After:
'''
Route [2]:[0]:[48]:[b2:cf:96:70:4f:b6]:[128]:[fe80::b0cf:96ff:fe70:4fb6] VNI 30/10
Imported from 2.2.2.2:4:[2]:[0]:[48]:[b2:cf:96:70:4f:b6]:[128]:[fe80::b0cf:96ff:fe70:4fb6], VNI 30/10
2
2.2.2.2(alfred) from alfred(veth1) (2.2.2.2)
Origin IGP, valid, external, bestpath-from-AS 2, best (First path received)
Extended Community: RT:2:30 ET:8
Last update: Fri Oct 7 17:02:28 2022
'''
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Use the IP addr of type2/macip routes only for the hash/key
of the VNI table and carry the MAC in a path_info_extra attribute.
There is exists situations that can be hit during extended MAC mobility events
where two MACs could be pointing to the same IP in our global table. It
is requires very specific timings.
When that happens, BPG would (because we key'd on both MAC and IP)
install both into it's VNI table as separate entries, but zebra only
knows/needs to know about a single IP -> MAC relationship for it's VNI
table's type2 routes. So it was compleletly undeterministic which one
zebra would end up with in these timing situations.
With these changes, we move BGP's VNI table to key'd the same as Zebra's
and now a single IP will have multiple path_info's with a path_info_extra
that is carrying the MAC info for each path.
BGP will then run best path to deterministically decide which one to send to
zebra during the occasions where there exist's two possible MACs.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
There is code that sets the pi based upon matching it against
the same peer. In this code the type and sub-type are also
compared to the passed in type and sub-type. Let's just use
type and sub-type as that if we have a pi we know type and sub-type
are already correct. This should also make the first iteration
work correctly when the pi has not been created yet when we call
the martian_update function.bgpd: Remove unnecessary check for pi and setting type and sub-type
There is code that sets the pi based upon matching it against
the same peer. In this code the type and sub-type are also
compared to the passed in type and sub-type. Let's just use
type and sub-type as that if we have a pi we know type and sub-type
are already correct. This should also make the first iteration
work correctly when the pi has not been created yet when we call
the martian_update function.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
```
donatas-pc# show ip bgp 100.100.100.0/24 longer-prefixes
BGP table version is 13, local router ID is 10.10.10.10, vrf id 0
Default local pref 100, local AS 65000
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
100.100.100.0/24 0.0.0.0 0 32768 i
Displayed 1 routes and 15 total paths
donatas-pc# show ip bgp 100.100.100.0/24
BGP routing table entry for 100.100.100.0/24, version 0
Paths: (1 available, no best path)
Not advertised to any peer
Local
0.0.0.0 (inaccessible, import-check enabled) from 0.0.0.0 (10.10.10.10)
Origin IGP, metric 0, weight 32768, invalid, sourced, local
Last update: Tue Oct 4 11:31:44 2022
donatas-pc# show ip bgp 100.100.100.0/24 json
{
"prefix":"100.100.100.0\/24",
"version":0,
"paths":[
{
"aspath":{
"string":"Local",
"segments":[
],
"length":0
},
"origin":"IGP",
"metric":0,
"weight":32768,
"valid":false,
"version":0,
"sourced":true,
"local":true,
"lastUpdate":{
"epoch":1664872304,
"string":"Tue Oct 4 11:31:44 2022\n"
},
"nexthops":[
{
"ip":"0.0.0.0",
"hostname":"donatas-pc",
"afi":"ipv4",
"accessible":false,
"importCheckEnabled":true,
"used":true
}
],
"peer":{
"peerId":"0.0.0.0",
"routerId":"10.10.10.10"
}
}
]
}
donatas-pc#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before returning an error, unlock bgp dest which is locked by
bgp_node_lookup()/bgp_node_get().
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
==536197== 400 (160 direct, 240 indirect) bytes in 4 blocks are definitely lost in loss record 19 of 21
==536197== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==536197== by 0x491C753: qcalloc (memory.c:116)
==536197== by 0x303FA9: aspath_dup (bgp_aspath.c:698)
==536197== by 0x304B2A: aspath_replace_specific_asn (bgp_aspath.c:1219)
==536197== by 0x256840: bgp_peer_as_override (bgp_route.c:1781)
==536197== by 0x256840: subgroup_announce_check (bgp_route.c:2216)
==536197== by 0x258345: subgroup_process_announce_selected (bgp_route.c:2804)
==536197== by 0x27F2CA: group_announce_route_walkcb (bgp_updgrp_adv.c:199)
==536197== by 0x4905A51: hash_walk (hash.c:285)
==536197== by 0x27E8D1: update_group_af_walk (bgp_updgrp.c:1866)
==536197== by 0x2809D3: group_announce_route (bgp_updgrp_adv.c:1022)
==536197== by 0x257DC4: bgp_process_main_one (bgp_route.c:3189)
==536197== by 0x257DC4: bgp_process_main_one (bgp_route.c:2975)
==536197== by 0x2581F7: bgp_process_wq (bgp_route.c:3330)
==536197== by 0x4961787: work_queue_run (workqueue.c:285)
==536197== by 0x4957745: thread_call (thread.c:2008)
==536197== by 0x4910B77: frr_run (libfrr.c:1198)
==536197== by 0x1ED6AC: main (bgp_main.c:520)
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before it worked only when configured initially via CLI. Later, when we
receive a new route, that should match a decent MED, we just skip it, because
MED mismatch is not recalculated.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When redistributing connected addresses, the address family has
to be figured out. The calculation was not done, the next-hop
address length was not set, and as consequence, the nexthop
is displayed like if it was an ipv6 address, which is wrong for
ipv4 addresses.
Calculate the family for connected addresses.
Change the topotests accordingly.
Fixes: ("7226bc40d606") bgpd: ignore NEXT_HOP for MP_REACH_NLRI
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
RFC4364 describes peerings between multiple AS domains, to ease
the continuity of VPN services across multiple SPs. This commit
implements a sub-set of IETF option b) described in chapter 10 b.
The ASBR to ASBR approach is taken, with an EBGP peering between
the two routers. The EBGP peering must be directly connected to
the outgoing interface used. In those conditions, the next hop
is directly connected, and there is no need to have a transport
label to convey the VPN label. A new vty command is added on a
per interface basis:
This command if enabled, will permit to convey BGP VPN labels
without any transport labels (i.e. with implicit-null label).
restriction:
this command is used only for EBGP directly connected peerings.
Other use cases are not covered.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
```
spine1-debian-11# sh ip bgp 100.100.100.101/32
BGP routing table entry for 100.100.100.101/32, version 21
Paths: (1 available, best #1, table default)
Not advertised to any peer
Local
fe80::ca5d:fd0d:cd8:1bb7 from eth3 (172.17.0.3)
(fe80::ca5d:fd0d:cd8:1bb7) (used)
Origin incomplete, metric 0, localpref 100, valid, internal, best (First path received)
Extended Community: OVS:invalid
Last update: Wed Aug 31 19:31:46 2022
spine1-debian-11# sh ip bgp 100.100.100.100/32
BGP routing table entry for 100.100.100.100/32, version 17
Paths: (1 available, best #1, table default)
Not advertised to any peer
Local
fe80::ca5d:fd0d:cd8:1bb7 from eth3 (172.17.0.3)
(fe80::ca5d:fd0d:cd8:1bb7) (used)
Origin incomplete, metric 0, localpref 100, valid, internal, best (First path received)
Extended Community: OVS:not-found
Last update: Wed Aug 31 19:31:46 2022
spine1-debian-11#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before:
```
$ vtysh -c 'show bgp l2vpn evpn route detail json'
<<<<<<<<<<<<<<<<<<<< empty line
<<<<<<<<<<<<<<<<<<<< empty line
<<<<<<<<<<<<<<<<<<<< empty line
<<<<<<<<<<<<<<<<<<<< empty line
{
...
"numPrefix":4,
"numPaths":4 <<<<< four paths = four empty lines
}
```
Contain as much "empty lines" before the JSON string as the number
of paths displayed.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
BGP SoO is a tag that is appended on BGP updates to allow a peer to mark
a particular peer as belonging to a particular site. In certain MPLS L3 VPN
configurations, the BGP AS-Path may not provide the granularity needed
prevent a loop in the control-plane. With this in mind, BGP SoO is designed
to fill this gap and prevent a routing loop that may occur.
If we configure for example, `neighbor soo 65000:1` at PEs, routes won't be
announced between CPEs if soo matches. This is especially needed when using
as-override or allowas-in.
Also, this is the automated way of the same behavior as configuring route-maps
for each peer like:
```
bgp extcommunity-list cpe permit soo 65000:1
!
route-map cpe permit 10
set extcommunity soo 65000:1
...
route-map cpe deny 10
match extcommunity cpe
route-map cpe permit 20
...
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before:
```
donatas-laptop# show bgp ipv4 unicast community-list testas
% testas is not a valid community-list name
donatas-laptop# con
donatas-laptop(config)# bgp community-list standard testas permit internet
donatas-laptop(config)# do show bgp ipv4 unicast community-list testas
donatas-laptop(config)#
```
`is not a valid community-list name` is a misleading warning message.
Doing the same for filter-list, access-list, prefix-list, route-map.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If we have conditional advertisement enabled, and conditionally withdrew
some prefixes, and then we do a 'clear bgp', those routes were getting
advertised again, and then withdrawn the next time the conditional
advertisement scanner executed.
When we go to advertise check the prefix against the conditional
advertisement status so we don't do that.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
RFC 4760 states we SHOULD ignore the NEXT_HOP attribute for BGP Update
messages carrying only MP_REACH_NLRI attributes. Thus we should use the
Network Address of Next Hop field of the MP_REACH_NLRI as the nexthop.
Instead of always looking for BGP_ATTR_NEXT_HOP, this commit ensures:
1) we set mp_nexthop_len to BGP_ATTR_NHLEN_IPV4 for v4 bgp_static routes
2) we check mp_nexthop_len when choosing the nexthop to use for nht
3) we check mp_nexthop_len when choosing the nexthop to send to zebra
4) we check mp_nexthop_len when picking the nexthop to shown by vtysh
Reported-by: Binon Gorbutt <binon@aervivo.com>
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
The same as with prefix-list/route-maps/etc.
```
donatas-pc# show ip access-list spine
ZEBRA:
Zebra IP access list spine
seq 5 permit 200.200.200.200/32
BGP:
Zebra IP access list spine
seq 5 permit 200.200.200.200/32
PIM:
Zebra IP access list spine
seq 5 permit 200.200.200.200/32
BABELD:
Zebra IP access list spine
seq 5 permit 200.200.200.200/32
donatas-pc# show bgp ipv4 unicast access-list
ACCESSLIST_NAME Access-list name
spine
donatas-pc# show bgp ipv4 unicast access-list spine
BGP table version is 9, local router ID is 172.17.0.3, vrf id 0
Default local pref 100, local AS 1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 200.200.200.200/32
enp3s0 0 0 65000 3456 ?
Displayed 1 routes and 10 total paths
donatas-pc#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Just convert all uses of thread_cancel to THREAD_OFF. Additionally
use THREAD_ARG instead of t->arg to get the arguement. Individual
files should never be accessing thread private data like this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
A new command is available under SAFI_MPLS_VPN:
With this command, the BGP vpnvx prefixes received are
not kept, if there are no VRF interested in importing
those vpn entries.
A soft refresh is performed if there is a change of
configuration: retain cmd, vrf import settings, or
route-map change.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
RFC 9234 mandates that role rules apply only to IPv4/IPv6 unicast bgp
sessions. If the OTC attribute appears in other sessions, it will remain
untouched.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
RFC9234 is a way to establish correct connection roles (Customer/
Provider, Peer or with RS) between bgp speakers. This patch:
- Add a new configuration/terminal option to set the appropriate local
role;
- Add a mechanism for checking used roles, implemented by exchanging
the corresponding capabilities in OPEN messages;
- Add strict mode to force other party to use this feature;
- Add basic support for a new transitive optional bgp attribute - OTC
(Only to Customer);
- Add logic for default setting OTC attribute and filtering routes with
this attribute by the edge speakers, if the appropriate conditions are
met;
- Add two test stands to check role negotiation and route filtering
during role usage.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
When using json output for `show bgp statistics json` gather the
number of prefixes of each prefix Length.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Start using mpls_lse_encode/mpls_lse_decode, that is endian-aware, because
we always use host-byte order, should use network-byte.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
in bgp_nlri_parse_ip there is a `sanity` check to ensure
that the prefix length as specified by the packet
will fit inside of a `struct prefix` correctly. The problem
here of course is that this is only v4 / v6 unicast/multicast
parsing and the bytes will never be more than 16, but we are copying
into a part of the struct prefix that is only 16 bytes, but with
this check the length may be up to 47 bytes( but not really possible ).
Limit the size check to at most 16 bytes (since we are only handling
v4 or v6 addresses here )
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The maxpaths same_clusterlen value was a uint16_t
with a single bit being used. No other values are
being stored. Let's remove the bitfield and simplify
to a bool.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The logic to unlock dest if iteration completed without iterating the
entire node was flawed. Specifically, if iteration terminated due to
`gr_deferred == 0` then the node would not get unlocked.
This change takes into account the fact that dest will be NULL only in
the case when the entire table was iterated and all nodes were already
unlocked. In any other case, it needs to be unlocked.
Signed-off-by: Carl Baldwin <carl@ecbaldwin.net>