This commit eaeba5e868 changed a bit a formatting,
but this part was missed, let's fix it.
An example before the patch:
```
r3# sh ip bgp ipv4 labeled-unicast neighbors 192.168.34.4 advertised-routes
BGP table version is 3, local router ID is 192.168.34.3, vrf id 0
Default local pref 100, local AS 65003
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 10.0.0.1/32 0.0.0.0 0 65001 ?
Total number of prefixes 1
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Use %pI4/%pI6 where possible, otherwise at least atjust stack buffer sizes
for inet_ntop() calls.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
bgp_pcount_adjust() is called only when calling bgp_path_info_set_flag().
Before this patch the pcount is not advanced before checking for overflow.
Additionally, print:
```
[RZMGQ-A03CG] 192.168.255.1(r1) rcvd UPDATE about 172.16.255.254/32 IPv4 unicast -- DENIED due to: maximum-prefix overflow
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
currently the following configuration
dut:
!
interface ntfp2
ip router isis 1
!
router bgp 200
no bgp ebgp-requires-policy
bgp confederation identifier 300
bgp confederation peers 300
neighbor 192.168.1.1 remote-as 100
neighbor 192.168.2.2 remote-as 300
!
address-family ipv4 unicast
neighbor 192.168.2.2 default-originate
exit-address-family
!
router isis 1
is-type level-2-only
net 49.0001.0002.0002.0002.00
redistribute ipv4 connected level-2
!
end
router:
!
interface ntfp2
ip router isis 1
isis circuit-type level-2-only
!
router bgp 300
no bgp ebgp-requires-policy
bgp confederation identifier 300
bgp confederation peers 200
neighbor 192.168.2.1 remote-as 200
neighbor 192.168.3.2 remote-as 400
!
address-family ipv4 unicast
network 3.3.3.0/24
exit-address-family
!
router isis 1
is-type level-2-only
net 49.0001.0003.0003.0003.00
redistribute ipv4 connected level-2
!
end
on dut result of show bgp ipv4 unicast command is:
show bgp ipv4 unicast
BGP table version is 1, local router ID is 192.168.2.1, vrf id 0
Default local pref 100, local AS 200
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.0/24 192.168.1.1 0 0 100 i
instead of
sho bgp ipv4 unicast
BGP table version is 3, local router ID is 192.168.2.1, vrf id 0
Default local pref 100, local AS 200
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.0/24 192.168.1.1 0 0 100 i
*> 3.3.3.0/24 192.168.2.2 0 100 0 (300) i
*> 4.4.4.0/24 192.168.3.2 0 100 0 (300) 400 i
Displayed 3 routes and 3 total paths
According to RFC 5065:the usage of one of the member AS number as the
confederation identifier is not forbidden.
fixes are the following
in bgp_route.c:
in bgp_update remove the test for presence of confederation id in
as_path since, this case is allowed;
in bgp_vty.c
bgp_confederation_peers, remove the test on peer as value
in bgpd.c
bgp_confederation_peers_add
remove the test on peer as value
invert the order of setting peer->sort value and peer->local_as,
since peer->sort is depending from current peer->local_as value
bgp_confederation_peers_remove
invert the order of setting peer->sort value and peer->local_as,
since peer->sort is depending from current peer->local_as value
Signed-off-by: Francois Dumontet <francois.dumontet@6wind.com>
We already have a global knob for graceful-shutdown, but it's handy having
per neighbor knob as well.
Especially when a single neighbor needs to be restarted/shutdown gracefuly.
We can do this route-maps, but this is a faster/cleaner way doing the same
for an operator.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
This commit addresses an issue that happens when using bgp
peering with a rr client, with a received prefix which is the
local ip address of the bgp session.
When using bgp ipv4 unicast session, the local prefix is
received by a peer, and finds out that the proposed prefix
and its next-hop are the same. To avoid a route loop locally,
no nexthop entry is referenced for that prefix, and the route
will not be selected.
When the received peer is a route reflector, the prefix has
to be selected, even if the route can not be installed locally.
Fixes: ("fb8ae704615c") bgpd: prevent routes loop through itself
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Not all places were checking to see if soft reconfiguration
was turned on before calling into it to do all that work.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When deciding whether to apply "neighbor soo" filtering towards a peer,
we were only looking for SoO ecoms that use either AS or AS4 encoding.
This makes sure we also check for IPv4 encoding, since we allow a user
to configure that encoding style against the peer.
Config:
```
router bgp 1
address-family ipv4 unicast
network 100.64.0.2/32 route-map soo-foo
neighbor 192.168.122.12 soo 3.3.3.3:20
exit-address-family
!
route-map soo-foo permit 10
set extcommunity soo 3.3.3.3:20
exit
```
Before:
```
ub20# show ip bgp neighbors 192.168.122.12 advertised-routes
BGP table version is 5, local router ID is 100.64.0.222, vrf id 0
Default local pref 100, local AS 1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 2.2.2.2/32 0.0.0.0 0 100 32768 i
*> 100.64.0.2/32 0.0.0.0 0 100 32768 i
Total number of prefixes 2
```
After:
```
ub20# show ip bgp neighbors 192.168.122.12 advertised-routes
BGP table version is 5, local router ID is 100.64.0.222, vrf id 0
Default local pref 100, local AS 1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 2.2.2.2/32 0.0.0.0 0 100 32768 i
Total number of prefixes 1
```
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Rather than running selected source files through the preprocessor and a
bunch of perl regex'ing to get the list of all DEFUNs, use the data
collected in frr.xref.
This not only eliminates issues we've been having with preprocessor
failures due to nonexistent header files, but is also much faster.
Where extract.pl would take 5s, this now finishes in 0.2s. And since
this is a non-parallelizable build step towards the end of the build
(dependent on a lot of other things being done already), the speedup is
actually noticeable.
Also files containing CLI no longer need to be listed in `vtysh_scan`
since the .xref data covers everything. `#ifndef VTYSH_EXTRACT_PL`
checks are equally obsolete.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Column headers in BGP routes table are not aligned with data when
RPKI status is available. This was fixed to insert a space at the
beginning of the header and at the beginning of lines that do not
have RPKI status.
This fix requires that several testing templates be adjusted to
match the new output.
Signed-off-by: Wayne Morrison <wmorrison@netgate.com>
Ensure that un-configuring allowas-in for a peer or group
clears the related flags and integer value. Tighten the use
of the integer counter so that it's only used when the config
flag is set. Add show output if allowas-in is enabled.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
Fix vni table output broken by 8304dabfab
The "Imported from" output was not getting delimitted by a newline
after the mentioned commit. This fixes that and retains the output
wanted by the original change.
Before:
'''
Route [2]:[0]:[48]:[b6:3a:cc:d5:a1:cd]:[128]:[fe80::b43a:ccff:fed5:a1cd] VNI 30/10 Imported from 2.2.2.2:4:[2]:[0]:[48]:[b6:3a:cc:d5:a1:cd]:[128]:[fe80::b43a:ccff:fed5:a1cd], VNI 30/10
2
2.2.2.2(alfred) from alfred(veth1) (2.2.2.2)
Origin IGP, valid, external, bestpath-from-AS 2, best (First path received)
Extended Community: RT:2:30 ET:8
Last update: Fri Oct 7 16:04:59 2022
'''
After:
'''
Route [2]:[0]:[48]:[b2:cf:96:70:4f:b6]:[128]:[fe80::b0cf:96ff:fe70:4fb6] VNI 30/10
Imported from 2.2.2.2:4:[2]:[0]:[48]:[b2:cf:96:70:4f:b6]:[128]:[fe80::b0cf:96ff:fe70:4fb6], VNI 30/10
2
2.2.2.2(alfred) from alfred(veth1) (2.2.2.2)
Origin IGP, valid, external, bestpath-from-AS 2, best (First path received)
Extended Community: RT:2:30 ET:8
Last update: Fri Oct 7 17:02:28 2022
'''
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Use the IP addr of type2/macip routes only for the hash/key
of the VNI table and carry the MAC in a path_info_extra attribute.
There is exists situations that can be hit during extended MAC mobility events
where two MACs could be pointing to the same IP in our global table. It
is requires very specific timings.
When that happens, BPG would (because we key'd on both MAC and IP)
install both into it's VNI table as separate entries, but zebra only
knows/needs to know about a single IP -> MAC relationship for it's VNI
table's type2 routes. So it was compleletly undeterministic which one
zebra would end up with in these timing situations.
With these changes, we move BGP's VNI table to key'd the same as Zebra's
and now a single IP will have multiple path_info's with a path_info_extra
that is carrying the MAC info for each path.
BGP will then run best path to deterministically decide which one to send to
zebra during the occasions where there exist's two possible MACs.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
There is code that sets the pi based upon matching it against
the same peer. In this code the type and sub-type are also
compared to the passed in type and sub-type. Let's just use
type and sub-type as that if we have a pi we know type and sub-type
are already correct. This should also make the first iteration
work correctly when the pi has not been created yet when we call
the martian_update function.bgpd: Remove unnecessary check for pi and setting type and sub-type
There is code that sets the pi based upon matching it against
the same peer. In this code the type and sub-type are also
compared to the passed in type and sub-type. Let's just use
type and sub-type as that if we have a pi we know type and sub-type
are already correct. This should also make the first iteration
work correctly when the pi has not been created yet when we call
the martian_update function.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
```
donatas-pc# show ip bgp 100.100.100.0/24 longer-prefixes
BGP table version is 13, local router ID is 10.10.10.10, vrf id 0
Default local pref 100, local AS 65000
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
100.100.100.0/24 0.0.0.0 0 32768 i
Displayed 1 routes and 15 total paths
donatas-pc# show ip bgp 100.100.100.0/24
BGP routing table entry for 100.100.100.0/24, version 0
Paths: (1 available, no best path)
Not advertised to any peer
Local
0.0.0.0 (inaccessible, import-check enabled) from 0.0.0.0 (10.10.10.10)
Origin IGP, metric 0, weight 32768, invalid, sourced, local
Last update: Tue Oct 4 11:31:44 2022
donatas-pc# show ip bgp 100.100.100.0/24 json
{
"prefix":"100.100.100.0\/24",
"version":0,
"paths":[
{
"aspath":{
"string":"Local",
"segments":[
],
"length":0
},
"origin":"IGP",
"metric":0,
"weight":32768,
"valid":false,
"version":0,
"sourced":true,
"local":true,
"lastUpdate":{
"epoch":1664872304,
"string":"Tue Oct 4 11:31:44 2022\n"
},
"nexthops":[
{
"ip":"0.0.0.0",
"hostname":"donatas-pc",
"afi":"ipv4",
"accessible":false,
"importCheckEnabled":true,
"used":true
}
],
"peer":{
"peerId":"0.0.0.0",
"routerId":"10.10.10.10"
}
}
]
}
donatas-pc#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before returning an error, unlock bgp dest which is locked by
bgp_node_lookup()/bgp_node_get().
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
==536197== 400 (160 direct, 240 indirect) bytes in 4 blocks are definitely lost in loss record 19 of 21
==536197== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==536197== by 0x491C753: qcalloc (memory.c:116)
==536197== by 0x303FA9: aspath_dup (bgp_aspath.c:698)
==536197== by 0x304B2A: aspath_replace_specific_asn (bgp_aspath.c:1219)
==536197== by 0x256840: bgp_peer_as_override (bgp_route.c:1781)
==536197== by 0x256840: subgroup_announce_check (bgp_route.c:2216)
==536197== by 0x258345: subgroup_process_announce_selected (bgp_route.c:2804)
==536197== by 0x27F2CA: group_announce_route_walkcb (bgp_updgrp_adv.c:199)
==536197== by 0x4905A51: hash_walk (hash.c:285)
==536197== by 0x27E8D1: update_group_af_walk (bgp_updgrp.c:1866)
==536197== by 0x2809D3: group_announce_route (bgp_updgrp_adv.c:1022)
==536197== by 0x257DC4: bgp_process_main_one (bgp_route.c:3189)
==536197== by 0x257DC4: bgp_process_main_one (bgp_route.c:2975)
==536197== by 0x2581F7: bgp_process_wq (bgp_route.c:3330)
==536197== by 0x4961787: work_queue_run (workqueue.c:285)
==536197== by 0x4957745: thread_call (thread.c:2008)
==536197== by 0x4910B77: frr_run (libfrr.c:1198)
==536197== by 0x1ED6AC: main (bgp_main.c:520)
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before it worked only when configured initially via CLI. Later, when we
receive a new route, that should match a decent MED, we just skip it, because
MED mismatch is not recalculated.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When redistributing connected addresses, the address family has
to be figured out. The calculation was not done, the next-hop
address length was not set, and as consequence, the nexthop
is displayed like if it was an ipv6 address, which is wrong for
ipv4 addresses.
Calculate the family for connected addresses.
Change the topotests accordingly.
Fixes: ("7226bc40d606") bgpd: ignore NEXT_HOP for MP_REACH_NLRI
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
RFC4364 describes peerings between multiple AS domains, to ease
the continuity of VPN services across multiple SPs. This commit
implements a sub-set of IETF option b) described in chapter 10 b.
The ASBR to ASBR approach is taken, with an EBGP peering between
the two routers. The EBGP peering must be directly connected to
the outgoing interface used. In those conditions, the next hop
is directly connected, and there is no need to have a transport
label to convey the VPN label. A new vty command is added on a
per interface basis:
This command if enabled, will permit to convey BGP VPN labels
without any transport labels (i.e. with implicit-null label).
restriction:
this command is used only for EBGP directly connected peerings.
Other use cases are not covered.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
```
spine1-debian-11# sh ip bgp 100.100.100.101/32
BGP routing table entry for 100.100.100.101/32, version 21
Paths: (1 available, best #1, table default)
Not advertised to any peer
Local
fe80::ca5d:fd0d:cd8:1bb7 from eth3 (172.17.0.3)
(fe80::ca5d:fd0d:cd8:1bb7) (used)
Origin incomplete, metric 0, localpref 100, valid, internal, best (First path received)
Extended Community: OVS:invalid
Last update: Wed Aug 31 19:31:46 2022
spine1-debian-11# sh ip bgp 100.100.100.100/32
BGP routing table entry for 100.100.100.100/32, version 17
Paths: (1 available, best #1, table default)
Not advertised to any peer
Local
fe80::ca5d:fd0d:cd8:1bb7 from eth3 (172.17.0.3)
(fe80::ca5d:fd0d:cd8:1bb7) (used)
Origin incomplete, metric 0, localpref 100, valid, internal, best (First path received)
Extended Community: OVS:not-found
Last update: Wed Aug 31 19:31:46 2022
spine1-debian-11#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before:
```
$ vtysh -c 'show bgp l2vpn evpn route detail json'
<<<<<<<<<<<<<<<<<<<< empty line
<<<<<<<<<<<<<<<<<<<< empty line
<<<<<<<<<<<<<<<<<<<< empty line
<<<<<<<<<<<<<<<<<<<< empty line
{
...
"numPrefix":4,
"numPaths":4 <<<<< four paths = four empty lines
}
```
Contain as much "empty lines" before the JSON string as the number
of paths displayed.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
BGP SoO is a tag that is appended on BGP updates to allow a peer to mark
a particular peer as belonging to a particular site. In certain MPLS L3 VPN
configurations, the BGP AS-Path may not provide the granularity needed
prevent a loop in the control-plane. With this in mind, BGP SoO is designed
to fill this gap and prevent a routing loop that may occur.
If we configure for example, `neighbor soo 65000:1` at PEs, routes won't be
announced between CPEs if soo matches. This is especially needed when using
as-override or allowas-in.
Also, this is the automated way of the same behavior as configuring route-maps
for each peer like:
```
bgp extcommunity-list cpe permit soo 65000:1
!
route-map cpe permit 10
set extcommunity soo 65000:1
...
route-map cpe deny 10
match extcommunity cpe
route-map cpe permit 20
...
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before:
```
donatas-laptop# show bgp ipv4 unicast community-list testas
% testas is not a valid community-list name
donatas-laptop# con
donatas-laptop(config)# bgp community-list standard testas permit internet
donatas-laptop(config)# do show bgp ipv4 unicast community-list testas
donatas-laptop(config)#
```
`is not a valid community-list name` is a misleading warning message.
Doing the same for filter-list, access-list, prefix-list, route-map.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If we have conditional advertisement enabled, and conditionally withdrew
some prefixes, and then we do a 'clear bgp', those routes were getting
advertised again, and then withdrawn the next time the conditional
advertisement scanner executed.
When we go to advertise check the prefix against the conditional
advertisement status so we don't do that.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
RFC 4760 states we SHOULD ignore the NEXT_HOP attribute for BGP Update
messages carrying only MP_REACH_NLRI attributes. Thus we should use the
Network Address of Next Hop field of the MP_REACH_NLRI as the nexthop.
Instead of always looking for BGP_ATTR_NEXT_HOP, this commit ensures:
1) we set mp_nexthop_len to BGP_ATTR_NHLEN_IPV4 for v4 bgp_static routes
2) we check mp_nexthop_len when choosing the nexthop to use for nht
3) we check mp_nexthop_len when choosing the nexthop to send to zebra
4) we check mp_nexthop_len when picking the nexthop to shown by vtysh
Reported-by: Binon Gorbutt <binon@aervivo.com>
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
The same as with prefix-list/route-maps/etc.
```
donatas-pc# show ip access-list spine
ZEBRA:
Zebra IP access list spine
seq 5 permit 200.200.200.200/32
BGP:
Zebra IP access list spine
seq 5 permit 200.200.200.200/32
PIM:
Zebra IP access list spine
seq 5 permit 200.200.200.200/32
BABELD:
Zebra IP access list spine
seq 5 permit 200.200.200.200/32
donatas-pc# show bgp ipv4 unicast access-list
ACCESSLIST_NAME Access-list name
spine
donatas-pc# show bgp ipv4 unicast access-list spine
BGP table version is 9, local router ID is 172.17.0.3, vrf id 0
Default local pref 100, local AS 1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 200.200.200.200/32
enp3s0 0 0 65000 3456 ?
Displayed 1 routes and 10 total paths
donatas-pc#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Just convert all uses of thread_cancel to THREAD_OFF. Additionally
use THREAD_ARG instead of t->arg to get the arguement. Individual
files should never be accessing thread private data like this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
A new command is available under SAFI_MPLS_VPN:
With this command, the BGP vpnvx prefixes received are
not kept, if there are no VRF interested in importing
those vpn entries.
A soft refresh is performed if there is a change of
configuration: retain cmd, vrf import settings, or
route-map change.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
RFC 9234 mandates that role rules apply only to IPv4/IPv6 unicast bgp
sessions. If the OTC attribute appears in other sessions, it will remain
untouched.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
RFC9234 is a way to establish correct connection roles (Customer/
Provider, Peer or with RS) between bgp speakers. This patch:
- Add a new configuration/terminal option to set the appropriate local
role;
- Add a mechanism for checking used roles, implemented by exchanging
the corresponding capabilities in OPEN messages;
- Add strict mode to force other party to use this feature;
- Add basic support for a new transitive optional bgp attribute - OTC
(Only to Customer);
- Add logic for default setting OTC attribute and filtering routes with
this attribute by the edge speakers, if the appropriate conditions are
met;
- Add two test stands to check role negotiation and route filtering
during role usage.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
When using json output for `show bgp statistics json` gather the
number of prefixes of each prefix Length.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Start using mpls_lse_encode/mpls_lse_decode, that is endian-aware, because
we always use host-byte order, should use network-byte.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
in bgp_nlri_parse_ip there is a `sanity` check to ensure
that the prefix length as specified by the packet
will fit inside of a `struct prefix` correctly. The problem
here of course is that this is only v4 / v6 unicast/multicast
parsing and the bytes will never be more than 16, but we are copying
into a part of the struct prefix that is only 16 bytes, but with
this check the length may be up to 47 bytes( but not really possible ).
Limit the size check to at most 16 bytes (since we are only handling
v4 or v6 addresses here )
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The maxpaths same_clusterlen value was a uint16_t
with a single bit being used. No other values are
being stored. Let's remove the bitfield and simplify
to a bool.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The logic to unlock dest if iteration completed without iterating the
entire node was flawed. Specifically, if iteration terminated due to
`gr_deferred == 0` then the node would not get unlocked.
This change takes into account the fact that dest will be NULL only in
the case when the entire table was iterated and all nodes were already
unlocked. In any other case, it needs to be unlocked.
Signed-off-by: Carl Baldwin <carl@ecbaldwin.net>
Before:
```
spine1-debian-11(config-route-map)# bgp community alias 65001:65001 test1
spine1-debian-11(config)# route-map rm permit 10
spine1-debian-11(config-route-map)# set community 65001:65001
% Malformed communities attribute
```
After:
```
spine1-debian-11(config)# bgp community alias 65001:65001 test1
spine1-debian-11(config)# route-map rm permit 10
spine1-debian-11(config-route-map)# set community 65001:65001
spine1-debian-11(config-route-map)#
```
Same for large-communities.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When `all` is specified BGP pointer is always NULL, we need to iterate over
all instances separately.
```
Received signal 11 at 1648199394 (si_addr 0x30, PC 0x562e96597090); aborting...
/usr/local/lib/libfrr.so.0(zlog_backtrace_sigsafe+0x5e) [0x7f378a57ff6e]
/usr/local/lib/libfrr.so.0(zlog_signal+0xe6) [0x7f378a580146]
/usr/local/lib/libfrr.so.0(+0xcd4c2) [0x7f378a5aa4c2]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7f378a33e140]
/usr/lib/frr/bgpd(bgp_afi_safi_peer_exists+0) [0x562e96597090]
/usr/lib/frr/bgpd(+0x15c3b8) [0x562e9654a3b8]
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Commit: ea47320b1d
Modified the bgp_clear_stale_route function to have
better indentation, but in the process changed some
`continue;` statements to `break;` which modified
the looping and caused stale paths to not always be
removed upon an update.
To reproduce: A ---- B, setup with addpath and GR
One side has a prefix with nhop1 and nhop2, kill one
side and then resend the same prefix with nhop3,
paths nhop1 and 2 become stale and never removed.
Code inspection clearly shows that that `continue`
statements became `break` statements causing the
loop over all paths to stop prematurely.
The fix is to change the break back to continue
statements so the loop can continue instead of
stopping.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This patch adds transpostion_offset and transposition_len to bgp_sid_info,
and transposes SID only at bgp_zebra_announce.
Signed-off-by: Ryoga Saito <ryoga.saito@linecorp.com>
Add a 15 minute warning to the logging system when
bgp policy is not setup properly. Operators keep asking
about the missing policy( on upgrade typically ). Let's
try to give them a bit more of a hint when something is
going wrong as that they are clearly missing the other
various places FRR tells them about it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
bgp_attr_undup does the same thing as bgp_attr_flush – frees the
temporary data that might be allocated when applying a route-map. There
is no need to have two separate functions for that.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
`struct prefix p` was declared inside an if statement
where we assign the address of to a pointer that is
then passed to a sub function. This will eventually
leave us in a bad state.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In situations where remove-private-AS is configured for eBGP peers
residing in a private ASN, the peer's ASN was not being retained
in the AS-Path which can allow loops to occur. This was addressed
in a prior commit but it only addressed cases where the "replace-AS"
keyword was configured.
This commit ensures we retain the peer's ASN when using
"remove-private-AS" for eBGP peers in a private ASN regardless of other
keywords.
Setup:
=========
router bgp 4200000002
neighbor enp1s0 interface v6only remote-as external
neighbor enp6s0 interface v6only remote-as external
!
address-family ipv4 unicast
neighbor enp6s0 remove-private-AS
exit-address-family
ub18# show ip bgp sum | include 420000
BGP router identifier 100.64.0.111, local AS number 4200000002 vrf-id 0 <<<<< local asn 4200000002
ub20(enp1s0) 4 4200000001 22 22 0 0 0 00:00:57 1 1
ub20(enp6s0) 4 4200000001 21 22 0 0 0 00:00:57 0 1 <<<< peer asn 4200000001
ub18# show ip bgp | include 0.2
Default local pref 100, local AS 4200000002
*> 100.64.0.2/32 enp1s0 0 0 4200000001 4200000004 4200000005 4200000001 i
Before ("remote-private-AS" only):
=========
ub18# show ip bgp neighbors enp6s0 advertised-routes | include 100.64.0.2
*> 100.64.0.2/32 :: 0 i <<<<< empty as-path, no way to prevent loop
After ("remote-private-AS" only):
=========
ub18# show ip bgp neighbors enp6s0 advertised-routes | include 100.64.0.2
*> 100.64.0.2/32 :: 0 4200000001 4200000001 i <<<< retain peer's asn, breaks loop
Ticket: 2857047
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Abstract:
- The command "neighbor PEER maximum-prefix-out NUMBER" cannot be applied
without clearing the BGP neighbor.
- Apply the maximum-prefix-out value as soon as it is modified without
clearing the neighbor.
subgroup_update_packet() and subgroup_withdraw_packet() respectively
manages the announcement and withdrawal BGP message to the peer.
subgrp->scount counter counts the number of sent prefixes.
Before the patch, the maximum out prefix limitation was applied in
subgroup_update_packet() in order that subgrp->scount never exceeds the
limit. Setting a limit inferior to the effective number of sent prefix
did not result in sending any withdrawal message to reduce the number of
sent prefixes. Without clearing the BGP neighbor, the limitation only
applied to the announcement of new prefixes when the limitation was
over.
With the patch, the limitation is checked in subgroup_announce_check().
The function is intended to say whether a prefix has to be announced in
regards to the prefix-list, route-map... Now when a maximum-prefix-out
value is changed/removed, the neighbor AFI/SAFI table is re-parsed in
the same way as for the application of route-map, prefix-lists...
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
BGP EVPN custom `union gw_addr` is basically the same thing as a common
`struct ipaddr` but it lacks the address family which is needed in some
cases.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
This code is populating a temporary variable `add` instead of the attr.
Initially this variable was later copied to the attr but the copying was
erroneously deleted by 0a50c2481. Directly populate the attr to restore
the correct behavior.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Here we try to compare the new attr with the existing one but this call
compares the existing index with zero instead. attrhash_cmp already
compares indexes using overlay_index_same so this call is both wrong and
useless.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
```
exit1-debian-11# sh ip bgp 10.10.10.10/32
BGP routing table entry for 10.10.10.10/32, version 14
Paths: (1 available, best #1, table default)
Not advertised to any peer
65000, (stale)
192.168.0.2 from 192.168.0.2 (0.0.0.0)
Origin incomplete, metric 0, valid, external, best (First path received)
Last update: Wed Jan 19 17:13:51 2022
Time until Graceful Restart stale route deleted: 117
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
'show bgp ... neighbor [routes|received-routes]' both incorrectly
used a json key of 'advertisedRoutes'.
This corrects the key to be 'receivedRoutes' for commands where
the displayed routes were received, not advertised.
before:
unet> r3 show ip bgp neigh 10.2.30.2 received-routes json | include Routes
"advertisedRoutes":{
after:
ub18# show ip bgp neighbors enp1s0 received-routes json | include Routes
"receivedRoutes":{
ub18# show ip bgp neighbors enp1s0 advertised-routes json | include Routes
"advertisedRoutes":{
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
The bgp_notify_conditional_adv_scanner function was/is looping
over all peers. And only matching on the passed in peer,
based upon the subgroup. As such we do not need to loop
over everything and just cut-to-the chase and just modify
the peer structure.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Unsuppress route part of the aggregation when route-map configuration
is removed before the aggregation itself.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Description:
Change is intended for fixing the issue related to
clearing of stale leaked routes:
- Whenever BGP goes down,
after bringing down tcp connection and renegotiating capabilities,
once we reestablish connection,
we are not handling clear of VRF leaked route in the bgp_clear_stale_route.
- While bgp is clearing stale routes,
we need to handle withdraw of routes for VRF route-leaking.
Co-authored-by: Kantesh Mundaragi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
rfc7196 recommends:
In addition, BGP implementations have an internal constant, which we
will call the 'maximum penalty', and the current computed penalty may
not exceed it.
Router Maximum Penalty: The internal constant for the maximum
penalty value MUST be raised to at least 50,000.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Currently the Wait for Install code ( bgp_suppress_fib ) does
not properly handle two states from zebra: ROUTE_INSTALL_FAILED
and BETTER_ADMIN_DISTANCE_WON. Pre this change the WFI code
would just never notify our peers about a route install failure
but more is needed. In the ROUTE_INSTALL_FAILED and the
BETTER_ADMIN_DISTANCE_WON we need to notify our peers with
a withdrawal about the route, else we will continue to
draw traffic to us when we cannot legally do so.
Why is this needed? In either case imagine that we've already
received a bgp route, installed it and sent to our peers.
In the Better admin distance won case, say a static route is installed
at this point in time we must stop advertising the route through
us since we are not installed. As such a withdrawal must be sent.
In the ROUTE_INSTALL_FAILED case, the code was not properly handling
the situation where we have Route A, it was successfully installed
and then we received a update to Route A that was attempted to be
installed but failed. In this case we also need to send a withdrawal
Finally update the bgp_suppress_fib topotest to test both of these
situations.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Don't hide the LABELED_UNICAST safi when processing route
updates; map it where necessary (to use the UNICAST table
for instance).
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
Move the "longer-prefixes" option from show_ip_bgp_cmd to
show_ip_bgp_json_cmd so that is has access to JSON output.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Move the "route-map" option from show_ip_bgp_cmd to
show_ip_bgp_json_cmd so that is has access to JSON output.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Move the "filter-list" option from show_ip_bgp_cmd to
show_ip_bgp_json_cmd so that is has access to JSON output.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Move the "prefix-list" option from show_ip_bgp_cmd to
show_ip_bgp_json_cmd so that is has access to JSON output.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Move the "community-list" option from show_ip_bgp_cmd to
show_ip_bgp_json_cmd so that is has access to JSON output.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
There's no need to have different calls to bgp_show() when the only
difference is one argument that corresponds to a "void *" parameter.
Code duplication should be reduced to a minimum to avoid bugs like
the one fixed in the previous commit.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Like done in the other places (when "all" isn't used), pass the
actual alias name to bgp_show() instead of a null pointer.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Description:
Incorrect behavior during best path selection for the imported routes.
Imported routes are always treated as eBGP routes.
Change is intended for fixing the issues related to
bgp best path selection for leaked routes:
- FRR does ecmp for the imported routes,
even without any ecmp related config.
If the same prefix is imported from two different VRFs,
then we configure the route with ecmp even without
any ecmp related config.
- Locally imported routes are preferred over imported
eBGP routes.
If there is a local route and eBGP learned route
for the same prefix, if we import both the routes,
imported local route is selected as best path.
- Same route is imported from multiple tenant VRFs,
both imported routes point to the same VRF in nexthop.
- When the same route with same nexthop in two different VRFs
is imported from those two VRFs, route is not installed as ecmp,
even though we had ecmp config.
- During best path selection, while comparing the paths for imported routes,
we should correctly refer to the original route i.e. the ultimate path.
- When the same route is imported from multiple VRF,
use the correct VRF while installing in the FIB.
- When same route is imported from two different tenant VRFs,
while comparing bgp path info as part of bgp best path selection,
we should ideally also compare corresponding VRFs.
See-also: https://github.com/FRRouting/frr/files/7169555/FRR.and.Cisco.VRF-Lite.Behaviour.pdf
Co-authored-by: Santosh P K <sapk@vmware.com>
Co-authored-by: Kantesh Mundaragi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
We should send only 16bytes next hop, no need for 32bytes, third party
next hops kinda for LLA does not work here.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
When debugging issues for routes in multiple vrf's. It would
be extremely useful if the debug output had which vrf we
are acting on.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
AFI/SAFI is handled in bgp_vty_find_and_parse_afi_safi_bgp() properly for
IPv4, but not for IPv6. Let's have it enabled for IPv6 by default.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
AFI/SAFI is handled in bgp_vty_find_and_parse_afi_safi_bgp() properly for
IPv4, but not for IPv6. Let's have it enabled for IPv6 by default.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
```
exit1-debian-9# show ip route 172.16.16.1/32
Routing entry for 172.16.16.1/32
Known via "bgp", distance 20, metric 0, best
Last update 00:00:28 ago
* 192.168.0.2, via eth1, weight 1
AS-Path : 65003
Communities : first 65001:2 65001:3
Large-Communities: 65001:1:1 65001:1:2 65001:1:3
Selection reason : First path received
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
`bgp` pointer always exists and is used before this function call.
Calling `free` in `json` in this context will also cause a
use-after-free crash.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
In some cases where bgp is at the mpls edge, where it has a BGP-LU
peer downstream but an IP peer upstream, it can advertise the
IMPLICIT_NULL label instead of a per-prefix label.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
The '... json detail' output is missing some data that's shown
via the 'route_vty_out_detail_header' function. Integrate the
json version of that function in the 'json detail' path.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
following command: show bgp l2vpn evpn rd all tags
does not append rd contexts one after the other
before:
dut-vm# show bgp l2vpn evpn rd all tags
Network Next Hop In tag/Out tag
Route Distinguisher: 65000:999
*> [5]:[0]:[24]:[10.40.1.0]
10.209.36.1 Route Distinguisher: 65000:1000
*> [5]:[0]:[24]:[10.40.1.0]
10.209.36.1
Displayed 2 out of 2 total prefixes
after:
dut-vm# show bgp l2vpn evpn rd all tags
Network Next Hop In tag/Out tag
Route Distinguisher: 65000:999
*> [5]:[0]:[24]:[10.40.1.0]
10.209.36.1
Route Distinguisher: 65000:1000
*> [5]:[0]:[24]:[10.40.1.0]
10.209.36.1
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>