```
spine1-debian-11# sh ip bgp 100.100.100.101/32
BGP routing table entry for 100.100.100.101/32, version 21
Paths: (1 available, best #1, table default)
Not advertised to any peer
Local
fe80::ca5d:fd0d:cd8:1bb7 from eth3 (172.17.0.3)
(fe80::ca5d:fd0d:cd8:1bb7) (used)
Origin incomplete, metric 0, localpref 100, valid, internal, best (First path received)
Extended Community: OVS:invalid
Last update: Wed Aug 31 19:31:46 2022
spine1-debian-11# sh ip bgp 100.100.100.100/32
BGP routing table entry for 100.100.100.100/32, version 17
Paths: (1 available, best #1, table default)
Not advertised to any peer
Local
fe80::ca5d:fd0d:cd8:1bb7 from eth3 (172.17.0.3)
(fe80::ca5d:fd0d:cd8:1bb7) (used)
Origin incomplete, metric 0, localpref 100, valid, internal, best (First path received)
Extended Community: OVS:not-found
Last update: Wed Aug 31 19:31:46 2022
spine1-debian-11#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before:
```
$ vtysh -c 'show bgp l2vpn evpn route detail json'
<<<<<<<<<<<<<<<<<<<< empty line
<<<<<<<<<<<<<<<<<<<< empty line
<<<<<<<<<<<<<<<<<<<< empty line
<<<<<<<<<<<<<<<<<<<< empty line
{
...
"numPrefix":4,
"numPaths":4 <<<<< four paths = four empty lines
}
```
Contain as much "empty lines" before the JSON string as the number
of paths displayed.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
BGP SoO is a tag that is appended on BGP updates to allow a peer to mark
a particular peer as belonging to a particular site. In certain MPLS L3 VPN
configurations, the BGP AS-Path may not provide the granularity needed
prevent a loop in the control-plane. With this in mind, BGP SoO is designed
to fill this gap and prevent a routing loop that may occur.
If we configure for example, `neighbor soo 65000:1` at PEs, routes won't be
announced between CPEs if soo matches. This is especially needed when using
as-override or allowas-in.
Also, this is the automated way of the same behavior as configuring route-maps
for each peer like:
```
bgp extcommunity-list cpe permit soo 65000:1
!
route-map cpe permit 10
set extcommunity soo 65000:1
...
route-map cpe deny 10
match extcommunity cpe
route-map cpe permit 20
...
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before:
```
donatas-laptop# show bgp ipv4 unicast community-list testas
% testas is not a valid community-list name
donatas-laptop# con
donatas-laptop(config)# bgp community-list standard testas permit internet
donatas-laptop(config)# do show bgp ipv4 unicast community-list testas
donatas-laptop(config)#
```
`is not a valid community-list name` is a misleading warning message.
Doing the same for filter-list, access-list, prefix-list, route-map.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If we have conditional advertisement enabled, and conditionally withdrew
some prefixes, and then we do a 'clear bgp', those routes were getting
advertised again, and then withdrawn the next time the conditional
advertisement scanner executed.
When we go to advertise check the prefix against the conditional
advertisement status so we don't do that.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
RFC 4760 states we SHOULD ignore the NEXT_HOP attribute for BGP Update
messages carrying only MP_REACH_NLRI attributes. Thus we should use the
Network Address of Next Hop field of the MP_REACH_NLRI as the nexthop.
Instead of always looking for BGP_ATTR_NEXT_HOP, this commit ensures:
1) we set mp_nexthop_len to BGP_ATTR_NHLEN_IPV4 for v4 bgp_static routes
2) we check mp_nexthop_len when choosing the nexthop to use for nht
3) we check mp_nexthop_len when choosing the nexthop to send to zebra
4) we check mp_nexthop_len when picking the nexthop to shown by vtysh
Reported-by: Binon Gorbutt <binon@aervivo.com>
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
The same as with prefix-list/route-maps/etc.
```
donatas-pc# show ip access-list spine
ZEBRA:
Zebra IP access list spine
seq 5 permit 200.200.200.200/32
BGP:
Zebra IP access list spine
seq 5 permit 200.200.200.200/32
PIM:
Zebra IP access list spine
seq 5 permit 200.200.200.200/32
BABELD:
Zebra IP access list spine
seq 5 permit 200.200.200.200/32
donatas-pc# show bgp ipv4 unicast access-list
ACCESSLIST_NAME Access-list name
spine
donatas-pc# show bgp ipv4 unicast access-list spine
BGP table version is 9, local router ID is 172.17.0.3, vrf id 0
Default local pref 100, local AS 1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 200.200.200.200/32
enp3s0 0 0 65000 3456 ?
Displayed 1 routes and 10 total paths
donatas-pc#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Just convert all uses of thread_cancel to THREAD_OFF. Additionally
use THREAD_ARG instead of t->arg to get the arguement. Individual
files should never be accessing thread private data like this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
A new command is available under SAFI_MPLS_VPN:
With this command, the BGP vpnvx prefixes received are
not kept, if there are no VRF interested in importing
those vpn entries.
A soft refresh is performed if there is a change of
configuration: retain cmd, vrf import settings, or
route-map change.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
RFC 9234 mandates that role rules apply only to IPv4/IPv6 unicast bgp
sessions. If the OTC attribute appears in other sessions, it will remain
untouched.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
RFC9234 is a way to establish correct connection roles (Customer/
Provider, Peer or with RS) between bgp speakers. This patch:
- Add a new configuration/terminal option to set the appropriate local
role;
- Add a mechanism for checking used roles, implemented by exchanging
the corresponding capabilities in OPEN messages;
- Add strict mode to force other party to use this feature;
- Add basic support for a new transitive optional bgp attribute - OTC
(Only to Customer);
- Add logic for default setting OTC attribute and filtering routes with
this attribute by the edge speakers, if the appropriate conditions are
met;
- Add two test stands to check role negotiation and route filtering
during role usage.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
When using json output for `show bgp statistics json` gather the
number of prefixes of each prefix Length.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Start using mpls_lse_encode/mpls_lse_decode, that is endian-aware, because
we always use host-byte order, should use network-byte.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
in bgp_nlri_parse_ip there is a `sanity` check to ensure
that the prefix length as specified by the packet
will fit inside of a `struct prefix` correctly. The problem
here of course is that this is only v4 / v6 unicast/multicast
parsing and the bytes will never be more than 16, but we are copying
into a part of the struct prefix that is only 16 bytes, but with
this check the length may be up to 47 bytes( but not really possible ).
Limit the size check to at most 16 bytes (since we are only handling
v4 or v6 addresses here )
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The maxpaths same_clusterlen value was a uint16_t
with a single bit being used. No other values are
being stored. Let's remove the bitfield and simplify
to a bool.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The logic to unlock dest if iteration completed without iterating the
entire node was flawed. Specifically, if iteration terminated due to
`gr_deferred == 0` then the node would not get unlocked.
This change takes into account the fact that dest will be NULL only in
the case when the entire table was iterated and all nodes were already
unlocked. In any other case, it needs to be unlocked.
Signed-off-by: Carl Baldwin <carl@ecbaldwin.net>
Before:
```
spine1-debian-11(config-route-map)# bgp community alias 65001:65001 test1
spine1-debian-11(config)# route-map rm permit 10
spine1-debian-11(config-route-map)# set community 65001:65001
% Malformed communities attribute
```
After:
```
spine1-debian-11(config)# bgp community alias 65001:65001 test1
spine1-debian-11(config)# route-map rm permit 10
spine1-debian-11(config-route-map)# set community 65001:65001
spine1-debian-11(config-route-map)#
```
Same for large-communities.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When `all` is specified BGP pointer is always NULL, we need to iterate over
all instances separately.
```
Received signal 11 at 1648199394 (si_addr 0x30, PC 0x562e96597090); aborting...
/usr/local/lib/libfrr.so.0(zlog_backtrace_sigsafe+0x5e) [0x7f378a57ff6e]
/usr/local/lib/libfrr.so.0(zlog_signal+0xe6) [0x7f378a580146]
/usr/local/lib/libfrr.so.0(+0xcd4c2) [0x7f378a5aa4c2]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7f378a33e140]
/usr/lib/frr/bgpd(bgp_afi_safi_peer_exists+0) [0x562e96597090]
/usr/lib/frr/bgpd(+0x15c3b8) [0x562e9654a3b8]
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Commit: ea47320b1d
Modified the bgp_clear_stale_route function to have
better indentation, but in the process changed some
`continue;` statements to `break;` which modified
the looping and caused stale paths to not always be
removed upon an update.
To reproduce: A ---- B, setup with addpath and GR
One side has a prefix with nhop1 and nhop2, kill one
side and then resend the same prefix with nhop3,
paths nhop1 and 2 become stale and never removed.
Code inspection clearly shows that that `continue`
statements became `break` statements causing the
loop over all paths to stop prematurely.
The fix is to change the break back to continue
statements so the loop can continue instead of
stopping.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This patch adds transpostion_offset and transposition_len to bgp_sid_info,
and transposes SID only at bgp_zebra_announce.
Signed-off-by: Ryoga Saito <ryoga.saito@linecorp.com>
Add a 15 minute warning to the logging system when
bgp policy is not setup properly. Operators keep asking
about the missing policy( on upgrade typically ). Let's
try to give them a bit more of a hint when something is
going wrong as that they are clearly missing the other
various places FRR tells them about it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
bgp_attr_undup does the same thing as bgp_attr_flush – frees the
temporary data that might be allocated when applying a route-map. There
is no need to have two separate functions for that.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
`struct prefix p` was declared inside an if statement
where we assign the address of to a pointer that is
then passed to a sub function. This will eventually
leave us in a bad state.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In situations where remove-private-AS is configured for eBGP peers
residing in a private ASN, the peer's ASN was not being retained
in the AS-Path which can allow loops to occur. This was addressed
in a prior commit but it only addressed cases where the "replace-AS"
keyword was configured.
This commit ensures we retain the peer's ASN when using
"remove-private-AS" for eBGP peers in a private ASN regardless of other
keywords.
Setup:
=========
router bgp 4200000002
neighbor enp1s0 interface v6only remote-as external
neighbor enp6s0 interface v6only remote-as external
!
address-family ipv4 unicast
neighbor enp6s0 remove-private-AS
exit-address-family
ub18# show ip bgp sum | include 420000
BGP router identifier 100.64.0.111, local AS number 4200000002 vrf-id 0 <<<<< local asn 4200000002
ub20(enp1s0) 4 4200000001 22 22 0 0 0 00:00:57 1 1
ub20(enp6s0) 4 4200000001 21 22 0 0 0 00:00:57 0 1 <<<< peer asn 4200000001
ub18# show ip bgp | include 0.2
Default local pref 100, local AS 4200000002
*> 100.64.0.2/32 enp1s0 0 0 4200000001 4200000004 4200000005 4200000001 i
Before ("remote-private-AS" only):
=========
ub18# show ip bgp neighbors enp6s0 advertised-routes | include 100.64.0.2
*> 100.64.0.2/32 :: 0 i <<<<< empty as-path, no way to prevent loop
After ("remote-private-AS" only):
=========
ub18# show ip bgp neighbors enp6s0 advertised-routes | include 100.64.0.2
*> 100.64.0.2/32 :: 0 4200000001 4200000001 i <<<< retain peer's asn, breaks loop
Ticket: 2857047
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Abstract:
- The command "neighbor PEER maximum-prefix-out NUMBER" cannot be applied
without clearing the BGP neighbor.
- Apply the maximum-prefix-out value as soon as it is modified without
clearing the neighbor.
subgroup_update_packet() and subgroup_withdraw_packet() respectively
manages the announcement and withdrawal BGP message to the peer.
subgrp->scount counter counts the number of sent prefixes.
Before the patch, the maximum out prefix limitation was applied in
subgroup_update_packet() in order that subgrp->scount never exceeds the
limit. Setting a limit inferior to the effective number of sent prefix
did not result in sending any withdrawal message to reduce the number of
sent prefixes. Without clearing the BGP neighbor, the limitation only
applied to the announcement of new prefixes when the limitation was
over.
With the patch, the limitation is checked in subgroup_announce_check().
The function is intended to say whether a prefix has to be announced in
regards to the prefix-list, route-map... Now when a maximum-prefix-out
value is changed/removed, the neighbor AFI/SAFI table is re-parsed in
the same way as for the application of route-map, prefix-lists...
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
BGP EVPN custom `union gw_addr` is basically the same thing as a common
`struct ipaddr` but it lacks the address family which is needed in some
cases.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
This code is populating a temporary variable `add` instead of the attr.
Initially this variable was later copied to the attr but the copying was
erroneously deleted by 0a50c2481. Directly populate the attr to restore
the correct behavior.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Here we try to compare the new attr with the existing one but this call
compares the existing index with zero instead. attrhash_cmp already
compares indexes using overlay_index_same so this call is both wrong and
useless.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
```
exit1-debian-11# sh ip bgp 10.10.10.10/32
BGP routing table entry for 10.10.10.10/32, version 14
Paths: (1 available, best #1, table default)
Not advertised to any peer
65000, (stale)
192.168.0.2 from 192.168.0.2 (0.0.0.0)
Origin incomplete, metric 0, valid, external, best (First path received)
Last update: Wed Jan 19 17:13:51 2022
Time until Graceful Restart stale route deleted: 117
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
'show bgp ... neighbor [routes|received-routes]' both incorrectly
used a json key of 'advertisedRoutes'.
This corrects the key to be 'receivedRoutes' for commands where
the displayed routes were received, not advertised.
before:
unet> r3 show ip bgp neigh 10.2.30.2 received-routes json | include Routes
"advertisedRoutes":{
after:
ub18# show ip bgp neighbors enp1s0 received-routes json | include Routes
"receivedRoutes":{
ub18# show ip bgp neighbors enp1s0 advertised-routes json | include Routes
"advertisedRoutes":{
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
The bgp_notify_conditional_adv_scanner function was/is looping
over all peers. And only matching on the passed in peer,
based upon the subgroup. As such we do not need to loop
over everything and just cut-to-the chase and just modify
the peer structure.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Unsuppress route part of the aggregation when route-map configuration
is removed before the aggregation itself.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Description:
Change is intended for fixing the issue related to
clearing of stale leaked routes:
- Whenever BGP goes down,
after bringing down tcp connection and renegotiating capabilities,
once we reestablish connection,
we are not handling clear of VRF leaked route in the bgp_clear_stale_route.
- While bgp is clearing stale routes,
we need to handle withdraw of routes for VRF route-leaking.
Co-authored-by: Kantesh Mundaragi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
rfc7196 recommends:
In addition, BGP implementations have an internal constant, which we
will call the 'maximum penalty', and the current computed penalty may
not exceed it.
Router Maximum Penalty: The internal constant for the maximum
penalty value MUST be raised to at least 50,000.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Currently the Wait for Install code ( bgp_suppress_fib ) does
not properly handle two states from zebra: ROUTE_INSTALL_FAILED
and BETTER_ADMIN_DISTANCE_WON. Pre this change the WFI code
would just never notify our peers about a route install failure
but more is needed. In the ROUTE_INSTALL_FAILED and the
BETTER_ADMIN_DISTANCE_WON we need to notify our peers with
a withdrawal about the route, else we will continue to
draw traffic to us when we cannot legally do so.
Why is this needed? In either case imagine that we've already
received a bgp route, installed it and sent to our peers.
In the Better admin distance won case, say a static route is installed
at this point in time we must stop advertising the route through
us since we are not installed. As such a withdrawal must be sent.
In the ROUTE_INSTALL_FAILED case, the code was not properly handling
the situation where we have Route A, it was successfully installed
and then we received a update to Route A that was attempted to be
installed but failed. In this case we also need to send a withdrawal
Finally update the bgp_suppress_fib topotest to test both of these
situations.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Don't hide the LABELED_UNICAST safi when processing route
updates; map it where necessary (to use the UNICAST table
for instance).
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
Move the "longer-prefixes" option from show_ip_bgp_cmd to
show_ip_bgp_json_cmd so that is has access to JSON output.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Move the "route-map" option from show_ip_bgp_cmd to
show_ip_bgp_json_cmd so that is has access to JSON output.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Move the "filter-list" option from show_ip_bgp_cmd to
show_ip_bgp_json_cmd so that is has access to JSON output.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Move the "prefix-list" option from show_ip_bgp_cmd to
show_ip_bgp_json_cmd so that is has access to JSON output.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Move the "community-list" option from show_ip_bgp_cmd to
show_ip_bgp_json_cmd so that is has access to JSON output.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
There's no need to have different calls to bgp_show() when the only
difference is one argument that corresponds to a "void *" parameter.
Code duplication should be reduced to a minimum to avoid bugs like
the one fixed in the previous commit.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Like done in the other places (when "all" isn't used), pass the
actual alias name to bgp_show() instead of a null pointer.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Description:
Incorrect behavior during best path selection for the imported routes.
Imported routes are always treated as eBGP routes.
Change is intended for fixing the issues related to
bgp best path selection for leaked routes:
- FRR does ecmp for the imported routes,
even without any ecmp related config.
If the same prefix is imported from two different VRFs,
then we configure the route with ecmp even without
any ecmp related config.
- Locally imported routes are preferred over imported
eBGP routes.
If there is a local route and eBGP learned route
for the same prefix, if we import both the routes,
imported local route is selected as best path.
- Same route is imported from multiple tenant VRFs,
both imported routes point to the same VRF in nexthop.
- When the same route with same nexthop in two different VRFs
is imported from those two VRFs, route is not installed as ecmp,
even though we had ecmp config.
- During best path selection, while comparing the paths for imported routes,
we should correctly refer to the original route i.e. the ultimate path.
- When the same route is imported from multiple VRF,
use the correct VRF while installing in the FIB.
- When same route is imported from two different tenant VRFs,
while comparing bgp path info as part of bgp best path selection,
we should ideally also compare corresponding VRFs.
See-also: https://github.com/FRRouting/frr/files/7169555/FRR.and.Cisco.VRF-Lite.Behaviour.pdf
Co-authored-by: Santosh P K <sapk@vmware.com>
Co-authored-by: Kantesh Mundaragi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
We should send only 16bytes next hop, no need for 32bytes, third party
next hops kinda for LLA does not work here.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
When debugging issues for routes in multiple vrf's. It would
be extremely useful if the debug output had which vrf we
are acting on.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
AFI/SAFI is handled in bgp_vty_find_and_parse_afi_safi_bgp() properly for
IPv4, but not for IPv6. Let's have it enabled for IPv6 by default.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
AFI/SAFI is handled in bgp_vty_find_and_parse_afi_safi_bgp() properly for
IPv4, but not for IPv6. Let's have it enabled for IPv6 by default.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
```
exit1-debian-9# show ip route 172.16.16.1/32
Routing entry for 172.16.16.1/32
Known via "bgp", distance 20, metric 0, best
Last update 00:00:28 ago
* 192.168.0.2, via eth1, weight 1
AS-Path : 65003
Communities : first 65001:2 65001:3
Large-Communities: 65001:1:1 65001:1:2 65001:1:3
Selection reason : First path received
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
`bgp` pointer always exists and is used before this function call.
Calling `free` in `json` in this context will also cause a
use-after-free crash.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
In some cases where bgp is at the mpls edge, where it has a BGP-LU
peer downstream but an IP peer upstream, it can advertise the
IMPLICIT_NULL label instead of a per-prefix label.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
The '... json detail' output is missing some data that's shown
via the 'route_vty_out_detail_header' function. Integrate the
json version of that function in the 'json detail' path.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
following command: show bgp l2vpn evpn rd all tags
does not append rd contexts one after the other
before:
dut-vm# show bgp l2vpn evpn rd all tags
Network Next Hop In tag/Out tag
Route Distinguisher: 65000:999
*> [5]:[0]:[24]:[10.40.1.0]
10.209.36.1 Route Distinguisher: 65000:1000
*> [5]:[0]:[24]:[10.40.1.0]
10.209.36.1
Displayed 2 out of 2 total prefixes
after:
dut-vm# show bgp l2vpn evpn rd all tags
Network Next Hop In tag/Out tag
Route Distinguisher: 65000:999
*> [5]:[0]:[24]:[10.40.1.0]
10.209.36.1
Route Distinguisher: 65000:1000
*> [5]:[0]:[24]:[10.40.1.0]
10.209.36.1
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
There are places where we use route-maps using duplicated attributes and
neither intern nor flush them after the usage. If a route-map has set
rules for aspath/communities, they will be allocated and never freed.
We should always flush unneeded duplicated attributes.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
EVPN paths are maintained in per-ES list for efficient updates
(es→macip_global_path_list, es→macip_evi_path_list). VNI is also maintained
in path_extra for easy lookups. This (path_extra) VNI (which is always 0 for
global paths) was being displayed against the path and was mis-interpreted
as the BD.
To avoid that confusion I have removed the display.
Ticket: #2732605
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
Description:
Change is intended for fixing the following issues related to vrf route leaking:
Routes with special nexthops i.e. blackhole/sink routes when imported,
are not programmed into the FIB and corresponding nexthop is set as 'inactive',
nexthop interface as 'unknown'.
While importing/leaking routes between VRFs, in case of special nexthop(ipv4/ipv6)
once bgp announces route(s) to zebra, nexthop type is incorrectly set as
NEXTHOP_TYPE_IPV6_IFINDEX/NEXTHOP_TYPE_IFINDEX
i.e. directly connected even though we are not able to resolve through an interface.
This leads to nexthop_active_check marking nexthop !NEXTHOP_FLAG_ACTIVE.
Unable to find the active nexthop(s), route is not programmed into the FIB.
Whenever BGP leaks routes, set the correct nexthop type, so that route gets resolved
and correctly programmed into the FIB, in the imported vrf.
Co-authored-by: Kantesh Mundaragi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
if advertisement with SID structure Sub-Sub-TLV, we need to transpose
SID, so added transpose operation into bgp_update.
Signed-off-by: Ryoga Saito <contact@proelbtn.com>
This is to avoid breaking changes between existing deployments of
extended community for bandwidth encoding. By default FRR uses uint32
to encode bandwidth, which is not as the draft requires (IEEE floating-point).
This switch enables the required encoding per-peer.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
```
exit1-debian-9# show ip bgp large-community-list
(1-500) large-community-list number
LCOMMUNITY_LIST_NAME large-community-list name
large-testas
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
```
exit1-debian-9# show ip bgp community-list ?
(1-500) community-list number
COMMUNITY_LIST_NAME community-list name
testas
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
When bgp receives the admin distance from a redistribution statement
let's store that distance for later usage.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When an EVPN prefix flaps too quickly such that the new advertisement
is received prior to the full processing of the prior withdraw, we may
get into a state where the route doesn't get imported properly into
MAC or IP VRFs. Ensure that we do the route import in such cases.
Suggested-by: Sri Mohana Singamsetty <msingamsetty@vmware.com>
Suggested-by: Ameya Dharkar <adharkar@vmware.com>
Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>
Add a bit of code to allow for auto-completion of the community
alias command when attempting to use it for show commands.
example:
eva(config)# bgp community alias 11:22 FOO
eva(config)# end
eva# show bgp ipv4 uni alias
ALIAS_NAME BGP community alias
FOO
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Some BGP updates received by BGP invite local router to
install a route through itself. The system will not do it, and
the route should be considered as not valid at the earliest.
This case is detected on the zebra, and this detection prevents
from trying to install this route to the local system. However,
the nexthop tracking mechanism is called, and acts as if the route
was valid, which is not the case.
By detecting in BGP that use case, we avoid installing the invalid
routes.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Force the processing of existing network configurations when VRF is
created, otherwise will be skipped in bgp_static_update().
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Add a terse option to show bgp summary to shorten output.
Do not show the following information about the BGP
instances: the number of RIB entries, the table version and the used memory.
The "terse" option can be used in combination with the "remote-as", "neighbor",
"failed" and "established" filters, and with the "wide" option as well.
Before patch:
ubuntu# show bgp summary remote-as 123456
IPv4 Unicast Summary (VRF default):
BGP router identifier X.X.X.X, local AS number XXX vrf-id 0
BGP table version 0
RIB entries 3, using 552 bytes of memory
Peers 5, using 3635 KiB of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
10.200.200.2 4 123456 81432 4 0 56092 0 00:00:13 572106 0 N/A
Displayed neighbors 1
Total number of neighbors 4
IPv6 Unicast Summary (VRF default):
BGP router identifier X.X.X.X, local AS number XXX vrf-id 0
BGP table version 0
RIB entries 3, using 552 bytes of memory
Peers 5, using 3635 KiB of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
% No matching neighbor
Total number of neighbors 5
After patch:
ubuntu# show bgp summary remote-as 123456 terse
IPv4 Unicast Summary (VRF default):
BGP router identifier X.X.X.X, local AS number XXX vrf-id 0
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
10.200.200.2 4 123456 81432 4 0 56092 0 00:00:13 572106 0 N/A
Displayed neighbors 1
Total number of neighbors 4
IPv6 Unicast Summary (VRF default):
BGP router identifier X.X.X.X, local AS number XXX vrf-id 1
% No matching neighbor
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
While installing this route in the EVPN table, make sure all the conditions
mentioned in the draft
https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement-11 are
met.
Draft mentions following conditions:
- ESI and gateway IP cannot be both nonzero at the same time.
- ESI, gateway IP, RMAC and VNI label all cannot be 0 at the same time.
If the received EVPN RT-5 route does not meet these conditions, the route is
treated as withdraw.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
Gateway IP overlay index is generated for EVPN RT-5 when following CLI is
configured.
router bgp 100 vrf vrf-blue
address-family l2vpn evpn
advertise ipv4 unicast gateway-ip
advertise ipv6 unicast gateway-ip
BGP nexthop of the VRF IP/IPv6 route is set as the gateway IP of the
corresponding EVPN RT-5
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
We are inconsistently using peer_establiahed(peer) with
sometimes using `peer->status == Established`. Just Convert
over to using the function for consistency.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
BGP configuration changes that imply recomputing the BGP route table
(e.g. modifying route-maps, setting bgp graceful-shutdown) might be a
long time process depending on the size of the BGP table and the
route-map numbers and complexity. For example, setups with full
Internet routes take something like one minute to reprocess all the
prefixes when graceful-shutdown is configured. During this time, a
"show bgp commands" request on vtysh results in blocking the shell until
the soft reconfigure table task is over.
This patch splits bgp_soft_reconfig_table task into thread jobs of 25K
prefixes.
Some tests on a full Internet route setup show that after reconfiguring
route-maps or graceful-shutdown, vtysh is not stucked anymore. We are
now able to request commands like "show bgp summary" after 1 or 2
seconds instead of 30 to 60s.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
For EVPN routes, there is specific logic in place for path selection
surrounding MAC Mobility. For pure type-5 routes, if a route is
advertised with a MED, this is ignored since it ultimately falls inside
of the EVPN specific path selection logic, and ultimately selects the
lower IP address. This change ensures only type-2 routes fall into the
EVPN BGP path selection.
Signed-off-by: Neal Shrader <neal@digitalocean.com>
Show alias name instead of numerical value in `show bgp <prefix>. E.g.:
```
root@exit1-debian-9:~/frr# vtysh -c 'sh run' | grep 'bgp community alias'
bgp community alias 65001:123 community-1
bgp community alias 65001:123:1 lcommunity-1
root@exit1-debian-9:~/frr#
```
```
exit1-debian-9# sh ip bgp 172.16.16.1/32
BGP routing table entry for 172.16.16.1/32, version 21
Paths: (2 available, best #2, table default)
Advertised to non peer-group peers:
65030
192.168.0.2 from home-spine1.donatas.net(192.168.0.2) (172.16.16.1)
Origin incomplete, metric 0, valid, external, best (Neighbor IP)
Community: 65001:12 65001:13 community-1 65001:65534
Large Community: lcommunity-1 65001:123:2
Last update: Fri Apr 16 12:51:27 2021
exit1-debian-9#
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Absolutetly cosmetic change, but let it be consistent with other checks
for optional attributes.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
This is useful to go back in the past and check when was that prefix appeared,
changed, etc.
```
exit1-debian-9# show ip bgp 172.16.16.1/32
BGP routing table entry for 172.16.16.1/32, version 6
Paths: (2 available, best #2, table default)
Advertised to non peer-group peers:
home-spine1.donatas.net(192.168.0.2) home-spine1.donatas.net(2a02:bbd::2)
65030
192.168.0.2 from home-spine1.donatas.net(2a02:bbd::2) (172.16.16.1)
Origin incomplete, metric 0, valid, external
Last update: Thu Apr 8 20:15:25 2021
65030
192.168.0.2 from home-spine1.donatas.net(192.168.0.2) (172.16.16.1)
Origin incomplete, metric 0, valid, external, best (Neighbor IP)
Last update: Thu Apr 8 20:15:25 2021
exit1-debian-9#
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Just to be more informant, copying from Cisco.
```
exit1-debian-9# sh ip bgp
BGP table version is 4, local router ID is 192.168.100.1, vrf id 0
Default local pref 100, local AS 65534
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
N*> 10.0.2.0/24 0.0.0.0 0 32768 ?
N*> 192.168.0.0/24 0.0.0.0 0 32768 ?
N*> 192.168.10.0/24 0.0.0.0 0 32768 ?
N*> 192.168.100.1/32 0.0.0.0 0 32768 ?
Displayed 4 routes and 4 total paths
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
set_vpn_nexthop/no_set_vpn_nexthop were failing due to missing
declarations and unused variables.
This adds the missing declaration and removes unused variables.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
New and improved submission for this commit -- updated to accommodate
changes from 4027d19b0.
Adds support for 'rd all' matching for EVPN and L3VPN show commands.
Introduces evpn_show_route_rd_all_macip().
Cleans up some show commands to use SHOW_DISPLAY string constants.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
During Review it was suggested that appending rpki_
to curr_state and target_state would be better
variable names. Instead of going and fixing
3 or so commits up. Just do this one.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add the ability for the end operator to query the state of valid
or invalid or no information rpki prefix information.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When displaying data about the rpki state, use the
string `rpki validation-state` instead of `validation-state:`
to avoid confusion with `(valid)`
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Setup a mh_info indirection in the path extra. This has been done to
avoid increasing evpn route's path size to add new (type based) pointers
in path_info_extra.
Ticket: CM-31398
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
There are two changes in this commit -
1. Maintain a list of global MAC-IP routes per-ES. This list is maintained
for quick processing on the following events -
a. When the first VTEP/PE becomes active in the ES-VRF, the L3 NHG is
activated and the route can be sent to zebra.
b. When there are no active PEs in the ES-VRF the L3 NHG is
de-activated and -
- If the ES is present in the VRF -
The route is not installed in zebra as there are no active PEs for
the ES-VRF
- If the ES is not present in the VRF -
The route is installed with a flat multi-path list i.e. without L3NHG.
This is to handle the case where there are no locally attached L2VNIs
on the ES (for that tenant VRF).
2. Reinstall VRF route when an ES is installed or uninstalled in a
tenant VRF (the global MAC-IP list in #1 is used for this purpose also).
If an ES is present in the VRF we use L3NHG to enable fast-failover of
routed traffic.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This new BGP configuration is akin to "bgp bestpath aspath
multipath-relax". When applied, paths learned from different peer types
will be eligible to be considered for multipath (ECMP). Paths from all
of eBGP, iBGP, and confederation peers may be included in multipaths
if they are otherwise equal cost.
This change preserves the existing bestpath behavior of step 10's result
being returned, not the result from steps 8 and 9, in the case where
both 8+9 and 10 determine a winner.
Signed-off-by: Joanne Mikkelson <jmmikkel@arista.com>
Adds support for 'rd all' matching for EVPN and L3VPN show commands.
Introduces evpn_show_route_rd_all_macip().
Cleanup some show commands to use SHOW_DISPLAY string constants.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
When dumping data about prefixes in bgp. Let's dump the
rpki validation state as well:
Output if rpki is turned on:
janelle# show rpki prefix 2003::/19
Prefix Prefix Length Origin-AS
2003:: 19 - 19 3320
janelle# show bgp ipv6 uni 2003::/19
BGP routing table entry for 2003::/19
Paths: (1 available, best #1, table default)
Not advertised to any peer
15096 6939 3320
::ffff:4113:867a from 65.19.134.122 (193.72.216.231)
(fe80::e063:daff:fe79:1dab) (used)
Origin IGP, valid, external, best (First path received), validation-state: valid
Last update: Sat Mar 6 09:20:51 2021
janelle# show rpki prefix 8.8.8.0/24
Prefix Prefix Length Origin-AS
janelle# show bgp ipv4 uni 8.8.8.0/24
BGP routing table entry for 8.8.8.0/24
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
100.99.229.142
15096 6939 15169
65.19.134.122 from 65.19.134.122 (193.72.216.231)
Origin IGP, valid, external, best (First path received), validation-state: not found
Last update: Sat Mar 6 09:21:25 2021
Example output when rpki is not configured:
eva# show bgp ipv4 uni 8.8.8.0/24
BGP routing table entry for 8.8.8.0/24
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
janelle(192.168.161.137)
64539 15096 6939 15169
192.168.161.137(janelle) from janelle(192.168.161.137) (192.168.44.1)
Origin IGP, valid, external, bestpath-from-AS 64539, best (First path received)
Last update: Sat Mar 6 09:33:51 2021
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
'show bgp l2vpn evpn statistics' was returning 0 for all stats
because bgp_table_stats_walker bailed out if afi != AFI_IP or AFI_IP6.
Add case condition to catch AFI_L2VPN.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
If we are filtering a route due to any of the filter reasons
we should not be setting the BGP_NODE_FIB_INSTALL_FIB_PENDING
flag. This is especially evident with say a loopback that
is covered by a network statement. When we receive the route
back from our peer we should not be setting the
BGP_NODE_FIB_INSTALL_PENDING flag on it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
'show bgp ipv[46] vpn neighbors ... advertised-routes' was displaying
empty output due to new command syntax using show_adj_routes() which
assumed each bgp_table was single-tier (not nested). This fixes that
assumption for safis with a two-tier bgp_table (SAFI_MPLS_VPN,
SAFI_ENCAP, and SAFI_EVPN).
Before:
ub18# show bgp ipv6 vpn neighbors 2001:db8:cafe::2 advertised-routes
ub18#
After:
ub20# show bgp ipv6 vpn neighbors 2001:db8:cafe::1 advertised-routes
BGP table version is 2, local router ID is 100.64.0.222, vrf id 0
Default local pref 100, local AS 1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
Route Distinguisher: 30:30
*> 2::2/128 :: 0 100 32768 i
*> 2::22/128 :: 0 100 32768 i
Route Distinguisher: 33:33
*> 2::2/128 :: 0 100 32768 i
*> 2::22/128 :: 0 100 32768 i
Total number of prefixes 4
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
`same_attr` has been computed and `hook_call(bgp_process)` (calling
BMP module) would not change it. We could reuse the value to filter
same attribute updates, avoiding an extra comparison.
Signed-off-by: zyxwvu Shi <i@shiyc.cn>
Already not necessary, because if BGP aggregator AS attribute is with
value of 0, then the attribute is already discarded at early processing.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Description:
clear ip bgp dampening was not triggering the route
calculation for the prefix, Due to this prefix are not install in
RIB(Zebra) and not adv to neighbor
Problem Description/Summary :
clear ip bgp dampening was not triggering the route
calculation for the prefix, Due to this prefix are not install in
RIB(Zebra) and not adv to neighbor
Fix: When clear ip bgp dampening, route are put for route-calculation as
that it is install in the Zebra and adv to neighbor.
Signed-off-by: sudhanshukumar22 <sudhanshu.kumar@broadcom.com>