Just convert all uses of thread_cancel to THREAD_OFF. Additionally
use THREAD_ARG instead of t->arg to get the arguement. Individual
files should never be accessing thread private data like this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
A new command is available under SAFI_MPLS_VPN:
With this command, the BGP vpnvx prefixes received are
not kept, if there are no VRF interested in importing
those vpn entries.
A soft refresh is performed if there is a change of
configuration: retain cmd, vrf import settings, or
route-map change.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
RFC 9234 mandates that role rules apply only to IPv4/IPv6 unicast bgp
sessions. If the OTC attribute appears in other sessions, it will remain
untouched.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
RFC9234 is a way to establish correct connection roles (Customer/
Provider, Peer or with RS) between bgp speakers. This patch:
- Add a new configuration/terminal option to set the appropriate local
role;
- Add a mechanism for checking used roles, implemented by exchanging
the corresponding capabilities in OPEN messages;
- Add strict mode to force other party to use this feature;
- Add basic support for a new transitive optional bgp attribute - OTC
(Only to Customer);
- Add logic for default setting OTC attribute and filtering routes with
this attribute by the edge speakers, if the appropriate conditions are
met;
- Add two test stands to check role negotiation and route filtering
during role usage.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
When using json output for `show bgp statistics json` gather the
number of prefixes of each prefix Length.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Start using mpls_lse_encode/mpls_lse_decode, that is endian-aware, because
we always use host-byte order, should use network-byte.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
in bgp_nlri_parse_ip there is a `sanity` check to ensure
that the prefix length as specified by the packet
will fit inside of a `struct prefix` correctly. The problem
here of course is that this is only v4 / v6 unicast/multicast
parsing and the bytes will never be more than 16, but we are copying
into a part of the struct prefix that is only 16 bytes, but with
this check the length may be up to 47 bytes( but not really possible ).
Limit the size check to at most 16 bytes (since we are only handling
v4 or v6 addresses here )
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The maxpaths same_clusterlen value was a uint16_t
with a single bit being used. No other values are
being stored. Let's remove the bitfield and simplify
to a bool.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The logic to unlock dest if iteration completed without iterating the
entire node was flawed. Specifically, if iteration terminated due to
`gr_deferred == 0` then the node would not get unlocked.
This change takes into account the fact that dest will be NULL only in
the case when the entire table was iterated and all nodes were already
unlocked. In any other case, it needs to be unlocked.
Signed-off-by: Carl Baldwin <carl@ecbaldwin.net>
Before:
```
spine1-debian-11(config-route-map)# bgp community alias 65001:65001 test1
spine1-debian-11(config)# route-map rm permit 10
spine1-debian-11(config-route-map)# set community 65001:65001
% Malformed communities attribute
```
After:
```
spine1-debian-11(config)# bgp community alias 65001:65001 test1
spine1-debian-11(config)# route-map rm permit 10
spine1-debian-11(config-route-map)# set community 65001:65001
spine1-debian-11(config-route-map)#
```
Same for large-communities.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When `all` is specified BGP pointer is always NULL, we need to iterate over
all instances separately.
```
Received signal 11 at 1648199394 (si_addr 0x30, PC 0x562e96597090); aborting...
/usr/local/lib/libfrr.so.0(zlog_backtrace_sigsafe+0x5e) [0x7f378a57ff6e]
/usr/local/lib/libfrr.so.0(zlog_signal+0xe6) [0x7f378a580146]
/usr/local/lib/libfrr.so.0(+0xcd4c2) [0x7f378a5aa4c2]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7f378a33e140]
/usr/lib/frr/bgpd(bgp_afi_safi_peer_exists+0) [0x562e96597090]
/usr/lib/frr/bgpd(+0x15c3b8) [0x562e9654a3b8]
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Commit: ea47320b1d
Modified the bgp_clear_stale_route function to have
better indentation, but in the process changed some
`continue;` statements to `break;` which modified
the looping and caused stale paths to not always be
removed upon an update.
To reproduce: A ---- B, setup with addpath and GR
One side has a prefix with nhop1 and nhop2, kill one
side and then resend the same prefix with nhop3,
paths nhop1 and 2 become stale and never removed.
Code inspection clearly shows that that `continue`
statements became `break` statements causing the
loop over all paths to stop prematurely.
The fix is to change the break back to continue
statements so the loop can continue instead of
stopping.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This patch adds transpostion_offset and transposition_len to bgp_sid_info,
and transposes SID only at bgp_zebra_announce.
Signed-off-by: Ryoga Saito <ryoga.saito@linecorp.com>
Add a 15 minute warning to the logging system when
bgp policy is not setup properly. Operators keep asking
about the missing policy( on upgrade typically ). Let's
try to give them a bit more of a hint when something is
going wrong as that they are clearly missing the other
various places FRR tells them about it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
bgp_attr_undup does the same thing as bgp_attr_flush – frees the
temporary data that might be allocated when applying a route-map. There
is no need to have two separate functions for that.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
`struct prefix p` was declared inside an if statement
where we assign the address of to a pointer that is
then passed to a sub function. This will eventually
leave us in a bad state.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In situations where remove-private-AS is configured for eBGP peers
residing in a private ASN, the peer's ASN was not being retained
in the AS-Path which can allow loops to occur. This was addressed
in a prior commit but it only addressed cases where the "replace-AS"
keyword was configured.
This commit ensures we retain the peer's ASN when using
"remove-private-AS" for eBGP peers in a private ASN regardless of other
keywords.
Setup:
=========
router bgp 4200000002
neighbor enp1s0 interface v6only remote-as external
neighbor enp6s0 interface v6only remote-as external
!
address-family ipv4 unicast
neighbor enp6s0 remove-private-AS
exit-address-family
ub18# show ip bgp sum | include 420000
BGP router identifier 100.64.0.111, local AS number 4200000002 vrf-id 0 <<<<< local asn 4200000002
ub20(enp1s0) 4 4200000001 22 22 0 0 0 00:00:57 1 1
ub20(enp6s0) 4 4200000001 21 22 0 0 0 00:00:57 0 1 <<<< peer asn 4200000001
ub18# show ip bgp | include 0.2
Default local pref 100, local AS 4200000002
*> 100.64.0.2/32 enp1s0 0 0 4200000001 4200000004 4200000005 4200000001 i
Before ("remote-private-AS" only):
=========
ub18# show ip bgp neighbors enp6s0 advertised-routes | include 100.64.0.2
*> 100.64.0.2/32 :: 0 i <<<<< empty as-path, no way to prevent loop
After ("remote-private-AS" only):
=========
ub18# show ip bgp neighbors enp6s0 advertised-routes | include 100.64.0.2
*> 100.64.0.2/32 :: 0 4200000001 4200000001 i <<<< retain peer's asn, breaks loop
Ticket: 2857047
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Abstract:
- The command "neighbor PEER maximum-prefix-out NUMBER" cannot be applied
without clearing the BGP neighbor.
- Apply the maximum-prefix-out value as soon as it is modified without
clearing the neighbor.
subgroup_update_packet() and subgroup_withdraw_packet() respectively
manages the announcement and withdrawal BGP message to the peer.
subgrp->scount counter counts the number of sent prefixes.
Before the patch, the maximum out prefix limitation was applied in
subgroup_update_packet() in order that subgrp->scount never exceeds the
limit. Setting a limit inferior to the effective number of sent prefix
did not result in sending any withdrawal message to reduce the number of
sent prefixes. Without clearing the BGP neighbor, the limitation only
applied to the announcement of new prefixes when the limitation was
over.
With the patch, the limitation is checked in subgroup_announce_check().
The function is intended to say whether a prefix has to be announced in
regards to the prefix-list, route-map... Now when a maximum-prefix-out
value is changed/removed, the neighbor AFI/SAFI table is re-parsed in
the same way as for the application of route-map, prefix-lists...
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
BGP EVPN custom `union gw_addr` is basically the same thing as a common
`struct ipaddr` but it lacks the address family which is needed in some
cases.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
This code is populating a temporary variable `add` instead of the attr.
Initially this variable was later copied to the attr but the copying was
erroneously deleted by 0a50c2481. Directly populate the attr to restore
the correct behavior.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Here we try to compare the new attr with the existing one but this call
compares the existing index with zero instead. attrhash_cmp already
compares indexes using overlay_index_same so this call is both wrong and
useless.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
```
exit1-debian-11# sh ip bgp 10.10.10.10/32
BGP routing table entry for 10.10.10.10/32, version 14
Paths: (1 available, best #1, table default)
Not advertised to any peer
65000, (stale)
192.168.0.2 from 192.168.0.2 (0.0.0.0)
Origin incomplete, metric 0, valid, external, best (First path received)
Last update: Wed Jan 19 17:13:51 2022
Time until Graceful Restart stale route deleted: 117
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
'show bgp ... neighbor [routes|received-routes]' both incorrectly
used a json key of 'advertisedRoutes'.
This corrects the key to be 'receivedRoutes' for commands where
the displayed routes were received, not advertised.
before:
unet> r3 show ip bgp neigh 10.2.30.2 received-routes json | include Routes
"advertisedRoutes":{
after:
ub18# show ip bgp neighbors enp1s0 received-routes json | include Routes
"receivedRoutes":{
ub18# show ip bgp neighbors enp1s0 advertised-routes json | include Routes
"advertisedRoutes":{
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
The bgp_notify_conditional_adv_scanner function was/is looping
over all peers. And only matching on the passed in peer,
based upon the subgroup. As such we do not need to loop
over everything and just cut-to-the chase and just modify
the peer structure.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Unsuppress route part of the aggregation when route-map configuration
is removed before the aggregation itself.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Description:
Change is intended for fixing the issue related to
clearing of stale leaked routes:
- Whenever BGP goes down,
after bringing down tcp connection and renegotiating capabilities,
once we reestablish connection,
we are not handling clear of VRF leaked route in the bgp_clear_stale_route.
- While bgp is clearing stale routes,
we need to handle withdraw of routes for VRF route-leaking.
Co-authored-by: Kantesh Mundaragi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
rfc7196 recommends:
In addition, BGP implementations have an internal constant, which we
will call the 'maximum penalty', and the current computed penalty may
not exceed it.
Router Maximum Penalty: The internal constant for the maximum
penalty value MUST be raised to at least 50,000.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Currently the Wait for Install code ( bgp_suppress_fib ) does
not properly handle two states from zebra: ROUTE_INSTALL_FAILED
and BETTER_ADMIN_DISTANCE_WON. Pre this change the WFI code
would just never notify our peers about a route install failure
but more is needed. In the ROUTE_INSTALL_FAILED and the
BETTER_ADMIN_DISTANCE_WON we need to notify our peers with
a withdrawal about the route, else we will continue to
draw traffic to us when we cannot legally do so.
Why is this needed? In either case imagine that we've already
received a bgp route, installed it and sent to our peers.
In the Better admin distance won case, say a static route is installed
at this point in time we must stop advertising the route through
us since we are not installed. As such a withdrawal must be sent.
In the ROUTE_INSTALL_FAILED case, the code was not properly handling
the situation where we have Route A, it was successfully installed
and then we received a update to Route A that was attempted to be
installed but failed. In this case we also need to send a withdrawal
Finally update the bgp_suppress_fib topotest to test both of these
situations.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Don't hide the LABELED_UNICAST safi when processing route
updates; map it where necessary (to use the UNICAST table
for instance).
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
Move the "longer-prefixes" option from show_ip_bgp_cmd to
show_ip_bgp_json_cmd so that is has access to JSON output.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Move the "route-map" option from show_ip_bgp_cmd to
show_ip_bgp_json_cmd so that is has access to JSON output.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Move the "filter-list" option from show_ip_bgp_cmd to
show_ip_bgp_json_cmd so that is has access to JSON output.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Move the "prefix-list" option from show_ip_bgp_cmd to
show_ip_bgp_json_cmd so that is has access to JSON output.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Move the "community-list" option from show_ip_bgp_cmd to
show_ip_bgp_json_cmd so that is has access to JSON output.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
There's no need to have different calls to bgp_show() when the only
difference is one argument that corresponds to a "void *" parameter.
Code duplication should be reduced to a minimum to avoid bugs like
the one fixed in the previous commit.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Like done in the other places (when "all" isn't used), pass the
actual alias name to bgp_show() instead of a null pointer.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Description:
Incorrect behavior during best path selection for the imported routes.
Imported routes are always treated as eBGP routes.
Change is intended for fixing the issues related to
bgp best path selection for leaked routes:
- FRR does ecmp for the imported routes,
even without any ecmp related config.
If the same prefix is imported from two different VRFs,
then we configure the route with ecmp even without
any ecmp related config.
- Locally imported routes are preferred over imported
eBGP routes.
If there is a local route and eBGP learned route
for the same prefix, if we import both the routes,
imported local route is selected as best path.
- Same route is imported from multiple tenant VRFs,
both imported routes point to the same VRF in nexthop.
- When the same route with same nexthop in two different VRFs
is imported from those two VRFs, route is not installed as ecmp,
even though we had ecmp config.
- During best path selection, while comparing the paths for imported routes,
we should correctly refer to the original route i.e. the ultimate path.
- When the same route is imported from multiple VRF,
use the correct VRF while installing in the FIB.
- When same route is imported from two different tenant VRFs,
while comparing bgp path info as part of bgp best path selection,
we should ideally also compare corresponding VRFs.
See-also: https://github.com/FRRouting/frr/files/7169555/FRR.and.Cisco.VRF-Lite.Behaviour.pdf
Co-authored-by: Santosh P K <sapk@vmware.com>
Co-authored-by: Kantesh Mundaragi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
We should send only 16bytes next hop, no need for 32bytes, third party
next hops kinda for LLA does not work here.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
When debugging issues for routes in multiple vrf's. It would
be extremely useful if the debug output had which vrf we
are acting on.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
AFI/SAFI is handled in bgp_vty_find_and_parse_afi_safi_bgp() properly for
IPv4, but not for IPv6. Let's have it enabled for IPv6 by default.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
AFI/SAFI is handled in bgp_vty_find_and_parse_afi_safi_bgp() properly for
IPv4, but not for IPv6. Let's have it enabled for IPv6 by default.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
```
exit1-debian-9# show ip route 172.16.16.1/32
Routing entry for 172.16.16.1/32
Known via "bgp", distance 20, metric 0, best
Last update 00:00:28 ago
* 192.168.0.2, via eth1, weight 1
AS-Path : 65003
Communities : first 65001:2 65001:3
Large-Communities: 65001:1:1 65001:1:2 65001:1:3
Selection reason : First path received
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
`bgp` pointer always exists and is used before this function call.
Calling `free` in `json` in this context will also cause a
use-after-free crash.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
In some cases where bgp is at the mpls edge, where it has a BGP-LU
peer downstream but an IP peer upstream, it can advertise the
IMPLICIT_NULL label instead of a per-prefix label.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
The '... json detail' output is missing some data that's shown
via the 'route_vty_out_detail_header' function. Integrate the
json version of that function in the 'json detail' path.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
following command: show bgp l2vpn evpn rd all tags
does not append rd contexts one after the other
before:
dut-vm# show bgp l2vpn evpn rd all tags
Network Next Hop In tag/Out tag
Route Distinguisher: 65000:999
*> [5]:[0]:[24]:[10.40.1.0]
10.209.36.1 Route Distinguisher: 65000:1000
*> [5]:[0]:[24]:[10.40.1.0]
10.209.36.1
Displayed 2 out of 2 total prefixes
after:
dut-vm# show bgp l2vpn evpn rd all tags
Network Next Hop In tag/Out tag
Route Distinguisher: 65000:999
*> [5]:[0]:[24]:[10.40.1.0]
10.209.36.1
Route Distinguisher: 65000:1000
*> [5]:[0]:[24]:[10.40.1.0]
10.209.36.1
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
There are places where we use route-maps using duplicated attributes and
neither intern nor flush them after the usage. If a route-map has set
rules for aspath/communities, they will be allocated and never freed.
We should always flush unneeded duplicated attributes.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
EVPN paths are maintained in per-ES list for efficient updates
(es→macip_global_path_list, es→macip_evi_path_list). VNI is also maintained
in path_extra for easy lookups. This (path_extra) VNI (which is always 0 for
global paths) was being displayed against the path and was mis-interpreted
as the BD.
To avoid that confusion I have removed the display.
Ticket: #2732605
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
Description:
Change is intended for fixing the following issues related to vrf route leaking:
Routes with special nexthops i.e. blackhole/sink routes when imported,
are not programmed into the FIB and corresponding nexthop is set as 'inactive',
nexthop interface as 'unknown'.
While importing/leaking routes between VRFs, in case of special nexthop(ipv4/ipv6)
once bgp announces route(s) to zebra, nexthop type is incorrectly set as
NEXTHOP_TYPE_IPV6_IFINDEX/NEXTHOP_TYPE_IFINDEX
i.e. directly connected even though we are not able to resolve through an interface.
This leads to nexthop_active_check marking nexthop !NEXTHOP_FLAG_ACTIVE.
Unable to find the active nexthop(s), route is not programmed into the FIB.
Whenever BGP leaks routes, set the correct nexthop type, so that route gets resolved
and correctly programmed into the FIB, in the imported vrf.
Co-authored-by: Kantesh Mundaragi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
if advertisement with SID structure Sub-Sub-TLV, we need to transpose
SID, so added transpose operation into bgp_update.
Signed-off-by: Ryoga Saito <contact@proelbtn.com>
This is to avoid breaking changes between existing deployments of
extended community for bandwidth encoding. By default FRR uses uint32
to encode bandwidth, which is not as the draft requires (IEEE floating-point).
This switch enables the required encoding per-peer.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
```
exit1-debian-9# show ip bgp large-community-list
(1-500) large-community-list number
LCOMMUNITY_LIST_NAME large-community-list name
large-testas
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
```
exit1-debian-9# show ip bgp community-list ?
(1-500) community-list number
COMMUNITY_LIST_NAME community-list name
testas
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>