RFC4364 describes peerings between multiple AS domains, to ease
the continuity of VPN services across multiple SPs. This commit
implements a sub-set of IETF option b) described in chapter 10 b.
The ASBR to ASBR approach is taken, with an EBGP peering between
the two routers. The EBGP peering must be directly connected to
the outgoing interface used. In those conditions, the next hop
is directly connected, and there is no need to have a transport
label to convey the VPN label. A new vty command is added on a
per interface basis:
This command if enabled, will permit to convey BGP VPN labels
without any transport labels (i.e. with implicit-null label).
restriction:
this command is used only for EBGP directly connected peerings.
Other use cases are not covered.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
```
spine1-debian-11# sh ip bgp 100.100.100.101/32
BGP routing table entry for 100.100.100.101/32, version 21
Paths: (1 available, best #1, table default)
Not advertised to any peer
Local
fe80::ca5d:fd0d:cd8:1bb7 from eth3 (172.17.0.3)
(fe80::ca5d:fd0d:cd8:1bb7) (used)
Origin incomplete, metric 0, localpref 100, valid, internal, best (First path received)
Extended Community: OVS:invalid
Last update: Wed Aug 31 19:31:46 2022
spine1-debian-11# sh ip bgp 100.100.100.100/32
BGP routing table entry for 100.100.100.100/32, version 17
Paths: (1 available, best #1, table default)
Not advertised to any peer
Local
fe80::ca5d:fd0d:cd8:1bb7 from eth3 (172.17.0.3)
(fe80::ca5d:fd0d:cd8:1bb7) (used)
Origin incomplete, metric 0, localpref 100, valid, internal, best (First path received)
Extended Community: OVS:not-found
Last update: Wed Aug 31 19:31:46 2022
spine1-debian-11#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before:
```
$ vtysh -c 'show bgp l2vpn evpn route detail json'
<<<<<<<<<<<<<<<<<<<< empty line
<<<<<<<<<<<<<<<<<<<< empty line
<<<<<<<<<<<<<<<<<<<< empty line
<<<<<<<<<<<<<<<<<<<< empty line
{
...
"numPrefix":4,
"numPaths":4 <<<<< four paths = four empty lines
}
```
Contain as much "empty lines" before the JSON string as the number
of paths displayed.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
BGP SoO is a tag that is appended on BGP updates to allow a peer to mark
a particular peer as belonging to a particular site. In certain MPLS L3 VPN
configurations, the BGP AS-Path may not provide the granularity needed
prevent a loop in the control-plane. With this in mind, BGP SoO is designed
to fill this gap and prevent a routing loop that may occur.
If we configure for example, `neighbor soo 65000:1` at PEs, routes won't be
announced between CPEs if soo matches. This is especially needed when using
as-override or allowas-in.
Also, this is the automated way of the same behavior as configuring route-maps
for each peer like:
```
bgp extcommunity-list cpe permit soo 65000:1
!
route-map cpe permit 10
set extcommunity soo 65000:1
...
route-map cpe deny 10
match extcommunity cpe
route-map cpe permit 20
...
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before:
```
donatas-laptop# show bgp ipv4 unicast community-list testas
% testas is not a valid community-list name
donatas-laptop# con
donatas-laptop(config)# bgp community-list standard testas permit internet
donatas-laptop(config)# do show bgp ipv4 unicast community-list testas
donatas-laptop(config)#
```
`is not a valid community-list name` is a misleading warning message.
Doing the same for filter-list, access-list, prefix-list, route-map.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If we have conditional advertisement enabled, and conditionally withdrew
some prefixes, and then we do a 'clear bgp', those routes were getting
advertised again, and then withdrawn the next time the conditional
advertisement scanner executed.
When we go to advertise check the prefix against the conditional
advertisement status so we don't do that.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
RFC 4760 states we SHOULD ignore the NEXT_HOP attribute for BGP Update
messages carrying only MP_REACH_NLRI attributes. Thus we should use the
Network Address of Next Hop field of the MP_REACH_NLRI as the nexthop.
Instead of always looking for BGP_ATTR_NEXT_HOP, this commit ensures:
1) we set mp_nexthop_len to BGP_ATTR_NHLEN_IPV4 for v4 bgp_static routes
2) we check mp_nexthop_len when choosing the nexthop to use for nht
3) we check mp_nexthop_len when choosing the nexthop to send to zebra
4) we check mp_nexthop_len when picking the nexthop to shown by vtysh
Reported-by: Binon Gorbutt <binon@aervivo.com>
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
The same as with prefix-list/route-maps/etc.
```
donatas-pc# show ip access-list spine
ZEBRA:
Zebra IP access list spine
seq 5 permit 200.200.200.200/32
BGP:
Zebra IP access list spine
seq 5 permit 200.200.200.200/32
PIM:
Zebra IP access list spine
seq 5 permit 200.200.200.200/32
BABELD:
Zebra IP access list spine
seq 5 permit 200.200.200.200/32
donatas-pc# show bgp ipv4 unicast access-list
ACCESSLIST_NAME Access-list name
spine
donatas-pc# show bgp ipv4 unicast access-list spine
BGP table version is 9, local router ID is 172.17.0.3, vrf id 0
Default local pref 100, local AS 1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 200.200.200.200/32
enp3s0 0 0 65000 3456 ?
Displayed 1 routes and 10 total paths
donatas-pc#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Just convert all uses of thread_cancel to THREAD_OFF. Additionally
use THREAD_ARG instead of t->arg to get the arguement. Individual
files should never be accessing thread private data like this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
A new command is available under SAFI_MPLS_VPN:
With this command, the BGP vpnvx prefixes received are
not kept, if there are no VRF interested in importing
those vpn entries.
A soft refresh is performed if there is a change of
configuration: retain cmd, vrf import settings, or
route-map change.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
RFC 9234 mandates that role rules apply only to IPv4/IPv6 unicast bgp
sessions. If the OTC attribute appears in other sessions, it will remain
untouched.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
RFC9234 is a way to establish correct connection roles (Customer/
Provider, Peer or with RS) between bgp speakers. This patch:
- Add a new configuration/terminal option to set the appropriate local
role;
- Add a mechanism for checking used roles, implemented by exchanging
the corresponding capabilities in OPEN messages;
- Add strict mode to force other party to use this feature;
- Add basic support for a new transitive optional bgp attribute - OTC
(Only to Customer);
- Add logic for default setting OTC attribute and filtering routes with
this attribute by the edge speakers, if the appropriate conditions are
met;
- Add two test stands to check role negotiation and route filtering
during role usage.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
When using json output for `show bgp statistics json` gather the
number of prefixes of each prefix Length.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Start using mpls_lse_encode/mpls_lse_decode, that is endian-aware, because
we always use host-byte order, should use network-byte.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
in bgp_nlri_parse_ip there is a `sanity` check to ensure
that the prefix length as specified by the packet
will fit inside of a `struct prefix` correctly. The problem
here of course is that this is only v4 / v6 unicast/multicast
parsing and the bytes will never be more than 16, but we are copying
into a part of the struct prefix that is only 16 bytes, but with
this check the length may be up to 47 bytes( but not really possible ).
Limit the size check to at most 16 bytes (since we are only handling
v4 or v6 addresses here )
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The maxpaths same_clusterlen value was a uint16_t
with a single bit being used. No other values are
being stored. Let's remove the bitfield and simplify
to a bool.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The logic to unlock dest if iteration completed without iterating the
entire node was flawed. Specifically, if iteration terminated due to
`gr_deferred == 0` then the node would not get unlocked.
This change takes into account the fact that dest will be NULL only in
the case when the entire table was iterated and all nodes were already
unlocked. In any other case, it needs to be unlocked.
Signed-off-by: Carl Baldwin <carl@ecbaldwin.net>
Before:
```
spine1-debian-11(config-route-map)# bgp community alias 65001:65001 test1
spine1-debian-11(config)# route-map rm permit 10
spine1-debian-11(config-route-map)# set community 65001:65001
% Malformed communities attribute
```
After:
```
spine1-debian-11(config)# bgp community alias 65001:65001 test1
spine1-debian-11(config)# route-map rm permit 10
spine1-debian-11(config-route-map)# set community 65001:65001
spine1-debian-11(config-route-map)#
```
Same for large-communities.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When `all` is specified BGP pointer is always NULL, we need to iterate over
all instances separately.
```
Received signal 11 at 1648199394 (si_addr 0x30, PC 0x562e96597090); aborting...
/usr/local/lib/libfrr.so.0(zlog_backtrace_sigsafe+0x5e) [0x7f378a57ff6e]
/usr/local/lib/libfrr.so.0(zlog_signal+0xe6) [0x7f378a580146]
/usr/local/lib/libfrr.so.0(+0xcd4c2) [0x7f378a5aa4c2]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7f378a33e140]
/usr/lib/frr/bgpd(bgp_afi_safi_peer_exists+0) [0x562e96597090]
/usr/lib/frr/bgpd(+0x15c3b8) [0x562e9654a3b8]
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Commit: ea47320b1d
Modified the bgp_clear_stale_route function to have
better indentation, but in the process changed some
`continue;` statements to `break;` which modified
the looping and caused stale paths to not always be
removed upon an update.
To reproduce: A ---- B, setup with addpath and GR
One side has a prefix with nhop1 and nhop2, kill one
side and then resend the same prefix with nhop3,
paths nhop1 and 2 become stale and never removed.
Code inspection clearly shows that that `continue`
statements became `break` statements causing the
loop over all paths to stop prematurely.
The fix is to change the break back to continue
statements so the loop can continue instead of
stopping.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This patch adds transpostion_offset and transposition_len to bgp_sid_info,
and transposes SID only at bgp_zebra_announce.
Signed-off-by: Ryoga Saito <ryoga.saito@linecorp.com>
Add a 15 minute warning to the logging system when
bgp policy is not setup properly. Operators keep asking
about the missing policy( on upgrade typically ). Let's
try to give them a bit more of a hint when something is
going wrong as that they are clearly missing the other
various places FRR tells them about it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
bgp_attr_undup does the same thing as bgp_attr_flush – frees the
temporary data that might be allocated when applying a route-map. There
is no need to have two separate functions for that.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
`struct prefix p` was declared inside an if statement
where we assign the address of to a pointer that is
then passed to a sub function. This will eventually
leave us in a bad state.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In situations where remove-private-AS is configured for eBGP peers
residing in a private ASN, the peer's ASN was not being retained
in the AS-Path which can allow loops to occur. This was addressed
in a prior commit but it only addressed cases where the "replace-AS"
keyword was configured.
This commit ensures we retain the peer's ASN when using
"remove-private-AS" for eBGP peers in a private ASN regardless of other
keywords.
Setup:
=========
router bgp 4200000002
neighbor enp1s0 interface v6only remote-as external
neighbor enp6s0 interface v6only remote-as external
!
address-family ipv4 unicast
neighbor enp6s0 remove-private-AS
exit-address-family
ub18# show ip bgp sum | include 420000
BGP router identifier 100.64.0.111, local AS number 4200000002 vrf-id 0 <<<<< local asn 4200000002
ub20(enp1s0) 4 4200000001 22 22 0 0 0 00:00:57 1 1
ub20(enp6s0) 4 4200000001 21 22 0 0 0 00:00:57 0 1 <<<< peer asn 4200000001
ub18# show ip bgp | include 0.2
Default local pref 100, local AS 4200000002
*> 100.64.0.2/32 enp1s0 0 0 4200000001 4200000004 4200000005 4200000001 i
Before ("remote-private-AS" only):
=========
ub18# show ip bgp neighbors enp6s0 advertised-routes | include 100.64.0.2
*> 100.64.0.2/32 :: 0 i <<<<< empty as-path, no way to prevent loop
After ("remote-private-AS" only):
=========
ub18# show ip bgp neighbors enp6s0 advertised-routes | include 100.64.0.2
*> 100.64.0.2/32 :: 0 4200000001 4200000001 i <<<< retain peer's asn, breaks loop
Ticket: 2857047
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Abstract:
- The command "neighbor PEER maximum-prefix-out NUMBER" cannot be applied
without clearing the BGP neighbor.
- Apply the maximum-prefix-out value as soon as it is modified without
clearing the neighbor.
subgroup_update_packet() and subgroup_withdraw_packet() respectively
manages the announcement and withdrawal BGP message to the peer.
subgrp->scount counter counts the number of sent prefixes.
Before the patch, the maximum out prefix limitation was applied in
subgroup_update_packet() in order that subgrp->scount never exceeds the
limit. Setting a limit inferior to the effective number of sent prefix
did not result in sending any withdrawal message to reduce the number of
sent prefixes. Without clearing the BGP neighbor, the limitation only
applied to the announcement of new prefixes when the limitation was
over.
With the patch, the limitation is checked in subgroup_announce_check().
The function is intended to say whether a prefix has to be announced in
regards to the prefix-list, route-map... Now when a maximum-prefix-out
value is changed/removed, the neighbor AFI/SAFI table is re-parsed in
the same way as for the application of route-map, prefix-lists...
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
BGP EVPN custom `union gw_addr` is basically the same thing as a common
`struct ipaddr` but it lacks the address family which is needed in some
cases.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
This code is populating a temporary variable `add` instead of the attr.
Initially this variable was later copied to the attr but the copying was
erroneously deleted by 0a50c2481. Directly populate the attr to restore
the correct behavior.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Here we try to compare the new attr with the existing one but this call
compares the existing index with zero instead. attrhash_cmp already
compares indexes using overlay_index_same so this call is both wrong and
useless.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
```
exit1-debian-11# sh ip bgp 10.10.10.10/32
BGP routing table entry for 10.10.10.10/32, version 14
Paths: (1 available, best #1, table default)
Not advertised to any peer
65000, (stale)
192.168.0.2 from 192.168.0.2 (0.0.0.0)
Origin incomplete, metric 0, valid, external, best (First path received)
Last update: Wed Jan 19 17:13:51 2022
Time until Graceful Restart stale route deleted: 117
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
'show bgp ... neighbor [routes|received-routes]' both incorrectly
used a json key of 'advertisedRoutes'.
This corrects the key to be 'receivedRoutes' for commands where
the displayed routes were received, not advertised.
before:
unet> r3 show ip bgp neigh 10.2.30.2 received-routes json | include Routes
"advertisedRoutes":{
after:
ub18# show ip bgp neighbors enp1s0 received-routes json | include Routes
"receivedRoutes":{
ub18# show ip bgp neighbors enp1s0 advertised-routes json | include Routes
"advertisedRoutes":{
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>