This commit introduces the implementation for the north-bound
callbacks for the bgp-specific route-map match and set clauses.
Signed-off-by: NaveenThanikachalam <nthanikachal@vmware.com>
Signed-off-by: Sarita Patra <saritap@vmware.com>
Allowing printfrr extensions to directly write to the output buffer has
a few advantages:
- there is no arbitrary length limit imposed (previously 64)
- the output doesn't need to be copied another time
- the extension can directly use bprintfrr() to put together pieces
The downside is that the theoretical length (regardless of available
buffer space) must be computed correctly.
Extended unit tests to test these paths a bit more thoroughly.
Signed-off-by: David Lamparter <equinox@diac24.net>
During Review it was suggested that appending rpki_
to curr_state and target_state would be better
variable names. Instead of going and fixing
3 or so commits up. Just do this one.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add the ability for the end operator to query the state of valid
or invalid or no information rpki prefix information.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When displaying data about the rpki state, use the
string `rpki validation-state` instead of `validation-state:`
to avoid confusion with `(valid)`
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add transactional northbound callbacks for route-map
options for unnumbered neighbor and peer-group under
l2vpn-evpn address-family.
Signed-off-by: Chirag Shah <chirag@nvidia.com>
The MH datastructures were being released before the paths that were
referencing them. Fix is to do the MH cleanup last.
The MH finish function has also been stripped down to only do a
datastructure cleanup i.e. avoid sending route updates etc.
Ticket: 31376
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
1. When a local ES is deleted or the ES-bond goes into bypass we treat
imported MAC-IP routes with that ES destination as remote routes instead
of sync routes. This requires a re-evaluation of the routes as
"non-local-dest" and an update to zebra.
2. When a ES is attached to an access port or the ES-bond transitions from
bypass to LACP-up we treat imported MAC-IP routes with that ES destination as
sync routes. This requires a re-evaluation of the routes as
"local-dest" and an update to zebra.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
In the case of EVPN type-2 routes that use ES as destination, BGP
consolidates the nh (and nh->rmac mapping) and sends it to zebra as
a nexthop add.
This nexthop is the EVPN remote PE and is created by reference of
VRF IPvx unicast paths imported from EVPN Type-2 routes.
zebra uses this nexthop for setting up a remote neigh enty for the PE
and a remote fdb entry for the PE's RMAC.
Ticket: CM-31398
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Setup a mh_info indirection in the path extra. This has been done to
avoid increasing evpn route's path size to add new (type based) pointers
in path_info_extra.
Ticket: CM-31398
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Theoretically we should just be able to use the L3 NHG in the other-VRF/nh-VRF.
But there is some change list handling (when an ES is added to or
removed from a VRF) that needs to be updated to account for routes in other
VRFs using that ES-VRF as nexthop. Till that is done we will disable L3-NHG
use for routes leaked from a different VRF.
Route in tenant2 with ES/NHG as destination -
===========================================
root@leaf11:mgmt:~# ip route show vrf tenant2 22.1.0.7
22.1.0.7 nhid 75000012 proto bgp metric 20
root@leaf11:mgmt:~# ip nexthop list id 75000012
id 75000012 group 103/107/111 proto bgp
root@leaf11:mgmt:~# ip nexthop |grep "103\|107\|111"
id 103 via 6.0.0.11 dev vlan12 scope link proto bgp onlink
id 107 via 6.0.0.12 dev vlan12 scope link proto bgp onlink
id 111 via 6.0.0.13 dev vlan12 scope link proto bgp onlink
id 75000012 group 103/107/111 proto bgp
root@leaf11:mgmt:~#
Leaked into VRF1 with a flat/exploded mpaths
============================================
root@leaf11:mgmt:~# ip route show vrf tenant1 |grep -A3 22.1.0.7
22.1.0.7 proto bgp metric 20
nexthop via 6.0.0.11 dev vlan12 weight 1 onlink
nexthop via 6.0.0.12 dev vlan12 weight 1 onlink
nexthop via 6.0.0.13 dev vlan12 weight 1 onlink
root@leaf11:mgmt:~#
Ticket: CM-31115
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Force flush all ES-EVI PE entries when a L2-VNI is deleted. This will
implicitly free up the remote ES-EVI and deref the ES entry.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
There are two changes in this commit -
1. Maintain a list of global MAC-IP routes per-ES. This list is maintained
for quick processing on the following events -
a. When the first VTEP/PE becomes active in the ES-VRF, the L3 NHG is
activated and the route can be sent to zebra.
b. When there are no active PEs in the ES-VRF the L3 NHG is
de-activated and -
- If the ES is present in the VRF -
The route is not installed in zebra as there are no active PEs for
the ES-VRF
- If the ES is not present in the VRF -
The route is installed with a flat multi-path list i.e. without L3NHG.
This is to handle the case where there are no locally attached L2VNIs
on the ES (for that tenant VRF).
2. Reinstall VRF route when an ES is installed or uninstalled in a
tenant VRF (the global MAC-IP list in #1 is used for this purpose also).
If an ES is present in the VRF we use L3NHG to enable fast-failover of
routed traffic.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This is done to clearly indicate what routes are being linked to
the list i.e. MAC-IP routes in the VNI table.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
In a sym-IRB setup the remote ES may not be installed if the tenant
VRF is not present locally. To allow that case while retaining the
fast-failover benefits for the case where the tenant VRF is locally
present we use the following approach -
1. If ES is present in the tenant VRF we use the L3NHG for installing
the MAC-IP based tenant route. This allows for efficient failover via
L3NHG updates.
2. If the ES is not present locally in the corresponding tenant VRF we
fall back to a non-NHG multi-path based routing approach. In this
case individual routes are updated when the ES links flap.
PS: #1 can be turned off entirely by disabling use-l3-nhg in BGP.
Ticket: CM-30935
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
When an ES goes down the MAC-IP route must be updated to remove it from
the tenant VRF routing table. This is because the fast-failover
(via EAD-per-ES withdraw) procedures described in RFC 7432 are only
applicable to L2 forwarding/MAC-ECMP. For L3/routed traffic (in a
sym-IRB setup) failover, individual paths need to be withdrawn.
To handle this difference in L2/L3 requirements BGP updates the MAC-IP
route to include the L3 ECOM if local destination ES is oper-up and
to exclude the L3 ECOM if local ES is oper-down.
Ticket: CM-30935
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Instead of using bgp_get_default which refers to operational state, we
can check existence of the default node using only candidate config.
The same thing is done in "no router bgp" command.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When "bgp bestpath peer-type multipath-relax" is enabled, multipaths
with both eBGP and iBGP learned routes may exist. It is not desirable
for the iBGP next hops to be discarded from the FIB because they are not
directly connected. When publishing a nexthop group to zebra, the
ZEBRA_FLAG_ALLOW_RECURSION flag is normally not set when the best path
is eBGP; when "bgp bestpath aspath multipath-relax" is configured, the
flag will now be set if any paths are from iBGP peers. This leaves
all-eBGP multipaths still requiring nexthops over connected routes.
Signed-off-by: Joanne Mikkelson <jmmikkel@arista.com>
This new BGP configuration is akin to "bgp bestpath aspath
multipath-relax". When applied, paths learned from different peer types
will be eligible to be considered for multipath (ECMP). Paths from all
of eBGP, iBGP, and confederation peers may be included in multipaths
if they are otherwise equal cost.
This change preserves the existing bestpath behavior of step 10's result
being returned, not the result from steps 8 and 9, in the case where
both 8+9 and 10 determine a winner.
Signed-off-by: Joanne Mikkelson <jmmikkel@arista.com>
1. When VNI export RT changes, for each local es_evi, update local
EAD/ES and EAD/EVI routes and advertise.
2. When VNI import RT changes, uninstall all type-1 routes imported in
the VNI and import routes carrying the updated RT.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
Move `bgp_peer_config_apply` outside `bgp_peer_configure_bfd` (and
document it) so we only call the session installation once with one
set of timers. It also makes all calls of that function
equal (e.g. always calls `bgp_peer_config_apply` afterwards).
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Remove old BFD API usage and replace it with the new one.
Highlights:
- More shared code: the daemon gets notified with callbacks instead of
having to roll its own code to find the notified sessions.
- Less code to integrate with BFD.
- Remove hidden commands to configure single / multi hop. Use
protocol data instead.
BGP can determine if a peer is single/multi hop according to the
following criteria:
a. If the IP address is a link-local address (single hop)
b. The network is shared with peer (single hop)
c. BGP is configured for eBGP multi hop / TTL security (multi hop)
- Respect the configuration hierarchy:
a. Peer configuration take precendence over peer-group
configuration.
b. When peer group configuration is removed, reset peer
BFD configurations to defaults (unless peer had specific
configs).
Example:
neighbor foo peer-group
neighbor foo bfd profile X
neighbor 192.168.0.2 peer-group foo
neighbor 192.168.0.2 bfd
! If peer-group is removed the profile configuration gets
! removed from peer 192.168.0.2, but BFD will still enabled
! because of the neighbor specific bfd configuration.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
The BFD function `bgp_bfd_is_peer_multihop` will no longer exist and now
both code paths are equal.
Longer explanation:
Cumulus was previously using the BFD function to help determine whether a
peer is multi hop or not, because there is a configuration to set BFD
to use single or multi hop.
Current BFD code can automatically pick between single/multi hop by
using the protocol information and so it is a good idea to have that
tested/used than relying on yet another duplicated information.
(BFD extracts the TTL information from protocol and selects
single/multi hop based on that)
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
For link-local IPv6 next hops, the next hop tracking is implemented based
on interface status changes. For this purpose, the ifindex is stored in
the NHT. Reset this value if a change in ifindex is noticed, such as for
example after a restart of the networking service.
Also add some additional debug logs.
Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>
Updates: "bgpd: Switch LL nexthop tracking to be interface based"
Ticket: RM 2575386
Testing Done:
1. Manual verification
2. Precommit (#156), evpn-smoke (#155), bgp-smoke (#157), vrl (#158)
-- Precommit is clean, reported failures in evpn-smoke & vrl are resolved
-- some other tests fail in evpn-smoke, bgp-smoke & vrl, appear to be existing
-- or unrelated failures
Adds support for 'rd all' matching for EVPN and L3VPN show commands.
Introduces evpn_show_route_rd_all_macip().
Cleanup some show commands to use SHOW_DISPLAY string constants.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Back when I put this together in 2015, ISO C11 was still reasonably new
and we couldn't require it just yet. Without ISO C11, there is no
"good" way (only bad hacks) to require a semicolon after a macro that
ends with a function definition. And if you added one anyway, you'd get
"spurious semicolon" warnings on some compilers...
With C11, `_Static_assert()` at the end of a macro will make it so that
the semicolon is properly required, consumed, and not warned about.
Consistently requiring semicolons after "file-level" macros matches
Linux kernel coding style and helps some editors against mis-syntax'ing
these macros.
Signed-off-by: David Lamparter <equinox@diac24.net>
The point of the `-std=gnu99` was to override a `-std=c99` that may be
coming in from net-snmp. However, we want C11, not C99.
Signed-off-by: David Lamparter <equinox@diac24.net>
If we have a SAFI conflict, ie we are trying to activate safi's
UNICAST and LABELED_UNICAST at the same time, we should not
cause bestpath to be rerun and we should not try to put
labels on everything.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Should return an actual useful error message.
Commit: 055679e915 messed this error message
up.
Fixes: #8246
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The v6 LL commit 8761cd6ddb
incorrectly was setting the metric value to 1 for the underlying
connected interface. Modify the code to use a metric value of 0
instead of 1 that now represents the actual metric value that
was originally passed up.
This was noticed when the `show bgp ipv4 uni` command was
inserting a `(metric 1)` into output where before it was not.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a handler for socket errors that runs in the main pthread,
rather than the io pthread. When the io pthread encounters a
read error, capture the error and schedule a task for the main
pthread.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
When dumping data about prefixes in bgp. Let's dump the
rpki validation state as well:
Output if rpki is turned on:
janelle# show rpki prefix 2003::/19
Prefix Prefix Length Origin-AS
2003:: 19 - 19 3320
janelle# show bgp ipv6 uni 2003::/19
BGP routing table entry for 2003::/19
Paths: (1 available, best #1, table default)
Not advertised to any peer
15096 6939 3320
::ffff:4113:867a from 65.19.134.122 (193.72.216.231)
(fe80::e063:daff:fe79:1dab) (used)
Origin IGP, valid, external, best (First path received), validation-state: valid
Last update: Sat Mar 6 09:20:51 2021
janelle# show rpki prefix 8.8.8.0/24
Prefix Prefix Length Origin-AS
janelle# show bgp ipv4 uni 8.8.8.0/24
BGP routing table entry for 8.8.8.0/24
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
100.99.229.142
15096 6939 15169
65.19.134.122 from 65.19.134.122 (193.72.216.231)
Origin IGP, valid, external, best (First path received), validation-state: not found
Last update: Sat Mar 6 09:21:25 2021
Example output when rpki is not configured:
eva# show bgp ipv4 uni 8.8.8.0/24
BGP routing table entry for 8.8.8.0/24
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
janelle(192.168.161.137)
64539 15096 6939 15169
192.168.161.137(janelle) from janelle(192.168.161.137) (192.168.44.1)
Origin IGP, valid, external, bestpath-from-AS 64539, best (First path received)
Last update: Sat Mar 6 09:33:51 2021
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Use the new ringbuffer API function to read file descriptors directly
to the ringbuffer instead of using intermediary buffers.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
'show bgp l2vpn evpn statistics' was returning 0 for all stats
because bgp_table_stats_walker bailed out if afi != AFI_IP or AFI_IP6.
Add case condition to catch AFI_L2VPN.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
If we are filtering a route due to any of the filter reasons
we should not be setting the BGP_NODE_FIB_INSTALL_FIB_PENDING
flag. This is especially evident with say a loopback that
is covered by a network statement. When we receive the route
back from our peer we should not be setting the
BGP_NODE_FIB_INSTALL_PENDING flag on it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
now that sequence number is configurable, there is no problem in
permitting to configure seq 0 sequence number.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When you use a single BGP session for both IPv4 and IPv6 it's a bit
annoying going into ipv6 address-family and explicitly activating it.
Let's get this automatically if enabled with `bgp default ipv6-unicast`.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
When a local ES is in LACP bypass state BGP doesn't advertise
reachability to it i.e. the Type-1/EAD-per-ES routes and Type-4
route for the ES is not advertised. This is the equivalent of
oper-down handling.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Consistency checks are processed in the background using a periodic timer.
Start this timer only if Ethernet Segments are present and consistency
checking is needed.
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
flowspec address family can now use attribute-unchanged attribute.
This parameter is necessary when it comes to play with
route-server-client, as that latter command forces to change
attribute-unchanged nexthop.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
this particular callback had not been implemented in the northbound
conversion, so the command had no effect.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Recent changes to allow bgpd to handle v6 LL slightly
differently in the nexthop tracking code has not
interacted well with the blackhole nexthop change
for peers. Modify the code to do the right thing
Signed-off-by: Donald Sharp <sharpd@nvidia.com>