Rename {bgp,zvrf}_def{ault} to {bgp,zvrf}_evpn where it makes sense,
i.e. when they contain the EVPN instance.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
For default RT, this uses the correct ASN to derive the RT (ASN of the
EVPN VRF).
It also stores them in the EVPN VRF's hash tables rather than in the
default's one.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
This change stores the mapping in the hash table of the EVPN VRF rather
than the one of the default VRF.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
This sends local routes in overlay VRFs to the EPVN VRF when
redistribute configurations are present, rather than to the default VRF.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
Refine check on whether a route can be injected into EVPN to allow
EVPN-sourced routes to be injected back into another instance.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
IPv4 or IPv6 unicast routes which are imported from EVPN routes
(type-2 or type-5) and installed in a BGP instance can be leaked
to another instance.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
In the case of EVPN symmetric routing, the tenant VRF is associated with
a VNI that is used for routing and commonly referred to as the L3 VNI or
VRF VNI. Corresponding to this VNI is a VLAN and its associated L3 (IP)
interface (SVI). Overlay next hops (i.e., next hops for routes in the
tenant VRF) are reachable over this interface.
https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement
section 4.4 provides additional description of the above constructs.
The implementation currently derives this L3 interface for EVPN tenant
routes using special code that looks at route flags. This patch
exchanges the L3 interface between zebra and bgpd as part of the L3-VNI
exchange in order to eliminate some this special code.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Withdraw flag is not sufficient to call bgp_update vs. bgp_withdraw()
processing for a given BGP evpn update message.
When a bgp update needs to be treated as an implicit withdraw
(e.g., due to malformed attribute), the code wasn't handling
things properly.
Rearranging attribute pass field to type-5 route processing and aligning
similar to done for other routes (type2/type-3).
Ticket:CM-24003
Reviewed By:CCR-8330
Testing Done:
Singed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Currently we are hardcoding it at the time of attr building to
ingress-replication. This is just a code clean-up and has no
functional impact.
Ticket: CM-23790
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Executed some evpn related tests with valgrind and saw some errors
related to uninitialized memory and overlapping memcpy. This commit
fixes those.
Ticket: CM-21218
Signed-off-by: Nitin Soni <nsoni@cumulusnetworks.com>
Reviewed-by: CCR-8249
When an inactive-neigh delete is rxed bgp will not have a local path to
remove (and re-run path selection). Instead it simply re-installs the
current best remote path if any.
Ticket: CM-23018
Testing Done: evpn-min
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Move the info filling for zebra mac-ip install (sent by bgpd) to a
common place.
The commit also fixes missing ROUTER flag for one of the cases
added in a code branch that doesn't have the ROUTER changes -
[
6d8c603a
bgpd: use IP address as tie breaker if the MM seq number is the same
]
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
During L3VNI add delete, configured non-default
route-target is not replayed correctly.
Non-default route-target should only be deleted
during unconfiguring under bgp vrf instance,
during delete of l3vni only unmap from the VRF.
during addition of l3vni map back to the VRF
Ticket:CM-21482
Testing Done:
Bring up evpn configuration with L3vni up with
non-default route-target.
Perform delete/add of L3vni and validated non-default
route-target is mapped back to vrf.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
EVPN route's attribute changes,
mark attribute change flag to imported unicast route.
A scenario where AS_PATH attribute have changed for an EVPN type-5
route, set attribute change
to imported route.
Ticket:CM-23008
Reviewed By:
Testing Done:
Validated via marking EVPN route with AS_PATH prepand.
At the receiving VTEP, ensure attribute change flag is set to
imported unicast route and bgp update sent to VTEPs subsequent
bgp peers with AS_PATH prepend update.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
When "default-originate ipv4" is configured, a type-5 route is installed in
the local node and advertised to the peer with auto-rd.
When the above was followed by configuring an RD in IP VRF, Type-5 are
generated for only the non-default routes.
Fixed this issue by withdrawing the default route with auto-rd and advertising
the route with confiured RD.
Signed-off-by: Kishore Aramalla karamalla@vmware.com
Duplicate address detection configuration clis
under bgp l2vpn evpn config mode.
- Enabled/Disable (global knob) for feature.
- Configure cli for duplicate detection action
freeze and freze until time (auto-recovery).
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Enable/disable duplicate address detection
there are 3 actions
warning-only: Default action which generates
only frr warning (syslog) to user for any
duplicate detecton
freeze: Permanently freezes address, manual
intervene required.
freeze with time: An address will recover once
the time has expired (auto-recovery).
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
The bgp_info data is stored as a void pointer in `struct bgp_node`.
Abstract retrieval of this data and setting of this data
into functions so that in the future we can move around
what is stored in bgp_node.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The motivation for this patch is to address a concerning behavior of
tx-addpath-bestpath-per-AS. Prior to this patch, all paths' TX ID was
pre-determined as the path was received from a peer. However, this meant
that any time the path selected as best from an AS changed, bgpd had no
choice but to withdraw the previous best path, and advertise the new
best-path under a new TX ID. This could cause significant network
disruption, especially for the subset of prefixes coming from only one
AS that were also communicated over a bestpath-per-AS session.
The patch's general approach is best illustrated by
txaddpath_update_ids. After a bestpath run (required for best-per-AS to
know what will and will not be sent as addpaths) ID numbers will be
stripped from paths that no longer need to be sent, and held in a pool.
Then, paths that will be sent as addpaths and do not already have ID
numbers will allocate new ID numbers, pulling first from that pool.
Finally, anything left in the pool will be returned to the allocator.
In order for this to work, ID numbers had to be split by strategy. The
tx-addpath-All strategy would keep every ID number "in use" constantly,
preventing IDs from being transferred to different paths. Rather than
create two variables for ID, this patch create a more generic array that
will easily enable more addpath strategies to be implemented. The
previously described ID manipulations will happen per addpath strategy,
and will only be run for strategies that are enabled on at least one
peer.
Finally, the ID numbers are allocated from an allocator that tracks per
AFI/SAFI/Addpath Strategy which IDs are in use. Though it would be very
improbable, there was the possibility with the free-running counter
approach for rollover to cause two paths on the same prefix to get
assigned the same TX ID. As remote as the possibility is, we prefer to
not leave it to chance.
This ID re-use method is not perfect. In some cases you could still get
withdraw-then-add behaviors where not strictly necessary. In the case of
bestpath-per-AS this requires one AS to advertise a prefix for the first
time, then a second AS withdraws that prefix, all within the space of an
already pending MRAI timer. In those situations a withdraw-then-add is
more forgivable, and fixing it would probably require a much more
significant effort, as IDs would need to be moved to ADVs instead of
paths.
Signed-off-by Mitchell Skiba <mskiba@amazon.com>
Allow some debug notification when we are unable to talk
to zebra due to the connection not being there yet.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This change is a fixup to -
7b5e18 - bgpd: use IP address as tie breaker if the MM seq number is the
same
And is being done in response to review comments. This commit brings no
functional change; simply moves around code for easier maintanence.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Same sequence number handling is specified by RFC 7432 -
[
If two (or more) PEs advertise the same MAC address with the same
sequence number but different Ethernet segment identifiers, a PE that
receives these routes selects the route advertised by the PE with the
lowest IP address as the best route.
If the PE is the originator of the MAC route and it receives the same
MAC address with the same sequence number that it generated, it will
compare its own IP address with the IP address of the remote PE and
will select the lowest IP. If its own route is not the best one, it
will withdraw the route.
]
To implement that specification this commit uses nexthop IP as a tie
breaker between two paths of equal seq number with lower IP winning.
Now if a local path already exists with the same sequence number but higher
(local-VTEP) IP it is evicted (deleted and withdrawn from the peers) and
the winning new remote path is installed in zebra. This is existing code
and handled implicitly via evpn_route_select_install.
If a local path is rxed from zebra with the same sequence as the
current remote winner it is rejected (not installed in the bgp
routing tables) and zebra is asked to re-install the older/remote winner.
This is a race condition that can only happen if bgp's add and zebra's add
cross paths. Additional handling has been added in this commit via
evpn_cleanup_local_non_best_route to take care of the race condition.
Ticket: CM-22674
Reviewed By: CCR-7937
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This is needed to install the remote dst when a more preferred local
path is removed.
Ticket: CM-22685
Reviewed By: CCR-7936
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
The ->hash_cmp and linked list ->cmp functions were sometimes
being used interchangeably and this really is not a good
thing. So let's modify the hash_cmp function pointer to return
a boolean and convert everything to use the new syntax.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The evpn_vtep_ip_cmp function must return positive and negative
numbers for when we are doing sorted linked list inserts.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The purpose of adding a l2vni as an sorted list is
shot in the foot when the l2vni compare function only
returns 0 or 1. This will cause subtle crashes when
we add sorted and we end up with multiple list node pointing
to the same thing.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add the '[no] flood <disable|head-end-replication>' command
to the l2vpn evpn afi/safi sub commands for bgp. This command
when entered as 'flood disable' will turn off type 3 route
generation for the transmittal of the type 3 route necessary
for BUM replication on the remote VTEP. Additionally it will
turn off the BUM handling via the new zebra command,
ZEBRA_VXLAN_FLOOD_CONTROL.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Do a straight conversion of `struct bgp_info` to `struct bgp_path_info`.
This commit will setup the rename of variables as well.
This is being done because `struct bgp_info` is not descriptive
of what this data actually is. It is path information for routes
that we keep to build the actual routes nexthops plus some extra
information.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Problem encountered where using the aggregate-address command in an
evpn environment did not work properly. Depending on the order of
actions, the aggregate may not be created or removed when either the
commands were issued or routes come and go.
Ticket: CM-20585
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Scan all bgp vrf instances and respective L3VNI against the VNI which is being configured.
Ticket:CM-21859
Testing Done:
Configure l3vni,
try to configure same vni as l2vni under router bgp, address-family
l2vpn evpn.
The configuration is rejected.
show evpn vni
VNI Type VxLAN IF # MACs # ARPs # Remote VTEPs Tenant VRF
4001 L3 vx-4001 0 0 n/a vrf1
TOR(config)# router bgp 5546
TOR(config-router)# address-family l2vpn evpn
TOR(config-router-af)# vni 4001
% Failed to create VNI
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Implement procedures similar to what is specified in
https://tools.ietf.org/html/draft-malhotra-bess-evpn-irb-extended-mobility
in order to support extended mobility scenarios in EVPN. These are scenarios
where a host/VM move results in a different (MAC,IP) binding from earlier.
For example, a host with an address assignment (IP1, MAC1) moves behind a
different PE (VTEP) and has an address assignment of (IP1, MAC2) or a host
with an address assignment (IP5, MAC5) has a different assignment of (IP6,
MAC5) after the move. Note that while these are described as "move" scenarios,
they also cover the situation when a VM is shut down and a new VM is spun up
at a different location that reuses the IP address or MAC address of the
earlier instance, but not both. Yet another scenario is a MAC change for an
attached host/VM i.e., when the MAC of an attached host changes from MAC1 to
MAC2. This is necessary because there may already be a non-zero sequence
number associated with MAC2. Also, even though (IP, MAC1) is withdrawn before
(IP, MAC2) is advertised, they may propagate through the network differently.
The procedures continue to rely on the MAC mobility extended community
specified in RFC 7432 and already supported by the implementation, but
augment it with a inheritance mechanism that understands the relationship
of the host MACIP (ARP/neighbor table entry) to the underlying MAC (MAC
forwarding database entry). In FRR, this relationship is understood by the
zebra component which doubles as the "host mobility manager", so the MAC
mobility sequence numbers are determined through interaction between bgpd
and zebra.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
backet->data must be non-NULL( look at hash_get ) as such
we do not need to check for NULL values for this when
we retrieve data from the backet.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ensure that the presence of L3VNI is checked before we generate
Router MAC and L3 Route Target extended communities. Without this
check, the router would send an all-zeros RMAC in some situations,
which may cause problems for receivers.
Ticket: CM-21014
Testing Done:
a) Verification of failed scenario
b) Interop verification by Scott Laffer
c) evpn-smoke
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
EVPN ND ext community support NA flag R-bit, to have proxy ND.
Set R-bit in EVPN NA if a given router is default gateway or there is a
local
router attached, which can be determine based on local neighbor entry.
Implement BGP ext community attribute to generate and parse R-bit and
pass along zebra to program neigh entry in kernel.
Upon receiving MAC/IP update with community type 0x06 and sub_type 0x08,
pass the R-bit to zebra to program neigh entry.
Set NTF_ROUTER in neigh entry and inform kernel to do proxy NA for EVPN.
Ref:
https://tools.ietf.org/html/draft-ietf-bess-evpn-na-flags-01
Ticket:CM-21712, CM-21711
Reviewed By:
Testing Done:
Configure Local vni enabled L3 Gateway, which would act as router,
checked
show evpn arp-cache vni x ip <ip of svi> on originated and remote VTEPs.
"Router" flag is set.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
This commit removes various parts of the bgpd implementation code which
are unused/useless, e.g. unused functions, unused variable
initializations, unused structs, ...
Signed-off-by: Pascal Mathis <mail@pascalmathis.com>
In Symmetric routing case, L3VNI stores evpn MAC_IP routes
as IP_PREFIX routes in associated bgp_vrf and fib.
When vxlan device for l3vni goes down, triggers l3vni delete
in bgp.
As part l3vni delete, evpn ip prefix routes associated
with the vni need to be withdrawn from zebra as well
bgpinfo needs to be freed.
bgp_delete does not free bgp_info associated
to evpn ip prefix routes (link to bgp_vrf).
Call to uninstall_evpn_route_entry_in_vrf() properly
cleanup bgp_info as well triggers appropriate updates.
Ticket:CM-21443
Testing Done:
On DUT, bringup symmetric routing configuration, learn
EVPN Type-2 and Type-3 Routes.
Type-2 MAC_IP routes will be stored as ip_prefix in vrf table
during l3vni bring up.
Remove L3vni, deletes all ip_prefix routes from the zebra, kernel
vrf route table and bgp_info is freed.
Check the show bgp memory stats for bgp_info post l3vni flap.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
This correction fixes three bugs detected by Clang scan:
Bug Group: Logic error
Bug Type: Dereference of null pointer
File: bgpd/bgp_evpn.c
Function: bgp_evpn_unconfigure_import_rt_for_vrf
Line: 4246
File: isisd/isis_spf.c
Function: isis_print_paths
Line: 69 (two bugs of same type in one line)
Signed-off-by: F. Aragon <paco@voltanet.io>
The bgp data structures:
bgp->vnihash
bgp->vrf_export_rtl
bgp->vrf_import_rtl
bgp->l2vnis
Must always be valid data structures. So remove the tests
that ensure that they are.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Move the list_delete_and_null of the virt->vrfs code to
the actual deletion function to ensure proper lifecycle.
This assumption allows us to know that irt->vrfs is always
true so remove the NULL check on it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The irt->vnis list was being freed on going down,
but actually delete it from the deletion function. Then
we can know that the irt->vnis is a valid list anywhere
we have a irt pointer.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There exists code paths where the rn was being used after free.
This eliminates these code paths.
Fixes: CM-21019
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ensure that when EVPN routes are imported into a VRF as IPv4 routes,
the NEXT_HOP attribute is set. In the absence of this, this attribute
is currently not generated when advertising the route to peers in the
VRF. It is to be noted that the source route (the EVPN route) will only
have the MP_REACH_NLRI attribute that contains the next hop in it.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Imported routes in a VRF routing table have a reference to their parent
route entry which resides in the EVPN or IPVPN routing table. Ensure that
this reference uses appropriate locking so that the parent entry doesn't
get freed prematurely.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
(cherry picked from commit 13cb6b22ba9d558b1b4a1e8752f63f13242462a7)
Conflicts:
bgpd/bgp_mplsvpn.c
Ticket: CM-20471
Testing Done:
a) Ran vrf_route_leak tests without fix and hit crash, ran twice with fix
and did not see the crash.
b) Ran evpn-smoke and ensured there were no new failures.
There are situations in which zebra may issue more than one delete
notification, so BGP should not warn when it can't locate the VNI
at delete. This is comparable to the situation when a withdraw is
received but the route isn't present locally.
Signed-off-by: Vivek Venkatraman <vivek@cumulusmetworks.com>
Ticket: CM-17512
Reviewed By: Trivial
Testing Done: Manual
EVPN prefix depends on the EVPN route type.
Currently, in FRR we have a prefix_evpn/evpn_addr which relates to a evpn prefix.
We need to convert this to encompass an union of various EVPN route-types.
This diff handles the necessary code changes to adopt the new struct evpn_addr.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Setup a per-VRF identifier to use along with the Router Id to build the
RD. Define a function to encode the RD. Code is brought over from EVPN
and EVPN code has been modified to use the generic function.
Ticket: CM-20256
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
RFC 8635 explains how RT auto-derivation should be done in section
5.1.2.1 [1]. In addition to encoding the VNI in the lowest bytes, a
3-bit field is used to encode a namespace. For VXLAN, we have to put 1
in this field. This is needed for proper interoperability with RT
auto-derivation in JunOS. Since this would break existing setup, an
additional option, "autort rfc8365-compatible" is used.
[1]: https://tools.ietf.org/html/rfc8365#section-5.1.2.1
Signed-off-by: Vincent Bernat <vincent@bernat.im>
Ethernet Tag ID (ETI) is part of the prefix. It cannot just be ignored
as it needs to be used when checking for prefix uniqueness. Moreover,
when using Quagga as a route reflector, we need to keep its
value. Therefore, we correctly parse and encode it. We also parse
ESI. While not part of the prefix, it needs to be reflected correctly
by Quagga.
Signed-off-by: Vincent Bernat <vincent@bernat.im>
Routes that have labels must be sent via a nexthop that also has labels.
This change notes whether any path in a nexthop update from zebra contains
labels. If so, then the nexthop is valid for routes that have labels.
If a nexthop update has no labeled paths, then any labeled routes
referencing the nexthop are marked not valid.
Add a route flag BGP_INFO_ANNC_NH_SELF that means "advertise myself
as nexthop when announcing" so that we can track our notion of the
nexthop without revealing it to peers.
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
When an L3 VNI is deleted, cleanup linkage to it from associated
L2 VNIs.
Updates: bgpd: keep a backpointer to vrf instance in struct bgpevpn
[Mitesh Kanjariya]
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
The following types are nonstandard:
- u_char
- u_short
- u_int
- u_long
- u_int8_t
- u_int16_t
- u_int32_t
Replace them with the C99 standard types:
- uint8_t
- unsigned short
- unsigned int
- unsigned long
- uint8_t
- uint16_t
- uint32_t
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The str2prefix_rd function can fail, but for auto-derived
values this should be impossible to happen. So ignore it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This commit does these 2 things:
1) irt->vrfs is never NULL so no need to test for it
2) No need to check for a good irt value returned from
vrf_import_rt_new as that the alloc operation will
dump if memory allocation fails.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Set EVPN routes imported into a VRF to (sub)type BGP_ROUTE_IMPORTED and
use this for passing appropriate information to zebra. This is needed
because relying on the Router MAC for this purpose was incorrect and
impacted routing to/from external destinations, particularly for IPv6.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
In bgp_update_receive the first thing we do is establish
that the peer->status is Established. We then do a bunch
of work and call bgp_nlri_parse where we break out for
each address family. Each AFI is then checking for
being peer->status is Established again. There is no
point in checking this again.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Received PMSI tunnel attributes (in EVPN type-3 route) were not recognized.
Parse them and display the tunnel type when looking at routes. Note that
the only tunnel type currently supported is ingress replication (IR). A
warning message will be logged if the received tunnel type is something
else, but the attribute is otherwise ignored.
Updates: a21bd7a (bgpd: add PMSI_TUNNEL_ATTRIBUTE to EVPN IMET routes)
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Implement support for EVPN symmetric routing for IPv6 routes. The next hop
for EVPN routes is the IP address of the remote VTEP which is only an IPv4
address. This means that for IPv6 symmetric routing, there will be IPv6
destinations with IPv4 next hops. To make this work, the IPv4 next hops are
converted into IPv4-mapped IPv6 addresses.
As part of support, ensure that "L3" route-targets are not announced with
IPv6 link-local addresses so that they won't be installed in the routing
table.
Signed-off-by: Vivek Venkatraman vivek@cumulusnetworks.com
Reviewed-by: Mitesh Kanjariya mitesh@cumulusnetworks.com
Reviewed-by: Donald Sharp sharpd@cumulusnetworks.com
The PMSI attribute is only applicable to EVPN type-3 route.
Rmac is applicable to type-2 and type-5 routes.
We should attach these attributes appropiately based on route-type.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Type-5 routes can be useful in multiple scenarios such as advertise-subnet,
default-originate etc. Currently, the code has a restriction that to allow
advertising type-5 routes, user has to first enable advertise ipvX command.
This restriction is not necessary and should be removed.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
FRR/CL provides the means for injecting regular (IPv4) routes
from the BGP RIB into EVPN as type-5 routes.
This needs to be enhanced to allow selective injection.
This can be achieved by adding a route-map option
for the "advertise ipv4/ipv6 unicast" command.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Asymmetric routing is an ideal choice when all VLANs are cfged on all leafs.
It simplifies the routing configuration and
eliminates potential need for advertising subnet routes.
However, we need to reach the Internet or global destinations
or to do subnet-based routing between PODs or DCs.
This requires EVPN type-5 routes but those routes require L3 VNI configuration.
This task is to support EVPN type-5 routes for prefix-based routing in
conjunction with asymmetric routing within the POD/DC.
It is done by providing an option to use the L3 VNI only for prefix routes,
so that type-2 routes (host routes) will only use the L2 VNI.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
When an l3-vni is enabled, type-2 routes are sent with 2 labels (l2vni and l3vni).
When it gets deleted, we need to update type-2 routes and send them with only one label (l2vni).
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Ensure that if multiple parameters for a VNI change simultaneously, the
changes are processed correctly. The changes of interest are the local
tunnel IP address and the tenant VRF to which this VNI is attached. The
former is used to originate type-3 routes as well as set the next hop of
all routes, the latter helps to determine the route targets and VNIs to
include in the route.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Ticket: CM-19099
Reviewed By: CCR-7102
Testing Done:
1. Manually reproduced problem and verified fix.
2. Additional trigger events tested with fix.
EVPN type-2 and type-5 routes received with a L3 VNI and corresponding RTs
are installed into the appropriate BGP RIB. Ensure that these routes are not
re-injected back into EVPN as type-5 routes when type-5 advertisement is
enabled; only regular IPv4 routes (and IPv6 routes in future) in the RIB
should be injected into EVPN.
As a benefit of this change, no longer restrict that EVPN type-5 routes
should be non-host routes - i.e., allow /32 IPv4 routes (and /128 IPv6
routes in future).
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Ticket: CM-19456
Reviewed By: CCR-7117
Testing Done:
1. Manual replication of problem and verification of fix
2. evpn-min
While processing the type-2 evpn route nlri,
increment the pointer by BGP_LABEL_BYTES instead of '3'
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Ensure that spurious error messages are not generated in a non-EVPN configuration
when routes in a VRF get deleted or added. Also, check on EVPN advertisement
being enabled before walking VRF routing table.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-19206
Reviewed By: CCR-7073
Testing Done:
1. Recreate errors and validate fix
2. Type-5 route related testing - new routes, neighbor flap etc.
When a EVPN type-5 route is formed by using the source IP route's AS-path,
the AS-path is not locally generated and should not be "uninterned" (i.e.,
have its reference count incorrectly updated). An incorrect update of the
reference count can lead to asserts or crashes at a later stage.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Ticket: CM-19121
Reviewed By: CCR-7028
Testing Done:
1. Manual testing by Vivek and Anitha
2. No automation run since this area has no coverage yet
When an IPv4 or IPv6 unicast route is injected into EVPN as a type-5 route
(upon user configuration), ensure that the source route (best path)'s path
attributes are used to build the EVPN type-5 route. This will result in
correct AS_PATH and ORIGIN attributes for the type-5 route so that it doesn't
appear that all type-5 routes are locally sourced. This is necessary to
ensure that external paths (IPv4 or IPv6 from EBGP peer) are preferred over
internal EVPN paths, if both exist.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Ticket: CM-19051
Reviewed By: CCR-7009
Testing Done: Verify failed scenario
In EVPN symmetric routing, not all subnets are presents everywhere.
We have multiple scenarios where a host might not get learned locally.
1. GARP miss
2. SVI down/up
3. Silent host
We need a mechanism to resolve such hosts. In order to achieve this,
we will be advertising a subnet route from a box and that box will help
in resolving the ARP to such hosts.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
When doing symmetric routing,
EVPN type-2 (MACIP) routes need to be advertised with two labels (VNIs)
the first being the L2 VNI (identifying the VLAN) and
the second being the L3 VNI (identifying the VRF).
The receive processing needs to handle one or two labels too.
Ticket: CM-18489
Review: CCR-6949
Testing: manual and bgp/evpn/mpls smoke
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
1. Added default gw extended community
2. code modification to handle sticky-mac/default-gw-mac as they go together
3. show command support for newly added extended community
4. State in zebra to reflect if a mac/neigh is default gateway
5. show command enhancement to refelect the same in zebra commands
Ticket: CM-17428
Review: CCR-6580
Testing: Manual
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
EVPN is only enabled when user configures advertise-all-vni.
All VNIs (L2 and L3) should be cleared upon removal of this config.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
When we receive an MP_UNREACH,
we try to uninstall routes from the VRF and the VNI.
The route-node in the VRF corresponds
to the ip prefix formed from EVPN prefix.
We should correctly form the prefix based on the EVPN route-type.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
for EVPN routes prefixlen filed in struct prefix
represents the sizeof of the struct rather than the actual prefix len.
This is later used in looking up route node in RIB.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
CLI config for enabling/disabling type-5 routes
router bgp <as> vrf <vrf>
address-family l2vpn evpn
[no] advertise <ipv4|ipv6|both>
loop through all the routes in VRF instance and advertise/withdraw
all ip routes as type-5 routes in default instance.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
For EVPN type-5 route the NH in the NLRI is set to the local tunnel ip.
This information has to be obtained from kernel notification.
We need to pass this info from zebra to bgp in l3vni call flow.
This patch doesn't handle the tunnel-ip change.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
1. VRF RD can be auto-derived (simillar to RD for a VNI)
2. VRF RD can be configured manually through a config
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
currently, we have a rd_id bitfield
to assign an unique index for auto RD.
This bitfield currently resides under struct bgp which seems wrong.
We need to shift this to a global space
as this ID space is really global per box.
One more reason to keep it at a global data structure is,
the ID space could be used by both VNIs and VRFs.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
BGP VRF can be created/deleted either via config or via l3vni add/del.
We need to handle various sequences.
1. If user config is presented, an l3vni del should not delete the vrf instance
2. do not write bgp config in show running for auto created vrf
2. If l3vni present, disallow the cli for deleting bgp vrf instance
3. If l3vni is added and vrf config is present set the flags properly
4. if bgp vrf is configured unset the AUTO flag
Ticket: CM-18630
Review: CCR-6906
Testing: Manual
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>