It doesn't make much sense for a hash function to modify its argument,
so const the hash input.
BGP does it in a couple places, those cast away the const. Not great but
not any worse than it was.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
IMET/type-3 routes are used by VTEPs to advertise the flood mode for BUM
traffic via the PMSI tunnel attribute. If a type-3 route is not rxed from
a remote-VTEP we default to "no-head-end-rep" for that remote-VTEP. In such
cases static-config such as PIM is likely used for BUM flooding.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
IMET route is optional if the flood mode is PIM-SM and serves
no functional purpose. So this change limits type-3 route generation
to flood-mode=head-end-replication.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
If PIM-SM if used for BUM flooding the multicast group address can be
configured per-vxlan-device. BGP receives this config from zebra via
the L2 VNI add/update.
Sample output -
root@TORS1:~# vtysh -c "show bgp l2vpn evpn vni 1000" |grep Mcast
Mcast group: 239.1.1.100
root@TORS1:~#
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
During L3VNI add, non-default RD value is not replayed
correctly. Instead of picking non-default value it picks
up auto RD value which is derived based on router-id.
Indentation issue: Remove additional space from
L3VNI running config output.
Ticket:CM-24320
Reviewed By:CCR-8437
Testing Done:
Bring up evpn configuration with L3vni up with non-default
RD value, perform peerlink flap, l3vni flap which removes
all VNIS and readds with RD and RT values.
The configured RD and RTs are replayed.
Post L3VNI flap
router bgp 5546 vrf vrf2
!
address-family l2vpn evpn
rd 45.0.66.2:6
route-target import 20001:1
route-target export 20001:1
exit-address-family
TORC11# show bgp l2vpn evpn vni 4002
VNI: 4002 (known to the kernel)
Type: L3
Tenant VRF: vrf2
RD: 45.0.66.2:6
Originator IP: 36.0.0.11
Advertise-gw-macip : n/a
Import Route Target:
20001:1
Export Route Target:
20001:1
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
When a bgp-peer comes up prior to l3vnis are up in bgpd.
The EVPN routes (type-2/type-5) are learnt via peer.
The routes can have one of interface's MAC in rmac attribute.
The self rmac check would bypass as l3vni is not present.
Once l3vni has come up in bgpd, while installing evpn
routes in vrf table, perform rmac attribute check against self mac.
The routes with rmac of ours will be removed via re-scan
of routes during bgp_mac_rescan_all_evpn_tables when
interface mac is added to bgp.
Ticket:CM-24224
Reviewed By:CCR-8423
Testing Done:
Signed-off-by: Chirag Shah <chirag@cumulunetworks.com>
Rename {bgp,zvrf}_def{ault} to {bgp,zvrf}_evpn where it makes sense,
i.e. when they contain the EVPN instance.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
For default RT, this uses the correct ASN to derive the RT (ASN of the
EVPN VRF).
It also stores them in the EVPN VRF's hash tables rather than in the
default's one.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
This change stores the mapping in the hash table of the EVPN VRF rather
than the one of the default VRF.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
This sends local routes in overlay VRFs to the EPVN VRF when
redistribute configurations are present, rather than to the default VRF.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
Refine check on whether a route can be injected into EVPN to allow
EVPN-sourced routes to be injected back into another instance.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
IPv4 or IPv6 unicast routes which are imported from EVPN routes
(type-2 or type-5) and installed in a BGP instance can be leaked
to another instance.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
In the case of EVPN symmetric routing, the tenant VRF is associated with
a VNI that is used for routing and commonly referred to as the L3 VNI or
VRF VNI. Corresponding to this VNI is a VLAN and its associated L3 (IP)
interface (SVI). Overlay next hops (i.e., next hops for routes in the
tenant VRF) are reachable over this interface.
https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement
section 4.4 provides additional description of the above constructs.
The implementation currently derives this L3 interface for EVPN tenant
routes using special code that looks at route flags. This patch
exchanges the L3 interface between zebra and bgpd as part of the L3-VNI
exchange in order to eliminate some this special code.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Withdraw flag is not sufficient to call bgp_update vs. bgp_withdraw()
processing for a given BGP evpn update message.
When a bgp update needs to be treated as an implicit withdraw
(e.g., due to malformed attribute), the code wasn't handling
things properly.
Rearranging attribute pass field to type-5 route processing and aligning
similar to done for other routes (type2/type-3).
Ticket:CM-24003
Reviewed By:CCR-8330
Testing Done:
Singed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Currently we are hardcoding it at the time of attr building to
ingress-replication. This is just a code clean-up and has no
functional impact.
Ticket: CM-23790
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Executed some evpn related tests with valgrind and saw some errors
related to uninitialized memory and overlapping memcpy. This commit
fixes those.
Ticket: CM-21218
Signed-off-by: Nitin Soni <nsoni@cumulusnetworks.com>
Reviewed-by: CCR-8249
When an inactive-neigh delete is rxed bgp will not have a local path to
remove (and re-run path selection). Instead it simply re-installs the
current best remote path if any.
Ticket: CM-23018
Testing Done: evpn-min
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Move the info filling for zebra mac-ip install (sent by bgpd) to a
common place.
The commit also fixes missing ROUTER flag for one of the cases
added in a code branch that doesn't have the ROUTER changes -
[
6d8c603a
bgpd: use IP address as tie breaker if the MM seq number is the same
]
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
During L3VNI add delete, configured non-default
route-target is not replayed correctly.
Non-default route-target should only be deleted
during unconfiguring under bgp vrf instance,
during delete of l3vni only unmap from the VRF.
during addition of l3vni map back to the VRF
Ticket:CM-21482
Testing Done:
Bring up evpn configuration with L3vni up with
non-default route-target.
Perform delete/add of L3vni and validated non-default
route-target is mapped back to vrf.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
EVPN route's attribute changes,
mark attribute change flag to imported unicast route.
A scenario where AS_PATH attribute have changed for an EVPN type-5
route, set attribute change
to imported route.
Ticket:CM-23008
Reviewed By:
Testing Done:
Validated via marking EVPN route with AS_PATH prepand.
At the receiving VTEP, ensure attribute change flag is set to
imported unicast route and bgp update sent to VTEPs subsequent
bgp peers with AS_PATH prepend update.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
When "default-originate ipv4" is configured, a type-5 route is installed in
the local node and advertised to the peer with auto-rd.
When the above was followed by configuring an RD in IP VRF, Type-5 are
generated for only the non-default routes.
Fixed this issue by withdrawing the default route with auto-rd and advertising
the route with confiured RD.
Signed-off-by: Kishore Aramalla karamalla@vmware.com
Duplicate address detection configuration clis
under bgp l2vpn evpn config mode.
- Enabled/Disable (global knob) for feature.
- Configure cli for duplicate detection action
freeze and freze until time (auto-recovery).
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Enable/disable duplicate address detection
there are 3 actions
warning-only: Default action which generates
only frr warning (syslog) to user for any
duplicate detecton
freeze: Permanently freezes address, manual
intervene required.
freeze with time: An address will recover once
the time has expired (auto-recovery).
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
The bgp_info data is stored as a void pointer in `struct bgp_node`.
Abstract retrieval of this data and setting of this data
into functions so that in the future we can move around
what is stored in bgp_node.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The motivation for this patch is to address a concerning behavior of
tx-addpath-bestpath-per-AS. Prior to this patch, all paths' TX ID was
pre-determined as the path was received from a peer. However, this meant
that any time the path selected as best from an AS changed, bgpd had no
choice but to withdraw the previous best path, and advertise the new
best-path under a new TX ID. This could cause significant network
disruption, especially for the subset of prefixes coming from only one
AS that were also communicated over a bestpath-per-AS session.
The patch's general approach is best illustrated by
txaddpath_update_ids. After a bestpath run (required for best-per-AS to
know what will and will not be sent as addpaths) ID numbers will be
stripped from paths that no longer need to be sent, and held in a pool.
Then, paths that will be sent as addpaths and do not already have ID
numbers will allocate new ID numbers, pulling first from that pool.
Finally, anything left in the pool will be returned to the allocator.
In order for this to work, ID numbers had to be split by strategy. The
tx-addpath-All strategy would keep every ID number "in use" constantly,
preventing IDs from being transferred to different paths. Rather than
create two variables for ID, this patch create a more generic array that
will easily enable more addpath strategies to be implemented. The
previously described ID manipulations will happen per addpath strategy,
and will only be run for strategies that are enabled on at least one
peer.
Finally, the ID numbers are allocated from an allocator that tracks per
AFI/SAFI/Addpath Strategy which IDs are in use. Though it would be very
improbable, there was the possibility with the free-running counter
approach for rollover to cause two paths on the same prefix to get
assigned the same TX ID. As remote as the possibility is, we prefer to
not leave it to chance.
This ID re-use method is not perfect. In some cases you could still get
withdraw-then-add behaviors where not strictly necessary. In the case of
bestpath-per-AS this requires one AS to advertise a prefix for the first
time, then a second AS withdraws that prefix, all within the space of an
already pending MRAI timer. In those situations a withdraw-then-add is
more forgivable, and fixing it would probably require a much more
significant effort, as IDs would need to be moved to ADVs instead of
paths.
Signed-off-by Mitchell Skiba <mskiba@amazon.com>
Allow some debug notification when we are unable to talk
to zebra due to the connection not being there yet.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This change is a fixup to -
7b5e18 - bgpd: use IP address as tie breaker if the MM seq number is the
same
And is being done in response to review comments. This commit brings no
functional change; simply moves around code for easier maintanence.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Same sequence number handling is specified by RFC 7432 -
[
If two (or more) PEs advertise the same MAC address with the same
sequence number but different Ethernet segment identifiers, a PE that
receives these routes selects the route advertised by the PE with the
lowest IP address as the best route.
If the PE is the originator of the MAC route and it receives the same
MAC address with the same sequence number that it generated, it will
compare its own IP address with the IP address of the remote PE and
will select the lowest IP. If its own route is not the best one, it
will withdraw the route.
]
To implement that specification this commit uses nexthop IP as a tie
breaker between two paths of equal seq number with lower IP winning.
Now if a local path already exists with the same sequence number but higher
(local-VTEP) IP it is evicted (deleted and withdrawn from the peers) and
the winning new remote path is installed in zebra. This is existing code
and handled implicitly via evpn_route_select_install.
If a local path is rxed from zebra with the same sequence as the
current remote winner it is rejected (not installed in the bgp
routing tables) and zebra is asked to re-install the older/remote winner.
This is a race condition that can only happen if bgp's add and zebra's add
cross paths. Additional handling has been added in this commit via
evpn_cleanup_local_non_best_route to take care of the race condition.
Ticket: CM-22674
Reviewed By: CCR-7937
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This is needed to install the remote dst when a more preferred local
path is removed.
Ticket: CM-22685
Reviewed By: CCR-7936
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
The ->hash_cmp and linked list ->cmp functions were sometimes
being used interchangeably and this really is not a good
thing. So let's modify the hash_cmp function pointer to return
a boolean and convert everything to use the new syntax.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The evpn_vtep_ip_cmp function must return positive and negative
numbers for when we are doing sorted linked list inserts.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The purpose of adding a l2vni as an sorted list is
shot in the foot when the l2vni compare function only
returns 0 or 1. This will cause subtle crashes when
we add sorted and we end up with multiple list node pointing
to the same thing.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add the '[no] flood <disable|head-end-replication>' command
to the l2vpn evpn afi/safi sub commands for bgp. This command
when entered as 'flood disable' will turn off type 3 route
generation for the transmittal of the type 3 route necessary
for BUM replication on the remote VTEP. Additionally it will
turn off the BUM handling via the new zebra command,
ZEBRA_VXLAN_FLOOD_CONTROL.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Do a straight conversion of `struct bgp_info` to `struct bgp_path_info`.
This commit will setup the rename of variables as well.
This is being done because `struct bgp_info` is not descriptive
of what this data actually is. It is path information for routes
that we keep to build the actual routes nexthops plus some extra
information.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>