EVPN MH ES reduendant VTEPs need to install
sync MAC as notify inactive and generate
ND:Proxy stamped extended community on Type-2
route.
Ticket:#3436621
Issue:3436621
Testing Done:
tor-11 originates type-2 MAC route:
tor-11# bridge -d fdb show | grep 00:65:00:00:00:01
00:65:00:00:00:01 dev hostbond1 vlan 1000 notify master bridge static
tor-12 receives sync MAC route:
Before fix:
----------
tor-12:/# bridge -d fdb show | grep 00:65:00:00:00:01
00:65:00:00:00:01 dev hostbond1 vlan 1000 notify master bridge static
After fix: inactive is set to MAC entry
----------
tor-12:/#bridge -d fdb show | grep 00:65:00:00:00:01
00:65:00:00:00:01 dev hostbond1 vlan 1000 notify inactive master bridge
static
Notice the difference in `inactive` post notify on tor-12
with the fix.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This change refactors the zebra_vxlan_if related functionality
to a new zebra_vxlan_if.c file. zebra_vxlan_if_up/down,
zebra_vxlan_if_add/update/del is moved zebra_vxlan_if.c
Signed-off-by: Sharath Ramamurthy <sramamurthy@nvidia.com>
dplane_mac_info and dplane_neigh_info is modified to be vni aware.
dplane_rem_mac_add/del dplane_mac_init is modified to be vni aware.
During dplane context update (mac and neigh), we use the vni information
and if set, corresponding netlink attribute NDA_SRC_VNI is set and passed to the
dplane.
Signed-off-by: Sharath Ramamurthy <sramamurthy@nvidia.com>
This changeset introduces the data structure changes needed for
single vxlan device functionality. A new struct zebra_vxlan_vni_info
encodes the iftype and vni information for vxlan device.
The change addresses related access changes of the new data structure
fields from different files
zebra_vty is modified to take care of the vni dump information according
to the new vni data structure for vxlan devices.
Signed-off-by: Sharath Ramamurthy <sramamurthy@nvidia.com>
New show command "show evpn mac vni xx detail [json]"
to display details of all the mac entries for the
requested VNI.
Output of show evpn mac vni xx detail json:
{
"numMacs":2,
"macs":{
"ca:be:63:7c:81:05":{
"type":"local",
"intf":"veth100",
"ifindex":8,
"uptime":"00:06:55",
"localSequence":0,
"remoteSequence":0,
"detectionCount":0,
"isDuplicate":false,
"syncNeighCount":0,
"neighbors":{
"active":[
"fe80::c8be:63ff:fe7c:8105"
],
"inactive":[
]
}
}
}
}
Also added remoteEs field in the JSON output of
"show evpn mac vni xx json".
Output of show evpn mac vni xx json:
"00:02:00:00:00:0d":{
"type":"remote",
"remoteEs":"03:44:38:39:ff:ff:02:00:00:02",
"localSequence":0,
"remoteSequence":0,
"detectionCount":0,
"isDuplicate":false
}
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
Use "get" as the name for checking the status of the bgp
accept lower seq knob. This already has an equivalent "set"
so makes sense to keep it consistent.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Re-work the bgp vni table to use separately keyed tables for type2
routes.
So, with type2 routes, we have the main table keyed off of the IP and a
new MAC table keyed off of MACs.
By separating out the two, we are able to run path selection separately
for the neigh and mac. Keeping the two separate is also more in-line
with what happens in zebra (they are managed comptletely seperate).
With this change type2 routes go into each table like so:
```
Remote MAC-IP -> IP Table & MAC Table
Remote MAC -> MAC Table
Local MAC-IP -> IP Table
Local MAC -> MAC Table
```
The difference for local is necessary because we should not ever allow
multiple paths for a local MAC.
Also cleaned up the commands for querying the vni tables:
```
show bgp vni all type ...
show bgp vni VNI type ...
```
Old commands will be deprecated in a separate commit.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Add some special handling to accept lower seq routes for local
known routes when not ready. This aligns the code back a bit more
to where it was before to fix seen issues with sync routes.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Add a knob to accept lower seq number in evpn updates
from BGP in Zebra.
Note: Knob is enabled by default
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Since the `mac->flags` with `ZEBRA_MAC_ES_PEER_ACTIVE` is about ES Peer,
it should be displayed as `PEER Active`.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
The global vrf in zebra is always non-NULL. In general, it is bound to
default vrf by `zebra_vrf_init()`, at other times bound to some specific
vrf. Anyway, non-NULL.
So remove all redundant checkings for the returned value of
`zebra_vrf_get_evpn()`.
Additionally, remove the unnecessary check for `zvrf` in
`zebra_vxlan_cleanup_tables()`.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Firstly, *keep no change* for `hash_get()` with NULL
`alloc_func`.
Only focus on cases with non-NULL `alloc_func` of
`hash_get()`.
Since `hash_get()` with non-NULL `alloc_func` parameter
shall not fail, just ignore the returned value of it.
The returned value must not be NULL.
So in this case, remove the unnecessary checking NULL
or not for the returned value and add `void` in front
of it.
Importantly, also *keep no change* for the two cases with
non-NULL `alloc_func` -
1) Use `assert(<returned_data> == <searching_data>)` to
ensure it is a created node, not a found node.
Refer to `isis_vertex_queue_insert()` of isisd, there
are many examples of this case in isid.
2) Use `<returned_data> != <searching_data>` to judge it
is a found node, then free <searching_data>.
Refer to `aspath_intern()` of bgpd, there are many
examples of this case in bgpd.
Here, <returned_data> is the returned value from `hash_get()`,
and <searching_data> is the data, which is to be put into
hash table.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
When `zebra_evpn_mac_svi_add()` adds one found mac by
`zebra_evpn_mac_lookup()` and the found mac is without
svi flag, then call `zebra_evpn_mac_svi_add()` to create
one appropriate mac, but it will call `zebra_evpn_mac_lookup()`
the second time. So lookup twice, the procedure is redundant.
Just an optimization for it, make sure only lookup once.
Modify `zebra_evpn_mac_gw_macip_add()` to check the `macp`
parameter passed by caller, so it can distinguish whether
really need lookup or not.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Two minor changes:
1) Change `zebra_evpn_mac_gw_macip_add()` 's return type to `void`.
2) Since `zebra_evpn_mac_gw_macip_add()` has already `assert` the returned
`mac`, the check of its return value makes no sense. And keep setting
`mac->flags` inside `zebra_evpn_mac_gw_macip_add()` is more reasonable. So
just move the setting `mac->flags` inside `zebra_evpn_mac_gw_macip_add()`.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
zebra crash is seen during shutdown (frr restart).
During shutdown, remote neigh and remote mac clean up
is triggered first, followed by per vni all neigh
(including local) and macs cleanup is triggered.
The crash occurs when a remote mac is cleaned up first
and its reference is remained in local neigh.
When local neigh attempt removes itself from its associated
mac's neigh_list it triggers inaccessible memory crash.
The fix is during mac deletion if its neigh_list is non-empty
then retain the MAC in AUTO state.
This can arise when MAC and neigh duo are in different state
(remote/local). Otherwise, the order of cleanup operation
is neighs followed by macs.
The auto mac will be cleaned up when per vni all neighs and macs
are cleaned up.
Ticket:CM-29826
Reviewed By:CCR-10369
Testing Done:
Configure evpn symmetric config where
MAC is in remote state and neigh is in local state.
Perform frr restart then crash is not seen.
Signed-off-by: Chirag Shah <chirag@nvidia.com>
This function is sure to return correct value by "assert", so the
checking its return value should be removed.
Signed-off-by: anlan_cs <anlan_cs@tom.com>
Since f60a1188 we store a pointer to the VRF in the interface structure.
There's no need anymore to store a separate vrf_id field.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When an ES is deleted and re-added bgpd can start sending MAC-IP sync updates
before the dataplane and zebra have setup the VLAN membership for the ES. Such
MAC entries are not installed in the dataplane till the ES-EVI is created.
Ticket: #2668488
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
Enqueue incoming vxlan remote macip updates on the main
workqueue, instead of performing the updates immediately,
in-line.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
We are creating 2 hash tables per vni in zebra. Once we start to
scale the number of vni's we start to see some serious memory
usage in zebra. Let's reduce the memory usage at startup
for scale of vni's.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This one also needed a bit of shuffling around, but MTYPE_RE is the only
one left used across file boundaries now.
Signed-off-by: David Lamparter <equinox@diac24.net>
This is needed as kernel currently doesn't allow a mac replace if the dst
changes from a L2NHG to a single-VTEP and viceversa.
Ticket: CM-31561
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
When an ES-bond comes out of bypass FRR needs to flush the local MACs learnt
while the bond was in bypass. To do that efficiently local MACs are linked
to the dest-access port. This only happens if the access-port is in
LACP-bypass or if it is non-ES.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Feature overview:
=================
A 802.3ad bond can be setup to allow lacp-bypass. This is done to enable
servers to pxe boot without a LACP license i.e. allows the bond to go oper
up (with a single link) without LACP converging.
If an ES-bond is oper-up in an "LACP-bypass" state MH treats it as a non-ES
bond. This involves the following special handling -
1. If the bond is in a bypass-state the associated ES is placed in a
bypass state.
2. If an ES is in a bypass state -
a. DF election is disabled (i.e. assumed DF)
b. SPH filter is not installed.
3. MACs learnt via the host bond are advertised with a zero ESI.
When the ES moves out of "bypass" the MACs are moved from a zero-ESI to
the correct non-zero id. This is treated as a local station move.
Implementation:
===============
When (a) an ES is detached from a hostbond or (b) an ES-bond goes into
LACP bypass zebra deletes all the local macs (with that ES as destination)
in the kernel and its local db. BGP re-sends any imported MAC-IP routes
that may exist with this ES destination as remote routes i.e. zebra can
end up programming a MAC that was perviously local as remote pointing
to a VTEP-ECMP group.
When an ES is attached to a hostbond or an ES-bond goes
LACP-up (out of bypss) zebra again deletes all the local macs in the
kernel and its local db. At this point BGP resends any imported MAC-IP
routes that may exist with this ES destination as sync routes i.e.
zebra can end up programming a MAC that was perviously remote
as local pointing to an access port.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
For MH the SVI MAC is advertised to prevent flooding of ARP replies.
But because of a bug the SVI MAC was being added to the zebra database
but not sent to bgpd for advertising.
Ticket: CM-33329
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
Added support for advertising SVI MAC if EVPN-MH is enabled.
In the case of EVPN MH arp replies from an attached server can be sent to
the ES-peer. To prevent flooding of the reply the SVI MAC needs to be
advertised by default.
Note:
advertise-svi-ip could have been used as an alternate way to advertise
SVI MAC. However that config cannot be turned on if SVI IPs are
re-used (which is done to avoid wasting IP addresses in a subnet).
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
the old VXLAN function for local MAC deletion was still in
existence and being called from the VXLAN code whilst the new
generic function was not being called at all. Resolve this so
the generic function matches the old function and is called
exclusively.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Create a function that can dump the mac->flags in human readable
output and convert all debugs to use it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>