- Define a hook "zebra_mac_update" which can be registered by multiple
data plane components (e.g. FPM, dplane).
DEFINE_HOOK(zebra_rmac_update, (zebra_mac_t *rmac, zebra_l3vni_t *zl3vni, bool
delete, const char *reason), (rmac, zl3vni, delete, reason))
- While performing RMAC add/delete for an L3VNI, call "zebra_mac_update" hook.
- This hook call triggers "zfpm_trigger_rmac_update". In this function, we do a
lookup for the RMAC in fpm_mac_info_table. If already present, this node is
updated with the latest RMAC info. Else, a new fpm_mac_info_t node is created
and inserted in the queue and hash data structures.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
Field vrf_id is replaced by the pointer of the struct vrf *.
For that all other code referencing to (interface)->vrf_id is replaced.
This work should not change the behaviour.
It is just a continuation work toward having an interface API handling
vrf pointer only.
some new generic functions are created in vrf:
vrf_to_id, vrf_to_name,
a zebra function is also created:
zvrf_info_lookup
an ospf function is also created:
ospf_lookup_by_vrf
it is to be noted that now that interface has a vrf pointer, some more
optimisations could be thought through all the rest of the code. as
example, many structure store the vrf_id. those structures could get
the exact vrf structure if inherited from an interface vrf context.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add the local and remote sequence number to the `show evpn arp-cache vni XX` command.
VNI 1000111 #ARP (IPv4 and IPv6, local and remote) 15
IP Type State MAC Remote VTEP Seq #'s
fe80::202:ff:fe00:15 remote active 00:02:00:00:00:15 6.0.0.31 0/0
fe80::202:ff:fe00:8 local active 00:02:00:00:00:08 0/0
60.1.1.111 local active 00:02:00:00:00:08 0/0
2060:1:1:1::11 local active 00:e0:ec:38:49:a1 0/0
fe80::202:ff:fe00:11 remote active 00:02:00:00:00:11 6.0.0.30 0/0
2060:1:1:1::211 remote active 00:02:00:00:00:11 6.0.0.30 0/0
2060:1:1:1::121 remote active 00:02:00:00:00:0c 6.0.0.29 0/0
60.1.1.211 remote active 00:02:00:00:00:11 6.0.0.30 0/0
fe80::202:ff:fe00:c remote active 00:02:00:00:00:0c 6.0.0.29 0/0
60.1.1.11 local active 00:e0:ec:38:49:a1 0/0
fe80::2e0:ecff:fe38:49a1 local active 00:e0:ec:38:49:a1 0/0
60.1.1.221 remote active 00:02:00:00:00:15 6.0.0.31 0/0
2060:1:1:1::111 local active 00:02:00:00:00:08 0/0
2060:1:1:1::221 remote active 00:02:00:00:00:15 6.0.0.31 0/0
60.1.1.121 remote active 00:02:00:00:00:0c 6.0.0.29 0/0
The seq numbers are at 0/0 because we have had no mobility events.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
- For data plane processing of VxLAN routes, add encap type and L3VNI info to
rtmsg message for FPM.
- Add "RTA_ENCAP_TYPE" attribute for VxLAN encap with value 100.
This value is not currently used for RTA_ENCAP_TYPE for any encap.
- If "RTA_ENCAP_TYPE" is 100, add "RTA_ENCAP" attribute with "RTA_VNI" as a
nested attribute of RTA_ENCAP
Format of RTA_VNI attribute:
Len(2 bytes) type (2 bytes) Value(4 bytes)(VNI)
00 08 : 00 00 : 1000
RTA_VNI attribute is a custom attribute.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
It doesn't make much sense for a hash function to modify its argument,
so const the hash input.
BGP does it in a couple places, those cast away the const. Not great but
not any worse than it was.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
if the local sticky mac delete request is received,
if there are associated neighbor entries present, mac's
only local flag is removed and marked as auto mac.
this results in next local mac learning automatically assumes
mac is sticky.
There is a case when bridge learning off is configured, user
configures sticky mac via bridge fdb add.
This MAC learns associated neighbor entry.
Later user deletes stick mac via bridge fdb del, this triggers
frr to delete mac but if there are neighbors present, frr marks
MAC as AUTO but does not remove sticky flag.
User enables bridge learning on which triggers
The mac to learn as dynamic entry and in absence of this
fix, the mac is marked as sticky.
Ticket:CM-24968
Reviewed By:CCR-8683
Testing Done:
Validated broken condition with internally reproduction
with fix and without.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
With flooding control added recently we were not properly handling
the new flood control parameter in zebra_vxlan.c handler functions.
The error message that was being repeatedly seen:
2019/05/01 00:47:32 ZEBRA: [EC 100663311] stream_get2: Attempt to get out of bounds
2019/05/01 00:47:32 ZEBRA: [EC 100663311] &(struct stream): 0x7f0f04001740, size: 22, getp: 22, endp: 22
The fix was to ensure that both the _add and _del functions kept proper
sizing of amount of data read *and* the _del function was not
reading the flood_control data from the stream.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
These updates act as triggers to pimd to -
1. join the MDT for rxing VxLAN encapsulated BUM traffic
2. register the local-vtep-ip as a source for the MDT
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
An SG entry is added (if one doesn't already exist) when a l2-VNI is
associated with a mcast-grp and local-vtep-ip.
And viceversa; when the last l2-vni using a MDT is removed the SG
entry is deleted.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Based code for adding (S, G) entries. These entries are created when
a mcast-group and local-VTEP-IP is associated with and L2 VNI.
The parent (*, G) entries are created implicitly on the (S, G) addition
and play the role of termination entries.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Each multicast tunnel is associated with a -
1. Tunnel origination mroute that is used for forwarding the
VxLAN encapsulated flow -
S - local VTEP-IP
G - BUM mcast-group
2. And a tunnel termination entry -
S - * (any remote VTEP)
G - BUM mcast-group
Multiple L2 VNIs can share the same BUM mcast group (and local-VTEP-IP).
Zebra maintains an mcast (SG) hash table to pass this info to pimd for
subsequent MDT setup.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Remote VTEPs advertise the flood mode via IMET and the ingress VTEP
needs to perform head-end-replication of BUM packets to it only if the
PMSI tunnel type is set to ingress-replication. If a type-3 route is not
rxed or rxed with a mode other than ingress-replication we can skip
installation of the flood fdb entry for that L2-VNI. In that case the
remote VTEP is either not interested in BUM traffic or is using a
"static-config" based replication mode like PIM.
Sample output with HER -
=======================
root@TORS1:~# vtysh -c "show evpn vni 1000" |grep "Remote\|flood"
Remote VTEPs for this VNI:
27.0.0.8 flood: HER
root@TORS1:~#
Sample output with PIM-SM -
=========================
root@TORS2:~# vtysh -c "show evpn vni 1000" |grep "Remote\|flood"
Remote VTEPs for this VNI:
27.0.0.7 flood: -
root@TORS2:~#
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
L3VNI configured in a specific VRF is allowed to unconfigure from any
VRF, including default (global) VRF. This results L3VNI delete notification
to BGP and subsequent type-5 route uninstall from the VRF the L3VNI belong to.
This also resulted in the inconsistent running configuration.
The deleted L3VNI still shows up in its original VRF. The VRF in which the
"no vni <x>" was executed doesn't display its own L3VNI.
Added a VRF check in zebra to prevent this.
Signed-off-by: Kishore Aramalla <karamalla@vmware.com>
It had no logical reason to be in the default VRF. This moves it to the
zebra_router, which is better suited to store global references.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
A lot of checks relied on the VRF ID and the EVPN VRF ID to be the same.
This patch changes those checks to the EVPN_ENABLED macro, which checks
if the VRF is the EVPN one.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
For a MAC-IP pair generally local/netlink msg for
MAC is received followed by Neigh. The MAC can be detected as duplicate
during this event.
When a neigh update is received, the neigh inherits DUP flag from its
MAC and along with that mark the neigh as INACTIVE.
Also, In the case of DUP detected neigh, do not update its state
to ACTIVE before determining to send notification to bgpd.
There is a time when Neigh update received prior to MAC update.
In that case neigh is marked as inactive since its MAC is
still in REMOTE state. Once the MAC update is received and
it is detected as DUPLICATE, the neigh would inherit DUP flag
but remained in inactive state.
By fixing the first case, the neigh remains in inactive once
detected as DUPLICATE in both scenarios.
The unfreeze action would mark all inherited neighs to ACTIVE,
and clears DUP flag then sends notification to bgpd (to send type-2).
Ticket:CM-24339
Reviewed By:CCR-8451
Testing Done:
Validated dup detection on both environment where neigh and mac
notification can come as either one first.
With the fix, the neigh was remained in "inactive" state
once detected as duplicate.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
This replaces manual checks of the flag with a wrapper macro to convey
the meaning "is evpn enabled on this vrf?"
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
Rename {bgp,zvrf}_def{ault} to {bgp,zvrf}_evpn where it makes sense,
i.e. when they contain the EVPN instance.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
Since the EVPN VRF may not be the default one, compare received
messages' VRF agains the EVPN VRF and not the Default.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
This uses the EPVN VRF to store L3VNIs hashes, and looks up L2VNIs in
this VRF as they are stored there.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
This sends local VNIs and local MAC addresses to the BGP instance
responsible for EVPN rather than the default one.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
If the EVPN VRF is not the default one (i.e. with advertise-all-vni),
this allows showing its information with `show bgp l2evpn evpn ...`
commands. They do not require adding `vrf VRFNAME` since we only
support a single EVPN VRF. The same is true for zebra-specific commands
(e.g. `show evpn ...`).
Configuration commands are not restricted to the default VRF but to
the EVPN one, that is to the one bearing `advertise-all-vni`.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
The EVPN VRF is defined by bgpd, and is the one vrf where
`advertise-all-vni` is present.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
Duplicate address detection and recovery was relying on the l2-vni backptr
in the neighbor entry which was simply not initialized resulting in
a NULL pointer access in a setup with dup-addressed VMs -
VM1:{IP1,M1} and VM2:{IP1,M2}
Call stack:
(gdb) bt 6
at lib/sigevent.c:249
nbr=nbr@entry=0x559347f901d0, vtep_ip=..., vtep_ip@entry=..., do_dad=do_dad@entry=true,
is_dup_detect=is_dup_detect@entry=0x7ffc7f6be59f, is_local=is_local@entry=true)
at ./lib/ipaddr.h:86
ip=0x7ffc7f6be6f0, ifp=0x559347f901d0, zvni=0x559347f86800) at zebra/zebra_vxlan.c:3152
(More stack frames follow...)
(gdb) p nbr->zvni
$8 = (zebra_vni_t *) 0x0 <<<<<<<<<<<<<<<<<<<<
(gdb)
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
When we get a neighbor entry in zebra we start processing it.
Let's add some additional debugs to the processing so that when
it bails out and we don't use the data, we know the reason.
This should help in debugging the problems from why bgp does
not appear to have data associated with a neighbor entry
in the kernel.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In the case of EVPN symmetric routing, the tenant VRF is associated with
a VNI that is used for routing and commonly referred to as the L3 VNI or
VRF VNI. Corresponding to this VNI is a VLAN and its associated L3 (IP)
interface (SVI). Overlay next hops (i.e., next hops for routes in the
tenant VRF) are reachable over this interface.
https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement
section 4.4 provides additional description of the above constructs.
The implementation currently derives this L3 interface for EVPN tenant
routes using special code that looks at route flags. This patch
exchanges the L3 interface between zebra and bgpd as part of the L3-VNI
exchange in order to eliminate some this special code.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Commit: 6005fe55bc
Introduced a crash with zebra looking up either the
nbr structure or the mac structure. This is because
the zvni used is NULL and we eventually call a hash_lookup
call that would cause a NULL dereference. Partially
revert this commit to original behavior.
Problems found via clang Static Analyzer.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In Asymmetric and symetric routing scenario in EVPN
where each VTEP pair having different set of addresses
for the SVIs.
This knob allows reachability (ping connectivity) of
SVI IPs and resolve ARP resoultion VTEPs across racks.
This knob should not be used when same SVI IPs configured
on VTEPs across racks or when advertise default gateway
is configured.
Ticket:CM-23782
Testing Done:
Bring up EVPN symmetric routing topology with different
SVI IPs on different VTEPs. Enable advertise svi ip
at each VTEP, remote VTEPs installs arp entry for
SVI IPs via EVPN type-2 route exchange.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
The master thread handler is really part of the zrouter structure.
So let's move it over to that. Eventually zserv.h will only be
used for zapi messages.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In extended-mobility case ({IP1, MAC} binding),
when a MAC moves from local to remote, binding
changes to {IP2, MAC}, local neigh (IP1) marked
as inactive in frr.
The evpn draft recommends to probe the entry once
local binding changes from local to remote.
Once the probe is set for the local neigh entry,
kernel will attempt refresh the entry via sending
unicast address resolution message, if host does not
reply, it will mark FAILED state.
For FAILED entry, kernel triggers delete neigh
request, which result in frr to remove inactive entry.
In absence of probing and aging out entry,
if MAC moves back to local with {IP3, MAC},
frr will mark both IP1 and IP3 as active and sends
type-2 update for both.
The IP1 may not be active host and still frr advertises
the route.
Ticket:CM-22864
Testing Done:
Validate the MAC mobilty in extended mobility scenario,
where local inactive entry gets removed once MAC moves
to remote state.
Once probe is set to the local entry, kernel triggers
reachability of the neigh/arp entry, since MAC moved remote,
ARP request goes to remote VTEP where host is not residing,
thus local neigh entry goes to failed state.
Frr receives neighbor delete faster and removes the entry.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Neigh detected duplicate detected during local update,
upon receiving kernel neigh delete, set neigh inactive
flag so BGPd can install remote route entry if present.
Only if freeze action enabled, local duplicate detected
entry will not be present in BGPd thus marking neigh
inactive is safe. BGPd will simply attempt install
remote entry if present.
Ticket:CM-23438
Testing Done:
Validated MAC-IP pair, trigger mobility of between two
VTEPs, upon local freeze perform neigh delete which
triggers BGP to install remote type-2 route into kernel.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
A MACIP is detected as duplicate and after that
the host continue to move behind different VTEPs results
in local VTEP receiving remote mobility events.
In remote_macip_add, ensure to trigger dad if
MAC is marked as duplicate. In case of freeze
action enabled, is_dup_detect will be set to
avoids installing frozen MAC into kernel.
Ticket:CM-23649
Testing Done:
Configured detection action freeze with detection count
as 7 at DUT and >7 at remote VTEP,
trigger MAC-IP mobility between VTEPs.
once tdetection count reached, MAC detected as duplicate,
post detection move the host to remote. The local VTEP
receives remote macip add and entry is not installed into
kernel with fix.
root@VTEP1:~# net show evpn mac vni 1002 mac aa:aa:aa:aa:aa:aa
MAC: aa:aa:aa:aa:aa:aa
Remote VTEP: 27.0.0.16
Local Seq: 7 Remote Seq: 8
Duplicate, detected at Fri Jan 25 05:03:29 2019
Neighbors:
11.11.11.11 Inactive
Kernel entry still points to LOCAL
root@VTEP1:~# bridge fdb show | grep aa:aa:aa
aa:aa:aa:aa:aa:aa dev hostbond3 vlan 1002 master VxLanA-1
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
When a local neigh is added with a MAC that is remote or absent the
neigh is kept in zebra as local/in-active. But not propagated to bgpd.
Similarly when an inactive neigh is deleted the del-msg is not propagated
to bgpd.
Without this change bgp and zebra would fall out of sync as that
bgp would not know to rerun bestpath and for it to reinstall a
known remote path for the mac-ip in question. To fix this we
now propagate inactive neigh deletes to bgpd.
Ticket: CM-23018
Testing Done:
1. evpn-min
2. manually triggered the out-of-sync state and verified the fix
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
For neigh check duplicate flag as it can be inherited from
duplicate detected MAC (count could be 0).
Ticket:CM-23316
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Below are cases where EVPN duplicate detection
Freeze and Unfreeze required fixes:
Auto recovery needs to check neighbor's duplicate flag
to take action, as neigh could be marked duplicate
via inherited from MAC where IP detection count could be 0.
MAC duplicate detection needs to set flag to true
if freeze action is configured.
Local MAC add update should not send update to bgp
if MAC is in frozen state.
Remote MAC-IP update should not process neigh update if MAC
is detected as duplicate during remote update.
Ticket:CM-23344
Testing Done:
Trigger duplicate detection via both local and remote update trigger,
Validate clear command and other changes expected behavior.
Auto-recovery takes appropriate action on inherited IPs.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Duplicate address detection should operate
at default vrf instance.
For mac and neigh show command, auto recovery and few places
where tanent vrf_id used for zvrf instead use default
vrf instance. Use vxlan_if's or VRF_DEFAULT vrf_id to
fetch zebra's default vrf instance.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
An EVPN type-2 entry is in freeze state during remote update,
remote VTEP can send typ-2 withdraw update,
upon receiving an entry delete (withdraw), first check
kernel has in local reachable state. Upon
unfreeze use the local entry to advertise to peers.
Fetch is for both MAC and IP, delete can come for
only MAC or MAC-IP combined route.
The specific entry fetch only required request flag to be set,
dump flag is not required.
Testing Done:
Simulate two VTEPs to do M1, IP1 mobility sequence,
freeze MAC during remote MAC update, subsequently send
withdraw type-2 route from origintating VTEP.
This results in read apis to invoke for local reachable entry.
Zebra updates its cache and upon unfreeze originates type-2.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
the default vrf name was hardset to "Default", whereas the default vrf
name could have been configured in an other manner. Fix this
inconsistency.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
the l3vni structure is allocated only once, since that structure is only
used for default netns. For that, move the initialisation part is moved
to a proper place, where there is no risk of attempting to initialise it
more than once, even when vrf backend is netns.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Clear dup address vni needs to return non-zero value
in case of command is not successful.
Ticket:CM-23122
Testing Done:
run clear command and check upon failure return code is non-zero.
root@TORS1:~# vtysh -c "clear evpn dup-addr vni 1000 ip 45.0.1.26"
% Requested IP's associated MAC 00:01:02:03:04:05 is still in duplicate
% state
root@TORS1:~# echo $?
1
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Change helps display detailed output for all possible VNI neighbors
without specifying VNI and ip. It helps in troubleshooting as a single
command can be fired to capture detailed info on all VNIs.
Ticket: CM-22832
Signed-off-by: Nitin Soni <nsoni@cumulusnetworks.com>
Reviewed-by: CCR-8034
Change helps display detailed output for all possible VNI MACs without
specifying VNI or mac. It helps in troubleshooting - a single
command can be fired to capture detailed info on all VNIs.
Also fixed and existing json related bug where json object is created by
a parent function and freed in child function.
Ticket: CM-22832
Signed-off-by: Nitin Soni <nsoni@cumulusnetworks.com>
Reviewed-by: CCR-8028
Change helps display detailed output for all possible VNIs without
specifying VNI. It helps in troubleshooting - a single command can
be fired to capture detailed info on all VNIs.
Ticket: CM-22831
Signed-off-by: Nitin Soni <nsoni@cumulusnetworks.com>
Reviewed-by: CCR-8013
Display following Per MAC and Neigh's output:
If duplicate address detection is under process,
display detection start time and detection count.
If duplicate address detection detected an address
as duplicate, display detection time and duplicate
status.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
When the remote mac is deleted by bgpd we can end up with an auto mac
entry in zebra if there are neighs referring to the mac. The remote sequence
number in the auto mac entry needs to be reset to 0 as the mac entry may
have been removed on all VTEPs (including the originating one).
Now if the MAC comes back on a remote VTEP it may be added with MM=0 which
will NOT be accepted if the remote seq was not reset in the previous step.
Ticket: CM-22707
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
This is a fixup to commit -
f32ea5c07 - zebra: act on kernel notifications for remote neighbors
The original commit handled a race condition between kernel and zebra
that would result in an inconsistent state i.e.
kernel has an offload/remote neigh
zebra has a local neigh
The original commit missed setting the neigh to active when zebra
tried to resolve the inconsistency by modifying the local neigh to
remote neigh on hearing back its own kernel update. Fixed here.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Ticket: CM-22700
When events cross paths between bgp and zebra bgpd could end up with a
dangling local MAC entry. Consider the following sequence of events on
rack-1 -
1. MAC1 has MM sequence number 1 and points to rack-3
2. Now a packet is rxed locally on rack-1 and rack-2 (simultaneously) with
source-mac=MAC1.
3. This would cause rack-1 and rack-2 to set the MM seq to 2 and
simultaneously report the MAC as local.
4. Now let's say on rack-1 zebra's MACIP_ADD is in bgpd's queue. bgpd
accepts rack-3's update and sends a remote MACIP add to zebra with MM=2.
5. zebra updates the MAC entry from local=>remote.
6. bgpd now processes zebra's "stale local" making it the best path.
However zebra no longer has a local MAC entry.
At this point bgpd and zebra are effectively out of sync i.e. bgpd has a
local-MAC which is not present in the kernel or in zebra.
To handle this window zebra should send a local MAC delete to bgpd on
modifying its cache to remote.
Ticket: CM-22687
Reviewed By: CCR-7935
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
The `struct zebra_ns` data structure is being used
for both router information as well as support for
the vrf backend( as appropriate ). This is a confusing
state. Start the movement of `struct zebra_ns` into
2 things `struct zebra_router` and `struct zebra_ns`.
In this new regime `struct zebra_router` is purely
for handling data about the router. It has no knowledge
of the underlying representation of the Data Plane.
`struct zebra_ns` becomes a linux specific bit of code
that allows us to handle the vrf backend and is allowed
to have knowledge about underlying data plane constructs.
When someone implements a *bsd backend the zebra_vrf data
structure will need to be abstracted to take advantage of this
instead of relying on zebra_ns.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The ->hash_cmp and linked list ->cmp functions were sometimes
being used interchangeably and this really is not a good
thing. So let's modify the hash_cmp function pointer to return
a boolean and convert everything to use the new syntax.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We had a variety of issues with sorted list compare functions.
This commit identifies and fixes these issues.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Allow the modification of whether or not we will allow
BUM flooding on the vxlan bridge. To do this allow
the upper level protocol to specify via the ZEBRA_VXLAN_FLOOD_CONTROL
zapi message.
If flooding is disabled then BUM traffic will not be forwarded
to other VTEP's.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The block comments from a couple commits were not following
proper style. Fix.
Fix SA warning that had snuck in.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ensure that when the is_router condition changes for a locally learnt
neighbor, it is informed to BGP only if it is active i.e., the MAC is
also locally learnt.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Chirag Shah <chirag@cumulusnetworks.com>
Ticket: CM-22288
Reviewed By: CCR-7832
Testing Done:
1. Failed test
2. vxlan_routing_test.py
Use boolean variables instead of unsigned int for certain VxLAN-EVPN
flags which are really used as boolean.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Chirag Shah <chirag@cumulusnetworks.com>
Ticket: CM-22288
Reviewed By: CCR-7832
Testing Done:
Along with a subsequent, related commit
When a remote MAC goes away, but there are neighbors referring to it,
ensure that when the last remote neighbor goes away, the MAC is
uninstalled from the kernel and no longer considered as remote.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Chirag Shah <chirag@cumulusnetworks.com>
Ticket: CM-22130
Reviewed By: CCR-7777
Testing Done:
1. Replicated failed scenario and verified with fix.
2. evpn-min
When a MAC moves from local to remote, a replace is allowed, EVPN
no longer has to delete the local MAC before installing the remote
MAC.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Chirag Shah <chirag@cumulusnetworks.com>
The RB-Tree used to store rmac information was not properly
handling the v6 address family. Modify the code to allow
this handling.
Cleans up this error message:
zebra[2231]: host_rb_entry_compare: Unexpected family type: 10
That is being seen, This fixes some connectivity issues being seen.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Problem reported that some bgp and ospf json commands did not return
any json output at all if the bgp/ospf instance did not exist.
Additionally, some bgp and ospf json commands did not return any json
output if the instance existed but no neighbors were defined. This
fix makes these commands more consistent in returning empty braces for
json output and issue a message if not using json output. Additionally,
made the flag "use_json" a bool to make it consistent since previously,
it had been defined as an int, char, u_char, and bool at various places.
Ticket: CM-21040
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Add a header to cleanup no declaration and properly
wrapper some variables to appropriate #ifdef.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Handle Remote Neigh entry state change from Router to Host.
Remote MAC-IP update may not continue EVPN NA Extended community,
Zebra need to accomodate if router_flag change for existing neigh
and install with or without Router Flag (R-bit).
Testing:
Have locally run MAC/IP (neigh entry) with R-bit set,
Checke on remote VTEP 'show bgp evpn route ...mac ip' and
'show evpn arp-cache ...' contians router flag.
Change host to remove R-bit, which locally learnt entry removes
Router flag. This results in remote vtep to remove R-bit from
neigh entry.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Neigh update can have router_flag change, from unset to set and
viceversa. This is the case where MAC, IP and VLAN are same but
entry's flag moved from R to not R bit and reverse case.
Router flag change needs to trigger bgpd to inform all evpn peers
to remove from the evpn route.
Testing Done:
Send GARP with and without R bit from host and validate neigh entry
and evpn neigh and mac-ip route entry in zebra and bgpd.
Check Peer VTEP evpn route entry where router flag is (un)set.
With R-bit
Route [2]:[0]:[0]:[48]:[00:1f:2f:db:45:a6]:[128]:[2006:33:33:2::10]
VNI 1001
Imported from
27.0.0.16:5:[2]:[0]:[0]:[48]:[00:1f:2f:db:45:a6]:[128]:[2006:33:33:2::10]
4435 5551
27.0.0.16 from MSP1(uplink-1) (27.0.0.9)
Origin IGP, valid, external, bestpath-from-AS 4435, best
Extended Community: RT:5551:1001 ET:8 ND:Router
Flag
AddPath ID: RX 0, TX 1261
Last update: Wed Aug 15 20:52:14 2018
Without R-bit
Route [2]:[0]:[0]:[48]:[00:1f:2f:db:45:a6]:[128]:[2006:33:33:2::10]
VNI 1001
Imported from
27.0.0.16:5:[2]:[0]:[0]:[48]:[00:1f:2f:db:45:a6]:[128]:[2006:33:33:2::10]
4435 5551
27.0.0.16 from MSP2(uplink-2) (27.0.0.10)
Origin IGP, valid, external, bestpath-from-AS 4435, best
Extended Community: RT:5551:1001 ET:8
AddPath ID: RX 0, TX 1263
Last update: Wed Aug 15 20:53:10 2018
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
The neigh update can come prior to mac add update.
In this case, the mac will be auto created for the vni.
set router flag to local neigh update for mac with auto flag.
The neigh update will be informed to bgpd once local mac is learnt.
Unset router flag if the neigh update comes without the router flag
for an existing neigh entry.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Enhance the EVPN MAC and Neighbor cache display to show additional
information such as the mobility sequence numbers and the state.
Ensure that the neighbor state is set in a couple of places so
that the display is correct.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Implement procedures similar to what is specified in
https://tools.ietf.org/html/draft-malhotra-bess-evpn-irb-extended-mobility
in order to support extended mobility scenarios in EVPN. These are scenarios
where a host/VM move results in a different (MAC,IP) binding from earlier.
For example, a host with an address assignment (IP1, MAC1) moves behind a
different PE (VTEP) and has an address assignment of (IP1, MAC2) or a host
with an address assignment (IP5, MAC5) has a different assignment of (IP6,
MAC5) after the move. Note that while these are described as "move" scenarios,
they also cover the situation when a VM is shut down and a new VM is spun up
at a different location that reuses the IP address or MAC address of the
earlier instance, but not both. Yet another scenario is a MAC change for an
attached host/VM i.e., when the MAC of an attached host changes from MAC1 to
MAC2. This is necessary because there may already be a non-zero sequence
number associated with MAC2. Also, even though (IP, MAC1) is withdrawn before
(IP, MAC2) is advertised, they may propagate through the network differently.
The procedures continue to rely on the MAC mobility extended community
specified in RFC 7432 and already supported by the implementation, but
augment it with a inheritance mechanism that understands the relationship
of the host MACIP (ARP/neighbor table entry) to the underlying MAC (MAC
forwarding database entry). In FRR, this relationship is understood by the
zebra component which doubles as the "host mobility manager", so the MAC
mobility sequence numbers are determined through interaction between bgpd
and zebra.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
When a host moves and is locally reachable, if the local neighbor event
is received before the local MAC event, flag the neighbor as inactive
just as would happen in the case of a new host. This ensures that the
MACIP route will get originated as soon as the local MAC event is got.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
There is no need to check for failure of a ALLOC call
as that any failure to do so will result in a assert
happening. So we can safely remove all of this code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
show evpn mac vni all
show evpn mac vni x
does not display local svi and anycast mac into count.
Ticket:CM-20456
Testing Done:
Before:
TOR1# show evpn mac vni 1008
Number of MACs (local and remote) known for this VNI: 4
MAC Type Intf/Remote VTEP VLAN
44:38:39:00:6b:4c local vlan1008 1008
00:02:00:00:00:04 local hostbond5 1008
00:02:00:00:00:02 local hostbond4 1008
00:00:5e:00:01:01 local vlan1008-v0 1008
00:02:00:00:00:0c remote 27.0.0.15
00:02:00:00:00:0a remote 27.0.0.15
dell-s6000-07#
After:
TOR1# show evpn mac vni 1008
Number of MACs (local and remote) known for this VNI: 6
MAC Type Intf/Remote VTEP VLAN
44:38:39:00:6b:4c local vlan1008 1008
00:02:00:00:00:04 local hostbond5 1008
00:02:00:00:00:02 local hostbond4 1008
00:00:5e:00:01:01 local vlan1008-v0 1008
00:02:00:00:00:0c remote 27.0.0.15
00:02:00:00:00:0a remote 27.0.0.15
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
EVPN ND ext community support NA flag R-bit, to have proxy ND.
Set R-bit in EVPN NA if a given router is default gateway or there is a
local
router attached, which can be determine based on local neighbor entry.
Implement BGP ext community attribute to generate and parse R-bit and
pass along zebra to program neigh entry in kernel.
Upon receiving MAC/IP update with community type 0x06 and sub_type 0x08,
pass the R-bit to zebra to program neigh entry.
Set NTF_ROUTER in neigh entry and inform kernel to do proxy NA for EVPN.
Ref:
https://tools.ietf.org/html/draft-ietf-bess-evpn-na-flags-01
Ticket:CM-21712, CM-21711
Reviewed By:
Testing Done:
Configure Local vni enabled L3 Gateway, which would act as router,
checked
show evpn arp-cache vni x ip <ip of svi> on originated and remote VTEPs.
"Router" flag is set.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
SVI interface ip/hw address is advertised by the GW VTEP (say TORC11) with
the default-GW community. And the rxing VTEP (say TORC21) installs the GW
MAC as a dynamic FDB entry. The problem with this is a rogue packet from a
server with the GW MAC as source can cause a station move resulting in
TORC21 hijacking the GW MAC address and blackholing all inter rack traffic.
Fix is to make the GW MAC "sticky" pinning it to the GW VTEP (TORC11). This
commit does it by installing the FDB entry as static if the MACIP route is
received with the default-GW community (mimics handling of
mac-mobility-with-sticky community)
Sample output with from TORC12 with TORC11 setup as gateway -
root@TORC21:~# net show evpn mac vni 1004 mac 00:00:5e:00:01:01
MAC: 00:00:5e:00:01:01
Remote VTEP: 36.0.0.11 Remote-gateway Mac
Neighbors:
45.0.4.1
fe80::200:5eff:fe00:101
2001:fee1:0:4::1
root@TORC21:~# bridge fdb show |grep 00:00:5e:00:01:01|grep 1004
00:00:5e:00:01:01 dev vx-1004 vlan 1004 master bridge static
00:00:5e:00:01:01 dev vx-1004 dst 36.0.0.11 self static
root@TORC21:~#
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Ticket: CM-21508
New version of clang are detecting function parameters that we should
not be casting as such. Fix these issues.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we have a host prefix, actually free the alloced memory
associated with it when we free it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* Add centralized thread scheduling dispatchers for client threads and
the main thread
* Rename everything in zserv.c to stop using a combination of:
- zebra_server_*
- zebra_*
- zserv_*
Everything in zserv.c now begins with zserv_*.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The neighbor host_list is expensive as well. Modify
the code to take advantage of a rb_tree as well.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We are going to modify more host_list's to host_rb's
so let's rename some functions to take advantage of
what is there.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The host_list when we attempt to use it at scale, ends
up spending a non-trivial amount of time finding and
sorting entries for the host list. Convert to a rb tree.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We have a command to enable symmetric routing only for type-5 routes.
This command is provided under vrf <> option in zebra as follows:
vrf <VRF>
vni <VNI> [prefix-routes-only]
We need the corresponding no version of the command as well as follows:
vrf <VRF>
no vni <VNI> [prefix-routes-only]
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
For ipv6 host, the next hop is conevrted to ipv6 mapped address.
However, the remote rmac should still be programmed with the ipv4 address.
This is how the entries will look in the kernel for ipv6 hosts routing.
vrf routing table:
ipv6 -> ipv6_mapped remote vtep on l3vni SVI
neigh table:
ipv6_mapped remote vtep -> remote RMAC
bridge fdb:
remote rmac -> ipv4 vtep tunnel
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Ensure that when EVPN routes are installed into zebra, the router MAC
is passed per next hop and appropriately handled. This is required for
proper multipath operation.
Ticket: CM-18999
Reviewed By:
Testing Done: Verified failed scenario, other manual tests
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>