There is no need to check for failure of a ALLOC call
as that any failure to do so will result in a assert
happening. So we can safely remove all of this code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
show evpn mac vni all
show evpn mac vni x
does not display local svi and anycast mac into count.
Ticket:CM-20456
Testing Done:
Before:
TOR1# show evpn mac vni 1008
Number of MACs (local and remote) known for this VNI: 4
MAC Type Intf/Remote VTEP VLAN
44:38:39:00:6b:4c local vlan1008 1008
00:02:00:00:00:04 local hostbond5 1008
00:02:00:00:00:02 local hostbond4 1008
00:00:5e:00:01:01 local vlan1008-v0 1008
00:02:00:00:00:0c remote 27.0.0.15
00:02:00:00:00:0a remote 27.0.0.15
dell-s6000-07#
After:
TOR1# show evpn mac vni 1008
Number of MACs (local and remote) known for this VNI: 6
MAC Type Intf/Remote VTEP VLAN
44:38:39:00:6b:4c local vlan1008 1008
00:02:00:00:00:04 local hostbond5 1008
00:02:00:00:00:02 local hostbond4 1008
00:00:5e:00:01:01 local vlan1008-v0 1008
00:02:00:00:00:0c remote 27.0.0.15
00:02:00:00:00:0a remote 27.0.0.15
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
EVPN ND ext community support NA flag R-bit, to have proxy ND.
Set R-bit in EVPN NA if a given router is default gateway or there is a
local
router attached, which can be determine based on local neighbor entry.
Implement BGP ext community attribute to generate and parse R-bit and
pass along zebra to program neigh entry in kernel.
Upon receiving MAC/IP update with community type 0x06 and sub_type 0x08,
pass the R-bit to zebra to program neigh entry.
Set NTF_ROUTER in neigh entry and inform kernel to do proxy NA for EVPN.
Ref:
https://tools.ietf.org/html/draft-ietf-bess-evpn-na-flags-01
Ticket:CM-21712, CM-21711
Reviewed By:
Testing Done:
Configure Local vni enabled L3 Gateway, which would act as router,
checked
show evpn arp-cache vni x ip <ip of svi> on originated and remote VTEPs.
"Router" flag is set.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
SVI interface ip/hw address is advertised by the GW VTEP (say TORC11) with
the default-GW community. And the rxing VTEP (say TORC21) installs the GW
MAC as a dynamic FDB entry. The problem with this is a rogue packet from a
server with the GW MAC as source can cause a station move resulting in
TORC21 hijacking the GW MAC address and blackholing all inter rack traffic.
Fix is to make the GW MAC "sticky" pinning it to the GW VTEP (TORC11). This
commit does it by installing the FDB entry as static if the MACIP route is
received with the default-GW community (mimics handling of
mac-mobility-with-sticky community)
Sample output with from TORC12 with TORC11 setup as gateway -
root@TORC21:~# net show evpn mac vni 1004 mac 00:00:5e:00:01:01
MAC: 00:00:5e:00:01:01
Remote VTEP: 36.0.0.11 Remote-gateway Mac
Neighbors:
45.0.4.1
fe80::200:5eff:fe00:101
2001:fee1:0:4::1
root@TORC21:~# bridge fdb show |grep 00:00:5e:00:01:01|grep 1004
00:00:5e:00:01:01 dev vx-1004 vlan 1004 master bridge static
00:00:5e:00:01:01 dev vx-1004 dst 36.0.0.11 self static
root@TORC21:~#
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Ticket: CM-21508
New version of clang are detecting function parameters that we should
not be casting as such. Fix these issues.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we have a host prefix, actually free the alloced memory
associated with it when we free it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* Add centralized thread scheduling dispatchers for client threads and
the main thread
* Rename everything in zserv.c to stop using a combination of:
- zebra_server_*
- zebra_*
- zserv_*
Everything in zserv.c now begins with zserv_*.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The neighbor host_list is expensive as well. Modify
the code to take advantage of a rb_tree as well.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We are going to modify more host_list's to host_rb's
so let's rename some functions to take advantage of
what is there.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The host_list when we attempt to use it at scale, ends
up spending a non-trivial amount of time finding and
sorting entries for the host list. Convert to a rb tree.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We have a command to enable symmetric routing only for type-5 routes.
This command is provided under vrf <> option in zebra as follows:
vrf <VRF>
vni <VNI> [prefix-routes-only]
We need the corresponding no version of the command as well as follows:
vrf <VRF>
no vni <VNI> [prefix-routes-only]
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
For ipv6 host, the next hop is conevrted to ipv6 mapped address.
However, the remote rmac should still be programmed with the ipv4 address.
This is how the entries will look in the kernel for ipv6 hosts routing.
vrf routing table:
ipv6 -> ipv6_mapped remote vtep on l3vni SVI
neigh table:
ipv6_mapped remote vtep -> remote RMAC
bridge fdb:
remote rmac -> ipv4 vtep tunnel
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Ensure that when EVPN routes are installed into zebra, the router MAC
is passed per next hop and appropriately handled. This is required for
proper multipath operation.
Ticket: CM-18999
Reviewed By:
Testing Done: Verified failed scenario, other manual tests
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
EVPN owns the remote neigh entries which are programed in the kernel.
This entries should not age out and the only way to delete should be
from EVPN. We should program these entries with NUD_NOARP instead of
NUD_REACHABLE to avoid aging of this macs.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
There can be a race condition between kernel and frr as follows.
Frr sends remote neigh notification.
At the (almost) same time kernel might send a notification saying
neigh is local.
After processing this notifications, the state in frr is local while
state in kernel is remote. This causes kernel and frr to be out of sync.
This problem will be avoided if FRR acts on the kernel notifications for
remote neighbors. When FRR sees a remote neighbor notification for a
neighbor which it thinks is local, FRR will change the neigh state to remote.
Ticket: CM-19923/CM-18830
Review: CCR-7222
Testing: Manual
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
[zebra/zebra_vxlan.c:5779] -> [zebra/zebra_vxlan.c:5778]:
(warning) Either the condition 'if(svi_if_zif&&svi_if_link)'
is redundant or there is possible null pointer dereference: svi_if_zif.
Signed-off-by: Ilya Shipitsin <chipitsine@gmail.com>
The following types are nonstandard:
- u_char
- u_short
- u_int
- u_long
- u_int8_t
- u_int16_t
- u_int32_t
Replace them with the C99 standard types:
- uint8_t
- unsigned short
- unsigned int
- unsigned long
- uint8_t
- uint16_t
- uint32_t
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Nobody uses it, but it's got the same definition. Move the parser
function into zclient.c and use it.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Group send and receive functions together, change handlers to take a
message instead of looking at ->ibuf and ->obuf, allow zebra to read
multiple packets off the wire at a time.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
A lot of the handler functions that are called directly from the ZAPI
input processing code take different argument sets where they don't need
to. These functions are called from only one place and all have the same
fundamental information available to them to do their work. There is no
need to specialize what information is passed to them; it is cleaner and
easier to understand when they all accept the same base set of
information and extract what they need inline.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
All of the ZAPI message handlers return an integer that means different
things to each of them, but nobody ever reads these integers, so this is
technical debt that we can just eliminate outright.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Asymmetric routing is an ideal choice when all VLANs are cfged on all leafs.
It simplifies the routing configuration and
eliminates potential need for advertising subnet routes.
However, we need to reach the Internet or global destinations
or to do subnet-based routing between PODs or DCs.
This requires EVPN type-5 routes but those routes require L3 VNI configuration.
This task is to support EVPN type-5 routes for prefix-based routing in
conjunction with asymmetric routing within the POD/DC.
It is done by providing an option to use the L3 VNI only for prefix routes,
so that type-2 routes (host routes) will only use the L2 VNI.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
is_vni_l3 was removed as a part of PR1700. However, it seems to be used in master.
Causing the breakage. Made the changes to not use the API anymore.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
removed an additional field 'local-tunnel-ip' from l2vnis o/p
Ticket: CM-19670
Review: CCR-7167
Testing: Verified that the output is proper
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Currently, while processing kernel messages related to VNIs
we first check if VNI is L3 - this is a hash lookup
later, we do the lookup again to find the L3-VNI.
This is non-optimal.
Made changed to make sure we only do the lookup once.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Refine the notion of what FRR considers as "configured" VRF. It is no longer
based on user just typing "vrf FOO" but when something is actually configured
against that VRF. Right now, in zebra, the only configuration against a VRF
are static IP routes and EVPN L3 VNI. Whenever a configuration is removed,
check and clear the "configured" flag if there is no other configuration for
this VRF. When user attempts to configure a static route and the VRF doesn't
exist, a VRF is created; the VRF is only active when also defined in the
kernel.
Updates: 8b73ea7bd479030418ca06eef59d0648d913b620
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Ticket: CM-10139, CM-18553
Reviewed By: CCR-7019
Testing Done:
1. Manual testing for L3 VNI and static routes - FRR restart, networking
restart etc.
2. 'vrf' smoke
<DETAILED DESCRIPTION (REPLACE)>
Only check on L3-VNI SVI status when uninstalling remote next hops.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Ticket: CM-19036
Reviewed By: None
Testing Done:
1. Networking restart
2. VxLAN interface disable/enable
3. VRF delete and readd
A VRF is active only when the corresponding VRF device is present in the
kernel. However, when the kernel VRF device is removed, the VRF container in
FRR should go away only if there is no user configuration for it. Otherwise,
when the VRF device is created again so that the VRF becomes active, FRR
cannot take the correct actions. Example configuration for the VRF includes
static routes and EVPN L3 VNI.
Note that a VRF is currently considered to be "configured" as soon as the
operator has issued the "vrf <name>" command in FRR. Such a configured VRF
is not deleted upon VRF device removal, it is only made inactive. A VRF that
is "configured" can be deleted only upon operator action and only if the VRF
has been deactivated i.e., the VRF device removed from the kernel. This is
an existing restriction.
To implement this change, the VRF disable and delete actions have been modified.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Mitesh Kanjariya <mkanjariya@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Ticket: CM-18553, CM-18918, CM-10139
Reviewed By: CCR-7022
Testing Done:
1. vrf and pim-vrf automation tests
2. Multiple VRF delete and readd (ifdown, ifup-with-depends)
3. FRR stop, start, restart
4. Networking restart
5. Configuration delete and readd
Some of the above tests run in different sequences (manually).
Kernel can delete a frr installed remote RMAC on a L3-VNI.
We should re-add if such a siatuation occurs
as we are the owner of the RMAC.
This behavor is same for remote MACs as well and was missing for RMACs.
Ticket: CM-18762
Review: CCR-6992
Testing: Manual
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
0. move all global EVPN details to 'show evpn [json]' command
1. change "VRF" to "Tenant VRF" in 'show evpn vni'
2. change 'show vrf vni' command to tabular form
and add l3-vni related params to the output
3. show evpn rmac should show refcount only in detailed output
4. show evpn next-hop should show refcount only in detailed output
5. move VRF in 'show evpn l3vni' to the end
6. add num rmacs and num nexthops to show evpn l3vni
7. remove "info" from 'show bgp vrf <> l3vni info'
8. show evpn vni <vni> should show l2vni details or l3 vni details
9. show evpn vni should show both L2 and L3 VNIs
10. show bgp l2vpn evpn - shows all global bgp l2vpn evpn details
11. show bgp l2vpn evpn vni - will show both l2 and l3 vnis
12. show bgp l2vpn evpn vni - should show both l2 and l3 vnis
13. follow camel notation for all json keys
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
In EVPN symmetric routing, not all subnets are presents everywhere.
We have multiple scenarios where a host might not get learned locally.
1. GARP miss
2. SVI down/up
3. Silent host
We need a mechanism to resolve such hosts. In order to achieve this,
we will be advertising a subnet route from a box and that box will help
in resolving the ARP to such hosts.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
1. Added default gw extended community
2. code modification to handle sticky-mac/default-gw-mac as they go together
3. show command support for newly added extended community
4. State in zebra to reflect if a mac/neigh is default gateway
5. show command enhancement to refelect the same in zebra commands
Ticket: CM-17428
Review: CCR-6580
Testing: Manual
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
The function zserv_create_header was exactly the same
as zclient_create_header. Let's just have one in the
system.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
EVPN is only enabled when user configures advertise-all-vni.
All VNIs (L2 and L3) should be cleared upon removal of this config.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
For EVPN type-5 route the NH in the NLRI is set to the local tunnel ip.
This information has to be obtained from kernel notification.
We need to pass this info from zebra to bgp in l3vni call flow.
This patch doesn't handle the tunnel-ip change.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
When a remote VTEP next hop entry (for symmetric routing) becomes
stale, reinstall it. This makes the behavior the same as what is
done for remote host next hops (for asymmetric routing and ARP
suppression).
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Upon a l3vni delete (no vni under a vrf) is executed,
we should uninstall all the RMACs and NHs associated with the l3vni.
This is because by the time we get a route delete in zebra
l3vni is already deleted and we dont have refernce to RMACs and NHs
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
zebra_find_client needs to match on instance as well so
protocols like ospfd will work correctly for notification.
Modify the zebra_find_client code to accept the instance
number and to pass it in appropriately.
Signed-off-by: Doanld Sharp <sharpd@cumulusnetworks.com>
This code modifies zebra to use the STREAM_GET functionality.
This will allow zebra to continue functioning in the case of
bad input data from higher level protocols instead of crashing.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This improves code readability and also future-proofs our codebase
against new changes in the data structure used to store interfaces.
The FOR_ALL_INTERFACES_ADDRESSES macro was also moved to lib/ but
for now only babeld is using it.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
This is an important optimization for users running FRR on systems with
a large number of interfaces (e.g. thousands of tunnels). Red-black
trees scale much better than sorted linked-lists and also store the
elements in an ordered way (contrary to hash tables).
This is a big patch but the interesting bits are all in lib/if.[ch].
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
MAC entries are internally created for purposes such as when a local
neighbor is learnt but the MAC itself is not yet learnt. Such MACs are
not "real", so ensure they are not counted for UI output.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Ticket: CM-17991
Reviewed By: None
Testing Done: Manual, evpn-smoke
Fix following flaws that resulted in EVPN with L3 multi-tenancy (i.e.,
EVPN dealing with VxLAN routing in the presence of tenant VRFs) not
working properly:
1. EVPN enable ("advertise-all-vni") is a global command, ensure it is
accordingly processed. The config is maintained against the default VRF.
2. There was an incorrect attempt to derive the L3 VRF for L2 interfaces
- the VRF only applies for L3 interfaces, though the code may initialize
to the default value in other cases.
3. Functions to map (port, VLAN) to SVI or vice versa were incorrect -
particularly, zvni_map_svi() since it was looking in the L3 VRF for
"matching" L2 interface which it would never find. Fix.
In addition, since the 'zebra_vrf *' parameter is not relevant in most
places, it has been removed.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-17840
Reviewed By: CCR-6685
Testing Done: evpn-smoke, various manual tests
Convert the list_delete(struct list *) function to use
struct list **. This is to allow the list pointer to be nulled.
I keep running into uses of this list_delete function where we
forget to set the returned pointer to NULL and attempt to use
it and then experience a crash, usually after the developer
has long since left the building.
Let's make the api explicit in it setting the list pointer
to null.
Cynical Prediction: This code will expose a attempt
to use the NULL'ed list pointer in some obscure bit
of code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
1) Various socket close issues
2) Ensure afi passed is usable
3) Fix some reads beyond buffer and reads after free
4) Ensure some failure modes are handled properly
5) Memory Leak(s) fix
6) There is no 6.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Frr has an assumption that when interface A links to B,
we already know about B. But that might be true always.
It is probably purely depends on the configuration
and how the interfaces are hashed in Kernel.
FRR seems to sometimes get "A is linked to B" before it knows about B,
in that case, the linkage between the data structure for A & B won't be proper.
Ticket: CM-17679
Review: ccr-6628
Testing: Manual
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
NUD_STALE flag is causing a build breakage,
we might have to define it somewhere in frr.
Reverting the fix for now untill we decide how to handle it correctly.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
When the MAC changes for a local neighbor, ensure that the neighbor data
structure as well as the link between the neighbor and MAC data structures
is updated correctly.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-17565
Reviewed By: CCR-6605
Testing Done: Manual, evpn-smoke
If we get an ageout notification from the kernel for EVPN-installed
neighbors, ensure that they are readded. Otherwise, while entries in
STALE state are usable, based on other kernel parameters they can
get deleted and adding them back only at delete can have other
undesirable performance consequences.
Note: This is the current Linux kernel behavior (to ageout EVPN
installed neighbors).
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Ticket: CM-15623, CM-17490
Reviewed By: CCR-6586
Testing Done: Manual, evpn-min
When multiple events are happening, it is possible that remote
MACIP or other requests may be received when an interface is down
or removed from a bridge. Handle this correctly.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Until now, we had to delete the local mac entries when a mac moved from local to remote,
with the new kernel patch that is no longer necessary.
Ticket:CM-16094
Reviewed By:CCR-6470
Testing Done: Manual
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Currently, FRR does not do any linking between local MACs and neighbors.
We found this necessary when dealing with centralized GW. A neigh is considered local only when the mac is learnt locally as well.
Ticket: CM-16544
Review: CCR-6388
Unit-test: Manual/Evpn-Smoke
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
The hash key function choosen for mac vni's would tend
to clump the key value to the same number. Use a better
hash key generator to spread the hash values out.
A bad hash key might lead to O(2^n) memory consumption
because the hash size is doubled, each time a backet
exceeds a predefined threshold. This quickly leads
to OOM. Fixing this issue by fixing the hash
key generation to actually spread the keys out.
Ticket: CM-17412
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This reverts commit c14777c6bf.
clang 5 is not widely available enough for people to indent with. This
is particularly problematic when rebasing/adjusting branches.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Implement support for sticky (static) MACs. This includes the following:
- Recognize MAC is static (using NUD_NOARP flag) and inform BGP
- Construct MAC mobility extended community for sticky MACs as per
RFC 7432 section 15.2
- Inform to zebra that remote MAC is sticky, where appropriate
- Install sticky MACs into the kernel with the right flag
- Appropriate handling in route selection
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Implement handling of MACs and Neighbors (ARP/ND entries) in zebra:
- MAC and Neighbor database handlers
- Read MACs and Neighbors from the kernel, when needed and create
entries in zebra's MAC and Neighbor databases.
- Handle add/update/delete notifications from the kernel for MACs and
Neighbors and update zebra's database appropriately
- Inform locally learnt MACs and Neighbors to client
- Handle MACIP add/delete from client and install appriporiate entries
into the kernel
- Since Neighbor entries will be installed on an SVI, implement the
needed mappings
NOTE: kernel interface is only implemented for Linux/netlink
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Implement fundamental handling for VNIs and VTEPs:
- Handle EVPN enable/disable by client (advertise-all-vni)
- Create/update/delete VNIs based on VxLAN interface events and inform
client
- Handle VTEP add/delete from client and install into kernel
- New debug command for VxLAN/EVPN
- kernel interface (Linux/netlink only)
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>