Fix following flaws that resulted in EVPN with L3 multi-tenancy (i.e.,
EVPN dealing with VxLAN routing in the presence of tenant VRFs) not
working properly:
1. EVPN enable ("advertise-all-vni") is a global command, ensure it is
accordingly processed. The config is maintained against the default VRF.
2. There was an incorrect attempt to derive the L3 VRF for L2 interfaces
- the VRF only applies for L3 interfaces, though the code may initialize
to the default value in other cases.
3. Functions to map (port, VLAN) to SVI or vice versa were incorrect -
particularly, zvni_map_svi() since it was looking in the L3 VRF for
"matching" L2 interface which it would never find. Fix.
In addition, since the 'zebra_vrf *' parameter is not relevant in most
places, it has been removed.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-17840
Reviewed By: CCR-6685
Testing Done: evpn-smoke, various manual tests
This is a continuation of 915902cb82. Basically the netlink
read of messages up from the kernel is now noticing the proper
owner of the route. As such when rib_delete was being called
as part of the upcall from the kernel we were not noticing that
we were the originator and not diss-allowing the rib_delete
from happening. This restores this behavior that we were getting
pre-915902cb82cfd
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
For ZEBRA_ROUTE_KERNEL types:
The metric/priority of the route received from the kernel
is a 32 bit number. We are going to interpret the high
order byte as the Admin Distance and the low order 3 bytes
as the metric.
This will allow us to do two things:
1) Allow the creation of kernel routes that can be
overridden by zebra.
2) Allow the old behavior for 'most' kernel route types
if a user enters 'ip route ...' v4 routes get a metric
of 0 and v6 routes get a metric of 1024. Both of these
values will end up with a admin distance of 0, which
will cause them to win for the purposes of zebra.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This fixes the broken indentation of several foreach loops throughout
the code.
From clang's documentation[1]:
ForEachMacros: A vector of macros that should be interpreted as foreach
loops instead of as function calls.
[1] http://clang.llvm.org/docs/ClangFormatStyleOptions.html
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
There exists situations where it is possible to have duplicate
nexthops passed from a higher level protocol into zebra.
This code notices this duplication of nexthops and marks
the duplicates as DUPLICATE so we don't attempt to install
it into the kernel.
This is important on *BSD as I understand it because passing
duplicate nexthops will cause the route to be rejected.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
So the current code for a blackhole route assumed that you
would never want a recursively resolved blackhole to work.
Suppose you have this setup:
1) ip route 192.0.2.1/32 Null0
2) BGP installed with a route-map that rewrites the
nexthop to 192.0.2.1.
Zebra will end up with a recursive nexthop that resolves
to the blackhole.
The original rib install function assumed that we would never
want the ability to recursively resolve a blackhole route.
Instead just handle the blackhole as part of the nexthop_num = 1
case.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
With the change to make zebra pass routes to the kernel
with the 'correct' proto name, it caused zebra to
not properly recognize them on startup again
the next time such that the route would not
be deleted.
Modify rt_netlink.c to notice that we have a
self originated route and to properly mark
the type of route it was.
Modify rib_table_sweep to mark the nexthops
as active so that when we go to delete the
self originated routes it would properly
delete from the kernel.
Fixes: #1061
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
If we've set the bh_type to something besides BLACKHOLE_UNSPEC
due to the received route type being RTN_BLACKHOLE,
RTN_UNREACHABLE or RTN_PROHIBIT then just trust that
the nexthop is just what it is and set accordingly.
Fixes: #1082
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
support processing of RTN_BLACKHOLE et al. from kernel and dump them
into appropriate blackhole rib entries.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
blackhole support was horribly broken. cleanup by removing blackhole
stuff from ZEBRA_FLAG_*
introduces support for "prohibit" routes (Linux/netlink only)
also clean up blackhole options on "ip route" vty commands.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Frr has an assumption that when interface A links to B,
we already know about B. But that might be true always.
It is probably purely depends on the configuration
and how the interfaces are hashed in Kernel.
FRR seems to sometimes get "A is linked to B" before it knows about B,
in that case, the linkage between the data structure for A & B won't be proper.
Ticket: CM-17679
Review: ccr-6628
Testing: Manual
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
When the linux kernel adds/deletes routes, the
metric is important, but our routing protocols
add/delete in a slightly different manner,
so allow kernel metrics to match so that our
rib matches the kernel's fib.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
RTA_PAYLOAD() return value depends on the platform bits.
make[5]: Nothing to be done for 'all-am'.
Making all in zebra
CC rt_netlink.o
../../zebra/rt_netlink.c: In function 'netlink_macfdb_change':
../../zebra/rt_netlink.c:1695:63: error: format '%ld' expects argument of type 'long int', but argument 7 has type 'unsigned int' [-Werror=format=]
"%s family %s IF %s(%u) brIF %u - LLADDR is not MAC, len %ld",
^
../../zebra/rt_netlink.c: In function 'netlink_ipneigh_change':
../../zebra/rt_netlink.c:2024:57: error: format '%ld' expects argument of type 'long int', but argument 6 has type 'unsigned int' [-Werror=format=]
"%s family %s IF %s(%u) - LLADDR is not MAC, len %ld",
^
Signed-off-by: Jorge Boncompte <jbonor@gmail.com>
The pimregX devices when created by the kernel are put into
the default vrf. When pim gets the callback that the device
exists, check to see if it is a pimregX device and if so
move it into the appropriate vrf.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This current implementation unfortunately must
ask the kernel for all mroutes because vrf's
do not have the ability to request a single
mroute at this time.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Implement support for sticky (static) MACs. This includes the following:
- Recognize MAC is static (using NUD_NOARP flag) and inform BGP
- Construct MAC mobility extended community for sticky MACs as per
RFC 7432 section 15.2
- Inform to zebra that remote MAC is sticky, where appropriate
- Install sticky MACs into the kernel with the right flag
- Appropriate handling in route selection
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Implement handling of MACs and Neighbors (ARP/ND entries) in zebra:
- MAC and Neighbor database handlers
- Read MACs and Neighbors from the kernel, when needed and create
entries in zebra's MAC and Neighbor databases.
- Handle add/update/delete notifications from the kernel for MACs and
Neighbors and update zebra's database appropriately
- Inform locally learnt MACs and Neighbors to client
- Handle MACIP add/delete from client and install appriporiate entries
into the kernel
- Since Neighbor entries will be installed on an SVI, implement the
needed mappings
NOTE: kernel interface is only implemented for Linux/netlink
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Implement fundamental handling for VNIs and VTEPs:
- Handle EVPN enable/disable by client (advertise-all-vni)
- Create/update/delete VNIs based on VxLAN interface events and inform
client
- Handle VTEP add/delete from client and install into kernel
- New debug command for VxLAN/EVPN
- kernel interface (Linux/netlink only)
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
For NHRP, EIGRP and LDP( This is for consistency as opposed to correctness )
assign some new values to routes to be installed into the kernel
so we can know who owns them later.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The 'struct rib' data structure is missnamed. It really
is a 'struct route_entry' as part of the 'struct route_node'.
We have 1 'struct route_entry' per route src. As such
1 route node can have multiple route entries if multiple
protocols attempt to install the same route.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Rearrange the _netlink_route_build*() functions so the labels of the
nexthops are always installed, even for IPv4 routes with IPv6 nexthops.
Fixes Labeled Unicast with BGP Unnumbered.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
the ipv4_ll address used for 5549 routes does not need
to be figured out every single time that we attempt
to install/remove a route of that type.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When zebra issues read (GET) requests to the kernel using the netlink
interface, it is incorrect to format all of them in a generic manner
using 'struct ifinfomsg' or 'struct rtgenmsg'. Rather, messages for a
particular entity (e.g., routes) should use the corresponding structure
for encoding (e.g., 'struct rtmsg'). Of course, this has to correlate
with what the kernel expects.
In the absence of this, there is the possibility of sending extraneous
information in the request which the kernel wouldn't like.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: David Ahern <dsa@cumulusnetworks.com>
The FSF's address changed, and we had a mixture of comment styles for
the GPL file header. (The style with * at the beginning won out with
580 to 141 in existing files.)
Note: I've intentionally left intact other "variations" of the copyright
header, e.g. whether it says "Zebra", "Quagga", "FRR", or nothing.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Ticket: CM-14313
Reviewed By:
Testing Done: bgpmin, ospfmin, bgp_kitchen_sink_test
'ip route show' displays all routes as belonging to protocol zebra.
The user has to run an additional command (in vtysh) to get the actual
source of a route (bgp/ospf/static etc.). This patch addresses that by
pushing the appropriate protocol string into the protocol field of the
netlink route update message. Now you can see routes with the correct
origin as well as filter on them (ip route show proto ospf).
'ospf' is used for both IPv4 and IPv6 routes, even though the OSPF
version is different in both cases.
Sample output (old):
9.9.12.13 via 69.254.2.38 dev swp3.2 proto zebra metric 20
9.9.13.3 proto zebra metric 20
nexthop via 69.254.2.30 dev swp1.2 weight 1
nexthop via 69.254.2.34 dev swp2.2 weight 1
nexthop via 69.254.2.38 dev swp3.2 weight 1
Sample output (new):
9.9.12.13 via 69.254.2.38 dev swp3.2 proto bgp metric 20
9.9.13.3 proto bgp metric 20
nexthop via 69.254.2.30 dev swp1.2 weight 1
nexthop via 69.254.2.34 dev swp2.2 weight 1
nexthop via 69.254.2.38 dev swp3.2 weight 1
The kernel can send a DELROUTE with a individual
nexthop. Technically this is meant to delete that
individual nexthop from the route but zebra
has no way to do this currently. So we just delete
the route.
V4 -> Never sends a DELROUTE with multiple nexthops
as a way to modify the rib. It sends a a NEWROUTE
with RTM_REPLACE with the new appropriate route.
V6 -> Sends a DELROUTE with multiple nexthops
which is supposed to be interpreted as a
subtraction from the route.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In the near future it will be possible to recieve v6 multipath netlink
messages. This code change is in prep for it. In the meantime the
v6 code path will continue to work as per normal.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The reading if unicast routes from the kernel acts subtly differently
between reading in the routes from the kernel on startup and
reading a new route or getting a response for a route.
Add startup flag(currently ignored) so that we can start
consolidating the functionality.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When starting up bgp and zebra now, you can specify
-e <number> or --ecmp <number>
and that number will be used as the maximum ecmp
that can be used.
The <number specified must be >= 1 and <= MULTIPATH_NUM
that Quagga is compiled with.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This reverts commit 1a11782c40.
The change is not suitable for stable/2.0, it's not a bugfix and has
quite a visible user impact.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Acked-by: Donald Sharp <sharpd@cumulusnetworks.com>
Problem reported was stale routes left in the kernel in certain cases
when overlapping static routes were used and links were bounced. The
problem was determined to be an issue where the nexthop was changed
due to recursion as the link is going down, and the next-hop at the
time of deletion doesn't match what was previously installed by the
kernel. This caused the kernel to reject the deletion and the route
stuck around.
It was pointed out that the kernel doesn't actually require a next-hop
value on the netlink deletion call. In this fix, we are eliminating
the nexthop for RTM_DELROUTE messages to the kernel in the ipv4 singlepath
case. This approach could also be valid for other cases but the fix
as is resolved the reported failure case. More testing should be
performed before similar changes are made for other cases.
Testing included manual testing for the failure condition as well as
complete bgp-smoke and ospf-smoke tests with no new failures.
Ticket: CM-13328
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed-by: CCR-5562
Ticket: CM-14313
Reviewed By:
Testing Done: bgpmin, ospfmin, bgp_kitchen_sink_test
'ip route show' displays all routes as belonging to protocol zebra.
The user has to run an additional command (in vtysh) to get the actual
source of a route (bgp/ospf/static etc.). This patch addresses that by
pushing the appropriate protocol string into the protocol field of the
netlink route update message. Now you can see routes with the correct
origin as well as filter on them (ip route show proto ospf).
'ospf' is used for both IPv4 and IPv6 routes, even though the OSPF
version is different in both cases.
Sample output (old):
9.9.12.13 via 69.254.2.38 dev swp3.2 proto zebra metric 20
9.9.13.3 proto zebra metric 20
nexthop via 69.254.2.30 dev swp1.2 weight 1
nexthop via 69.254.2.34 dev swp2.2 weight 1
nexthop via 69.254.2.38 dev swp3.2 weight 1
Sample output (new):
9.9.12.13 via 69.254.2.38 dev swp3.2 proto bgp metric 20
9.9.13.3 proto bgp metric 20
nexthop via 69.254.2.30 dev swp1.2 weight 1
nexthop via 69.254.2.34 dev swp2.2 weight 1
nexthop via 69.254.2.38 dev swp3.2 weight 1
Signed-off-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Check and read the IPv6 source prefix on ZAPI messages, and pass it down
to the RIB functions (which do nothing with it yet.) Since the RIB
functions now all have a new extra argument, this also updates the
kernel route read functions to supply NULL.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The code to collect the sg stats was written for linux.
Abstract the call to allow it to work on all platforms.
I have not implemented the call for non-linux systems.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Fully decode mcast messages from the kernel. We are not
doing anything with this at the moment, but that will
change.
Additionally convert over to using lookup for
displaying the route type.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The netlink_talk call sends a message to the kernel, which
with netlink_talk_filter only waits for the ACK.
It would be nice to have the ability to specify what the handler
function would be for when we send queries about mcast S,G routes
so that we can gather the data returned from the kernel.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There's no need to duplicate the 'vrf_id' and 'name' fields from the 'vrf'
structure into the 'zebra_vrf' structure. Instead of that, add a back
pointer in 'zebra_vrf' that should point to the associated 'vrf' structure.
Additionally, modify the vrf callbacks to pass the whole vrf structure
as a parameter. This allow us to make further simplifications in the code.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
[DL: picked out from: "atomic FIB updates"]
This simplifies the OS-specific route update API into a single entry
point, kernel_route_rib(), which dispatches the various operations
internally.
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
In linux, 'scope' is a hint of distance of the IP. And this is
evident from the fact that only lower scope can be used as recursive
via lookup result. This changes all interface routes scope to link
so kernel will allow regular routes to use it as via. Then we do
not need to use the 'onlink' attribute.
Signed-off-by: Timo Teräs <timo.teras@iki.fi>