Re-organize and expose the nhg_connected functions so that
it can be used outside zebra_nhg.c. And then abstract those
into zebra_nhg_depends_* and zebra_nhg_depenents_* functons.
Switch the ifp struct to use an RB tree for its dependents,
making use of the nhg_connected functions.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Create a nhg_depenents tree that will function as a way
to get back pointers for NHE's depending on it.
Abstract the RB nodes into nhg_connected for both depends and
dependents. This same struct is used for both.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Default the afi of the nexthop to the route entry using it.
If it turns out to be a group, update the afi to AFI_UNSPEC.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Update rib_add_multipath to use the reference count
increment function for nexthop group hash entries.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add function to increment the route reference count for nhg_hash_entry's
and to do so recursively if its a group.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a helper function to allow us to check if two
nhg_hash_entry's dependency lists are equal.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add helper function to allow us to lookup an ID inside
of a nhg_hash_entry's dependency list.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a function that allows us to take a single
nexthop struct and look that up or create a group and
nexthop hash entry with it.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Pass a boolean to zebra_nhg_find(), indicating whether the
nhg is being lookedup from the kernel side or not.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Move the id counter further up into zebra_nhg_find() so that
it is still incremented if we receive a duplicate that never
would get allocated. The kernel will still use the dup, so we
have to account for that in our id counter.
Also, if we don't create a new entry, reset the id back to where
it was when zebra_nhg_find() was called.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Make the the kernel debug zlog for nexthop messages from the
kernel more aligned with the route message kernel debug zlog.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Changed our alloc function to just copy the nhg and
nhg_depends. This makes the zebra_nhg_find code a
little bit cleaner, hopefully preventing bugs.
The only issue with this is that it makes us have to loop
over the nexthops in a group an extra time for the copies.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Fix a couple functions that were using depends (plural)
rather than depend(singular) in their wording.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add functionality to allow us to send nexthop groups
to the kernel. It creates a nexthop_grp array based on
the dependency list in the nhg_hash_entry and then shoves
that into the netlink message.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Update the dataplane nexthop ctx to use the nhg_depend_dup_list()
function for copying over the dependencies into its context.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a function to duplicate a nhg dependency linked
list. We will use this for duplicating the dependency
list rather than the linked list dup function in lib/linkedlist.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Nexthop groups can have nexthops in different vrf's. So,
let's make the group vrf_id just be VRF_DEFAULT for hash
lookup purposes.
Set vrf_id to be VRF_DEFAULT for every message. If its a new
nextop, set the vrf to be the appropriate thing, otherwise
its a group and can just be left as default.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Simplify the code for nexthop hash entry creation. I made nexthop
hash entry creation expect the nexthop group and depends to always
be allocated before lookup. Before, it was only allocated if it had
dependencies. I think it makes the code a bit more readable to go
ahead an allocate even for single nexthops as well.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add some functions that can be called to free everything that should
have been allocated in a nexthop group hash entry.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add an option to not specify the afi in the show nexthop-group
command so that it shows all nexthops, including groups. This is
how iproute2 does it. If the afi is given, it will only show single
nexthops since groups are AF_UNSPEC.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add functionality to read in a group from the kernel,
create a hash entry for it, and add its nexthops to
its dependency list.
Further, we create its nhg struct separtely from this,
copying over any nexthops it should reference directly
into it.
Thus, we have two types for representation of the nexthop group:
nhe->nhg_depends->[nhe, nhe, nhe]
nhe->nhg->nexthop->nexthop->nexthop
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We treat "groups" from the kernel here as a dependency list.
Each hash entry, if its a group from the kernel, has
a list of any other nexthop hash entries that are in its
group. A non-group nexthop from the kernel will have its
dependency list set to NULL.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Since nexthops are always going to need to be address family
specific unless they are only a group, we have to address
this when we receive and send them.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
The message for an invalid address family on a nexthop gateway did
not specify that is what for the gateway specifically.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
The nexthop group hash entries were using the "TMP" memory
type. Declared one for them and updated to use it.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Changed to the wording in the duplicate error message
since its techincally possible we get could try to
create a dupe from somewhere else besides the kernel
in the future.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add an interface pointer for an nexthop group hash entry
when we are getting a rib_add for a new route.
Also, add the interface index to the `show nexthop-group` command.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
When we get a new nexthop and find the interface associated
with it, add this nexthop to the interface's zebra interface
info nexthop hash entry list.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a nexthop hash entry list to the local zebra
interface info for each interface. This will allow
us to modify nexthops on link events.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Make our route entry struct's re->ng nexthop group pointer
just point to the nhe->nhg nexthop hash entry nexthop group.
This will allow updates to the nexthop itself to propogate
to our routes immediately.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a parameter to the rib_add function so that it takes
a nexthop ID from the kernel if one is passed along
with the route.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Move the nexthop unicast parsing into its own function
to improve code readability. It was getting a bit too
indented.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add parsing code for nexthop object ID's when we get a
route. When we get a new route with the new kernel, it
will come with a nexthop ID and the nexthop full info.
We should just reference by ID if it exists and point
to the nexthop hash entry that matches it.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add functionality to uninstall nexthops we created on shutdown.
To account for this, I added in a function for zebra_router
cleanup in a shutdown event.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
When nexthop entry reference counts hit zero and
we created them, uninstall them from the kernel.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Added in case statements to handle finished dataplane contexts
and then handle them with the nexthop process result function.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Switched the route entries to use ID's instead of pointers.
Perform lookups with the ID and then check if its null.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a function that can handle the results of a dataplane
ctx status, dpending on the operation performed.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add functions for sending a nexthop to be queued on the dataplane
for install/uninstall into the kernel.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Added functionality so that when we receive a RTM_DELNEXTHOP
for a nhg_hash_entry that is still being referenced by
a route, we immediately push it back to the kernel.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a function for installing Nexthop Group hash entires into
the kernel. It sends the entry to the dataplane and does any
post-processing immediately after that.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Added a NEXTHOP_GROUP_QUEUED flag to the nexthop
group hash entry struct. This indicates when we have
sent it to be installed to the kernel and are waiting
for the dataplane provider to process it.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We were ignoring the status result interger from
the netlink request and message parsing and just
returning 0. Fixed this to return the result of the last one.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Added a check on startup for determining if the kernel supports
nexthop objects. It sets an appropriate bool on the zebra namespace
struct.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Device only nexthops still need an address family associated
with them. Decided to get this from the destination prefix on it.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
The nexthop dataplane context was not getting populated with
namespace info for its netlink messages. Fixed this to do
lookups the same way we do it with route contexts.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We needed a kernel debugging function for netlink nexthop
messages when people are debugging kernel zebra messages.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add all the neccessary code to allow nexthops to be processed
in separate dataplane contexts with the netlink dataplane kernel
provider.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We needed an error code that can be used when we
fail to install a nexthop group into the kernel/fib.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Added the appropriate flags that need to be set when
we receive a nexthop from the kernel. They should be
marked as ACTIVE and that they are in the FIB.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add the functionality to parse new nexthop group messages
from the kernel and insert them into the appropriate hash
tables. Parsing is done at startup between interface and
interface address lookup. Add functionality to parse
changes to nexthops we already have. Add functionality
to parse delete nexthop messages from the kernel and
remove them from our table.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
I do not believe we should be hashing based on AFI
in for our upper level nexthop group entries. These
should be ambiguous with regards to address families since
an ipv4 or ipv6 address can have the same interface
nexthop. This can be seen in NEXTHOP_TYPE_IFINDEX.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add an error code that indicates we received a nexthop
from the kernel that is identical to one it/we already
have other than its ID.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add the basic infrastructure for a nexthop group work queue.
This queue will be used to validate and then install the
new nexthop group.
The result from the kernel when a new nexthop group is installed
will cause the route entries that depend on it to be installed.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add a nexthop hash entry to the route_entry so that we can
track the nhe with the route entry.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Since we are using two different tables to hash the next groups with,
lets add an error message in case there is a failure to insert into
one of them. This will help to notify if the tables are not synced.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
The messages we get from the kernel come with ids only
for groups, so lets index with those as well. Also adding
a helper function for lookup and get with the two different
tables.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Separate interface lookup into its own function.
We need to know interfaces for reading in nexthop
information, but we need to know nexthops for reading
in the interface addresses. We will read in nexthops
between the two.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
The nexthop_active_num data structure is a property of the
nexthop group. Move the keeping of this data to that.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In the route_entry we are keeping a non pointer based
nexthop group, switch the code to use a pointer for all
operations here and ensure we create and delete the memory.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add some base functionality so we can verify we are getting messages
about nexthops from the kernel.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add some code to allow us to do lookups and releases of
nexthop groups from zebra. At this point we do not do anything
with it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We need to track if a nexthop group is valid and installed,
so create some basic flags to track this.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This commit does nothing more than just create a hash structure
that we will use to track nexthop groups.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Since we don't have a daemon who's job is to handle kernel
routes and we don't get an explicit route delete anymore if
nexthops become unreachable from the kernel, zebra must
re-process kernel routes itself to make sure they are still valid.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We can assume that system/kernel routes are valid indeed
if this is our first time procesing them. But since we don't
get explicit deletion events for kernel routes anymore, we
have to be prepared to process them if the nexthop becomes
unreachable for instance. Therefore, if the route is not NEW,
then don't assume its valid.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
If we need to batch process the rib (all tables or specific
vrf), do so as a scheduled thread event rather than immediately
handling it. Further, add context to the events so that you
narrow down to certain route types you want to reprocess.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Do not allow an upper level protocol to send a route to
zebra that is a /32 or a /128 that recurses through itself.
Current behavior:
donna.cumulusnetworks.com# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
K>* 0.0.0.0/0 [0/104] via 10.0.2.2, enp0s3, 01:05:28
C>* 10.0.2.0/24 is directly connected, enp0s3, 00:01:50
C>* 192.168.209.0/24 is directly connected, enp0s8, 01:05:28
C>* 192.168.210.0/24 is directly connected, enp0s9, 01:05:28
D>* 192.168.210.43/32 [150/0] via 192.168.210.44, enp0s9, 01:01:57
D 192.168.210.44/32 [150/0] via 192.168.210.44 inactive, 01:05:15
C>* 192.168.212.0/24 is directly connected, enp0s10, 01:05:28
donna.cumulusnetworks.com# sharp install routes 40.0.0.1 nexthop 192.168.210.44
% Command incomplete: sharp install routes 40.0.0.1 nexthop 192.168.210.44
donna.cumulusnetworks.com# sharp install routes 40.0.0.1 nexthop 192.168.210.44 1
donna.cumulusnetworks.com# end
donna.cumulusnetworks.com# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
K>* 0.0.0.0/0 [0/104] via 10.0.2.2, enp0s3, 01:05:51
C>* 10.0.2.0/24 is directly connected, enp0s3, 00:00:12
D>* 40.0.0.1/32 [150/0] via 192.168.210.44, enp0s9, 00:00:03
C>* 192.168.209.0/24 is directly connected, enp0s8, 01:05:51
C>* 192.168.210.0/24 is directly connected, enp0s9, 01:05:51
D>* 192.168.210.43/32 [150/0] via 192.168.210.44, enp0s9, 01:02:20
D 192.168.210.44/32 [150/0] via 192.168.210.44 inactive, 01:05:38
C>* 192.168.212.0/24 is directly connected, enp0s10, 01:05:51
donna.cumulusnetworks.com#
Fixed behavior:
donna.cumulusnetworks.com# sharp install routes 192.168.210.44 nexthop 192.168.210.44 1
donna.cumulusnetworks.com# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
K>* 0.0.0.0/0 [0/104] via 10.0.2.2, enp0s3, 00:00:15
C>* 10.0.2.0/24 is directly connected, enp0s3, 00:00:15
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:00:15
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:00:15
D 192.168.210.44/32 [150/0] via 192.168.210.44 inactive, 00:00:03
C>* 192.168.212.0/24 is directly connected, enp0s10, 00:00:15
donna.cumulusnetworks.com# sharp install routes 40.0.0.1 nexthop 192.168.210.44 1
donna.cumulusnetworks.com# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
K>* 0.0.0.0/0 [0/104] via 10.0.2.2, enp0s3, 00:00:24
C>* 10.0.2.0/24 is directly connected, enp0s3, 00:00:24
D>* 40.0.0.1/32 [150/0] via 192.168.210.44, enp0s9, 00:00:02
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:00:24
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:00:24
D 192.168.210.44/32 [150/0] via 192.168.210.44 inactive, 00:00:12
C>* 192.168.212.0/24 is directly connected, enp0s10, 00:00:24
donna.cumulusnetworks.com#
This behavior came up from discussion around issue #5159. Where
OSPF was receiving a route through itself as part of the router link
lsa. I currently think that ospf should probably dissallow this in ospf
but we should also do the right thing in zebra. If we do not allow this
change we can have situations where ordering of routes into zebra suddenly
matters.
Fixes: #5159
Signed-off-by: Donald Sharp <sharpd@cumulsunetworks.com>
If we only really use the ifp for the name, then
don't bother referencing the ifp. If that ifp is
freed, we don't expect zebra to handle the rules that
use it (that's pbrd's job), so it is going to be
pointing to unintialized memory when we decide to remove
that rule later. Thus, just keep the name in the data
and dont mess with pointer refs.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Use the ifindex value as a primary hash key/identifier, not
the ifp pointer. It is possible for that data to be freed
and then we would not be able to hash and find the rule entry
anymore. Using the ifindex, we can still find the rule even
if the interface is removed.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We were seeing a double free on shutdown if the
hash release fails here due to the interface state
changing. We probably shouldn't free the data if its
still being handled in the table so adding a check there
and a debug message.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
With commit: a9ff90c41b
the vrf_id_t was changed from a uint16_t to a uint32_t
Zebra tracked the last command sent to it's peer via peeking
into the data it was sending to each client ( since we had
lost the idea of what the command was when it was time to track
the data ).
Add a define to track this and add a bit of verbiage
to the code to allow us to notice when we screw with
the header again so that this is just fixed correctly
when it happens again.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
initially, vrf backend if vrf-lite, and a specific table identifier is
associated to a vrf. here, with netns vrf backend, there is no specific
table assigned to except default routing table. use the passed table_id
parameter in zapi api, and apply it to the entry to be pushed in, or to
be removed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Cleanup the interface creation apis to make it more
clear what they are doing.
Make it explicit that the creation via name/ifindex will
only add it to the appropriate list.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
commit ee8a72f315
broke the usage of ZEBRA_ROUTE_ALL as a valid redistribution
command. This commit puts it back in. LDP uses ZEBRA_ROUTE_ALL
as an option to say it is interested in all REDISTRIBUTION events.
Fixes: #5072
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Don't process dataplane results in zebra during shutdown (after
sigint has been seen). The dplane continues to run in order to
clean up, but zebra main just drops results.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
asymmetric routing default vrf vni configuration
is not displayed as part of running-config.
Ticket:CM-26470
Reviewed By:
Testing Done:
T11# config t
T11(config)# vni 4004 prefix-routes-only
T11(config)# end
Before:
T11# show running-config
...
vni 4004
...
After:
T11# show running-config
...
vni 4004 prefix-routes-only
...
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Add the (single) dataplane config value to the output of
config write, 'show run' - missed this during dplane development.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
the if_lookup_by_name_per_ns keeps a lock on the node where the
searched ifp is stored. Then this node can not be freed even if
the ifp is removed from the node. Just add the missing unlock
(as for the if_lookup_by_index_per_ns lookup function)
Fixes: b8af3fbbaf ("zebra: fix detection of interface renames")
Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Current autocompletion works only for simple "vrf NAME" case.
This commit expands it also for the following cases:
- "nexthop-vrf NAME" in staticd
- usage of $varname in many daemons
All daemons are updated to use single varname "$vrf_name".
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
1. add the Mlag ProtoBuf Lib to Zebra Compilation
2. Encode the messages with protobuf before writing to MLAG
3. Decode the MLAG Messages using protobuf and write to clients
based on their subscrption.
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>
This includes:
1. Processing client Registrations for MLAG
2. storing client Interests for MLAG updates
3. Opening communication channel to MLAG with First client reg
4. Closing Communication channel with last client De-reg
5. Spawning a new thread for handling MLAG updates peocessing
6. adding Test code
7. advertising MLAG Updates to clients based on their interests
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>
When a VxLAN interface comes up new vni up event is sent
to bgpd, which triggers bgpd to sync advertise-svi-macip
to zebra. At this point, vni is present but the associated
SVI may not be present.
When SVI comes up, vni add event sent to bgpd (with associated
vrf update). Bgpd already has vni present so
advertise-svi-macip is not synced to Zebra.
To fix,
When advertise-svi-macip flag is synced first time, cache it in
zebra context even though vni associated SVI is not present.
when SVI comes up, interface address add event triggers
new MAC-IP route add to bgpd.
Ticket:CM-26038
Reviewed By:CCR-9254
Testing Done:
Validated via running a sequence of steps in symmetric
routing topology.
- Enable advertise-svi-macip at l2vni level under bgp default
instance (afi/safi, l2vpn/evpn)
- Flap l2vni associated SVI interface.
- Check the output of 'show bgp l2vpn evpn route' command for
MAC-IP route of the SVI's (MAC and IP address).
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Update neighbor entries and rule entries to have the RTPROT_ZEBRA
protocol value. So we can tell where things come from.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Start the conversion to allow zapi interface callbacks to be
controlled like vrf creation/destruction/change callbacks.
This will allow us to consolidate control into the interface.c
instead of having each daemon read the stream and react accordingly.
This will hopefully reduce a bunch of cut-n-paste stuff
Create 4 new callback functions that will be controlled by
lib/if.c
create -> A upper level protocol receives an interface creation event
The ifp is brand spanking newly created in the system.
up -> A upper level protocol receives a interface up event
This means the interface is up and ready to go.
down -> A upper level protocol receives a interface down
destroy -> A upper level protocol receives a destroy event
This means to delete the pointers associated with it.
At this point this is just boilerplate setup for future commits.
There is no new functionality.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When zebra gets a callback from the kernel that an interface has
actually been deleted *and* the end users has not configured
the interface, then allow for deletion of the interface from zebra.
This is especially important in a docker environment where containers
and their veth interfaces are treated as ephermeal. FRR can quickly
have an inordinate amount of interfaces sitting around that are
not in the kernel and we have no way to clean them up either.
My expectation is that this will cause a second order crashes
in upper level protocols, but I am not sure how to catch these
and fix them now ( suggestions welcome ). There are too many
use patterns and order based events that I cannot know for certain
that we are going to see any at all, until someone sees this problem
as a crash :( I do not recommend that this be put in the current
stabilization branch and allow this to soak in master for some time
first.
Testing:
sharpd@donna ~/frr4> sudo ip link add vethdj type veth peer name vethjd
sharpd@donna ~/frr4> sudo ip link add vethaa type veth peer name vethab
sharpd@donna ~/frr4> sudo vtysh -c "show int brief"
Interface Status VRF Addresses
--------- ------ --- ---------
dummy1 down default
enp0s3 up default 10.0.2.15/24
enp0s8 up default 192.168.209.2/24
enp0s9 up default 192.168.210.2/24
enp0s10 up default 192.168.212.4/24
lo up default 10.22.89.38/32
vethaa down default
vethab down default
vethdj down default
vethjd down default
virbr0 up default 192.168.122.1/24
virbr0-nic down default
sharpd@donna ~/frr4> sudo ip link set vethaa up
sharpd@donna ~/frr4> sudo ip link set vethab up
sharpd@donna ~/frr4> sudo ip link del vethdj
sharpd@donna ~/frr4> sudo vtysh -c "show int brief"
Interface Status VRF Addresses
--------- ------ --- ---------
dummy1 down default
enp0s3 up default 10.0.2.15/24
enp0s8 up default 192.168.209.2/24
enp0s9 up default 192.168.210.2/24
enp0s10 up default 192.168.212.4/24
lo up default 10.22.89.38/32
vethaa up default
vethab up default
virbr0 up default 192.168.122.1/24
virbr0-nic down default
sharpd@donna ~/frr4> sudo ip link del vethaa
sharpd@donna ~/frr4> sudo vtysh -c "show int brief"
Interface Status VRF Addresses
--------- ------ --- ---------
dummy1 down default
enp0s3 up default 10.0.2.15/24
enp0s8 up default 192.168.209.2/24
enp0s9 up default 192.168.210.2/24
enp0s10 up default 192.168.212.4/24
lo up default 10.22.89.38/32
virbr0 up default 192.168.122.1/24
virbr0-nic down default
sharpd@donna ~/frr4> sudo ip link add vethaa type veth peer name vethab
sharpd@donna ~/frr4> sudo vtysh -c "show int brief"
Interface Status VRF Addresses
--------- ------ --- ---------
dummy1 down default
enp0s3 up default 10.0.2.15/24
enp0s8 up default 192.168.209.2/24
enp0s9 up default 192.168.210.2/24
enp0s10 up default 192.168.212.4/24
lo up default 10.22.89.38/32
vethaa down default
vethab down default
virbr0 up default 192.168.122.1/24
virbr0-nic down default
sharpd@donna ~/frr4> sudo vtysh -c "show run"
Building configuration...
Current configuration:
!
frr version 7.2-dev
frr defaults datacenter
hostname donna.cumulusnetworks.com
log stdout
no ipv6 forwarding
!
ip route 192.168.3.0/24 192.168.209.1
ip route 192.168.4.0/24 blackhole
ip route 192.168.5.0/24 192.168.209.1
ip route 192.168.6.0/24 192.168.209.1
ip route 192.168.7.0/24 99.99.99.99 nexthop-vrf EVA
ip route 192.168.8.0/24 192.168.209.1
ip route 4.5.6.7/32 12.13.14.15
!
interface dummy1
ip address 12.13.14.15/32
!
interface vethaa
description FROO
!
line vty
!
end
sharpd@donna ~/frr4> sudo ip link del vethaa
sharpd@donna ~/frr4> sudo vtysh -c "show int brief"
Interface Status VRF Addresses
--------- ------ --- ---------
dummy1 down default
enp0s3 up default 10.0.2.15/24
enp0s8 up default 192.168.209.2/24
enp0s9 up default 192.168.210.2/24
enp0s10 up default 192.168.212.4/24
lo up default 10.22.89.38/32
vethaa down default
virbr0 up default 192.168.122.1/24
virbr0-nic down default
sharpd@donna ~/frr4> sudo vtysh -c "show run"
Building configuration...
Current configuration:
!
frr version 7.2-dev
frr defaults datacenter
hostname donna.cumulusnetworks.com
log stdout
no ipv6 forwarding
!
ip route 192.168.3.0/24 192.168.209.1
ip route 192.168.4.0/24 blackhole
ip route 192.168.5.0/24 192.168.209.1
ip route 192.168.6.0/24 192.168.209.1
ip route 192.168.7.0/24 99.99.99.99 nexthop-vrf EVA
ip route 192.168.8.0/24 192.168.209.1
ip route 4.5.6.7/32 12.13.14.15
!
interface dummy1
ip address 12.13.14.15/32
!
interface vethaa
description FROO
!
line vty
!
end
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We were storing the interface description irrelevant of whether
or not it was a newlink or dellink. This makes no sense.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This change addresses the following :
1. Ensures zlog_debug should be under DEBUG macro check
2. Ensures zlog_err and zlog_warn wherever applicable.
3. Removed few posivite logs from fpm handling, whose frequency is high.
Signed-off-by: vishaldhingra <vdhingra@vmware.com>
when a client disconnects, we iterate over the routing table to
remove any label that originated from that client. However we
were erroneously passing the route type to the function, while
it was expecting the lsp type. As a result, for example, killing
ldpd would not remove the ldp labels from the routes.
Kudos to @rwestphal for pointing me to the source of the issue.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
speed interface is done 15 seconds after interface creation. during that
time, the vrf or the interface may have disappeared. to protect this,
return an error in case it is not possible to create a vrf socket or it
is not possible to get speed of an interface because of a missing
device.
Signed-off-by: Julien Floret <julien.floret@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When processing route updates from the dataplane, we were
terminating the checking of nexthops prematurely, and we could
miss meaningful changes.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
User pass the string match large-community 1 exact-match from CLI.
Now route map lib has got the string as "1 exact-match". It passes the string
to call back for compilation. BGP will parse this string and came to know
that for "1" it has to do exact match. Routemap lib has to save "1" in it’s
dependency table. Here routemap is saving this as a “1 exact-match”
which is wrong. The solution is used the compiled data.
Signed-off-by: vishaldhingra <vdhingra@vmware.com>
When selecting a new best route, zebra sends a redist update
when the route is installed. There are cases where redist
clients may not see that redist add - clients who are not
subscribed to the new route type, e.g. In that case, attempt
to send a redist delete for the old/previous route type.
Revised the redist delete api to accomodate both cases;
also tightened up the const-ness of a few internal redist apis.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Add a bit of extra command `show ip route summary table XXX`
To allow end user to specify a specific table that they want
summary information on.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This new message makes it possible to install/reinstall LSPs with
multiple nexthops using a single ZAPI message.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
If the nexthop is of type `GATEWAY_IFINDEX` then the nexthop
should not resolve to a nexthop that has a different ifindex
from the one given.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Use the zserv_client_close hook to cleanup all MPLS labels advertised
by a zclient when it disconnects. We were doing this cleanup for
ldpd only, but now we have other daemons that are MPLS aware,
like ospfd (due to the new Segment Routing feature).
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
* Add ability to specify the nexthop type;
* Add ability to install or not a FTN (in addition to an LSP).
These two additions will be useful to install local SR Prefix-SIDs
configured with the no-PHP option.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
SR support for IS-IS is coming so we need to be able to distinguish
OSPF and IS-IS LSPs.
While here, add missing case statement for LDP on
lsp_type_from_re_type().
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Use the route type and instance instead of the route distance
to identify MPLS FTNs. This is a more robust approach since the
routing daemons can modify the distance of their announced routes
via configuration, which can cause inconsistencies.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Do this for the following reasons:
* Improve modularity of the code by separating the decoding of the
ZAPI messages from their processing;
* Create an API that is easier to use by the client daemons.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Some netlink-facing code used for evpn/vxlan programming was
being run in the dataplane pthread, but accessing zebra core
datastructs. Move some additional data into the dataplane
context, and use it in the netlink path instead.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Everywhere else in the code we use GNU_LINUX, that is the symbol we actualy define in the configuration. Don't rely on compiler's built-in symbols.
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
frr_with_mutex(...) { ... } locks and automatically unlocks the listed
mutex(es) when the block is exited. This adds a bit of safety against
forgetting the unlock in error paths & co. and makes the code a slight
bit more readable.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Conver these functions:
route_map_add_match
route_map_delete_match
route_map_add_set
route_map_delete_set
To return the `enum rmap_compile_rets` and ensure all functions
that use this code handle all the enumerated possible returns.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We agreed on this several weeks ago on the weekly call, I just forgot to
actually put it in a PR...
A call for any Protobuf FPM users to raise their hand came up empty on
both the mailing list as well as Slack. Let's see if this gets any
response. If not, it'll be time to remove Protobuf FPM.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
even if vty commands were available, the default resolution command was
working only for the first vrf configured. others were ignored. Also,
for nexthop, resolution was working for all vrfs, and not the specific
one.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Move neighbor programming to the dataplane; remove
old apis; remove some ifdef'd use of direct netlink
code points, using neutral values outside of the netlink-
specific files.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
re->nexthop_num and re->nexthop_active_num are calculated while rib
processing. Also It helps in encoding the ZAPI message.
It's good to dump these parameters also, when the system is in
abnormal state.
Signed-off-by: vishaldhingra<vdhingra@vmware.com>
When resolving a nexthop, append its labels to the one its
resolving to along with the labels that may already be present there.
Before we were ignoring labels if the resolving level was greater than
two.
Before:
```
S> 2.2.2.2/32 [1/0] via 7.7.7.7 (recursive), label 2222, 00:00:07
* via 7.7.7.7, dummy1 onlink, label 1111, 00:00:07
S> 3.3.3.3/32 [1/0] via 2.2.2.2 (recursive), label 3333, 00:00:04
* via 7.7.7.7, dummy1 onlink, label 1111, 00:00:04
K>* 7.7.7.7/32 [0/0] is directly connected, dummy1, label 1111, 00:00:17
C>* 192.168.122.0/24 is directly connected, ens3, 00:00:17
K>* 192.168.122.1/32 [0/100] is directly connected, ens3, 00:00:17
ubuntu_nh#
```
This patch:
```
S> 2.2.2.2/32 [1/0] via 7.7.7.7 (recursive), label 2222, 00:00:04
* via 7.7.7.7, dummy1 onlink, label 1111/2222, 00:00:04
S> 3.3.3.3/32 [1/0] via 2.2.2.2 (recursive), label 3333, 00:00:02
* via 7.7.7.7, dummy1 onlink, label 1111/2222/3333, 00:00:02
K>* 7.7.7.7/32 [0/0] is directly connected, dummy1, label 1111, 00:00:11
C>* 192.168.122.0/24 is directly connected, ens3, 00:00:11
K>* 192.168.122.1/32 [0/100] is directly connected, ens3, 00:00:11
ubuntu_nh#
```
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Debian packaging when run finds a bunch of spelling errors:
I: frr: spelling-error-in-binary usr/bin/vtysh occurences occurrences
I: frr: spelling-error-in-binary usr/lib/frr/bfdd Amount of times Number of times
I: frr: spelling-error-in-binary usr/lib/frr/bgpd occurences occurrences
I: frr: spelling-error-in-binary usr/lib/frr/bgpd recieved received
I: frr: spelling-error-in-binary usr/lib/frr/isisd betweeen between
I: frr: spelling-error-in-binary usr/lib/frr/ospf6d Infomation Information
I: frr: spelling-error-in-binary usr/lib/frr/ospfd missmatch mismatch
I: frr: spelling-error-in-binary usr/lib/frr/pimd bootsrap bootstrap
I: frr: spelling-error-in-binary usr/lib/frr/pimd Unknwon Unknown
I: frr: spelling-error-in-binary usr/lib/frr/zebra Requsted Requested
I: frr: spelling-error-in-binary usr/lib/frr/zebra uknown unknown
I: frr: spelling-error-in-binary usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0 overriden overridden
This commit fixes all of them except the bgp `recieved` issue due to
it being part of json output. That one will need to go through
a deprecation cycle.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Since we are now away from the dual use of the destination field, there
is no need to single out /32 addresses as broadcast. This was bugged
anyway, since the same /32 criteria was used for IPv6 addresses as well,
when `connected_check_ptp` is called in `connected_delete_ipv6`.
Fixes: 3053
Signed-off-by: Juergen Werner <juergen@opensourcerouting.org>
The `destination` field of the connection structure was used to store
the broadcast address, if the connection was not p2p. This multipurpose
is not very evident and the benefits over calculating the bcast address
on the fly minimal.
Signed-off-by: Juergen Werner <juergen@opensourcerouting.org>
Create an interface with IP4 link local address 169.254.0.131/25.
In BGP enable the redistribute connected. Now Zebra will not send
the route corresponding to IPV4 link local address. Now made this
interface down and up. Zebra sends the route to BGP.
Zebra should not send this route to BGP.
This Fix would make the behaviour consistent and would not send the
routes corresponding to IPV4 Link local addresses.
Signed-off-by: vishaldhingra <vdhingra@vmware.com>
In if_netlink.c, when an interface structure, ifp, is first created,
its possible for the master to come up after the slave interface does.
This means, the slave interface has no way to display the master's ifname
in show outputs. To fix this, we need to allow creation by ifindex instead
of by ifname so that this issue is handled.
Signed-off-by: Dinesh G Dutt<5016467+ddutt@users.noreply.github.com>
When displaying the master interface's information in "show interface",
the display is currently the ifindex of the master interface. Make it
display the name as well as that is more useful than the name.
Signed-off-by: Dinesh G Dutt<5016467+ddutt@users.noreply.github.com>
If there is a recursive route resolved over blackhole route, then
the resolved blackhole_type is not getting set correctly.
This fix updates the bh_type correctly for resursive routes.
Signed-off-by: vishaldhingra <vdhingra@vmware.com>
We used the vrf_id in the rtm_table field of the netlink rtmsg to fetch L3VNI.
But, now we program table_id to rtm_table field instead of vrf_id.
Thus, L3VNI fetched using rtm_table is incorrect.
Instead, use nexthop->vrf_id to fetch the L3VNI.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
PR #3745 added EVPN feature to advertise individual
SVI-IPs as MAC-IP routes.
Fix a condition in zebra to send MAC and IP pair
to bgpd when the feature is enabled.
Testing Done:
Originator VTEP:
TORC11:~# ip -br addr show VxU-1002
VxU-1002 UP 45.0.2.2/24 2001:fee1:0:2::2/64
show bgp l2vpn evpn vni 1004
VNI: 1004 (known to the kernel)
Type: L2
Tenant-Vrf: default
RD: 27.0.0.11:3
Advertise-svi-macip : Yes
Import Route Target:
10:1004
Export Route Target:
10:1004
Remote vtep evpn route output for 45.0.4.2:
BGP routing table entry for 27.0.0.11:3:[2]:[0]:[48]:[00:02:00:00:00:2f]:[32]:[45.0.4.2]
Paths: (2 available, best #1)
Advertised to non peer-group peers:
MSP1(uplink-1) MSP2(uplink-2)
Route [2]:[0]:[48]:[00:02:00:00:00:2f]:[32]:[45.0.4.2] VNI 1004
64435 65546
36.0.0.11 from MSP1(uplink-1) (27.0.0.9)
Origin IGP, valid, external, bestpath-from-AS 64435, best (First path received)
Extended Community: RT:10:1004 ET:8
Last update: Thu Aug 8 18:09:13 2019
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
The `show ip nht vrf EVA ...` command was not allowing you to only
specify the vrf anymore. Fix this:
robot# show ip nht vrf EVA
<cr>
A.B.C.D IPv4 Address
X:X::X:X IPv6 Address
robot# show ip nht vrf EVA 4.5.6.7
robot# show ip nht vrf EVA
robot#
Ticket: CM-25831
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ensure that the route-entry QUEUED flag is cleared in the async
notification path, as it is in the normal results processing
code path.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Update the stats displayed by 'show zebra dplane' - some
counters had been added but not displayed. Also include
the new counters for evpn macs.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
When interfaces change while they are up, Zebra sends if_up
notifications with the updated interface info. Change Zebra to send
if_down notifications with interface info when the interface changes
while it is down.
VRRP, at the least, needs these to know about MAC changes while an
interface is down.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When we are sending a redistribute_update, pass the old_re in
so that if we still have it around we can update the calling protocol.
Test:
router ospf
redistribute sharp
!
sharp install route 4.5.6.7 nexthop 192.168.201.1 1
Now add a `ip route 4.5.6.7/32 192.168.201.1`.
This causes zebra to replace the sharp route with the static route.
No update is sent to ospf and debug:
2019/08/01 19:02:38.271998 ZEBRA: 0:4.5.6.7/32: Redist update re 0x12fdbda0 (static), old 0x0 (None)
With fix:
2019/08/01 19:15:09.644499 ZEBRA: 0:4.5.6.7/32: Redist update re 0x1ba5bce0 (static), old 0x1beea4e0 (sharp)
2019/08/01 19:15:09.645462 OSPF: ospf_zebra_read_route: from client sharp: vrf_id 0, p 4.5.6.7/32
Ticket: CM-25847
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Delete an auto MAC with no neighbor associated,
when its VNI is down.
In Following sequence stale MAC entry retained in
FRR (zebra).
- Local MAC-IP pair
- MAC is deleted in bridge fdb table
- VNI is down, triggers IP (neigh) entries removed
from FRR DB.
- MAC retained as AUTO MAC with neigh list count 0.
- When VNI is UP again, stale MAC entry retained in FRR
DB.
When the MAC-IP pair moves behind remote VTEP, local VTEP
fails to add remote entry as its MAC is in auto state.
Ticket:CM-25504
Reviewed By:
Testing Done:
Validated the sequence with fix and auto MAC is deleted
when VNI is down.
When VNI comes up, the remote MAC-IP is added to FRR (Zebra)
and kernel.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Add info info in local mac del debug,
the local sequence and assoicated neigh count.
remote_mac_ip_add modify debug to display
flags value to cover local, remote and auto flags
for the MAC.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
The changes came as part of PR #4730, checks
only variable mac, which is never null. Even
for ip version of cli hits "mac" case statement
and failing the clear cli.
Testing Done:
Before Fix:
VTEP-03# show evpn arp-cache vni 1002 duplicate
VNI 1002 #ARP (IPv4 and IPv6, local and remote) 1
IP Type State MAC Remote VTEP
Seq #'s
11.11.11.11 remote active aa:22:aa:aa:aa:aa 27.0.0.16
7/8
VTEP-03# clear evpn dup-addr vni 1002 ip 11.11.11.11
% Requested MAC does not exist in VNI 1002
Post fix:
VTEP-03# clear evpn dup-addr vni 1002 ip 11.11.11.11
VTEP-03#
VTEP-03# show evpn mac vni all duplicat
VNI 1002 #MACs (local and remote) 1
MAC Type Intf/Remote VTEP VLAN Seq #'s
aa:aa:aa:aa:aa:aa remote 27.0.0.16 7/8
Post fix:
VTEP-03# clear evpn dup-addr vni 1002 mac aa:aa:aa:aa:aa:aa
VTEP-03#
VTEP-03# clear evpn dup-addr vni 1002 ip 11.11.11.11
VTEP-03#
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
The 'show ip import-check A.B.C.D` code was generating
a /32 prefix for comparison. Except import-check was
being used by bgp to track networks. So they could
have received a /24( or anything the `network A.B.C.D/M`
statement specifies ).
Consequently when we do a `show ip import-check A.B.C.D`
we would never find the network but a `show ip import-check |
grep A.B.C.D` would find it.
Fix the exact comparison to a match.
For the `show ip nht A.B.C.D` case we are comparing
a /32 to a /32 so prefix_match will work still.
While a `show ip import-check A.B.C.D` will now show
the expected behavior as well.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The flag ROUTE_ENTRY_NEXTHOPS_CHANGED is only ever set or unset.
Since this flag is not used for anything useful, remove from system.
By changing this flag we have re-ordered `internalStatus' of json
output of zebra rib routes. Go through and fix up tetsts to
use the new values.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The code as written before this code change point would enqueue
every system route type to be refigured when we have an
interface event. I believe this was to originally handle bugs
in the way nexthop tracking was handled, mainly that if you keep
asking the question you'll eventually get the right answer.
Modify the code to not do this, we have fixed nexthop tracking
to not be so brain dead and to know when it needs to refigure
a route that it is tracking.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When a new system route comes in and we have a pre-existing
non-system route we are not deleting the current system
route from the linux kernel.
Modify the code such that when a route replace is sent
to the kernel with a new route as a system route and
the old route as a non-system route do a delete of
the old route so it is no longer in the kernel.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Initial data struct and api changes to support EVPN MAC
updates via the dataplane subsystem (no handlers yet).
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Problem reported where certain routes were not being passed on to
clients if they were operated on while still queued for kernel
installation. Changed it to defer working on entries that were
queued to dplane so we could operate on them after getting an
answer back from kernel installatino.
Ticket: CM-25480
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Introducing a 3rd state for route_map_apply library function: RMAP_NOOP
Traditionally route map MATCH rule apis were designed to return
a binary response, consisting of either RMAP_MATCH or RMAP_NOMATCH.
(Route-map SET rule apis return RMAP_OKAY or RMAP_ERROR).
Depending on this response, the following statemachine decided the
course of action:
State1:
If match cmd returns RMAP_MATCH then, keep existing behaviour.
If routemap type is PERMIT, execute set cmds or call cmds if applicable,
otherwise PERMIT!
Else If routemap type is DENY, we DENYMATCH right away
State2:
If match cmd returns RMAP_NOMATCH, continue on to next route-map. If there
are no other rules or if all the rules return RMAP_NOMATCH, return DENYMATCH
We require a 3rd state because of the following situation:
The issue - what if, the rule api needs to abort or ignore a rule?:
"match evpn vni xx" route-map filter can be applied to incoming routes
regardless of whether the tunnel type is vxlan or mpls.
This rule should be N/A for mpls based evpn route, but applicable to only
vxlan based evpn route.
Also, this rule should be applicable for routes with VNI label only, and
not for routes without labels. For example, type 3 and type 4 EVPN routes
do not have labels, so, this match cmd should let them through.
Today, the filter produces either a match or nomatch response regardless of
whether it is mpls/vxlan, resulting in either permitting or denying the
route.. So an mpls evpn route may get filtered out incorrectly.
Eg: "route-map RM1 permit 10 ; match evpn vni 20" or
"route-map RM2 deny 20 ; match vni 20"
With the introduction of the 3rd state, we can abort this rule check safely.
How? The rules api can now return RMAP_NOOP to indicate
that it encountered an invalid check, and needs to abort just that rule,
but continue with other rules.
As a result we have a 3rd state:
State3:
If match cmd returned RMAP_NOOP
Then, proceed to other route-map, otherwise if there are no more
rules or if all the rules return RMAP_NOOP, then, return RMAP_PERMITMATCH.
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
A client was sending zebra a route with no nexthops! Update the
error message to tell us *Which* daemon is doing this.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Initial commit of understanding interface speed changes
on startup was this commit:
dc7b3caefb
Effectively we had encountered situations on system startup
where the interface speed for a device was not properly setup
when zebra learns about the interface ( Imagine a bond being
brought up and the controlling software creating the bond
is not fast given system load, the bond's speed changes
upwards for each interface added ).
The initial workup on this was to allow a 15 second window
and then just reread the interface speed. We've since noticed
that under heavy system load on startup this is not always sufficient.
So modify the code to wait the 15 seconds and then check the interfaces
speed. If the interfaces speed is still MAX_UINT32T or it has changed
let's wait a bit longer and try again.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
in order to both streamline the code and allow users to
define their own specialized versions of the LM api handlers,
define hooks for the 4 main primitives offered by the label
manager (i.e. connect, disconnect, get_chunk and release_chunk),
and have the existing code be run in response to a hook_call.
Additionally, have the responses to the requesting daemon be
callable from an external API.
Note that the proxy version of the label manager was a source of
issues and hardly used in practice. With the new hooks, users with
more complex requirements can simply plug in their own code to
handle label distribution remotely, so there is no longer a reason
to maintain this code.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
when requesting a specific label chunk (e.g. for the SRGB),
it might happen that we cannot get what we want. In this
event, we must be prepared to receive a response with no
label chunk. Without this fix, if the remote label manager
was not able to alloate the chunk we requested, we would
hang indefinitely trying to read data from the stream which
was not there.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
For SRGB, we need to support chunk requests starting at a
specific point in the label space, rather than just asking
for any sufficiently large chunk. To this purpose, we extend
the label manager api to request a chunk with a base value;
if the base is set to 0, the label manager will behave as it
currently does, i.e. fetching the first free chunk big enough
to satisfy the request.
update all the existing calls to get chunks from the label
manager so that they use MPLS_LABEL_BASE_ANY as the base
for the requested chunk
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Add a conditional to guard against segfaulting on the debug
statement when zvrf lookup fails.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
initially, that command was dumping only tables from default vrfs.
the change here consists in dumping all the tables from all the vrfs.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
the table identifier is made visible. this permits to easily know which
table identifier is dumped, or which table that entry belongs to, when
one calls 'show ip route all' command.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
this vty command explores the routing tables available, and dumps the
routing entries. there is no need to pass a table identifier, since all
configured tables are dumped.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
in addition to support for tcpflags, it is possible to filter on any
protocol. the filtering can then be based with iptables.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
zvni setup in zebra is controlled via bgpd i.e. advertise_all_vni
from bgpd triggers this setup. As a part of zvni creation we may need
to setup BUM mcast SG entries which are propagated to pimd for MDT setup.
Now pimd may not be present at the time of zvni creation or may restart
post zvni creation so we need a mechanism to replay (on pimd startup) and
to cleanup (on pimd stop). This is addressed via zebra_vxlan_sg_replay and
zebra_evpn_pim_cfg_clean_up.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
As part of PR 4458, when a client (bgpd) goes down,
zebra cleans up any evpn state including remotely learned
neighs, macs and vteps are suppose to be cleaned up,
uninstall from kernel tables.
Neighs (arps), macs and vteps (HREP entries) were not
removed from kernel tables as the uninstall flag as not set.
Clean up l3vni associated remote neighs, macs and vteps.
Ticket:CM-25468
Reviewed By:CCR-8889
Testing Done:
Validated in evpn symmetric routing topology where
remotely learned l2/l3 vnis neigh, macs and remote
vtep (hrep) entries are installed in kernel table,
perform systemctl stop frr.service and validated
all remotely learned entries cleaned up from kernel
tables.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Add a file that exposes functions which modify nexthop groups.
Nexthop groups are techincally immutable but there are a
few special cases where we need direct access to add/remove
nexthops after the group has been made. This file provides a
way to expose those functions in a way that makes it clear
this is a private/hidden api.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
When installing a table route into the kernel choose
RTPROT_ZEBRA as the installing/controlling protocol.
This way we can know we installed it as well as stop
the warnings about this special case of `ip import-table XX`
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we are importing/removing the table entry from table X into the
default routing table we are not properly setting the table_id
of the route entry. This is causing the route to be pushed
into the wrong internal table and to not be found for deletion.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The import table code assumes that they will only work
in the default vrf. This is ok, but we should push the
vrf_id and zvrf to be passed in instead of just using
VRF_DEFAULT.
This will allow us to fix a couple of things:
1) A bug in import where we are not creating the
route entry with the appropriate table so the imported
entry is showing up in the wrong spot.
2) In the future allow `ip import-table X` to become
vrf aware very easily.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The import-table code when looking up the table to use
for route-import was reversing the order of the table_id
and vrf_id causing us to never ever lookup a table
and we would cause the `ip|ipv6 import-table X` commands
to be just ignored.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>