Add a file that exposes functions which modify nexthop groups.
Nexthop groups are techincally immutable but there are a
few special cases where we need direct access to add/remove
nexthops after the group has been made. This file provides a
way to expose those functions in a way that makes it clear
this is a private/hidden api.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
When installing a table route into the kernel choose
RTPROT_ZEBRA as the installing/controlling protocol.
This way we can know we installed it as well as stop
the warnings about this special case of `ip import-table XX`
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we are importing/removing the table entry from table X into the
default routing table we are not properly setting the table_id
of the route entry. This is causing the route to be pushed
into the wrong internal table and to not be found for deletion.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The import table code assumes that they will only work
in the default vrf. This is ok, but we should push the
vrf_id and zvrf to be passed in instead of just using
VRF_DEFAULT.
This will allow us to fix a couple of things:
1) A bug in import where we are not creating the
route entry with the appropriate table so the imported
entry is showing up in the wrong spot.
2) In the future allow `ip import-table X` to become
vrf aware very easily.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The import-table code when looking up the table to use
for route-import was reversing the order of the table_id
and vrf_id causing us to never ever lookup a table
and we would cause the `ip|ipv6 import-table X` commands
to be just ignored.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Improve debugging when we cannot find a route to delete
that we have been told to delete.
New output:
2019/06/25 17:43:49 ZEBRA: default[0]:4.5.6.7/32 doesn't exist in rib
2019/06/25 17:43:49 ZEBRA: default[0]:4.5.6.8/32 doesn't exist in rib
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
1) If we are moving the nexthop we are tracking to
a new rn in the rib, then we know that the route
to get to that nexthop has changed. As such
we should notify the upper level.
This manifested itself because the code had a trigraph `?`
in the wrong order. Put the comparison in the right order.
2) If we are re-matching to the same rn and we call compare_state
then we need to see if our stored nexthops are the same or different.
If they are the same we should not notify. If they are different
we should notify. compare_state was only comparing the flags
on a route and since those are not necessarily the right flags
to look at( and we are well after the fact that the route has
already changed and been processed ) let's just compare
the nexthops to see if they are the same or different.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
when receiving an EAGAIN while trying to read the header
of a ZAPI message, we were erroneously continuing as if
everything was fine, which could crash zebra. Fix this
by returning and letting the re-armed read task deal with
this
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Also fixes some issues related to -
show evpn arp-cache vni xx vtep yy
Ticket: CM-25380
Signed-off-by: Nitin Soni<nsoni@cumulusnetworks.com>
Reviewed-by: CCR-8858
Testing-Done: Evpn scale test with 30K neighs
The failed neighbor event logging that was recently added in
commit: 3acae086ba
cast a bit too broad of a stroke. We should only inform
the user that we were ignoring the RTM_NEWNEIGH FAIL callback
when we believe it was one of our own 5549 entries.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When dumping rib data about a route for `debug rib detail`
modify the dump command to display the prefix as part
of every line so that we can use a grep on the log
file.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When BGP daemon is down, Clean up its configuration state from zebra.
When the BGP daemon is up again, it will push its configuration to zebra
Delete the MAC and neighbor information received on the BGP session,
while retaining the local MAC and local ARP entries.
Signed-off-by: Kishore Aramalla karamalla@vmware.com
Problem discovered in testing that occasionally when an interface
address was flushed, the corresponding route would be removed from
the kernel and zebra but remain in the bgp table and be advertised
to peers. Discovered that when zebra_rib_evaluate_nexthops spun
thru the tree list of rns, if the timing and circumstances were
right, it would move elements and miss evaluating some. Changed
from frr_each to frr_each_safe and the problem is now gone.
Ticket: CM-25301
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Add a expected count for the route node we will be processing
as part of nexthop resolution and modify the type to display
a useful string of what the type is instead of a number.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add a bit of extra code to indicate to the operator why
we intentionally rejected a kernel route from being used.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
- When the connection with the FPM socket is established, iterate through all the
L3VNIs and send all the RMACs for FPM processing zfpm_conn_up_thread_cb"
- We have already handled connection down even in previous commits. When the FPM
connection goes down, empty mac_q and FPM mac info hash table
"zfpm_conn_down_thread_cb"
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
- FPM write thread calls "zfpm_build_updates()" to process mac_q and dest_q and
to write update buffer over the FPM socket.
- "zfpm_build_updates()" processes all the update queues one by one in a while
loop. It will break the while loop and return if Queue processing function
returns "FPM_WRITE_STOP" OR FPM write buffer is full OR all the queues are
empty (no more update to process).
- "zfpm_build_route_updates()" dequeues and processes route nodes from "dest_q".
- "zfpm_build_mac_updates()" dequeues and processes MAC nodes from "mac_q"
- These queue processing functions return with "FPM_WRITE_STOP" if the write
buffer is full. Return value is "FPM_GOTO_NEXT_Q" if enough updates are
processed from this queue and we want to move on to the next queue.
- In each call, a queue processing function will process max
"FPM_QUEUE_PROCESS_LIMIT (10000)" updates to avoid starvation of other queues.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
- Define a hook "zebra_mac_update" which can be registered by multiple
data plane components (e.g. FPM, dplane).
DEFINE_HOOK(zebra_rmac_update, (zebra_mac_t *rmac, zebra_l3vni_t *zl3vni, bool
delete, const char *reason), (rmac, zl3vni, delete, reason))
- While performing RMAC add/delete for an L3VNI, call "zebra_mac_update" hook.
- This hook call triggers "zfpm_trigger_rmac_update". In this function, we do a
lookup for the RMAC in fpm_mac_info_table. If already present, this node is
updated with the latest RMAC info. Else, a new fpm_mac_info_t node is created
and inserted in the queue and hash data structures.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
- FPM MAC structure: This data structure will contain all the information
required for FPM message generation for an RMAC.
struct fpm_mac_info_t {
struct ethaddr macaddr;
uint32_t zebra_flags; /* Could be used to build FPM messages */
vni_t vni;
ifindex_t vxlan_if;
ifindex_t svi_if; /* L2 or L3 Bridge interface */
struct in_addr r_vtep_ip; /* Remote VTEP IP */
/* Linkage to put MAC on the FPM processing queue. */
TAILQ_ENTRY(fpm_mac_info_t) fpm_mac_q_entries;
uint8_t fpm_flags;
};
- Queue structure for FPM processing:
For FPM processing, we build a queue of "fpm_mac_info_t". When RMAC is
added or deleted from zebra, fpm_mac_info_t node is enqueued in this queue
for the corresponding operation. FPM thread will dequeue these nodes one by
one to generate a netlink message.
TAILQ_HEAD(zfpm_mac_q, fpm_mac_info_t) mac_q;
- Hash table for "fpm_mac_info_t"
When zebra tries to enqueue fpm_mac_info_t for a new RMAC add/delete
operation, it is possible that this RMAC is already present in the queue. So,
to avoid multiple messages for duplicate RMAC nodes, insert fpm_mac_info_t
into a hash table.
struct hash *fpm_mac_info_table;
- Before enqueueing any MAC, try to fetch the fpm_mac_info_t from the hash
table first.
- Entry is deleted from the hash table when the node is dequeued.
- For hash table key generation, parameters used are "mac adress" and "vni"
This will provide a fairly unique key for a MAC(fpm_mac_info_hash_keymake).
- Compare function uses "mac address", "RVTEP address" and "VNI" as the key
which is sufficient to distinguish any two RMACs. This compare function is
used for fpm_mac_info_t lookup (zfpm_mac_info_cmp).
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
Found that the "show interface brief" command was missing the
ability to specify all vrfs. Added that capability via this
fix.
Ticket: CM-25139
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
If we get a neighbor entry for 5549 failure notice
from the kernel that means that something has probably
gone terribly wrong. Let's notice and not reinstall.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This is mostly relevant for Solaris, where config.h sets up some #define
that affect overall header behaviour, so it needs to be before anything
else.
Signed-off-by: David Lamparter <equinox@diac24.net>
Field vrf_id is replaced by the pointer of the struct vrf *.
For that all other code referencing to (interface)->vrf_id is replaced.
This work should not change the behaviour.
It is just a continuation work toward having an interface API handling
vrf pointer only.
some new generic functions are created in vrf:
vrf_to_id, vrf_to_name,
a zebra function is also created:
zvrf_info_lookup
an ospf function is also created:
ospf_lookup_by_vrf
it is to be noted that now that interface has a vrf pointer, some more
optimisations could be thought through all the rest of the code. as
example, many structure store the vrf_id. those structures could get
the exact vrf structure if inherited from an interface vrf context.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
vrf_id parameter is replaced with struct vrf * parameter. It is
needed to create vrf structure before entering in the fuction.
an error is generated in case the vrf parameter is missing.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
there may be cases where the vrf is yet allocated from the vty, and the
discovery process did not make the relationship between the vrf_id and
the name of the vrf. For instance, by parsing an interface belonging to
vrf-id X, it is not sure that vrf-id X and vrfname XX are talking about
the same vrf. For that, lets allocate the vrf, and lets try to detect
there is a duplicate case in vrf, so that the merge can be done without
any impact for the user.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
the interface search is based on vrfs. As at startup, some interfaces
may be configured, there is need to have vrfs contexts present. A macro
is being appended with an extra parameter that permits create a vrf and
return the context. This macro is also used by some show routines, but
will not create vrfs, because that extra parameter will be set to false,
on that case.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Upon accessing interface NB API, the interface is created, if the vrf
is available. the commit does not change the behaviour, since at this
commit, this is not yet possible to have vrf contexts, while zebra did
not connect to daemons. However, that commit adds some work, so that it
will be possible to work on a vrf context, without having the vrf_id
completely resolved. for instance, if we suppose a vrf is created by
command 'vrf TOTO' in the starting configuration of a daemon, then 'interface
TITI vrf TOTO' will permit to create interface TITI within vrf TOTO.
the macro VRF_GET_INSTANCE will return the vrf context, if available or
not.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
the vrf_id parameter is replaced by struct vrf * parameter.
this impacts most of the daemons that look for an interface based on the
name and the vrf identifier.
Also, it fixes 2 lookup calls in zebra and sharpd, where the vrf_id was
ignored until now.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
vrf pointer is used as reference when calling if_get_by_name() function.
this will permit to create interfaces with an unknown vrf_id, since it
is only necessary to get the vrf structure to store the interfaces.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Update pbrd to properly handle nexthop tracking.
When we get a notification that a change happened on a nexthop,
re-install it if its still valid.
Before, we were running over all routes and re-queueing them if they
were PBR routes. This commit removes that and puts all the processing
in PBR instead.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
in the case the vrf backend is vrf-lite, there is no need to have
separate sockets. use a socket located in zrouter, so that when needing
the socket, a common API is used. that API will return the appropriate
socket value.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
when network namespace is used as vrf backend, there is need to have
separate contexts for rtadv contexts.
route advertisements have to look for appropriate interface based on
zvrf context.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In a variety of places we are using DAEMON_VTY_DIR, convert
to use frr_vtydir. This will allow us in a future commit
to have the -N namespace option be automatically used.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add the local and remote sequence number to the `show evpn arp-cache vni XX` command.
VNI 1000111 #ARP (IPv4 and IPv6, local and remote) 15
IP Type State MAC Remote VTEP Seq #'s
fe80::202:ff:fe00:15 remote active 00:02:00:00:00:15 6.0.0.31 0/0
fe80::202:ff:fe00:8 local active 00:02:00:00:00:08 0/0
60.1.1.111 local active 00:02:00:00:00:08 0/0
2060:1:1:1::11 local active 00:e0:ec:38:49:a1 0/0
fe80::202:ff:fe00:11 remote active 00:02:00:00:00:11 6.0.0.30 0/0
2060:1:1:1::211 remote active 00:02:00:00:00:11 6.0.0.30 0/0
2060:1:1:1::121 remote active 00:02:00:00:00:0c 6.0.0.29 0/0
60.1.1.211 remote active 00:02:00:00:00:11 6.0.0.30 0/0
fe80::202:ff:fe00:c remote active 00:02:00:00:00:0c 6.0.0.29 0/0
60.1.1.11 local active 00:e0:ec:38:49:a1 0/0
fe80::2e0:ecff:fe38:49a1 local active 00:e0:ec:38:49:a1 0/0
60.1.1.221 remote active 00:02:00:00:00:15 6.0.0.31 0/0
2060:1:1:1::111 local active 00:02:00:00:00:08 0/0
2060:1:1:1::221 remote active 00:02:00:00:00:15 6.0.0.31 0/0
60.1.1.121 remote active 00:02:00:00:00:0c 6.0.0.29 0/0
The seq numbers are at 0/0 because we have had no mobility events.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
the interface search done was not looking in the appropriate zns. The
display was then wrong. Update the show command with the correct zns.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Say, more than one sequence of a route-map uses the same named entity
in its match clause. After that entity is removed from any one of the
route-map sequences, any further changes made to that entity doesn't
dynamically take effect.
A reference counter, that allows the named entity to keep a count of
the route-maps dependent on it, has been introduced to address this issue.
Signed-off-by: NaveenThanikachalam <nthanikachal@vmware.com>
When you have compiled FRR with a large multipath number
then encoding large ecmp routes between zebra and the
routing daemons. There exists a theoritical size
of multipath that will cause the encoding to be larger
than the ZEBRA_MAX_PACKET_SIZ. In the cases where
we have allocated streams that will encode routes
then let's ensure that whatever size we have will
auto-fit what we say we can send.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Introducing a 3rd state for route_map_apply library function: RMAP_NOOP
Traditionally route map MATCH rule apis were designed to return
a binary response, consisting of either RMAP_MATCH or RMAP_NOMATCH.
(Route-map SET rule apis return RMAP_OKAY or RMAP_ERROR).
Depending on this response, the following statemachine decided the
course of action:
Action: Apply route-map match and return the result (RMAP_MATCH/RMAP_NOMATCH)
State1: Receveived RMAP_MATCH
THEN: If Routemap type is PERMIT, execute other rules if applicable,
otherwise we PERMIT!
Else: If Routemap type is DENY, we DENYMATCH right away
State2: Received RMAP_NOMATCH, continue on to next route-map, otherwise,
return DENYMATCH by default if nothing matched.
With reference to PR 4078 (https://github.com/FRRouting/frr/pull/4078),
we require a 3rd state because of the following situation:
The issue - what if, the rule api needs to abort or ignore a rule?:
"match evpn vni xx" route-map filter can be applied to incoming routes
regardless of whether the tunnel type is vxlan or mpls.
This rule should be N/A for mpls based evpn route, but applicable to only
vxlan based evpn route.
Today, the filter produces either a match or nomatch response regardless of
whether it is mpls/vxlan, resulting in either permitting or denying the
route.. So an mpls evpn route may get filtered out incorrectly.
Eg: "route-map RM1 permit 10 ; match evpn vni 20" or
"route-map RM2 deny 20 ; match vni 20"
With the introduction of the 3rd state, we can abort this rule check safely.
How? The rules api can now return RMAP_NOOP (or another enum) to indicate
that it encountered an invalid check, and needs to abort just that rule,
but continue with other rules.
Question: Do we repurpose an existing enum RMAP_OKAY or RMAP_ERROR
as the 3rd state (or create a new enum like RMAP_NOOP)?
RMAP_OKAY and RMAP_ERROR are used to return the result of set cmd.
We chose to go with RMAP_NOOP (but open to ideas),
as a way to bypass the rmap filter
As a result we have a 3rd state:
State3: Received RMAP_NOOP
Then, proceed to other route-map, otherwise return RMAP_PERMITMATCH by default.
Signed-off-by:Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
The multicast mode enum was a global static in zebra_rib.c
it does not belong there, it belongs in zebra_router, moving.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
BGP always sends down the correct distance to use. We do
not need rib_add_multipath to double check the code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Since these functions are not really rib processing problems
let's move them to zebra_nhg.c which is meant for processing of
nexthop groups.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Allow route notifications to trigger route state changes,
such as installed -> not installed.
Clean up the fib-specific nexthop-group in a couple of
un-install paths.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Use some common handling for both route update results
processing and dataplane notification processing. Use the
fib-specific nexthop-group if the update to a route results
in different nexthop status than the default rib-provided
nexthop-group.
Use the fib-specific nexthop-group, if present, to provide
the output of 'show ip fib'.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Some updates may be the result of a plugin's actions - such
as an async notification. Add accessor so that we can
identify that an update was generated by a plugin.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
When setting route nexthops' installation state based on a
dataplane context struct, unset the installed state if a
nexthop was not present in the dataplane context.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Create a helper api that locates a zebra route-node from info
in a dplane context struct. Moved code from the results handler
to make a more-general api that could be used in other paths.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Add an api to update the status of a route based on info
from a dplane context object. Use the api when processing
route update results from the dataplane.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Add a callback called at start time, once the dplane pthread
and thread_master are available. The callback is optional.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
This code doees this:
a) Imagine ospf installs a route into zebra. Zebra crashes and
we restart FRR. If we are using the -k option on zebra than
all routes are re-read in, including this OSPF route.
b) Now imagine at the same time that zebra is starting backup
ospf on a different router looses a link to the this route.
c) Since zebra was run with -k this OSPF route is read back
in but never replaced and we now have a route pointing out
an interface to other routers that cannot handle it.
We should never allow users to implement bad options from zebra's
perspective that allow them to put themselves into a clear problem
state and additionally we have *absolutely* no mechanism to ever
fix that broken route without special human interaction.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
<Initial Code from Praveen Chaudhary>
Add the a `--graceful_restart X` flag to zebra start that
now creates a timer that pops in X seconds and will go
through and remove all routes that are older than startup.
If graceful_restart is not specified then we will just pop
a timer that cleans everything up immediately.
Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Allow label ignoring when comparing nexthops. Specifically,
add another functon nexthop_same_no_labels() that shares
a path with nexthop_same() but doesn't check labels.
rib_delete() needs to ignore labels in this case.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
The functions nexthop_same() does not check the resolved
nexthops so I don't think this function is even needed
anymore.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
This is necessary to avoid a name collision with std::for_each
from C++.
Fixes the compilation of the gRPC northbound module.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
- Today, rtm_table field takes a vrf_id. It should take table_id
- rtm_table field is a uchar field which can only accomodate table_id less than
256. To support table id greater than 255, if the table_id is greater than 255,
set rtm_table to 0 and add RTA_TABLE attribute with 32 bit value as the
table_id.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
- For data plane processing of VxLAN routes, add encap type and L3VNI info to
rtmsg message for FPM.
- Add "RTA_ENCAP_TYPE" attribute for VxLAN encap with value 100.
This value is not currently used for RTA_ENCAP_TYPE for any encap.
- If "RTA_ENCAP_TYPE" is 100, add "RTA_ENCAP" attribute with "RTA_VNI" as a
nested attribute of RTA_ENCAP
Format of RTA_VNI attribute:
Len(2 bytes) type (2 bytes) Value(4 bytes)(VNI)
00 08 : 00 00 : 1000
RTA_VNI attribute is a custom attribute.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
VRRP doesn't install any routes, but should still have an array entry.
Also add a help string for VRRP to route_types.txt
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
We were running into some problems where VRRP is trying to protodown
interfaces that no longer exist. While this is a minor bug in its own
right, this was crashing Zebra because Zebra was not doing a null check
after its ifindex lookup.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Add a check for after table lookup during route map update.
If the table ID does not exist, continue.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
The multipath_num global variable was moved into
zrouter.multipath_num but this particular usage
of it was not updated.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
It doesn't make much sense for a hash function to modify its argument,
so const the hash input.
BGP does it in a couple places, those cast away the const. Not great but
not any worse than it was.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
bfd cbit is a value carried out in bfd messages, that permit to keep or
not, the independence between control plane and dataplane. In other
words, while most of the cases plan to flush entries, when bfd goes
down, there are some cases where that bfd event should be ignored. this
is the case with non stop forwarding mechanisms where entries may be
kept. this is the case for BGP, when graceful restart capability is
used. If BFD event down happens, and bgp is in graceful restart mode, it
is wished to ignore the BFD event while waiting for the remote router to
restart.
The changes take into account the following:
- add a config flag across zebra layer so that daemon can set or not the
cbit capability.
- ability for daemons to read the remote bfd capability associated to a bfd
notification.
- in bfdd, according to the value, the cbit value is set
- in bfdd, the received value is retrived and stored in the bfd session
context.
- by default, the local cbit announced to remote is set to 1 while
preservation of the local path is not set.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Make the RIB_*_ROUTE() macro which is passed a route in rib.h just use
the R*_ROUTE() macros that directly check the type in rt.h.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
The CLI code ensures that the clippy code produces
valid input for the zebra_routemap.c functions, but
coverity SA does not understand this fact. So add
some asserts to make the coverity SA happy.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
No need to check for non-null ctx at this point in the
function as that it has already been derefed.
Signed-off-by: donald Sharp ,sahrpd@cumulusnetworks.com>
The re->uptime usage of time(NULL) leaves it open to
timing changes from outside influence. Switching
to monotime allows us to ensure that we have a timestamp
that is always increasing.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The route_map_event_hook callback was passing the `route_map_event_t`
to each individual interested party. No-one is ever using this data
so let's cut to the chase a bit and remove the pass through of data.
This is considered ok in that the routemap.c code came this way
originally and after 15+ years no-one is using this functionality.
Nor do I see any `easy` way to do anything useful with this data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
1. If prefix not found, print "{}" for json
2. Print "Network not in table" for route option
3. Print "Network not in FIB" for fib option
4. Take care of "show ip route/fib vrf all prefix" command.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
if the local sticky mac delete request is received,
if there are associated neighbor entries present, mac's
only local flag is removed and marked as auto mac.
this results in next local mac learning automatically assumes
mac is sticky.
There is a case when bridge learning off is configured, user
configures sticky mac via bridge fdb add.
This MAC learns associated neighbor entry.
Later user deletes stick mac via bridge fdb del, this triggers
frr to delete mac but if there are neighbors present, frr marks
MAC as AUTO but does not remove sticky flag.
User enables bridge learning on which triggers
The mac to learn as dynamic entry and in absence of this
fix, the mac is marked as sticky.
Ticket:CM-24968
Reviewed By:CCR-8683
Testing Done:
Validated broken condition with internally reproduction
with fix and without.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
According to the review comments, added "Network not in FIB" message when we do
not have a FIB route present for given prefix.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
With the previous commit, the zebra_router_score_proto function
became unnecessary, so let us remove it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
For each table created by a vrf, keep track of it and
allow for proper cleanup on shutdown of that particular
table. Cleanup client shutdown to only cleanup data
that the particular vrf owns. Before we were cleaning
the same table 2 times.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Combine the zebra_vrf_other_route_table and zebra_vrf_table_with_table_id
functions into 1 function. Since they are basically the same thing.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
"show ip/ipv6 route <prefix> [json]" uses a different parser chain from
"show ip/ipv6 route [json]".
"show ip/ipv6 route <prefix> [json]" CLI does not support "fib" option.
Fix:
Add "fib" option to the above command.
The new command is: "show ip/ipv6 <route/fib> <prefix> [json]"
If "fib" option is specified, we will show only the selected routes
(Similar to "show ip/ipv6 fib")
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
messages from daemons to bfd daemons go through zebra. zebra reuses the
vrf identifier to send messages to bfd.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Try to remove any LSPs associated with a vrf when the vrf is
deleted. The vrf code was calling a helpful zebra_mpls api,
but that api was basically a no-op for vrfs other than
the default.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
This command is broken and has been broken since the introduction
of vrf's. Since no-one has complained it is safe to assume that
there is no call for this specialized linux command. Remove
from the system with extreme prejudice.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The rib_add( and rib_delete( functions are there to allow
kernel interactions with the creation of routes. Fixup the
code to be consistent in the passup of the tableid.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we switched to a pthread per client, we lost the ability to
correlate zapi message debugs with their handlers in zlog, because the
message was logged when it was read off the zapi socket and not right
before it was processed. Move the zapi msg hexdump to happen right
before we call the message handler.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The route_info[X].meta_q_map *must* be less than MQ_SIZE
or we will do some strange stuff, so assert on it at startup.
The distance in route_info is a uint8_t so let's keep the data
structure the same.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The ifp pointer must be pointing at a real location
in memory since right above us in this loop we
return if it is.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The route entry code was using a custom linked list to handle
route entries. Remove and replace with the new lib link list
code. This reduces the size of the route entry by a further
8 bytes.
Observant people will notice that the current linked list
implementation is singly linked, while the Route Entry
is doubly linked. I am not terribly concerned about this
change as that 1) we do not see a large number of route
entries per prefix( say 2 maybe 3 items ) and route entries
do not come and go that often.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The `struct rib_dest_t` was being used to store the linked
list of rnh's associated with the node. This was taking up
a bunch of memory. Replace with new data structure supplied
by David and see the memory reductions associated with 1 million
routes in the zebra rib:
Old:
Memory statistics for zebra:
System allocator statistics:
Total heap allocated: 675 MiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 567 MiB
Free small blocks: 39 MiB
Free ordinary blocks: 69 MiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
New:
Memory statistics for zebra:
System allocator statistics:
Total heap allocated: 574 MiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 536 MiB
Free small blocks: 33 MiB
Free ordinary blocks: 4600 KiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
`struct rnh` was moved to rib.h because of the tangled web
of structure dependancies. This data structure is used
in numerous places so it should be ok for the moment.
Future work might be needed to do a better job of splitting
up data structures and function definitions.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The `struct rib_dest_t` was being used to store the linked
list of rnh's associated with the node. This was taking up
a bunch of memory. Replace with new data structure supplied
by David and see the memory reductions associated with 1 million
routes in the zebra rib:
Old:
Memory statistics for zebra:
System allocator statistics:
Total heap allocated: 675 MiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 567 MiB
Free small blocks: 39 MiB
Free ordinary blocks: 69 MiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
New:
Memory statistics for zebra:
System allocator statistics:
Total heap allocated: 574 MiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 536 MiB
Free small blocks: 33 MiB
Free ordinary blocks: 4600 KiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
`struct rnh` was moved to rib.h because of the tangled web
of structure dependancies. This data structure is used
in numerous places so it should be ok for the moment.
Future work might be needed to do a better job of splitting
up data structures and function definitions.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add a function to check if the route_info array
has all types specified with data in it. Specifically,
test the 'key' attribute for non-zero data. Ignore
ZEBRA_ROUTE_SYSTEM as it should be zero key anyway.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
With flooding control added recently we were not properly handling
the new flood control parameter in zebra_vxlan.c handler functions.
The error message that was being repeatedly seen:
2019/05/01 00:47:32 ZEBRA: [EC 100663311] stream_get2: Attempt to get out of bounds
2019/05/01 00:47:32 ZEBRA: [EC 100663311] &(struct stream): 0x7f0f04001740, size: 22, getp: 22, endp: 22
The fix was to ensure that both the _add and _del functions kept proper
sizing of amount of data read *and* the _del function was not
reading the flood_control data from the stream.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add a comment to indicate that route types added to
Zebra, should also be present in the route_info array.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add OpenFabric to the route_info array for handling processing
of the OpenFabric route type.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Problem reported that route-maps applied to "ip protocol table bgp"
would not be invoked if the ip protocol table command was issued
after the bgp prefixes were installed. Found that a recent change
improving how often nexthop_active_update runs missed causing this
filtering to be applied. This fix resolves that issue as well as
a couple of other places that were problematic with the recent
change.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
in the case the vrf backend is vrf-lite, there is no need to have
separate sockets. use a socket located in zrouter, so that when needing
the socket, a common API is used. that API will return the appropriate
socket value.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
when network namespace is used as vrf backend, there is need to have
separate contexts for rtadv contexts.
route advertisements have to look for appropriate interface based on
zvrf context.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The alias/description of an interface in linux was being
used to override the internal description. As such let's
fix the display to keep track of both if we have it.
Config in FRR:
!
interface docker0
description another combination
!
interface enp3s0
description BAMBOOZLE ME WILL YOU
!
Config in linux:
sharpd@robot ~/f/zebra> ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
alias This is the loopback you cabbage
2: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether 74:d0:2b:9c:16:eb brd ff:ff:ff:ff:ff:ff
alias HI HI HI
Now the 'show int descr' command:
robot# show int description
Interface Status Protocol Description
docker0 up down another combination
enp3s0 up up BAMBOOZLE ME WILL YOU
HI HI HI
lo up up This is the loopback you cabbage
Fixes: #4191
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Start using the dataplane for interface-address programming,
on netlink platforms. Other platforms just stubbed at this
point.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
These updates act as triggers to pimd to -
1. join the MDT for rxing VxLAN encapsulated BUM traffic
2. register the local-vtep-ip as a source for the MDT
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
An SG entry is added (if one doesn't already exist) when a l2-VNI is
associated with a mcast-grp and local-vtep-ip.
And viceversa; when the last l2-vni using a MDT is removed the SG
entry is deleted.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Based code for adding (S, G) entries. These entries are created when
a mcast-group and local-VTEP-IP is associated with and L2 VNI.
The parent (*, G) entries are created implicitly on the (S, G) addition
and play the role of termination entries.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Each multicast tunnel is associated with a -
1. Tunnel origination mroute that is used for forwarding the
VxLAN encapsulated flow -
S - local VTEP-IP
G - BUM mcast-group
2. And a tunnel termination entry -
S - * (any remote VTEP)
G - BUM mcast-group
Multiple L2 VNIs can share the same BUM mcast group (and local-VTEP-IP).
Zebra maintains an mcast (SG) hash table to pass this info to pimd for
subsequent MDT setup.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Remote VTEPs advertise the flood mode via IMET and the ingress VTEP
needs to perform head-end-replication of BUM packets to it only if the
PMSI tunnel type is set to ingress-replication. If a type-3 route is not
rxed or rxed with a mode other than ingress-replication we can skip
installation of the flood fdb entry for that L2-VNI. In that case the
remote VTEP is either not interested in BUM traffic or is using a
"static-config" based replication mode like PIM.
Sample output with HER -
=======================
root@TORS1:~# vtysh -c "show evpn vni 1000" |grep "Remote\|flood"
Remote VTEPs for this VNI:
27.0.0.8 flood: HER
root@TORS1:~#
Sample output with PIM-SM -
=========================
root@TORS2:~# vtysh -c "show evpn vni 1000" |grep "Remote\|flood"
Remote VTEPs for this VNI:
27.0.0.7 flood: -
root@TORS2:~#
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
The multicast group ip address for BUM traffic is configurable per-l2-vni.
One way to configure that is to setup a vxlan device that per-l2-vni and
specify the address against that vxlan device -
root@TORS1:~# vtysh -c "show interface vx-1000" |grep -i vxlan
Interface Type Vxlan
VxLAN Id 1000 VTEP IP: 27.0.0.15 Access VLAN Id 1000 Mcast 239.1.1.100
root@TORS1:~# vtysh -c "show evpn vni 1000" |grep Mcast
Mcast group: 239.1.1.100
root@TORS1:~#
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Found that zebra_rnh_apply_nht_rmap would set the
NEXTHOP_FLAG_ACTIVE if not blocked by the route-map, even
if the flag was not active prior to the check. This fix
changes the flag used to denote the nexthop is filtered so
that proper active state can be retained. Additionally,
found two cases where we would send invalid nexthops via
send_client, which would also cause this crash. All three
fixed in this commit.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Update the nexthop flag output for the route entry dump to
include all possible flag states be output.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We currently run nexthop_active_check multiple times. Make the
code run once and figure out state from that.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The nexthop_active_update command looks at each individual
nexthop and decides if it has changed. If any nexthop
has changed we will set the re->status to ROUTE_ENTRY_CHANGED
and ROUTE_ENTRY_NEXTHOPS_CHANGED.
Additionally the test for old_nh_num != curr_active
makes no sense because suppose we have several events
we are processing at the same time and a total ecmp
of 16 but 14 are active at the start and 14 are active
at the end but different interfaces are up or down.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>