Locking around the list of providers/plugins is not
helpful - these only change at init time. Clear some SA
warnings by removing the locking.
Signed-off-by: Mark Stapp <mjs@labn.net>
Read from the fpm dplane a route update that will
include status about whether or not the asic was
successfull in offloading the route.
Have this data passed up to zebra for processing and disseminate
this data as appropriate.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add the initial step of passing in a dplane context
to reading route netlink messages. This code
will be run in two contexts:
a) The normal pthread for reading netlink messages from
the kernel
b) The dplane_fpm_nl pthread.
The goal of this commit is too just allow a) to work
b) will be filled in in the future. Effectively
everything should still be working as it should
pre this change. We will just possibly allow
the passing of the context around( but not used )
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In order for a future commit to abstract the dplane_ctx_route_init
so that the kernel can use it, let's move some stuff around
and add a dplane_ctx_route_init_basic that can be used by multiple
different paths
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
create a dplane_ctx_route_init_basic so it can be used
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Zebra needs the ability to pass this data around.
Add it to the dplanes ability to pass.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
zebra: Add a dplane_ctx_set_flags
The dplane_ctx_set_flags call is missing, we will need it. Add it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
If we have this semantics:
int ret = FAILURE;
if (foo)
goto done;
....
done:
return ret;
This pattern does us no favors and makes it harder to figure out what is going
on. Let's remove.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This allows Zebra to manage QDISC, TCLASS, TFILTER in kernel and do cleaning
jobs when it starts up.
Signed-off-by: Siger Yang <siger.yang@outlook.com>
Coverity spotted 3 places where `int ret = XXX` was
being used and FRR was immediately assigning a different
value.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This commit implements necessary netlink encoders for traffic control
including QDISC, TCLASS and TFILTER, and adds basic dplane operations.
Co-authored-by: Stephen Worley <sworley@nvidia.com>
Signed-off-by: Siger Yang <siger.yang@outlook.com>
When reading a on the fly change of an interested netconf netlink
message. The ifindex and ns_id for the context was being set for the sub structure
but not for the main context data structure and zebra_if_dplane_result
was dropping the result on the floor because it was expecting the ns_id and
the interface id to be in a different spot.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Multipath route may have mixed nexthops of EVPN and IP unicast. Move
EVPN flag to nexthop to support such cases.
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
The usage of zebra dplane makes the job asyncronous which implies
that a given job will try to add an iptable, while the second job
will not know that its iptable is the same as the former one.
The below exabgp rules stand for two bgp flowspec rules sent to
the bgp device:
flow {
route {match {
source 185.228.172.73/32;
destination 0.0.0.0/0;
source-port >=49156&<=49159;
}then {redirect 213.242.114.113;}}
route {match {
source 185.228.172.73/32;
destination 0.0.0.0/0;
source-port >=49160&<=49163;
}then {redirect 213.242.114.113;}}
}
This rule creates a single iptable, but in fact, the same iptable
name is appended twice. This results in duplicated entries in the
iptables context. This also results in contexts not flushed, when
BGP session or 'flush' operation is performed.
iptables-save:
[..]
-A PREROUTING -m set --match-set match0x55baf4c25cb0 src,src -g match0x55baf4c25cb0
-A PREROUTING -m set --match-set match0x55baf4c25cb0 src,src -g match0x55baf4c25cb0
-A match0x55baf4c25cb0 -j MARK --set-xmark 0x100/0xffffffff
-A match0x55baf4c25cb0 -j ACCEPT
-A match0x55baf4c25cb0 -j MARK --set-xmark 0x100/0xffffffff
-A match0x55baf4c25cb0 -j ACCEPT
[..]
This commit addresses this issue, by checking that an iptable
context is not already being processed. A flag is added in the
original iptable context, and a check is done if the iptable
context is not already being processed for install or uinstall.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Avoid initialization in dplane_ctx_intf_init() so
the compiler can warn us about using unintialized data.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Only clear protodown reason on shutdown/sweep, retain protodown
state.
This is to retain traditional and expected behavior with daemons
like vrrpd setting protodown. They expet it to be set on shutdown
and retained on bring up to prevent traffic from being dropped.
We must cleanup our reason code though to prevent us from blocking
others.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Add enums for set/unset of prodown state to handle the mainthread
knowing an update is already queued without actually marking it
as complete.
This is to make the logic confirm a bit more with other parts of the code
where we queue dplane updates and not update our internal structs until
success callback is received.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Add support for setting the protodown reason code.
829eb208e8
These patches handle all our netlink code for setting the reason.
For protodown reason we only set `frr` as the reason externally
but internally we have more descriptive reasoning available via
`show interface IFNAME`. The kernel only provides a bitwidth of 32
that all userspace programs have to share so this makes the most sense.
Since this is new functionality, it needs to be added to the dplane
pthread instead. So these patches, also move the protodown setting we
were doing before into the dplane pthread. For this, we abstract it a
bit more to make it a general interface LINK update dplane API. This
API can be expanded to support gernal link creation/updating when/if
someone ever adds that code.
We also move a more common entrypoint for evpn-mh and from zapi clients
like vrrpd. They both call common code now to set our internal flags
for protodown and protodown reason.
Also add debugging code for dumping netlink packets with
protodown/protodown_reason.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Use the dataplane to query and read interface NETCONF data;
add netconf-oriented data to the dplane context object, and
add accessors for it. Add handler for incoming update
processing.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
Currently when the kernel sends netlink messages to FRR
the buffers to receive this data is of fixed length.
The kernel, with certain configurations, will send
netlink messages that are larger than this fixed length.
This leads to situations where, on startup, zebra gets
really confused about the state of the kernel. Effectively
the current algorithm is this:
read up to buffer in size
while (data to parse)
get netlink message header, look at size
parse if you can
The problem is that there is a 32k buffer we read.
We get the first message that is say 1k in size,
subtract that 1k to 31k left to parse. We then
get the next header and notice that the length
of the message is 33k. Which is obviously larger
than what we read in. FRR has no recover mechanism
nor is there a way to know, a priori, what the maximum
size the kernel will send us.
Modify FRR to look at the kernel message and see if the
buffer is large enough, if not, make it large enough to
read in the message.
This code has to be per netlink socket because of the usage
of pthreads. So add to `struct nlsock` the buffer and current
buffer length. Growing it as necessary.
Fixes: #10404
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Store the fd that corresponds to the appropriate `struct nlsock` and pass
that around in the dplane context instead of the pointer to the nlsock.
Modify the kernel_netlink.c code to store in a hash the `struct nlsock`
with the socket fd as the key.
Why do this? The dataplane context is used to pass around the `struct nlsock`
but the zebra code has a bug where the received buffer for kernel netlink
messages from the kernel is not big enough. So we need to dynamically
grow the receive buffer per socket, instead of having a non-dynamic buffer
that we read into. By passing around the fd we can look up the `struct nlsock`
that will soon have the associated buffer and not have to worry about `const`
issues that will arise.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The ctx->zd_is_update is being set in various
spots based upon the same value that we are
passing into dplane_ctx_ns_init. Let's just
consolidate all this into the dplane_ctx_ns_init
so that the zd_is_udpate value is set at the
same time that we increment the sequence numbers
to use.
As a note for future me's reading this. The sequence
number choosen for the seq number passed to the
kernel is that each context gets a copy of the
appropriate nlsock to use. Since it's a copy
at a point in time, we know we have a unique sequence
number value.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In some cases, zebra may install a nexthop-group id that is
different from the id of the nhe struct attached to a
route-entry. This happens for a singleton recursive nexthop,
for example, where a route is installed with the resolving
nexthop's id.
The installed value is the most useful value - that corresponds
to information in the kernel on linux/netlink platforms that
support nhgs. Display both values if they differ in ascii
output, and include both values in the json form.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
Since f60a1188 we store a pointer to the VRF in the interface structure.
There's no need anymore to store a separate vrf_id field.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
The dplane_ctx_get_pbr_ipset_entry function only
failed when the caller did not pass in a valid
usable pointer. Change the code to assert on
a pointer not being passed in and remove the
bool return
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The only time this function ever failed is when
the developer does not pass in a usable pointer
to place the data in. Change it to an assert
to signify to the end developer that is what
we want and then remove all the if checks
for failure
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The function call dplane_ctx_get_pbr_ipset only
returns false when the calling function fails to
pass in a valid ipset pointer. This should
be an assertion issue since it's a programming
issue as opposed to an actual run time issue.
Change the function call parameter to not return
a bool on success/fail for a compile time decision.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
It is needed for the ipset entry to know for which address family
this ipset entry applies to. Actually, the family is in the original
ipset structure and was not passed as attribute in the dataplane
ipset_info structure. Add it.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When injecting an ipset entry into the zebra dataplane context, the
ipset name is stored in a separate structure. This will permit the
flowspec plugin to be able to know which ipset has to be appended with
relevant ipset entry.
The problem was that the zebra dataplane objects related to ipset entries
is made up of an union between the ipset structure and the ipset info
structure. This was implying that the two structures were on the same
memory zone, and when extracting the data stored, the data were incomplete.
Fix this by replacing the union structure by a defined struct.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Read incoming interface address change notifications in the
dplane pthread; enqueue the events to the main pthread
for processing. This is netlink-only for now - the bsd
kernel socket path remains unchanged.
Signed-off-by: Mark Stapp <mjs.ietf@gmail.com>