Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
ignore GETVLAN errors at startup like we are doing
for nexthop groups. Older platforms don't support the API.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Read in STP state changes for a Single Vxlan Device
via bridge vlan netlink messages. Map the vlanid to a
VNI in the SVD table and treat it similar to how
we handle proto down of the Vxlan device traditionally
in a non-SVD device scenario.
Forwarding == Interface UP
Blocking == Interface DOWN
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Encode the vni label during route install on linux
systems via lwt encap 64bit LWTUNNEL_IP_ID. The kernel expects
this in network byte order, so we convert it.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
This allows Zebra to manage QDISC, TCLASS, TFILTER in kernel and do cleaning
jobs when it starts up.
Signed-off-by: Siger Yang <siger.yang@outlook.com>
To resolve link dependencies of unordered interfaces, the commit
`520ebf72b27c2462ce8b0dc5a1d4cb83956df69c` has separated assignment of
`zif->link_ifindex` and `zif->link` from `netlink_interface()` during startup.
The fixup stage of `zebra_if_update_all_links()` goes into the last of
`interface_lookup_netlink()`, it can't be executed in the case of error in
above `netlink_parse_info()`s.
`RTM_GETTUNNEL` is not supported in linux kernel until 5.18, so
`netlink_parse_info()` will throw error with the previous versions.
If two conditions are met, (it is a common case)
1. Interfaces are created before frr restart/start
2. Linux kernel version < 5.18
the link dependencies will not be done, then evpn feature will be broken.
IMO we should just ignore this error.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
This commit implements necessary netlink encoders for traffic control
including QDISC, TCLASS and TFILTER, and adds basic dplane operations.
Co-authored-by: Stephen Worley <sworley@nvidia.com>
Signed-off-by: Siger Yang <siger.yang@outlook.com>
There exists a possibility that an end operator has choosen
to compile FRR on an extremely old KERNEL that does not support
the SOL_NETLINK sockopt call. If so let's note it for them
instead of stuff silently not working.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The usage of SOL_NETLINK for adding memberships of interest is
1 group per call. The netink_socket function implied that
the call could be a bitfield of values. This is not correct
at all. This will trip someone else up in the future when
a new value is needed. Let's get it right `now` before
it becomes a problem.
Let's also add a bit of extra code to give operator a better
understanding of what went wrong when a kernel does not
support the option.
Finally as a point of future reference should FRR just switch
over to a loop to add the required loops instead of having
this bastardized approach of some going in one way and some
going in another way?
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
'bridge vni add vni <id> dev <vxlan device>'
generates new RTM_NEWTUNNEL and RTM_DELTUNNEL
to add or remove vni to l3vxlan device.
Register new RTNLGRP_TUNNEL group to receive
new netlink notification.
Callback for the new RTM_xxxTUNNEL.
kernel patches:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?h=v5.18-rc7&id=7b8135f4df98b155b23754b6065c157861e268f1
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?h=v5.18-rc7&id=f9c4bb0b245cee35ef66f75bf409c9573d934cf9
Ticket:#3073812
Testing Done:
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Chirag Shah <chirag@nvidia.com>
The kernel supports l3vxlan device to have (l3vni)
vni filter similar to vlan filtering on bridge device.
To receive netlink notification, FRR to register
for new netlink RTNLGRP_TUNNEL message.
This message required to register via additional
socket option as it's beyond bitmap size.
kernel patches:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?h=v5.18-rc7&id=7b8135f4df98b155b23754b6065c157861e268f1
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?h=v5.18-rc7&id=f9c4bb0b245cee35ef66f75bf409c9573d934cf9
Ticket:#3073812
Testing Done:
Signed-off-by: Chirag Shah <chirag@nvidia.com>
There exists code paths in the linux kernel where a dump command
will be interrupted( I am not sure I understand what this really
means ) and the data sent back from the kernel is wrong or incomplete.
At this point in time I am not 100% certain what should be done, but
let's start noticing that this has happened so we can formulate a plan
or allow the end operator to know bad stuff is a foot at the circle K.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add command for use to set protodown via frr.conf in
the case our default conflicts with another application
they are using.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Add support for setting the protodown reason code.
829eb208e8
These patches handle all our netlink code for setting the reason.
For protodown reason we only set `frr` as the reason externally
but internally we have more descriptive reasoning available via
`show interface IFNAME`. The kernel only provides a bitwidth of 32
that all userspace programs have to share so this makes the most sense.
Since this is new functionality, it needs to be added to the dplane
pthread instead. So these patches, also move the protodown setting we
were doing before into the dplane pthread. For this, we abstract it a
bit more to make it a general interface LINK update dplane API. This
API can be expanded to support gernal link creation/updating when/if
someone ever adds that code.
We also move a more common entrypoint for evpn-mh and from zapi clients
like vrrpd. They both call common code now to set our internal flags
for protodown and protodown reason.
Also add debugging code for dumping netlink packets with
protodown/protodown_reason.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
End operator is reporting that they are receiving buffer overruns
when attempting to read from the kernel receive socket. It is
possible to adjust this size to more modern levels especially
for when the system is under load. Modify the code base
so that *BSD operators can use the zebra `-s XXX` option
to specify a read buffer.
Additionally setup the default receive buffer size on *BSD
to be 128k instead of the 8k so that FRR does not run into
this issue again.
Fixes: #10666
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Use the dataplane to query and read interface NETCONF data;
add netconf-oriented data to the dplane context object, and
add accessors for it. Add handler for incoming update
processing.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
Allow self-produced xxxNETCONF netlink messages through the BPF
filter we use. Just like address-configuration actions, we'll
process NETCONF changes in one path, whether the changes were
generated by zebra or by something else in the host OS.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
a) We'll need to pass the info up via some dataplane control method
(This way bsd and linux can both be zebra agnostic of each other)
b) We'll need to modify `struct interface *` to track this data
and when it changes to notify upper level protocols about it.
c) Work is needed to dump the entire mpls state at the start
so we can gather interface state. This should be done
after interface data gathering from the kernel.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
The recently-added hashtable of nlsock objects needs to be
thread-safe: it's accessed from the main and dplane pthreads.
Add a mutex for it, use wrapper apis when accessing it. Add
a per-OS init/terminate api so we can do init that's not
per-vrf or per-namespace.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
Currently when the kernel sends netlink messages to FRR
the buffers to receive this data is of fixed length.
The kernel, with certain configurations, will send
netlink messages that are larger than this fixed length.
This leads to situations where, on startup, zebra gets
really confused about the state of the kernel. Effectively
the current algorithm is this:
read up to buffer in size
while (data to parse)
get netlink message header, look at size
parse if you can
The problem is that there is a 32k buffer we read.
We get the first message that is say 1k in size,
subtract that 1k to 31k left to parse. We then
get the next header and notice that the length
of the message is 33k. Which is obviously larger
than what we read in. FRR has no recover mechanism
nor is there a way to know, a priori, what the maximum
size the kernel will send us.
Modify FRR to look at the kernel message and see if the
buffer is large enough, if not, make it large enough to
read in the message.
This code has to be per netlink socket because of the usage
of pthreads. So add to `struct nlsock` the buffer and current
buffer length. Growing it as necessary.
Fixes: #10404
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Store the fd that corresponds to the appropriate `struct nlsock` and pass
that around in the dplane context instead of the pointer to the nlsock.
Modify the kernel_netlink.c code to store in a hash the `struct nlsock`
with the socket fd as the key.
Why do this? The dataplane context is used to pass around the `struct nlsock`
but the zebra code has a bug where the received buffer for kernel netlink
messages from the kernel is not big enough. So we need to dynamically
grow the receive buffer per socket, instead of having a non-dynamic buffer
that we read into. By passing around the fd we can look up the `struct nlsock`
that will soon have the associated buffer and not have to worry about `const`
issues that will arise.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Store and use the sequence number instead of using what is in
the `struct nlsock`. Future commits are going away from storing
the `struct nlsock` and the copy of the nlsock was guaranteeing
unique sequence numbers per message. So let's store the
sequence number to use instead.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When nl_batch_read_resp gets a full on failure -1 or an implicit
ack 0 from the kernel for a batch of code. Let's immediately
mark all of those in the batch pass/fail as needed. Instead
of having them marked else where.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The dataplane pthread only processes a limited set of incoming
netlink notifications: only register for that set of events,
reducing duplicate incoming netlink messages.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
Read incoming interface address change notifications in the
dplane pthread; enqueue the events to the main pthread
for processing. This is netlink-only for now - the bsd
kernel socket path remains unchanged.
Signed-off-by: Mark Stapp <mjs.ietf@gmail.com>