Add a new netlink socket for events coming in from the host OS
to the dataplane system for processing. Rename the existing
outbound dplane socket.
Signed-off-by: Mark Stapp <mjs.ietf@gmail.com>
When NHRP registers to zebra to receive link layer events related to
gre interfaces, then it is interested in receiving also RTM_GETNEIGH
messages.
Fixes ("b3b751046495") nhrpd: link layer registration to notifications
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In order to parse the netlink message into the
`struct rtattr *tb[size]` it is assumed that the buffer is
memset to 0 before the parsing. As such if you attempt
to read a value that was not returned in the message
you will not crash when you test for it.
The code has places were we memset it and places where we don't.
This *will* lead to crashes when the kernel changes. In
our parsing routines let's have them memset instead of having
to remember to do it pre pass in to the parser.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This action is initiated by nhrp and has been stubbed when
moving to zebra. Now, a netlink request is forged to set
the link interface of a gre interface if that gre interface
does not have already a link interface.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Instead of directly configuring the neighbor table after read from zapi
interface, a zebra dplane context is prepared to host the interface and
the family where the neighbor table is updated. Also, some other fields
are hosted: app_probes, ucast_probes, and mcast_probes. More information
on those fields can be found on ip-ntable configuration.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
EVPN neighbor operations were already done in the zebra dataplane
framework. Now that NHRP is able to use zebra to perform neighbor IP
operations (by programming link IP operations), handle this operation
under dataplane framework:
- assign two new operations NEIGH_IP_INSTALL and NEIGH_IP_DELETE; this
is reserved for GRE like interfaces:
example: ip neigh add A.B.C.D lladdr E.F.G.H
- use 'struct ipaddr' to store and encode the link ip address
- reuse dplane_neigh_info, and create an union with mac address
- reuse the protocol type and use it for neighbor operations; this
permits to store the daemon originating this neighbor operation.
a new route type is created: ZEBRA_ROUTE_NEIGH.
- the netlink level functions will handle a pointer, and a type; the
type indicates the family of the pointer: AF_INET or AF_INET6 if the
link type is an ip address, mac address otherwise.
- to keep backward compatibility with old queries, as no extension was
done, an option NEIGH_NO_EXTENSION has been put in place
- also, 2 new state flags are used: NUD_PERMANENT and NUD_FAILED.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This one also needed a bit of shuffling around, but MTYPE_RE is the only
one left used across file boundaries now.
Signed-off-by: David Lamparter <equinox@diac24.net>
Back when I put this together in 2015, ISO C11 was still reasonably new
and we couldn't require it just yet. Without ISO C11, there is no
"good" way (only bad hacks) to require a semicolon after a macro that
ends with a function definition. And if you added one anyway, you'd get
"spurious semicolon" warnings on some compilers...
With C11, `_Static_assert()` at the end of a macro will make it so that
the semicolon is properly required, consumed, and not warned about.
Consistently requiring semicolons after "file-level" macros matches
Linux kernel coding style and helps some editors against mis-syntax'ing
these macros.
Signed-off-by: David Lamparter <equinox@diac24.net>
like it has been done for iptable contexts, a zebra dplane context is
created for each ipset/ipset entry event. The zebra_dplane_ctx job is
then enqueued and processed by separate thread. Like it has been done
for zebra_pbr_iptable context, the ipset and ipset entry contexts are
encapsulated into an union of structures in zebra_dplane_ctx.
There is a specificity in that when storing ipset_entry structure, there
was a backpointer pointer to the ipset structure that is necessary
to get some complementary information before calling the hook. The
proposal is to use an ipset_entry_info structure next to the ipset_entry,
in the zebra_dplane context. That information is used for ipset_entry
processing. The ipset name and the ipset type are the only fields
necessary.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The iptable processing was not handled in remote dataplane, and was
directly processed by the thread in charge of zapi calls. Now that call
can be handled in the zebra_dplane separate thread. once a
zebra_dplane_ctx is allocated for iptable handling, the hook call is
performed later. Subsequently, a return code may be triggered to zclient
interface if any problem occurs when calling the hook call.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Neither tabs nor newlines are acceptable in syslog messages. They also
break line-based parsing of file logs.
Signed-off-by: David Lamparter <equinox@diac24.net>
Use the new nested NDA_FDB_EXT_ATTRS attribute to control per-fdb
notifications.
PS: The attributes where updated as a part of the kernel upstreaming
hence the change.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This includes -
1. non-DF block filter
2. List of es-peers that need to be blocked per-access port (for
split horizon filtering)
3. Backup nexthop group to failover local-es via the VxLAN overlay
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Fixes the valgrind error we were seeing on startup due to
initializing the msg header struct:
```
==2534283== Thread 3 zebra_dplane:
==2534283== Syscall param recvmsg(msg) points to uninitialised byte(s)
==2534283== at 0x4D616DD: recvmsg (in /usr/lib64/libpthread-2.31.so)
==2534283== by 0x43107C: netlink_recv_msg (kernel_netlink.c:744)
==2534283== by 0x4330E4: nl_batch_read_resp (kernel_netlink.c:1070)
==2534283== by 0x431D12: nl_batch_send (kernel_netlink.c:1201)
==2534283== by 0x431E8B: kernel_update_multi (kernel_netlink.c:1369)
==2534283== by 0x46019B: kernel_dplane_process_func (zebra_dplane.c:3979)
==2534283== by 0x45EB7F: dplane_thread_loop (zebra_dplane.c:4368)
==2534283== by 0x493F5CC: thread_call (thread.c:1585)
==2534283== by 0x48D3450: fpt_run (frr_pthread.c:303)
==2534283== by 0x48D3D41: frr_pthread_inner (frr_pthread.c:156)
==2534283== by 0x4D56431: start_thread (in /usr/lib64/libpthread-2.31.so)
==2534283== by 0x4E709D2: clone (in /usr/lib64/libc-2.31.so)
==2534283== Address 0x85cd850 is on thread 3's stack
==2534283== in frame #2, created by nl_batch_read_resp (kernel_netlink.c:1051)
==2534283==
==2534283== Syscall param recvmsg(msg.msg_control) points to unaddressable byte(s)
==2534283== at 0x4D616DD: recvmsg (in /usr/lib64/libpthread-2.31.so)
==2534283== by 0x43107C: netlink_recv_msg (kernel_netlink.c:744)
==2534283== by 0x4330E4: nl_batch_read_resp (kernel_netlink.c:1070)
==2534283== by 0x431D12: nl_batch_send (kernel_netlink.c:1201)
==2534283== by 0x431E8B: kernel_update_multi (kernel_netlink.c:1369)
==2534283== by 0x46019B: kernel_dplane_process_func (zebra_dplane.c:3979)
==2534283== by 0x45EB7F: dplane_thread_loop (zebra_dplane.c:4368)
==2534283== by 0x493F5CC: thread_call (thread.c:1585)
==2534283== by 0x48D3450: fpt_run (frr_pthread.c:303)
==2534283== by 0x48D3D41: frr_pthread_inner (frr_pthread.c:156)
==2534283== by 0x4D56431: start_thread (in /usr/lib64/libpthread-2.31.so)
==2534283== by 0x4E709D2: clone (in /usr/lib64/libc-2.31.so)
==2534283== Address 0xa0 is not stack'd, malloc'd or (recently) free'd
==2534283==
```
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Replace all lib/thread cancel macros, use thread_cancel()
everywhere. Only the THREAD_OFF macro and thread_cancel() api are
supported. Also adjust thread_cancel_async() to NULL caller's pointer (if
present).
Signed-off-by: Mark Stapp <mjs@voltanet.io>
When attempting to limit the amount of data sent from the kernel
to FRR, some kernels we can run against may not have this ability
in which case the setsockopt will fail. Notice that in the log.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add new compile option to enable human readable netlink dumps with
`debug zebra kernel msgdump`.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
It was wrongly assumed that the kernel is replying in batches when multiple
requests fail. The kernel sends one error message at a time, so we can
simply keep reading data from the socket as long as possible.
Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
The fuzzing code that is in the master branch is outdated and unused, so it
is worth to remove it to improve readablity of the code.
All the code related to the fuzzing is in the `fuzz` branch.
Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
We can make the Linux kernel send an ARP/NDP request by adding
a neighbour with the 'NUD_INCOMPLETE' state and the 'NTF_USE' flag.
This commit adds new dataplane operation as well as new zapi message
to allow other daemons send ARP/NDP requests.
Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
* Rename netlink utility functions like addattr to be less ambiguous
* Replace rta_attr_* functions with nl_attr_* since they introduced
inconsistencies in the code
* Add helper functions for adding rtnexthop struct to the Netlink
message
Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
The netlink_request function takes a `struct nlmsghdr *`
pointer from a common pattern that we use:
struct {
struct nlmsghdr n;
struct fib_rule_hdr frh;
char buf[NL_PKT_BUF_SIZE];
} req;
We were calling it `netlink_request(Socket, &req.n)`
The problem here is that coverity, rightly so, sees that
we access the data after the nlmsghdr in netlink_request and
tells us we have an read beyond end of the structure. While
we know we haven't mangled anything up here because of manual
inspection coverity doesn't have this knowledge implicitly.
So let's modify the code call to netlink_request to pass in the
void pointer of the req structure itself, cast to the appropriate
data structure in the function and do the right thing. Hopefully
the coverity SA will be happy and we can move on with our life.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The existing usage of the rta_nest and addattr_nest
functions were not adding the NLA_F_NESTED flag
to the type. As such the new nexthop functionality was
actually looking for this flag, while apparently older
code did not.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The linux kernel will occassionally send RTM_GETNEIGH when
it expects user space to help in resolution of an ARP entry.
See linux kernel commit:
commit 3e25c65ed085b361cc91a8f02e028f1158c9f255
Author: Tim Gardner <tim.gardner@canonical.com>
Date: Thu Aug 29 06:38:47 2013 -0600
net: neighbour: Remove CONFIG_ARPD
Since we don't care about this, let's just safely ignore this
message for the moment. I imagine in the future we might
care when we implement neighbor managment in the system.
Reported By: Stefan Priebe <s.priebe@profihost.ag>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
On startup when we are requesting all nexthop objects
from the kernel and it doesn't support that, we should not
produce an error message.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add the functionality to parse new nexthop group messages
from the kernel and insert them into the appropriate hash
tables. Parsing is done at startup between interface and
interface address lookup. Add functionality to parse
changes to nexthops we already have. Add functionality
to parse delete nexthop messages from the kernel and
remove them from our table.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add some base functionality so we can verify we are getting messages
about nexthops from the kernel.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a bit of extra code to indicate to the operator why
we intentionally rejected a kernel route from being used.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In a variety of places we are using DAEMON_VTY_DIR, convert
to use frr_vtydir. This will allow us in a future commit
to have the -N namespace option be automatically used.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Fix the macros for reading NLA attribute info
from an extended error ack. We were processing the data
using route attributes (rtattr) which is identical in size
to nlattr but probably should not be used.
Further, we were incorrectly calculating the length of the
inner netlink message that cause the error. We have to read
passed that in order to access all the nlattr's.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
The master thread handler is really part of the zrouter structure.
So let's move it over to that. Eventually zserv.h will only be
used for zapi messages.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Make netlink_request api generic where it can be used
for dump or querying specific information request.
nelink request nlm flags (NLM_F_ROOT | NLM_F_MATCH) are
used to dump purpose, if client wants to query spcific
MAC or IP using netlink_request does not require to set
them.
nlm struct is passed by the caller of netlink_request,
it can also set the nlm request flags.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Use a separate netlink socket for the dataplane's updates, to
avoid races between the dataplane pthread and the zebra main
pthread. Revise zebra shutdown so that the dataplane netlink
socket is cleaned-up later, after all shutdown-time dataplane
work has been done.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Correct use of netlink_parse_info() in the netlink fuzzing path.
Also clarify a couple of comments about pthreads.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Reduce or eliminate use of global zebra_ns structs in
a couple of netlink/kernel code paths, so that those paths
can potentially be made asynch eventually.
Slide netlink_talk_info into place to remove dependency on core
zebra structs; add accessors for dplane context block
Start init of route context from zebra core re and rn structs;
start queueing and event handling for incoming route updates.
Expose netlink apis that don't rely on zebra core structs;
add parallel route-update code path using the dplane ctx;
simplest possible event loop to process queued route'
updates.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Reduce or eliminate use of global zebra_ns structs in
a couple of netlink/kernel code paths, so that those paths
can potentially be made asynch eventually.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we are displaying data about a netlink message
in debugs or errors, print out the message type
as a string instead of a number.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We were ignoring mpls labels encapped with static routes.
Added support for single and multipath labels.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
NLMSG_NEXT decrements the buffer length (status) by
the header msg length (nlmsg_len) everytime its called.
If nlmsg_len isn't accurate and set to be larger than
what it should represent, it will cause status to
decrement passed 0. This makes NLMSG_NEXT return a
pointer that references an inaccessible address.
When that is passed to NLMSG_OK, it segfaults.
Add a check to verify that there is still something to read
before we try to.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Change the fuzzing code so that it fakes data from
the listening socket rather than using its own pseudo one.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
This code allows you to fuzz the netlink listening socket
in zebra by --enable-fuzzing and passing the -w [FILE]
option when running zebra.
File collection is stored in /var/run/frr/netlink_*
where each number is just a counter to keep the
files distinct.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Kernel requests via netlink are synchronous.
Therefore we do not need to specify a need for a ACK and
we can make the netlink_cmd NONBLOCKING
1) If the netlink message is going to cause an error
we will still get one. Since results from the kernel
are synchronous we will get the error message on the
netlink_cmd socket and handle it
2) If the netlink message is going to send more than
one packet we will still get them all. Since the results
from the kernel are synchronous we will receive all data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When creating a netlink_socket, listen to error
codes and abandon ship if it crashes and burns.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add 'const' to prefix args to several zebra route update,
redistribution, and route owner notification apis.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
This correction fixes two bugs detected by Clang scan:
Bug Group: Dead store
Bug Type: Dead assignment
File: zebra/kernel_netlink.c
Function: netlink_parse_extended_ack
Line: 548
Bug Type: Dead increment
File: isisd/isis_lsp.c
Function: lsp_bits2string
Line: 625
Signed-off-by: F. Aragon <paco@voltanet.io>
When a filter function fails to work correctly, we get an
error message that something has gone wrong. Unfortunately
we may not have any clues as to where the decode failure
happened. Add a backtrace to give us a clue.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add code to request and read in extended ack information
to provide a bit more context of what went wrong when
a failure is detected in the kernel.
Example of a failed delete:
Jun 20 21:19:25 robot zebra[11878]: Extended Error: Invalid prefix for given prefix length
Jun 20 21:19:25 robot zebra[11878]: netlink-cmd (NS 0) error: Invalid argument, type=RTM_DELROUTE(25), seq=8, pid=4078403400
Jun 20 21:19:25 robot zebra[11878]: 0:4.3.2.0/24: Route Deletion failure
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The re-use of RTPROT_STATIC has caused too many collisions
where other legitimate route sources are causing us to
believe we are the originator of the route. Modify
the code so that if another protocol inserts RTPROT_STATIC
we will assume it's a Kernel Route.
Fixes: #2293
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Fix the code so that we would actually start receiving
RULE netlink notifications.
The Kernel expects the long long to be a bit field
value, while the newer netlink message types are
an enum. So we need to convert the message type
number to a bit position and set that value.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Move where we check for non-kernel netlink messages to
a slightly earlier spot. This will allow in subsuquent
commits the removal of an extra parameter that needs to
be passed around.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>