Commit Graph

166 Commits

Author SHA1 Message Date
Donald Sharp
fe953d7cde zebra: Correct implication of SOL_NETLINK NETLINK_ADD_MEMBERSHIP usage
The usage of SOL_NETLINK for adding memberships of interest is
1 group per call.  The netink_socket function implied that
the call could be a bitfield of values.  This is not correct
at all.  This will trip someone else up in the future when
a new value is needed.  Let's get it right `now` before
it becomes a problem.

Let's also add a bit of extra code to give operator a better
understanding of what went wrong when a kernel does not
support the option.

Finally as a point of future reference should FRR just switch
over to a loop to add the required loops instead of having
this bastardized approach of some going in one way and some
going in another way?

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-06-30 07:57:55 -04:00
Chirag Shah
acc8e68720 zebra: netlink rtm tunnel msg parsing
'bridge vni add vni <id> dev <vxlan device>'
generates new RTM_NEWTUNNEL and RTM_DELTUNNEL
to add or remove vni to l3vxlan device.

Register new RTNLGRP_TUNNEL group to receive
new netlink notification.
Callback for the new RTM_xxxTUNNEL.

kernel patches:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?h=v5.18-rc7&id=7b8135f4df98b155b23754b6065c157861e268f1

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?h=v5.18-rc7&id=f9c4bb0b245cee35ef66f75bf409c9573d934cf9

Ticket:#3073812
Testing Done:

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Chirag Shah <chirag@nvidia.com>
2022-06-24 07:33:34 -04:00
Chirag Shah
e5b1de8a11 zebra: add error check condition to sock option
Adding error checking condition which was missed
in PR-11216.

*** CID 1517953:  Error handling issues  (CHECKED_RETURN)
/zebra/kernel_netlink.c: 313 in netlink_socket()
307                     memset(&snl, 0, sizeof(snl));
308                     snl.nl_family = AF_NETLINK;
309                     snl.nl_groups = groups;
310
311     #if defined SOL_NETLINK
312                     if (ext_groups)
>>>     CID 1517953:  Error handling issues  (CHECKED_RETURN)
>>>     Calling "setsockopt(sock, 270, 1, &ext_groups, 8U)" without checking return value. This library function may fail and return an error code.
313                             setsockopt(sock, SOL_NETLINK, NETLINK_ADD_MEMBERSHIP,
314                                        &ext_groups, sizeof(ext_groups));
315     #endif
316
317                     /* Bind the socket to the netlink structure for anything. */
318                     ret = bind(sock, (struct sockaddr *)&snl, sizeof(snl));

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2022-05-31 13:50:48 -07:00
Sri Mohana Singamsetty
bde51e807f
Merge pull request #11216 from chiragshah6/fdev2
zebra: netlink registry of rtm tunnel notification
2022-05-19 10:28:25 -07:00
Donald Sharp
1b3cf91b0c zebra: Fix newline in log message
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-05-18 14:42:22 -04:00
Chirag Shah
47e2eb270d zebra: netlink registry rtm tunnel notif
The kernel supports l3vxlan device to have (l3vni)
vni filter similar to vlan filtering on bridge device.

To receive netlink notification, FRR to register
for new netlink RTNLGRP_TUNNEL message.
This message required to register via additional
socket option as it's beyond bitmap size.

kernel patches:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?h=v5.18-rc7&id=7b8135f4df98b155b23754b6065c157861e268f1

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?h=v5.18-rc7&id=f9c4bb0b245cee35ef66f75bf409c9573d934cf9

Ticket:#3073812
Testing Done:

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2022-05-18 07:56:35 -07:00
Chirag Shah
f8f3e484d4 zebra: new netlink parse utility for rta
Signed-off-by: Chirag Shah <chirag@nvidia.com>
2022-05-16 10:45:14 -07:00
Donald Sharp
2f71996a68 zebra: Note when the netlink DUMP command is interrupted
There exists code paths in the linux kernel where a dump command
will be interrupted( I am not sure I understand what this really
means ) and the data sent back from the kernel is wrong or incomplete.

At this point in time I am not 100% certain what should be done, but
let's start noticing that this has happened so we can formulate a plan
or allow the end operator to know bad stuff is a foot at the circle K.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-25 19:08:14 -04:00
Stephen Worley
c40e1b1cfb zebra: add command for setting protodown bit
Add command for use to set protodown via frr.conf in
the case our default conflicts with another application
they are using.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 17:52:44 -05:00
Stephen Worley
5d41413833 zebra: add support for protodown reason code
Add support for setting the protodown reason code.

829eb208e8

These patches handle all our netlink code for setting the reason.

For protodown reason we only set `frr` as the reason externally
but internally we have more descriptive reasoning available via
`show interface IFNAME`. The kernel only provides a bitwidth of 32
that all userspace programs have to share so this makes the most sense.

Since this is new functionality, it needs to be added to the dplane
pthread instead. So these patches, also move the protodown setting we
were doing before into the dplane pthread. For this, we abstract it a
bit more to make it a general interface LINK update dplane API. This
API can be expanded to support gernal link creation/updating when/if
someone ever adds that code.

We also move a more common entrypoint for evpn-mh and from zapi clients
like vrrpd. They both call common code now to set our internal flags
for protodown and protodown reason.

Also add debugging code for dumping netlink packets with
protodown/protodown_reason.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 17:52:44 -05:00
Donald Sharp
9fb83b5506 zebra: Allow *BSD to specify a receive buffer size
End operator is reporting that they are receiving buffer overruns
when attempting to read from the kernel receive socket.  It is
possible to adjust this size to more modern levels especially
for when the system is under load.  Modify the code base
so that *BSD operators can use the zebra `-s XXX` option
to specify a read buffer.

Additionally setup the default receive buffer size on *BSD
to be 128k instead of the 8k so that FRR does not run into
this issue again.

Fixes: #10666
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-27 07:47:58 -05:00
Mark Stapp
cd787a8a45 zebra: use dataplane to read interface NETCONF info
Use the dataplane to query and read interface NETCONF data;
add netconf-oriented data to the dplane context object, and
add accessors for it. Add handler for incoming update
processing.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-25 10:18:32 -05:00
Mark Stapp
728f2017ae zebra: add dplane type for NETCONF data
Add a new dplane op for interface NETCONF data; add the new
enum value to several switch statements.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-25 09:53:02 -05:00
Mark Stapp
9f3f1486c8 zebra: add xxxNETCONF messages to the netlink BPF filter
Allow self-produced xxxNETCONF netlink messages through the BPF
filter we use. Just like address-configuration actions, we'll
process NETCONF changes in one path, whether the changes were
generated by zebra or by something else in the host OS.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-25 09:53:02 -05:00
Donald Sharp
ebb61fcaf5 zebra: Start of work to get data about mpls from kernel
a) We'll need to pass the info up via some dataplane control method
(This way bsd and linux can both be zebra agnostic of each other)

b) We'll need to modify `struct interface *` to track this data
and when it changes to notify upper level protocols about it.

c) Work is needed to dump the entire mpls state at the start
so we can gather interface state.  This should be done
after interface data gathering from the kernel.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-25 09:53:02 -05:00
Donald Sharp
cc9f21da22 *: Change thread->func to return void instead of int
The int return value is never used.  Modify the code
base to just return a void instead.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-23 19:56:04 -05:00
Jafar Al-Gharaibeh
76d8e1a4a7
Merge pull request #10561 from mjstapp/nlsock_hash_lock
zebra: make netlink object hash threadsafe
2022-02-16 13:11:21 -06:00
Donald Sharp
b9d95135a8 zebra: Fix spelling mistake
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-14 12:56:44 -05:00
Mark Stapp
348698095d zebra: make netlink object hash threadsafe
The recently-added hashtable of nlsock objects needs to be
thread-safe: it's accessed from the main and dplane pthreads.
Add a mutex for it, use wrapper apis when accessing it. Add
a per-OS init/terminate api so we can do init that's not
per-vrf or per-namespace.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-11 17:03:26 -05:00
Donald Sharp
2cf7651f0b zebra: Make netlink buffer reads resizeable when needed
Currently when the kernel sends netlink messages to FRR
the buffers to receive this data is of fixed length.
The kernel, with certain configurations, will send
netlink messages that are larger than this fixed length.
This leads to situations where, on startup, zebra gets
really confused about the state of the kernel.  Effectively
the current algorithm is this:

read up to buffer in size
while (data to parse)
     get netlink message header, look at size
        parse if you can

The problem is that there is a 32k buffer we read.
We get the first message that is say 1k in size,
subtract that 1k to 31k left to parse.  We then
get the next header and notice that the length
of the message is 33k.  Which is obviously larger
than what we read in.  FRR has no recover mechanism
nor is there a way to know, a priori, what the maximum
size the kernel will send us.

Modify FRR to look at the kernel message and see if the
buffer is large enough, if not, make it large enough to
read in the message.

This code has to be per netlink socket because of the usage
of pthreads.  So add to `struct nlsock` the buffer and current
buffer length.  Growing it as necessary.

Fixes: #10404
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-08 17:28:19 -05:00
Donald Sharp
d4000d7ba3 zebra: Remove struct nlsock from dataplane information and use int fd
Store the fd that corresponds to the appropriate `struct nlsock` and pass
that around in the dplane context instead of the pointer to the nlsock.
Modify the kernel_netlink.c code to store in a hash the `struct nlsock`
with the socket fd as the key.

Why do this?  The dataplane context is used to pass around the `struct nlsock`
but the zebra code has a bug where the received buffer for kernel netlink
messages from the kernel is not big enough.  So we need to dynamically
grow the receive buffer per socket, instead of having a non-dynamic buffer
that we read into.  By passing around the fd we can look up the `struct nlsock`
that will soon have the associated buffer and not have to worry about `const`
issues that will arise.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-08 17:28:19 -05:00
Donald Sharp
3670f5047c zebra: Store the sequence number to use as part of the dp_info
Store and use the sequence number instead of using what is in
the `struct nlsock`.  Future commits are going away from storing
the `struct nlsock` and the copy of the nlsock was guaranteeing
unique sequence numbers per message.  So let's store the
sequence number to use instead.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-08 17:28:19 -05:00
Jafar Al-Gharaibeh
4333379fca
Merge pull request #9926 from donaldsharp/update_issues
zebra: Fix v6 route replace failure turned into success
2022-02-04 19:40:55 -06:00
Donald Sharp
c8453cd77e zebra: Fix v6 route replace failure turned into success
Currently when we have a route replace operation for v6 routes
with a new nexthop group the order of kernel installation is this:

a) New nexthop group insertion seq  1
b) Route delete operation seq 3
c) Route insertion operation seq 2

Currently the code in nl_batch_read_resp is attempting
to handle this situation by skipping the delete operation.
*BUT* it is enqueuing the context into the zebra dplane
queue before we read the response.  Since we create the ctx
with an implied success, success is being reported to the
upper level dplane and the zebra rib thinks the route has
been properly handled.

This is showing up in the zebra_seg6_route test code because
the test code is installing a seg6 route w/ sharpd and it
is failing to install because the route's nexthop is rejected:

First installation:

2021/10/29 09:28:10.218 ZEBRA: [JGWSB-SMNVE] dplane: incoming new work counter: 2
2021/10/29 09:28:10.218 ZEBRA: [Q52A7-211QJ] dplane enqueues 2 new work to provider 'Kernel'
2021/10/29 09:28:10.218 ZEBRA: [JVY1P-93VFY] dplane provider 'Kernel': processing
2021/10/29 09:28:10.218 ZEBRA: [TX9N0-9JKDF] ID (9) Dplane nexthop update ctx 0x56125390a820 op NH_INSTALL
2021/10/29 09:28:10.218 ZEBRA: [PM9ZJ-07RCP] 0:1::1/128 Dplane route update ctx 0x56125390add0 op ROUTE_INSTALL
2021/10/29 09:28:10.218 ZEBRA: [TJ327-ET8HE] netlink_send_msg: >> netlink message dump [sent]
2021/10/29 09:28:10.218 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=104 type=(104) NEWNEXTHOP flags=(0x0501) {REQUEST,DUMP,(ROOT|REPLACE|CAPPED),(ATOMIC|CREATE)} seq=9 pid=3539131282]
2021/10/29 09:28:10.218 ZEBRA: [WCX94-SW894]   nhm [family=(10) AF_INET6 scope=(0) UNIVERSE protocol=(11) ZEBRA flags=0x00000000 {}]
2021/10/29 09:28:10.218 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(1) ID]
2021/10/29 09:28:10.218 ZEBRA: [Z4E9C-GD9EP]       9
2021/10/29 09:28:10.218 ZEBRA: [KFBSR-XYJV1]     rta [len=20 (payload=16) type=(6) GATEWAY]
2021/10/29 09:28:10.218 ZEBRA: [STTSM-27M81]       2001::1
2021/10/29 09:28:10.218 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(5) OIF]
2021/10/29 09:28:10.218 ZEBRA: [JR4EA-BKPTA]       6
2021/10/29 09:28:10.218 ZEBRA: [KFBSR-XYJV1]     rta [len=6 (payload=2) type=(7) ENCAP_TYPE]
2021/10/29 09:28:10.218 ZEBRA: [JR4EA-BKPTA]       5
2021/10/29 09:28:10.218 ZEBRA: [KFBSR-XYJV1]     rta [len=36 (payload=32) type=(32776) UNKNOWN]
2021/10/29 09:28:10.218 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=64 type=(24) NEWROUTE flags=(0x0401) {REQUEST,(ATOMIC|CREATE)} seq=10 pid=3539131282]
2021/10/29 09:28:10.218 ZEBRA: [GCEGC-W8YBF]   rtmsg [family=(10) AF_INET6 dstlen=128 srclen=0 tos=0 table=254 protocol=(194) UNKNOWN scope=(0) UNIVERSE type=(1) UNICAST flags=0x0000 {}]
2021/10/29 09:28:10.218 ZEBRA: [KFBSR-XYJV1]     rta [len=20 (payload=16) type=(1) DST]
2021/10/29 09:28:10.218 ZEBRA: [STTSM-27M81]       1::1
2021/10/29 09:28:10.218 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(6) PRIORITY]
2021/10/29 09:28:10.218 ZEBRA: [Z4E9C-GD9EP]       20
2021/10/29 09:28:10.218 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(30) NH_ID]
2021/10/29 09:28:10.218 ZEBRA: [Z4E9C-GD9EP]       9
2021/10/29 09:28:10.218 ZEBRA: [V8KNF-8EXH8] netlink_recv_msg: << netlink message dump [recv]
2021/10/29 09:28:10.218 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=76 type=(2) ERROR flags=(0x0300) {DUMP,(ROOT|REPLACE|CAPPED),(MATCH|EXCLUDE|ACK_TLVS)} seq=9 pid=3539131282]
2021/10/29 09:28:10.218 ZEBRA: [KWP1C-6CSXF]   nlmsgerr [error=(-22) Invalid argument]
2021/10/29 09:28:10.218 ZEBRA: [HSYZM-HV7HF] Extended Error: Gateway can not be a local address
2021/10/29 09:28:10.218 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWNEXTHOP(104), seq=9, pid=3539131282
2021/10/29 09:28:10.218 ZEBRA: [V8KNF-8EXH8] netlink_recv_msg: << netlink message dump [recv]
2021/10/29 09:28:10.218 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=68 type=(2) ERROR flags=(0x0300) {DUMP,(ROOT|REPLACE|CAPPED),(MATCH|EXCLUDE|ACK_TLVS)} seq=10 pid=3539131282]
2021/10/29 09:28:10.218 ZEBRA: [KWP1C-6CSXF]   nlmsgerr [error=(-22) Invalid argument]
2021/10/29 09:28:10.218 ZEBRA: [HSYZM-HV7HF] Extended Error: Nexthop id does not exist
2021/10/29 09:28:10.218 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWROUTE(24), seq=10, pid=3539131282
2021/10/29 09:28:10.218 ZEBRA: [VCDW6-A7ZF1] dplane dequeues 2 completed work from provider Kernel
2021/10/29 09:28:10.218 ZEBRA: [JTWAB-1MH4Y] dplane has 2 completed, 0 errors, for zebra main
2021/10/29 09:28:10.218 ZEBRA: [J7K9Z-9M7DT] Nexthop dplane ctx 0x56125390a820, op NH_INSTALL, nexthop ID (9), result FAILURE
2021/10/29 09:28:10.218 ZEBRA: [P2XBZ-RAFQ5][EC 4043309074] Failed to install Nexthop ID (9) into the kernel
2021/10/29 09:28:10.218 ZEBRA: [RMK34-61HV5] default(0:254):1::1/128 Processing dplane result ctx 0x56125390add0, op ROUTE_INSTALL result FAILURE

Note the last line `op ROUTE_INSTALL result FAILURE` because we are attempting to use a
a gw nexthop that is local.  This is the result.

Then the test code was installing the route again:

2021/10/29 09:30:00.493 ZEBRA: [JGWSB-SMNVE] dplane: incoming new work counter: 2
2021/10/29 09:30:00.493 ZEBRA: [Q52A7-211QJ] dplane enqueues 2 new work to provider 'Kernel'
2021/10/29 09:30:00.493 ZEBRA: [JVY1P-93VFY] dplane provider 'Kernel': processing
2021/10/29 09:30:00.493 ZEBRA: [TX9N0-9JKDF] ID (9) Dplane nexthop update ctx 0x561253916a00 op NH_INSTALL
2021/10/29 09:30:00.493 ZEBRA: [PM9ZJ-07RCP] 0:1::1/128 Dplane route update ctx 0x561253915f40 op ROUTE_UPDATE
2021/10/29 09:30:00.493 ZEBRA: [TJ327-ET8HE] netlink_send_msg: >> netlink message dump [sent]
2021/10/29 09:30:00.493 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=104 type=(104) NEWNEXTHOP flags=(0x0501) {REQUEST,DUMP,(ROOT|REPLACE|CAPPED),(ATOMIC|CREATE)} seq=11 pid=3539131282]
2021/10/29 09:30:00.493 ZEBRA: [WCX94-SW894]   nhm [family=(10) AF_INET6 scope=(0) UNIVERSE protocol=(11) ZEBRA flags=0x00000000 {}]
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(1) ID]
2021/10/29 09:30:00.493 ZEBRA: [Z4E9C-GD9EP]       9
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=20 (payload=16) type=(6) GATEWAY]
2021/10/29 09:30:00.493 ZEBRA: [STTSM-27M81]       2001::1
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(5) OIF]
2021/10/29 09:30:00.493 ZEBRA: [JR4EA-BKPTA]       6
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=6 (payload=2) type=(7) ENCAP_TYPE]
2021/10/29 09:30:00.493 ZEBRA: [JR4EA-BKPTA]       5
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=36 (payload=32) type=(32776) UNKNOWN]
2021/10/29 09:30:00.493 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=56 type=(25) DELROUTE flags=(0x0401) {REQUEST,(ATOMIC|CREATE)} seq=13 pid=3539131282]
2021/10/29 09:30:00.493 ZEBRA: [GCEGC-W8YBF]   rtmsg [family=(10) AF_INET6 dstlen=128 srclen=0 tos=0 table=254 protocol=(194) UNKNOWN scope=(0) UNIVERSE type=(0) UNSPEC flags=0x0000 {}]
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=20 (payload=16) type=(1) DST]
2021/10/29 09:30:00.493 ZEBRA: [STTSM-27M81]       1::1
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(6) PRIORITY]
2021/10/29 09:30:00.493 ZEBRA: [Z4E9C-GD9EP]       20
2021/10/29 09:30:00.493 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=64 type=(24) NEWROUTE flags=(0x0401) {REQUEST,(ATOMIC|CREATE)} seq=12 pid=3539131282]
2021/10/29 09:30:00.493 ZEBRA: [GCEGC-W8YBF]   rtmsg [family=(10) AF_INET6 dstlen=128 srclen=0 tos=0 table=254 protocol=(194) UNKNOWN scope=(0) UNIVERSE type=(1) UNICAST flags=0x0000 {}]
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=20 (payload=16) type=(1) DST]
2021/10/29 09:30:00.493 ZEBRA: [STTSM-27M81]       1::1
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(6) PRIORITY]
2021/10/29 09:30:00.493 ZEBRA: [Z4E9C-GD9EP]       20
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(30) NH_ID]
2021/10/29 09:30:00.493 ZEBRA: [Z4E9C-GD9EP]       9
2021/10/29 09:30:00.493 ZEBRA: [V8KNF-8EXH8] netlink_recv_msg: << netlink message dump [recv]
2021/10/29 09:30:00.493 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=76 type=(2) ERROR flags=(0x0300) {DUMP,(ROOT|REPLACE|CAPPED),(MATCH|EXCLUDE|ACK_TLVS)} seq=11 pid=3539131282]
2021/10/29 09:30:00.493 ZEBRA: [KWP1C-6CSXF]   nlmsgerr [error=(-22) Invalid argument]
2021/10/29 09:30:00.493 ZEBRA: [HSYZM-HV7HF] Extended Error: Gateway can not be a local address
2021/10/29 09:30:00.493 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWNEXTHOP(104), seq=11, pid=3539131282
2021/10/29 09:30:00.493 ZEBRA: [V8KNF-8EXH8] netlink_recv_msg: << netlink message dump [recv]
2021/10/29 09:30:00.493 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=36 type=(2) ERROR flags=(0x0100) {DUMP,(ROOT|REPLACE|CAPPED)} seq=13 pid=3539131282]
2021/10/29 09:30:00.493 ZEBRA: [KWP1C-6CSXF]   nlmsgerr [error=(-3) No such process]
2021/10/29 09:30:00.493 ZEBRA: [V8KNF-8EXH8] netlink_recv_msg: << netlink message dump [recv]
2021/10/29 09:30:00.493 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=68 type=(2) ERROR flags=(0x0300) {DUMP,(ROOT|REPLACE|CAPPED),(MATCH|EXCLUDE|ACK_TLVS)} seq=12 pid=3539131282]
2021/10/29 09:30:00.493 ZEBRA: [KWP1C-6CSXF]   nlmsgerr [error=(-22) Invalid argument]
2021/10/29 09:30:00.493 ZEBRA: [VCDW6-A7ZF1] dplane dequeues 2 completed work from provider Kernel
2021/10/29 09:30:00.493 ZEBRA: [JTWAB-1MH4Y] dplane has 2 completed, 0 errors, for zebra main
2021/10/29 09:30:00.493 ZEBRA: [J7K9Z-9M7DT] Nexthop dplane ctx 0x561253916a00, op NH_INSTALL, nexthop ID (9), result FAILURE
2021/10/29 09:30:00.493 ZEBRA: [P2XBZ-RAFQ5][EC 4043309074] Failed to install Nexthop ID (9) into the kernel
2021/10/29 09:30:00.493 ZEBRA: [RMK34-61HV5] default(0:254):1::1/128 Processing dplane result ctx 0x561253915f40, op ROUTE_UPDATE result SUCCESS

Note that this time we do these three operations

a) nexthop installation seq 11
b) route delete seq 13
c) route add seq 12

Note the last line, we report the install as a success but it clearly failed from the seq=12 decode.
When we look at the v6 rib it thinks it is installed:

unet> r1 show ipv6 route
Codes: K - kernel route, C - connected, S - static, R - RIPng,
       O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
       v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR,
       f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

D>* 1::1/128 [150/0] via 2001::1, dum0, seg6local unspec unknown(seg6local_context2str), seg6 a::, weight 1, 00:00:17

So let's modify nl_batch_read_resp to not dequeue/enqueue the context until we are sure we have
the right one.  This fixes the test code to do the right thing on the second installation.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-04 15:33:58 -05:00
Donald Sharp
00249e255e zebra: When we get an implicit or ack or full failure mark status
When nl_batch_read_resp gets a full on failure -1 or an implicit
ack 0 from the kernel for a batch of code.  Let's immediately
mark all of those in the batch pass/fail as needed.  Instead
of having them marked else where.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-04 15:33:58 -05:00
Mark Stapp
ceab66b7f4 zebra: reduce incoming netlink messages for dplane thread
The dataplane pthread only processes a limited set of incoming
netlink notifications: only register for that set of events,
reducing duplicate incoming netlink messages.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-01 13:43:51 -05:00
David Lamparter
17a4c65576 zebra: remove netlink buffer size log message
... really not much point in printing this.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-01-17 09:46:19 +01:00
Donald Sharp
9bfadae860 zebra: Use a bool for startup indications
Let's not pass around an int startup when all we are doing
is true/falsing it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-10-04 20:26:38 -04:00
Mark Stapp
d166308be0 zebra: use the dataplane to read netlink intf addr changes
Read incoming interface address change notifications in the
dplane pthread; enqueue the events to the main pthread
for processing. This is netlink-only for now - the bsd
kernel socket path remains unchanged.

Signed-off-by: Mark Stapp <mjs.ietf@gmail.com>
2021-09-14 11:07:30 -04:00
Mark Stapp
9d59df634c zebra: add new dplane op codes for interface addr events
Add new dplane op values for incoming interface address add
and delete events.

Signed-off-by: Mark Stapp <mjs.ietf@gmail.com>
2021-09-14 11:07:30 -04:00
Mark Stapp
ff45112c07 zebra: use uint32_t instead of __u32
Use more consistent int type.

Signed-off-by: Mark Stapp <mjs.ietf@gmail.com>
2021-09-14 10:31:45 -04:00
Mark Stapp
80dcc38831 zebra: add inbound netlink socket for dataplane
Add a new netlink socket for events coming in from the host OS
to the dataplane system for processing. Rename the existing
outbound dplane socket.

Signed-off-by: Mark Stapp <mjs.ietf@gmail.com>
2021-09-14 10:31:45 -04:00
Philippe Guibert
7a52f27e75 zebra: RTM_GETNEIGH messages may be used by nhrp
When NHRP registers to zebra to receive link layer events related to
gre interfaces, then it is interested in receiving also RTM_GETNEIGH
messages.

Fixes ("b3b751046495") nhrpd: link layer registration to notifications

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2021-08-17 09:07:31 +02:00
Donald Sharp
94d70a6533 zebra: Add nl_attr_put8 so we can put uint8_t in netlink messages
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-07-08 11:12:46 -04:00
Donald Sharp
269b69d703 zebra: memset the struct rtattr *tb[SIZE] in setting function
In order to parse the netlink message into the
`struct rtattr *tb[size]` it is assumed that the buffer is
memset to 0 before the parsing.  As such if you attempt
to read a value that was not returned in the message
you will not crash when you test for it.

The code has places were we memset it and places where we don't.
This *will* lead to crashes when the kernel changes.  In
our parsing routines let's have them memset instead of having
to remember to do it pre pass in to the parser.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-05-11 20:05:51 -04:00
Philippe Guibert
62b4b7e44a zebra: new dplane action to set gre link interface
This action is initiated by nhrp and has been stubbed when
moving to zebra. Now, a netlink request is forged to set
the link interface of a gre interface if that gre interface
does not have already a link interface.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2021-04-30 10:33:18 +02:00
Philippe Guibert
e18747a967 zebra: move neighbor table configuration to dplane contexts
Instead of directly configuring the neighbor table after read from zapi
interface, a zebra dplane context is prepared to host the interface and
the family where the neighbor table is updated. Also, some other fields
are hosted: app_probes, ucast_probes, and mcast_probes. More information
on those fields can be found on ip-ntable configuration.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2021-04-09 18:29:58 +02:00
Philippe Guibert
0a27a2fef5 zebra, lib: handle NEIGH_ADD/DELETE to zebra dataplane framework
EVPN neighbor operations were already done in the zebra dataplane
framework. Now that NHRP is able to use zebra to perform neighbor IP
operations (by programming link IP operations), handle this operation
under dataplane framework:
- assign two new operations NEIGH_IP_INSTALL and NEIGH_IP_DELETE; this
is reserved for GRE like interfaces:
example: ip neigh add A.B.C.D lladdr E.F.G.H
- use 'struct ipaddr' to store and encode the link ip address
- reuse dplane_neigh_info, and create an union with mac address
- reuse the protocol type and use it for neighbor operations; this
permits to store the daemon originating this neighbor operation.
a new route type is created: ZEBRA_ROUTE_NEIGH.
- the netlink level functions will handle a pointer, and a type; the
type indicates the family of the pointer: AF_INET or AF_INET6 if the
link type is an ip address, mac address otherwise.
- to keep backward compatibility with old queries, as no extension was
done, an option NEIGH_NO_EXTENSION has been put in place
- also, 2 new state flags are used: NUD_PERMANENT and NUD_FAILED.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2021-04-09 18:29:58 +02:00
David Lamparter
224ccf29d9 zebra: kill zebra_memory.h, use MTYPE_STATIC
This one also needed a bit of shuffling around, but MTYPE_RE is the only
one left used across file boundaries now.

Signed-off-by: David Lamparter <equinox@diac24.net>
2021-03-22 20:02:17 +01:00
David Lamparter
bf8d3d6aca *: require semicolon after DEFINE_MTYPE & co
Back when I put this together in 2015, ISO C11 was still reasonably new
and we couldn't require it just yet.  Without ISO C11, there is no
"good" way (only bad hacks) to require a semicolon after a macro that
ends with a function definition.  And if you added one anyway, you'd get
"spurious semicolon" warnings on some compilers...

With C11, `_Static_assert()` at the end of a macro will make it so that
the semicolon is properly required, consumed, and not warned about.

Consistently requiring semicolons after "file-level" macros matches
Linux kernel coding style and helps some editors against mis-syntax'ing
these macros.

Signed-off-by: David Lamparter <equinox@diac24.net>
2021-03-17 06:18:17 +01:00
Philippe Guibert
ef524230a6 zebra: move ipset and ipset_entry to zebra dplane contexts
like it has been done for iptable contexts, a zebra dplane context is
created for each ipset/ipset entry event. The zebra_dplane_ctx job is
then enqueued and processed by separate thread. Like it has been done
for zebra_pbr_iptable context, the ipset and ipset entry contexts are
encapsulated into an union of structures in zebra_dplane_ctx.

There is a specificity in that when storing ipset_entry structure, there
was a backpointer pointer to the ipset structure that is necessary
to get some complementary information before calling the hook. The
proposal is to use an ipset_entry_info structure next to the ipset_entry,
in the zebra_dplane context. That information is used for ipset_entry
processing. The ipset name and the ipset type are the only fields
 necessary.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2021-03-10 14:57:32 +01:00
Philippe Guibert
5162e00045 zebra: move iptable handling in zebra_dplane
The iptable processing was not handled in remote dataplane, and was
directly processed by the thread in charge of zapi calls. Now that call
can be handled in the zebra_dplane separate thread. once a
zebra_dplane_ctx is allocated for iptable handling, the hook call is
performed later. Subsequently, a return code may be triggered to zclient
interface if any problem occurs when calling the hook call.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2021-03-04 11:50:25 +01:00
David Lamparter
1d5453d607 *: remove tabs & newlines from log messages
Neither tabs nor newlines are acceptable in syslog messages.  They also
break line-based parsing of file logs.

Signed-off-by: David Lamparter <equinox@diac24.net>
2021-02-14 15:36:51 +01:00
Mark Stapp
4c99d413e6 zebra: debug messages go under conditionals
Move a couple of unprotected debug calls in the netlink code
under DEBUG_KERNEL.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2021-01-26 12:29:39 -05:00
Patrick Ruddy
f87fe77aeb
Merge pull request #7723 from AnuradhaKaruppiah/fdb-ext-attrs
zebra: move from NDA_NOTIFY to NDA_FDB_EXT_ATTRS
2021-01-19 16:27:54 +00:00
Stephen Worley
3bece1e0e3
Merge pull request #7162 from opensourcerouting/zebra-human-netlink
zebra: human readable netlink dumps
2020-12-14 14:03:35 -05:00
Nikolay Aleksandrov
4bcdb6086c zebra: move from NDA_NOTIFY to NDA_FDB_EXT_ATTRS
Use the new nested NDA_FDB_EXT_ATTRS attribute to control per-fdb
notifications.

PS: The attributes where updated as a part of the kernel upstreaming
hence the change.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-12-11 12:13:36 -08:00
Anuradha Karuppiah
c60522f702 zebra: dplane APIs for programming evpn-mh access port attributes
This includes -
1. non-DF block filter
2. List of es-peers that need to be blocked per-access port (for
split horizon filtering)
3. Backup nexthop group to failover local-es via the VxLAN overlay

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-10-26 10:32:51 -07:00
Mark Stapp
33fa4b14db
Merge pull request #7382 from sworleys/Fix-Msg-Buff
zebra: fix unitialized msg header reading at startup
2020-10-23 18:05:04 -04:00
Stephen Worley
9d06e1219a zebra: fix unitialized msg header reading at startup
Fixes the valgrind error we were seeing on startup due to
initializing the msg header struct:

```
==2534283== Thread 3 zebra_dplane:
==2534283== Syscall param recvmsg(msg) points to uninitialised byte(s)
==2534283==    at 0x4D616DD: recvmsg (in /usr/lib64/libpthread-2.31.so)
==2534283==    by 0x43107C: netlink_recv_msg (kernel_netlink.c:744)
==2534283==    by 0x4330E4: nl_batch_read_resp (kernel_netlink.c:1070)
==2534283==    by 0x431D12: nl_batch_send (kernel_netlink.c:1201)
==2534283==    by 0x431E8B: kernel_update_multi (kernel_netlink.c:1369)
==2534283==    by 0x46019B: kernel_dplane_process_func (zebra_dplane.c:3979)
==2534283==    by 0x45EB7F: dplane_thread_loop (zebra_dplane.c:4368)
==2534283==    by 0x493F5CC: thread_call (thread.c:1585)
==2534283==    by 0x48D3450: fpt_run (frr_pthread.c:303)
==2534283==    by 0x48D3D41: frr_pthread_inner (frr_pthread.c:156)
==2534283==    by 0x4D56431: start_thread (in /usr/lib64/libpthread-2.31.so)
==2534283==    by 0x4E709D2: clone (in /usr/lib64/libc-2.31.so)
==2534283==  Address 0x85cd850 is on thread 3's stack
==2534283==  in frame #2, created by nl_batch_read_resp (kernel_netlink.c:1051)
==2534283==
==2534283== Syscall param recvmsg(msg.msg_control) points to unaddressable byte(s)
==2534283==    at 0x4D616DD: recvmsg (in /usr/lib64/libpthread-2.31.so)
==2534283==    by 0x43107C: netlink_recv_msg (kernel_netlink.c:744)
==2534283==    by 0x4330E4: nl_batch_read_resp (kernel_netlink.c:1070)
==2534283==    by 0x431D12: nl_batch_send (kernel_netlink.c:1201)
==2534283==    by 0x431E8B: kernel_update_multi (kernel_netlink.c:1369)
==2534283==    by 0x46019B: kernel_dplane_process_func (zebra_dplane.c:3979)
==2534283==    by 0x45EB7F: dplane_thread_loop (zebra_dplane.c:4368)
==2534283==    by 0x493F5CC: thread_call (thread.c:1585)
==2534283==    by 0x48D3450: fpt_run (frr_pthread.c:303)
==2534283==    by 0x48D3D41: frr_pthread_inner (frr_pthread.c:156)
==2534283==    by 0x4D56431: start_thread (in /usr/lib64/libpthread-2.31.so)
==2534283==    by 0x4E709D2: clone (in /usr/lib64/libc-2.31.so)
==2534283==  Address 0xa0 is not stack'd, malloc'd or (recently) free'd
==2534283==
```

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-10-23 14:57:29 -04:00