Fixup both memory leaks as well as use after free's in nhg's
on shutdown.
This approach is effectively just iterating through all the
hash items and directly just freeing the memory instead
of handling ref counts or cross references.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Commit 35729f38fa introduced the idea of
holding a nexthop group for a small amount of time
before removing it from the system. When this code
was introduced the nexthop group entry was saved
and a timer started, except instead of stopping
processing at that point in time, zebra was
continuing on and deleting nexthop group entries
that that entry depended on as well. This
should not be done until the timer pops.
Fixes: #11596
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The test case is with `redirect-off` in evpn multi-homing environment:
```
evpn mh redirect-off
```
After the environment is setup, do the following steps:
1) Let one member of ES learn one mac:
```
2e:52:bb:bb:2f:46 dev ae1 vlan 100 master bridge0 static
```
Now everything is ok and the mac can be synced to other ES peers.
2) Shutdown bond1. At this time, zebra will get three netlink messages,
not one as current code expected. Like:
```
e4:f0:04:89:b6:46 dev vxlan10030 vlan 30 master bridge0 static <-A
e4:f0:04:89:b6:46 dev vxlan10030 nhid 536870913 self extern_learn <-B
e4:f0:04:89:b6:46 dev vxlan10030 vlan 30 self <-C
```
With A), zebra will wrongly remove this mac again:
```
ZEBRA: dpAdd remote MAC e4:f0:04:89:b6:46 VID 30
ZEBRA: Add/update remote MAC e4:f0:04:89:b6:46 intf vxlan10030(26) VNI 10030 flags 0xa01 - del local
ZEBRA: Send MACIP Del f None MAC e4:f0:04:89:b6:46 IP (null) seq 0 L2-VNI 10030 ESI - to bgp
```
With C), zebra will wrongly add this mac again:
```
ZEBRA: Rx RTM_NEWNEIGH AF_BRIDGE IF 26 VLAN 30 st 0x2 fl 0x2 MAC e4:f0:04:89:b6:46 nhg 0
ZEBRA: dpAdd remote MAC e4:f0:04:89:b6:46 VID 30
```
zebra should skip the two messages with `vid`. Otherwise, it will send many
*wrong* messages to bgpd, and the logic is wrong.
`nhg/dst` is in 2nd message without `vid`, it is useful to call
`zebra_evpn_add_update_local_mac()`. But it will fail with "could not find EVPN"
warning for no `vid`, can't call `zebra_evpn_add_update_local_mac()`:
With B):
```
ZEBRA: Rx RTM_NEWNEIGH AF_BRIDGE IF 26 st 0x2 fl 0x12 MAC e4:f0:04:89:b6:46 nhg 536870913
ZEBRA: dpAdd local-nw-MAC e4:f0:04:89:b6:46 VID 0
ZEBRA: Add/Update MAC e4:f0:04:89:b6:46 intf ae1(18) VID 0, could not find EVPN
```
Here, we can get `vid` from vxlan interface instead of from netlink message.
In summary, `zebra_vxlan_dp_network_mac_add()` will process the three messages
wrongly expecting only one messsage, so its logic is wrong. Just skip the two
unuseful messages with `vid`.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
convert:
frr_with_mutex(..)
to:
frr_with_mutex (..)
To make all our code agree with what clang-format is going to produce
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The creation of the rtadv socket can fail but there
is very very little data associated with this event
to let the operator know something has gone terribly
wrong.
Please note if this socket fails to create or fails
the setsockopt's rtadv is basically just really really
messed up. I am not sure what can be done here.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
==1595641== 280 (80 direct, 200 indirect) bytes in 1 blocks are definitely lost in loss record 30 of 38
==1595641== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==1595641== by 0x493C89C: qcalloc (memory.c:116)
==1595641== by 0x1E8426: lsp_alloc (zebra_mpls.c:1116)
==1595641== by 0x49147F1: hash_get (hash.c:162)
==1595641== by 0x1EC880: mpls_lsp_install (zebra_mpls.c:3192)
==1595641== by 0x1C51BB: zread_vrf_label (zapi_msg.c:3197)
==1595641== by 0x1C6F11: zserv_handle_commands (zapi_msg.c:3863)
==1595641== by 0x24D0F4: zserv_process_messages (zserv.c:523)
==1595641== by 0x498F4CC: thread_call (thread.c:2002)
==1595641== by 0x49253A2: frr_run (libfrr.c:1198)
==1595641== by 0x1A28BA: main (main.c:475)
==1595641==
==1595641== 1,400 (400 direct, 1,000 indirect) bytes in 5 blocks are definitely lost in loss record 35 of 38
==1595641== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==1595641== by 0x493C89C: qcalloc (memory.c:116)
==1595641== by 0x1E8426: lsp_alloc (zebra_mpls.c:1116)
==1595641== by 0x49147F1: hash_get (hash.c:162)
==1595641== by 0x1EBD7C: mpls_zapi_labels_process (zebra_mpls.c:2915)
==1595641== by 0x1C35D9: zread_mpls_labels_add (zapi_msg.c:2513)
==1595641== by 0x1C6F11: zserv_handle_commands (zapi_msg.c:3863)
==1595641== by 0x24D0F4: zserv_process_messages (zserv.c:523)
==1595641== by 0x498F4CC: thread_call (thread.c:2002)
==1595641== by 0x49253A2: frr_run (libfrr.c:1198)
==1595641== by 0x1A28BA: main (main.c:475)
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Since the `mac->flags` with `ZEBRA_MAC_ES_PEER_ACTIVE` is about ES Peer,
it should be displayed as `PEER Active`.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Just adding two more attributes to decode and show nicely in netlink
msgdump debug output.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The IPv6 version needs rtm_src_len and rtm_dst_len filled in due to
strict validation. IPv4 also has this requirement, but zebra is running
in non-strict mode there so the kernel accepts it...
Also the table ID hack is IPv4 only.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The multicast routing RTM_GETROUTE command does not use IIF/OIF
attributes, and the IPv6 version will refuse them with an error due to
being new netlink API and thus using strict validation.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
These two structs happen to be the same size and have the family field
in the same spot, but the correct one to use here is rtmsg not ndmsg.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
zebra does not care about _notifications_ from the kernel regarding
multicast routing; we only use the MR netlink API to request stats from
the kernel by actively sending a RTM_GETROUTE.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
New output example:
2022-07-03 09:40:29.310 [DEBG] zebra: [JF0K0-DVHWH] rib_meta_queue_add: (0:254):4.5.6.8/32: queued rn 0x55937f586ee0 into sub-queue Kernel Routes
2022-07-03 09:40:29.321 [DEBG] zebra: [HH6N2-PDCJS] default(0:254):4.5.6.8/32 rn 0x55937f586ee0 dequeued from sub-queue Kernel Routes
Let's make it a bit more human readable.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a slightly modified version of Hiroki Sato's version:
9ca79c941f
Handle the `ENOBUFS` on a OS basis since it could have been implemented
differently (OpenBSD for an example uses `RTM_DESYNC`).
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Just some missing ones. Make zebra stop complaining, was getting
some messages from proto2zebra when doing testing, let's clean
that up from happening.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Instead of having global allow_delete move it to
where it belongs in the zrouter data structure.
Additionally show this data in `show zebra`
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When reading a multipath route and we detect an encoding
error from the kernel( yeah I don't think so either ),
let's tell the operator what happened to that route.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
There exists a possibility that an end operator has choosen
to compile FRR on an extremely old KERNEL that does not support
the SOL_NETLINK sockopt call. If so let's note it for them
instead of stuff silently not working.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The usage of SOL_NETLINK for adding memberships of interest is
1 group per call. The netink_socket function implied that
the call could be a bitfield of values. This is not correct
at all. This will trip someone else up in the future when
a new value is needed. Let's get it right `now` before
it becomes a problem.
Let's also add a bit of extra code to give operator a better
understanding of what went wrong when a kernel does not
support the option.
Finally as a point of future reference should FRR just switch
over to a loop to add the required loops instead of having
this bastardized approach of some going in one way and some
going in another way?
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The rib_process_dplane_results function was having each
sub function handler process the results and then
free the ctx. Lot's of functionality that needs to remember
to free the context. Let's just free it in the main loop.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add the ability for the netconf dplane code to handle
the global NETCONFA_IFINDEX_DEFAULT and NETCONF_IFINDEX_ALL
values. Then store our interested values when we get
them from the kernel as well as being able to display
them to the end operator.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When Zebra receives the netconf update an afi is passed
let's seperate that out and track the v4/v6 specific data
to save and store appropriately.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The mc_forwarding status for an interface was being sent but not
properly retrieved on the zebra master side of the dplane.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
-> Moved new capabilities needed to under HAVE_DPDK
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
PBR rules are installed as match, action rules in most dataplanes. This
requires the action to be resolved via a GW. And the GW to be subsequently
resolved to {SMAC, DMAC}.
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
Currently specific local neighbors (attached to SVIs) are maintatined
in an EVPN specific database. There is a need to maintain L3 neighbors
for other purposes including MAC resolution for PBR nexthops.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Cleanup compile and fix crash
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
'bridge vni add vni <id> dev <vxlan device>'
generates new RTM_NEWTUNNEL and RTM_DELTUNNEL
to add or remove vni to l3vxlan device.
Register new RTNLGRP_TUNNEL group to receive
new netlink notification.
Callback for the new RTM_xxxTUNNEL.
kernel patches:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?h=v5.18-rc7&id=7b8135f4df98b155b23754b6065c157861e268f1
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?h=v5.18-rc7&id=f9c4bb0b245cee35ef66f75bf409c9573d934cf9
Ticket:#3073812
Testing Done:
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Chirag Shah <chirag@nvidia.com>
When a interface is configured with this:
int eva
ipv6 nd ra-interval 5
no ipv6 nd suppress-ra
!
And then subsuquently the interface is created and brought up, FRR
would both error on joining the RA multicast address and never
properly work in this state.
Delay the startup of the join and start of the Router Advertisements
until after the ifindex has actually been found.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently kernel routes on system bring up would be `auto-accepted`,
then if an interface went down all kernel and system routes would
be re-evaluated. There exists situations where a kernel route can
exist but the interface itself is not exactly in a state that is
ready to create a connected route yet. As such when any interface
goes down in the system all kernel/system routes would be re-evaluated
and then since that interfaces connected route is not in the table yet
the route is matching against a default route( or not at all ) and
is being dropped.
Modify the code such that kernel or system routes just look for interface
being in a good state (up or operative) and accept it.
Broken code:
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp39s0, 00:05:08
K>* 1.2.3.5/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.6/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.7/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.8/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.9/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.10/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.12/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.13/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.14/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.16/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.17/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
C>* 4.5.6.99/32 is directly connected, dummy9, 00:05:08
K>* 4.9.10.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 10.11.12.13/32 [0/0] via 192.168.119.1, enp39s0, 00:05:08
C>* 192.168.10.0/24 is directly connected, dummy99, 00:05:08
C>* 192.168.119.0/24 is directly connected, enp39s0, 00:05:08
<shutdown a non-related interface>
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp39s0, 00:05:28
C>* 4.5.6.99/32 is directly connected, dummy9, 00:05:28
K>* 10.11.12.13/32 [0/0] via 192.168.119.1, enp39s0, 00:05:28
C>* 192.168.10.0/24 is directly connected, dummy99, 00:05:28
C>* 192.168.119.0/24 is directly connected, enp39s0, 00:05:28
Working code:
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp39s0, 00:00:04
K>* 1.2.3.5/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.6/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.7/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.8/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.9/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.10/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.12/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.13/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.14/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.16/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.17/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
C>* 4.5.6.99/32 is directly connected, dummy9, 00:00:04
K>* 4.9.10.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 10.11.12.13/32 [0/0] via 192.168.119.1, enp39s0, 00:00:04
C>* 192.168.10.0/24 is directly connected, dummy99, 00:00:04
C>* 192.168.119.0/24 is directly connected, enp39s0, 00:00:04
<shutdown a non-related interface>
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp39s0, 00:00:15
K>* 1.2.3.5/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.6/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.7/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.8/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.9/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.10/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.12/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.13/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.14/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.16/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.17/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
C>* 4.5.6.99/32 is directly connected, dummy9, 00:00:15
K>* 4.9.10.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 10.11.12.13/32 [0/0] via 192.168.119.1, enp39s0, 00:00:15
C>* 192.168.10.0/24 is directly connected, dummy99, 00:00:15
C>* 192.168.119.0/24 is directly connected, enp39s0, 00:00:15
eva#
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When a nexthop is set RTNH_F_LINKDOWN, start noticing
that this flag is set. Allow FRR to know about this
flag but at this point do not do anything with it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When reading a on the fly change of an interested netconf netlink
message. The ifindex and ns_id for the context was being set for the sub structure
but not for the main context data structure and zebra_if_dplane_result
was dropping the result on the floor because it was expecting the ns_id and
the interface id to be in a different spot.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The two checks for l3vni have been already done in
`lib_vrf_zebra_l3vni_id_modify()` as it should be. And it is improper that
the two checks are put after `zebra_vxlan_handle_vni_transition()`, which
will do real things.
My original fix is to remove them. But NB module can't guarantee many changes,
so we'd better keep them in `zebra_vxlan_process_vrf_vni_cmd()` in APPLY stage
for safe.
Just move them in front of `zebra_vxlan_handle_vni_transition()`.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
When disabling MLAG leaf configuration with EVPN, logs are
getting flooded for each VNI, This is the result of each Type-2
packets. Ideally, this should be under log debugging, not a warning.
Testing: UT
Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>
Since the calling hook for old fpm is done in `rib_uninstall_kernel()`
inside, this calling place outside should be redundant. Just remove it.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Allow end operator to set how long a nexthop-group is kept around
in the system after it is no-longer being used.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Before deleting nexthop groups, that are installed,
from the system, start a timer and hold the nexthop
group for that time.
Suppose you have this scenario
a) create a static route with 1 x ecmp
creates a nhg with 1 x ecmp
b) create a static route with 2 x ecmp
creates a nhg with 2 x ecmp
deletes a's nhg
c) create a static route with 3 x ecmp
creates a nhg with 3 x ecmp
deletes b's nhg
d) create a different route with 1 x ecmp
creates another 1 x ecmp ( since a's ecmp was deleted )
e) create a different route with 2 x ecmp
creates another 2 x ecmp ( since b's ecmp was deleted )
If you don't delete the nhg, start a timer, the nhg's used
in steps a and b can be reused for steps d and e. This reduces
overhead work with zebra <-> kernel interactions and improves
the speed of the system.
So modify the code to note that an installed nexthop group should
be kept around a bit and hopefully reused.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently the code is marking the nhg as uninstalled but not
causing that to flood up to the dependent nhgs:
nhg 3 is a group of 1/2
1 -> interface A
2 -> interface B
Suppose A goes down, old code would mark nhg 1 as !VALID and !INSTALLED.
Suppose B then goes down, old code would mark nhg 2 as !VALID and !INSTALLED
But would not mark nhg 3 as !VALID and !INSTALLED (sort of assuming that
it would just be cleaned up by NHG refcounts ). I would prefer that
the code is pedantic about nhg 3 actually being removed from the system.
This code moves the setting of !INSTALLED into zebra_nhg.c where it
really belongs.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
I keep getting confused about nhg_depends and nhg_dependents.
So take a second and write them down for the next person.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Consolidate on linux to using the netlink api for gathering all data
about a interface. Leave this interface alone in the meantime for
other OS's.
This also has the side effect of reducing the amount of work
being done on linux in that FRR was handling shut/no shut
events 2 times. Once for the ioctl question asked and
once for the netlink message received.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
While examining the code, it was noticed that there was a chance
to improve the log output in some cases to give a fuller understanding
of what went wrong where.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
If stream_dup was unable to actually allocate memory
then FRR would crash instead. So let's remove the
check for null since it is not needed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The nexthop group debugs were using %u to just display the id.
I found this very hard to figure out what was going on.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add `%pNG` so that a nexthop group can be displayed in debugs/logs
such that it can provide useful information.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Multipath route may have mixed nexthops of EVPN and IP unicast. Move
EVPN flag to nexthop to support such cases.
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
When the kernel was sending an RTM_NEWLINK updating the MAC of a known
SVI, Type-2 routes created by advertise-svi-ip were not getting updated
with the new address.
This adds removal of any old Type-2 routes (with old MAC) and creation
of new Type-2 routes (with new MAC) into RTM_NEWLINK processing.
Fixes: #11174
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
The usage of zebra dplane makes the job asyncronous which implies
that a given job will try to add an iptable, while the second job
will not know that its iptable is the same as the former one.
The below exabgp rules stand for two bgp flowspec rules sent to
the bgp device:
flow {
route {match {
source 185.228.172.73/32;
destination 0.0.0.0/0;
source-port >=49156&<=49159;
}then {redirect 213.242.114.113;}}
route {match {
source 185.228.172.73/32;
destination 0.0.0.0/0;
source-port >=49160&<=49163;
}then {redirect 213.242.114.113;}}
}
This rule creates a single iptable, but in fact, the same iptable
name is appended twice. This results in duplicated entries in the
iptables context. This also results in contexts not flushed, when
BGP session or 'flush' operation is performed.
iptables-save:
[..]
-A PREROUTING -m set --match-set match0x55baf4c25cb0 src,src -g match0x55baf4c25cb0
-A PREROUTING -m set --match-set match0x55baf4c25cb0 src,src -g match0x55baf4c25cb0
-A match0x55baf4c25cb0 -j MARK --set-xmark 0x100/0xffffffff
-A match0x55baf4c25cb0 -j ACCEPT
-A match0x55baf4c25cb0 -j MARK --set-xmark 0x100/0xffffffff
-A match0x55baf4c25cb0 -j ACCEPT
[..]
This commit addresses this issue, by checking that an iptable
context is not already being processed. A flag is added in the
original iptable context, and a check is done if the iptable
context is not already being processed for install or uinstall.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Move a few things into places they actually belong, and reduce the
number of places we have `#ifdev HAVE_RTADV`. Just overall code
prettification.
... I had actually done this quite a while ago while doing some other
random hacking and thought it more useful to not be sitting on it on my
disk...
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The parent node of "vrf" MUST be non-NULL, so the check is unnecessary and
misleading. Otherwise, there will be a branch of NULL parent node, it makes
no sense, remove it.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
The kernel supports l3vxlan device to have (l3vni)
vni filter similar to vlan filtering on bridge device.
To receive netlink notification, FRR to register
for new netlink RTNLGRP_TUNNEL message.
This message required to register via additional
socket option as it's beyond bitmap size.
kernel patches:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?h=v5.18-rc7&id=7b8135f4df98b155b23754b6065c157861e268f1
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?h=v5.18-rc7&id=f9c4bb0b245cee35ef66f75bf409c9573d934cf9
Ticket:#3073812
Testing Done:
Signed-off-by: Chirag Shah <chirag@nvidia.com>