Commit Graph

6004 Commits

Author SHA1 Message Date
anlan_cs
3cb4dcda5c zebra: fix missing kernel routes
The `rib_update_handle_kernel_route_down_possibility()` didn't consider
the kernel routes ( blackhole )  without interface.  When some other
interfaces are down, these kernel routes will be wrongly removed.

Signed-off-by: anlan_cs <anlan_cs@126.com>
(cherry picked from commit 44a82da405)
2024-11-05 15:22:23 +00:00
Donatas Abraitis
644211270f zebra: Add missing new line for help string
```
  -A, --asic-offload        FRR is interacting with an asic underneath the linux kernel
      --v6-with-v4-nexthops Underlying dataplane supports v6 routes with v4 nexthops  -s, --nl-bufsize          Set netlink receive buffer size
```

Fixes: 1f5611c06d ("zebra: Allow zebra cli to accept v6 routes with v4 nexthops")

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 25ae643996)
2024-10-31 13:14:40 +00:00
Donald Sharp
7ddbadd7f7 zebra: When installing a mroute, allow it to flow
Currently the mroute code was not allowing the mroute
to be sent to the dataplane.  This leaves us with a
situation where the routes being installed where never
being set as installed and additionally nht against
the mrib would not work if the route came into existence
after the nexthop tracking was asked for.

Turns out all the pieces where there to let this work.
Modify the code to pass it to the dplane and to send
it back up as having worked.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-10-30 07:47:32 -04:00
Donald Sharp
fb08f08ebb zebra: Add safi to some debugs
Trying to figure out what safi we are talking about is fun when
it is not put into the debugs.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 811168ecc3)
2024-10-30 07:45:08 -04:00
Donatas Abraitis
b65f4ad423 lib, zebra: Keep zebra on-rib-process script in frr.conf
After the change:

```
$ grep on-rib-process /etc/frr/frr.conf
zebra on-rib-process script script4

$ systemctl restart frr

$ vtysh -c 'show run' | grep on-rib-process
zebra on-rib-process script script4
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 1fe1f8d87c)
2024-10-27 23:24:50 +00:00
Donald Sharp
d8fc147d2c
Merge pull request #17143 from FRRouting/mergify/bp/dev/10.2/pr-17020
zebra: fix heap-use-after free on ns shutdown (backport #17020)
2024-10-16 15:23:38 -04:00
Philippe Guibert
bcdc8249b9 zebra: fix heap-use-after free on ns shutdown
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     #4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     #5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     #6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     #7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     #8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     #9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     #10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     #11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     #12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     #13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     #14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     #15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     #16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     #17 0x7f26f2777108 in event_call lib/event.c:2011
>     #18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     #19 0x55910c4f49cb in main zebra/main.c:531
>     #20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     #21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     #22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 7ae70eb5ef)
2024-10-16 14:49:50 +00:00
Enke Chen
5aae058522 zebra: unlock node only after operation in zebra_free_rnh()
Move route_unlock_node() after rnh_list_del().

Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
(cherry picked from commit 5b6ff51b8a)
2024-10-16 05:13:50 +00:00
Donald Sharp
cf9c02a8b1 zebra: Prevent a kernel route from being there when a connected should
There exists a series of events where a kernel route is learned
first( that happens to be exactly what a connected route should be )
and FRR ends up with both a kernel route and a connected route,
leaving us in a very strange spot.  This code change just mirrors
the existing code of if there is a connected route drop the kernel
route.  Here we just do the reverse, if we have a kernel route
already and a connected should be created, remove the kernel and
keep the connected.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 74e25198e7)
2024-10-15 19:05:50 +00:00
Donald Sharp
4353b81bbf zebra: Fix crash in pw code
Recent PR #17009 introduced a crash in pw handing
for deletion.  Let's fix that problem.

Fixes: #17041
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 154a89bc31)
2024-10-09 18:49:06 +00:00
Russ White
b8c458622d
Merge pull request #17023 from donaldsharp/dplane_problems
zebra: Allow dplane to pass larger number of nexthops down to dataplane
2024-10-08 11:45:27 -04:00
Donald Sharp
9f8968fc5a *: Allow 16 bit size for nexthops
Currently FRR is limiting the nexthop count to a uint8_t not a
uint16_t.  This leads to issues when the nexthop count is 256
which results in the count to overflow to 0 causing problems
in the code.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-10-08 09:26:57 -04:00
Donald Sharp
a8af2b2a9d zebra: Do not retry in 30 seconds on pw reachability failure
Currently the zebra pw code has setup a retry to install the
pw after 30 seconds when it is decided that reachability to
the pw is gone.  This causes a failure mode where the
pw code just goes and re-installs the pw after 30 seconds
in the non-reachability case.  Instead it should just be
reinstalling after reachability is restored.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-10-07 20:36:45 -04:00
Donald Sharp
f50b1f7c22 zebra: Move pw status settting until after we get results
Currently the pw code sets the status of the pw for install
and uninstall immediately when notifying the dplane.  This
is incorrect in that we do not actually know the status at
this point in time.  When we get the result is when to set
the status.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-10-07 20:36:45 -04:00
Donatas Abraitis
ded59bcc72
Merge pull request #17013 from dksharp5/removal_functions
Removal functions
2024-10-07 11:47:01 +03:00
Donna Sharp
f62dfc5d53 lib,zebra: remove unused ZEBRA_VRF_UNREGISTER
Signed-off-by: Donna Sharp <dksharp5@gmail.com>
2024-10-06 19:40:49 -04:00
Donna Sharp
103f24485c zebra: remove unsued function from tc_netlink.c
Signed-off-by: Donna Sharp <dksharp5@gmail.com>
2024-10-06 19:30:56 -04:00
Donna Sharp
7a63799a84 zebra: remove unused function from if_netlink.c
Signed-off-by: Donna Sharp <dksharp5@gmail.com>
2024-10-06 19:25:44 -04:00
Donna Sharp
b6dd4ff8bc zebra: remove unused function from tc_netlink.c
Signed-off-by: Donna Sharp <dksharp5@gmail.com>
2024-10-06 19:08:44 -04:00
Donna Sharp
8eb5f4f506 zebra: remove unused function rib_lookup_ipv4
Signed-off-by: Donna Sharp <dksharp5@gmail.com>
2024-10-06 18:53:11 -04:00
Russ White
15991e1a08
Merge pull request #16800 from donaldsharp/nhg_reuse_intf_down_up
Nhg reuse intf down up
2024-10-04 10:28:58 -04:00
Igor Zhukov
a3877e4444 zebra: Fix crash during reconnect
fpm_enqueue_rmac_table expects an fpm_rmac_arg* as its argument.

The issue can be reproduced by dropping the TCP session using:

ss -K dst 127.0.0.1 dport = 2620

I used Fedora 40 and frr 9.1.2 and I got the gdb backtrace:

(gdb) bt
0  0x00007fdd7d6997ea in fpm_enqueue_rmac_table (bucket=0x2134dd0, arg=0x2132b60) at zebra/dplane_fpm_nl.c:1217
1  0x00007fdd7dd1560d in hash_iterate (hash=0x21335f0, func=0x7fdd7d6997a0 <fpm_enqueue_rmac_table>, arg=0x2132b60) at lib/hash.c:252
2  0x00007fdd7dd1560d in hash_iterate (hash=0x1e5bf10, func=func@entry=0x7fdd7d698900 <fpm_enqueue_l3vni_table>,
    arg=arg@entry=0x7ffed983bef0) at lib/hash.c:252
3  0x00007fdd7d698b5c in fpm_rmac_send (t=<optimized out>) at zebra/dplane_fpm_nl.c:1262
4  0x00007fdd7dd6ce22 in event_call (thread=thread@entry=0x7ffed983c010) at lib/event.c:1970
5  0x00007fdd7dd20758 in frr_run (master=0x1d27f10) at lib/libfrr.c:1213
6  0x0000000000425588 in main (argc=10, argv=0x7ffed983c2e8) at zebra/main.c:492

Signed-off-by: Igor Zhukov <fsb4000@yandex.ru>
2024-10-04 14:59:14 +07:00
Donald Sharp
f53dde0e59 zebra: Add missing proto translations
Add missing isis and eigrp proto translations.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-25 12:14:50 -04:00
Donald Sharp
e41ae0acc1 zebra: Correctly report metrics
Report the routes metric in IPFORWARDMETRIC1 and return
-1 for the other metrics as required by the IP-FORWARD-MIB.

inetCidrRouteMetric2 OBJECT-TYPE
    SYNTAX     Integer32
    MAX-ACCESS read-create
    STATUS     current
    DESCRIPTION
           "An alternate routing metric for this route.  The
            semantics of this metric are determined by the routing-
            protocol specified in the route's inetCidrRouteProto
            value.  If this metric is not used, its value should be
            set to -1."
    DEFVAL { -1 }
    ::= { inetCidrRouteEntry 13 }

I've included metric2 but it's the same for all of them.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-25 12:09:40 -04:00
Donald Sharp
659cd66427 zebra: Let's use memset instead of walking bytes and setting to 0
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-25 12:08:03 -04:00
Donald Sharp
ecd9d441b0 zebra: Fix snmp walk of zebra rib
The snmp walk of the zebra rib was skipping entries
because in_addr_cmp was replaced with a prefix_cmp
which worked slightly differently causing parts
of the zebra rib tree to be skipped.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-25 12:06:29 -04:00
Donald Sharp
e54261e20d lib, zebra: TABLE_NODE is not used
No-one is using this, remove

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-24 16:05:54 -04:00
Donatas Abraitis
9616304e47
Merge pull request #16882 from mjstapp/fix_if_table_unlock
zebra: unlock if_table route_nodes
2024-09-23 10:24:52 +02:00
Mark Stapp
c40635c5c2 zebra: unlock if_table route_nodes
Must unlock if we break during iteration over any lib/table
tree.

Signed-off-by: Mark Stapp <mjs@cisco.com>
2024-09-20 12:24:01 -04:00
Donald Sharp
58722b9448 zebra: Pass in ZEBRA_ROUTE_MAX instead of true
zebra_nhg_install_kernel takes a route type.  We don't
know it at that particular spot but we should not be passing
in `true`.  Let's use ZEBRA_ROUTE_MAX to indicate we do not
know, so that the correct thing is done.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-20 11:00:11 -04:00
Donatas Abraitis
73d01a8e40 zebra: Send a correct size of ctx->nh6 for SRv6 SEG6_LOCAL_ACTION_END_DX6
Fixes: f6e58d26f6 ("zebra, sharpd: add srv6 End.DX6 support")

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-09-19 23:54:43 +03:00
Donald Sharp
ccbfb46d28 zebra: Remove nl_addraw_l
This function is never used.  So let's remove it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-19 08:28:48 -04:00
Donald Sharp
1af0a67401 zebra: In zebra_evpn_mac.c remove bad comments
Adding comments that tell what a variable is doing in
the middle of a function call makes it extremely hard
to read the formatting.  Remove.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-18 07:35:54 -04:00
Donald Sharp
03a7ab10fe zebra: Reindent some badly formatted functions in zebra_evpn_mac.c
Fix some badly formatted code to fit better on the screen.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-18 07:33:17 -04:00
Donald Sharp
390406973c zebra: Reframe zebra_evpn_mac.c to be properly formatted
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-18 07:33:17 -04:00
Donald Sharp
f02d76f0fd zebra: Attempt to reuse NHG after interface up and route reinstall
The previous commit modified zebra to reinstall the singleton
nexthops for a nexthop group when a interface event comes up.
Now let's modify zebra to attempt to reuse the nexthop group
when this happens and the upper level protocol resends the
route down with that.  Only match if the protocol is the same
as well as the instance and the nexthop groups would match.

Here is the new behavior:
eva(config)# do show ip route 9.9.9.9/32
Routing entry for 9.9.9.9/32
  Known via "static", distance 1, metric 0, best
  Last update 00:00:08 ago
  * 192.168.99.33, via dummy1, weight 1
  * 192.168.100.33, via dummy2, weight 1
  * 192.168.101.33, via dummy3, weight 1
  * 192.168.102.33, via dummy4, weight 1

eva(config)# do show ip route nexthop-group 9.9.9.9/32
% Unknown command: do show ip route nexthop-group 9.9.9.9/32
eva(config)# do show ip route 9.9.9.9/32 nexthop-group
Routing entry for 9.9.9.9/32
  Known via "static", distance 1, metric 0, best
  Last update 00:00:54 ago
  Nexthop Group ID: 57
  * 192.168.99.33, via dummy1, weight 1
  * 192.168.100.33, via dummy2, weight 1
  * 192.168.101.33, via dummy3, weight 1
  * 192.168.102.33, via dummy4, weight 1

eva(config)# exit
eva# conf
eva(config)# int dummy3
eva(config-if)# shut
eva(config-if)# no shut
eva(config-if)# do show ip route 9.9.9.9/32 nexthop-group
Routing entry for 9.9.9.9/32
  Known via "static", distance 1, metric 0, best
  Last update 00:00:08 ago
  Nexthop Group ID: 57
  * 192.168.99.33, via dummy1, weight 1
  * 192.168.100.33, via dummy2, weight 1
  * 192.168.101.33, via dummy3, weight 1
  * 192.168.102.33, via dummy4, weight 1

eva(config-if)# exit
eva(config)# exit
eva# exit
sharpd@eva ~/frr1 (master) [255]> ip nexthop show id 57
id 57 group 37/43/50/58 proto zebra
sharpd@eva ~/frr1 (master)> ip route show 9.9.9.9/32
9.9.9.9 nhid 57 proto 196 metric 20
	nexthop via 192.168.99.33 dev dummy1 weight 1
	nexthop via 192.168.100.33 dev dummy2 weight 1
	nexthop via 192.168.101.33 dev dummy3 weight 1
	nexthop via 192.168.102.33 dev dummy4 weight 1
sharpd@eva ~/frr1 (master)>

Notice that we now no longer are creating a bunch of new
nexthop groups.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-16 09:34:05 -04:00
Donald Sharp
3be8b48e6b zebra: Reinstall nexthop when interface comes back up
If a interface down event caused a nexthop group to remove
one of the entries in the kernel, have it be reinstalled
when the interface comes back up.  Mark the nexthop as
usable.

new behavior:
eva# show nexthop-group rib 181818168
ID: 181818168 (sharp)
     RefCnt: 1
     Uptime: 00:00:23
     VRF: default(bad-value)
     Valid, Installed
     Depends: (35) (38) (44) (51)
           via 192.168.99.33, dummy1 (vrf default), weight 1
           via 192.168.100.33, dummy2 (vrf default), weight 1
           via 192.168.101.33, dummy3 (vrf default), weight 1
           via 192.168.102.33, dummy4 (vrf default), weight 1
eva# conf
eva(config)# int dummy3
eva(config-if)# shut
eva(config-if)# do show nexthop-group rib 181818168
ID: 181818168 (sharp)
     RefCnt: 1
     Uptime: 00:00:44
     VRF: default(bad-value)
     Depends: (35) (38) (44) (51)
           via 192.168.99.33, dummy1 (vrf default), weight 1
           via 192.168.100.33, dummy2 (vrf default), weight 1
           via 192.168.101.33, dummy3 (vrf default) inactive, weight 1
           via 192.168.102.33, dummy4 (vrf default), weight 1
eva(config-if)# no shut
eva(config-if)# do show nexthop-group rib 181818168
ID: 181818168 (sharp)
     RefCnt: 1
     Uptime: 00:00:53
     VRF: default(bad-value)
     Valid, Installed
     Depends: (35) (38) (44) (51)
           via 192.168.99.33, dummy1 (vrf default), weight 1
           via 192.168.100.33, dummy2 (vrf default), weight 1
           via 192.168.101.33, dummy3 (vrf default), weight 1
           via 192.168.102.33, dummy4 (vrf default), weight 1
eva(config-if)# exit
eva(config)# exit
eva# exit
sharpd@eva ~/frr1 (master) [255]> ip nexthop show id 181818168
id 181818168 group 35/38/44/51 proto 194
sharpd@eva ~/frr1 (master)>

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-16 09:34:05 -04:00
Donald Sharp
ce166ca789 zebra: Expose _route_entry_dump_nh so it can be used.
Expose this helper function so it can be used in zebra_nhg.c

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-16 09:34:05 -04:00
Donald Sharp
1bbbcf043b zebra: Properly note that a nhg's nexthop has gone down
Current code when a link is set down is to just mark the
nexthop group as not properly setup.  Leaving situations
where when an interface goes down and show output is
entered we see incorrect state.  This is true for anything
that would be checking those flags at that point in time.

Modify the interface down nexthop group code to notice the
nexthops appropriately ( and I mean set the appropriate flags )
and to allow a `show ip route` command to actually display
what is going on with the nexthops.

eva# show ip route 1.0.0.0
Routing entry for 1.0.0.0/32
  Known via "sharp", distance 150, metric 0, best
  Last update 00:00:06 ago
  * 192.168.44.33, via dummy1, weight 1
  * 192.168.45.33, via dummy2, weight 1

sharpd@eva:~/frr1$ sudo ip link set dummy2 down

eva# show ip route 1.0.0.0
Routing entry for 1.0.0.0/32
  Known via "sharp", distance 150, metric 0, best
  Last update 00:00:12 ago
  * 192.168.44.33, via dummy1, weight 1
    192.168.45.33, via dummy2 inactive, weight 1

Notice now that the 1.0.0.0/32 route now correctly
displays the route for the nexthop group entry.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-16 09:34:05 -04:00
Enke Chen
f6e28717ec zebra: include the prefix in nht show command
Include the prefix in "show ip nht" and "show ipv6 nht".

Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
2024-09-14 23:47:00 -07:00
Donald Sharp
e0437aba6d zebra: Add more vrf name to debugs
Trying to debug some cross vrf stuff in zebra and frankly
it's hard to grep the file for the routes you are interested
in.  Let's clean this up some and get a bit better
information for us developers

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-11 15:30:43 -04:00
Russ White
add56c61dd
Merge pull request #15259 from dmytroshytyi-6WIND/nexthop_resolution
zebra: add LSP entry to nexthop via recursive (part 2)
2024-09-10 10:04:08 -04:00
Donald Sharp
98b11de9f6 zebra: Modify show zebra dplane providers to give more data
The show zebra dplane provider command was ommitting
the input and output queues to the dplane itself.
It would be nice to have this insight as well.

New output:
r1# show zebra dplane providers
dataplane Incoming Queue from Zebra: 100
Zebra dataplane providers:
  Kernel (1): in: 6, q: 0, q_max: 3, out: 6, q: 14, q_max: 3
  dplane_fpm_nl (2): in: 6, q: 10, q_max: 3, out: 6, q: 0, q_max: 3
dataplane Outgoing Queue to Zebra: 43
r1#

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-05 15:52:05 -04:00
Donald Sharp
8926ac1984 zebra: Limit queue depth in dplane_fpm_nl
The dplane providers have a concept of input queues
and output queues.  These queues are chained together
during normal operation.  The code in zebra also has
a feedback mechanism where the MetaQ will not run when
the first input queue is backed up.  Having the dplane_fpm_nl
code grab all contexts when it is backed up prevents
this system from behaving appropriately.

Modify the code to not add to the dplane_fpm_nl's internal
queue when it is already full.  This will allow the backpressure
to work appropriately in zebra proper.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-05 15:52:05 -04:00
Donald Sharp
3af381b502 zebra: Modify dplane loop to allow backpressure to filter up
Currently when the dplane_thread_loop is run, it moves contexts
from the dg_update_list and puts the contexts on the input queue
of the first provider.  This provider is given a chance to run
and then the items on the output queue are pulled off and placed
on the input queue of the next provider.  Rinse/Repeat down through
the entire list of providers.  Now imagine that we have a list
of multiple providers and the last provider is getting backed up.
Contexts will end up sticking in the input Queue of the `slow`
provider.  This can grow without bounds.  This is a real problem
when you have a situation where an interface is flapping and an
upper level protocol is sending a continous stream of route
updates to reflect the change in ecmp.  You can end up with
a very very large backlog of contexts.  This is bad because
zebra can easily grow to a very very large memory size and on
restricted systems you can run out of memory.  Fortunately
for us, the MetaQ already participates with this process
by not doing more route processing until the dg_update_list
goes below the working limit of dg_updates_per_cycle.  Thus
if FRR modifies the behavior of this loop to not move more
contexts onto the input queue if either the input queue
or output queue of the next provider has reached this limit.
FRR will naturaly start auto handling backpressure for the dplane
context system and memory will not go out of control.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-05 15:44:34 -04:00
Donald Sharp
34670c476a zebra: Use the ctx queue counters
The ctx queue data structures already have a counter
associated with them.  Let's just use them instead.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-05 15:44:34 -04:00
Donald Sharp
d97c535c1e *: Create termtable specific temp memory
When trying to track down a MTYPE_TMP memory leak
it's harder to search for it when you happen to
have some usage of ttable_dump.  Let's just give
it it's own memory type so that we can avoid
confusion in the future.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-01 13:07:46 -04:00
Donald Sharp
0c72a78930 zebra: Allow for initial deny of installation of nhe's
Currently the FRR code will receive both kernel and
connected routes that do not actually have an underlying
nexthop group at all.  Zebra turns around and creates
a `matching` nexthop hash entry and installs it.
For connected routes, this will create 2 singleton
nexthops in the dplane per interface (v4 and v6).
For kernel routes it would just create 1 singleton
nexthop that might be used or not.

This is bad because the dplane has a limited amount
of space available for nexthop entries and if you
happen to have a large number of interfaces then
all of a sudden you have 2x(# of interfaces) singleton
nexthops.

Let's modify the code to delay creation of these singleton
nexthops until they have been used by something else in the
system.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-30 08:23:48 -04:00
Jafar Al-Gharaibeh
90787a57fd
Merge pull request #16689 from donaldsharp/blackhole_and_afi
Blackhole and afi
2024-08-29 22:13:03 -04:00
Donald Sharp
8ad5643abe zebra: Convince SA that the ng will always be valid
There is a code path that could theoretically get you
to a point where the ng->nexthop is a NULL value.
Let's just make sure the SA system believes that
cannot happen anymore.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-29 18:10:30 -04:00