mirror_frr

mirror of https://git.proxmox.com/git/mirror_frr synced 2025-07-25 11:04:22 +00:00

Author	SHA1	Message	Date
Donatas Abraitis	91e157f3ae	Merge pull request #17162 from louis-6wind/fix-bh-nh-vrf zebra: fix showing nexthop vrf for ipv6 blackhole	2024-10-23 17:34:44 +03:00
Jafar Al-Gharaibeh	0078472e19	Merge pull request #17180 from anlancs/zebra/review-move-dplane zebra: drop NEWLINK event handling in the main thread	2024-10-22 10:29:49 -05:00
anlan_cs	96192f6aee	zebra: drop NEWLINK event handling in the main thread NEWLINK is only registered by the dplane thread, the main thread doesn't care about it. So remove the real process of `netlink_link_change()` for NEWLINK event in main thread. And move NEWLINK/DELLINK event to the block where the dplane messages are kept together. Signed-off-by: anlan_cs <anlan_cs@126.com>	2024-10-22 09:05:00 +08:00
anlan_cs	5829fea1b5	zebra: remove useless code Signed-off-by: anlan_cs <anlan_cs@126.com>	2024-10-19 13:32:53 +08:00
Louis Scalbert	6cdc82b21b	zebra: fix showing nexthop vrf for ipv6 blackhole For some reasons the Linux kernel associates the ipv6 blackhole of non default table the lo interface. > root@r1# ip -6 route show table 100 > root@r1# ip -6 route add unreachable default metric 4278198272 table 100 > root@r1# ip -6 route show table 100 > unreachable default dev lo metric 4278198272 pref medium As a consequence, the VRF default that owns the lo interface is shown as the nexthop VRF: > r1# show ipv6 route table 20 > Table 20: > K>* ::/0 [255/8192] unreachable (ICMP unreachable) (vrf default), 00:18:12 Do not display the nexthop VRF of a blackhole. It does not make sense for a blackhole and it was not displayed in the past. Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>	2024-10-18 14:45:50 +02:00
Donald Sharp	5a2a9e3b89	zebra: Fix possible null deref discovered by coverity Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-10-17 07:42:47 -04:00
Donald Sharp	466efab870	Merge pull request #17136 from opensourcerouting/clang-sa-19 *: fix clang-19 SA	2024-10-17 07:38:28 -04:00
Donatas Abraitis	1ce225d7e4	Merge pull request #17076 from donaldsharp/rnh_and_redistribution_nexthop_num_fix *: Fix up improper handling of nexthops for nexthop tracking	2024-10-16 16:34:08 +03:00
Donald Sharp	cc63dbb68f	Merge pull request #17020 from pguibert6WIND/asan_shutdown zebra: fix heap-use-after free on ns shutdown	2024-10-16 09:15:06 -04:00
David Lamparter	e6cb1a90f2	zebra: check `dirfd()` result `dirfd()` can theoretically return an error. Call it once and check the result. clang-SA: technically correct™. Ain't that the best kind of correct? Signed-off-by: David Lamparter <equinox@opensourcerouting.org>	2024-10-16 13:30:25 +02:00
David Lamparter	67b0a457ed	zebra: don't misappropriate `errno` `errno` has its own semantics. Sometimes it is correct to write to it. This is not one of those cases - just use a separate `nl_errno`. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>	2024-10-16 13:30:25 +02:00
David Lamparter	1350f8d1c1	zebra: don't try to read past EOF `FILE *` objects are theoretically in an invalid state if you try to use them past their reporting EOF. Adjust the code to make it correct. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>	2024-10-16 13:30:25 +02:00
David Lamparter	c071b4370d	*: clang-SA switch-enum initializer workarounds In these cases the value assigned by the switch block is used directly rather than returned. Mark the initial/default value as used so clang-SA doesn't complain about it. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>	2024-10-16 13:30:25 +02:00
David Lamparter	49cf311d46	*: clang-SA friendly switch-enum-return-string clang-19's SA complains about unused initializers for this kind of "switch (enum) { return string }" kind of code. Use direct string return values to avoid the issue. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>	2024-10-16 13:00:11 +02:00
Donatas Abraitis	c32bdc2469	Merge pull request #17116 from enkechen-panw/zfix-2 zebra: unlock node only after operation in zebra_free_rnh()	2024-10-16 08:12:28 +03:00
Jafar Al-Gharaibeh	b2eaf86fb5	Merge pull request #15586 from donaldsharp/nht_explain_doc zebra: Attempt to explain the rnh tracking code better	2024-10-15 14:25:35 -05:00
Jafar Al-Gharaibeh	b23bbb885a	Merge pull request #17088 from donaldsharp/connected_kernel_fun zebra: Prevent a kernel route from being there when a connected should	2024-10-15 14:04:51 -05:00
Enke Chen	5b6ff51b8a	zebra: unlock node only after operation in zebra_free_rnh() Move route_unlock_node() after rnh_list_del(). Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>	2024-10-15 10:25:46 -07:00
Donald Sharp	28237d73ad	zebra: Attempt to explain the rnh tracking code better I got asked today what was going on in the rnh code. I had to take time off of what I was doing and rewrap my head around this code, since it's been a long time. As that this question may come up again in the future I am trying to document this better so that someone coming behind us will be able to read this and get a better idea of what the algorithm is attempting to do. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-10-15 12:42:17 -04:00
Donald Sharp	645a9e4f83	*: Fix up improper handling of nexthops for nexthop tracking Currently FRR needs to send a uint16_t value for the number of nexthops as well it needs the ability to properly decode all of this. Find and handle all the places that this happens. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-10-15 11:57:23 -04:00
Mark Stapp	6c1bc51bbb	Merge pull request #16737 from raja-rajasekar/rajasekarr/vlan_to_dplane zebra: vlan to dplane	2024-10-15 08:06:34 -04:00
Donald Sharp	74e25198e7	zebra: Prevent a kernel route from being there when a connected should There exists a series of events where a kernel route is learned first( that happens to be exactly what a connected route should be ) and FRR ends up with both a kernel route and a connected route, leaving us in a very strange spot. This code change just mirrors the existing code of if there is a connected route drop the kernel route. Here we just do the reverse, if we have a kernel route already and a connected should be created, remove the kernel and keep the connected. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-10-14 11:27:53 -04:00
Donatas Abraitis	d1433ee9a8	Merge pull request #17062 from donaldsharp/dplane_fpm_nl_problems zebra: Only notify dplane work pthread when needed	2024-10-14 08:14:34 +03:00
anlan_cs	05e2472de7	zebra: add back one field for debug The `flags` field is removed recently, so add back it for debug. Signed-off-by: anlan_cs <anlan_cs@126.com>	2024-10-13 21:30:46 +08:00
Donald Sharp	8aa97a439f	zebra: Slow down fpm_process_queue When the fpm_process_queue has run out of space but has written to the fpm output buffer, schedule it to wake up immediately, as that the write will go out pretty much immediately, since it was scheduled first. If the fpm_process_queue has not written to the output buffer then delay the processing by 10 milliseconds to allow a possibly backed up write processing to have a chance to complete it's work. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-10-11 09:37:37 -04:00
Donald Sharp	963792e8c5	zebra: Only notify dplane work pthread when needed The fpm_nl_process function was getting the count of the total number of ctx's processed. This leads to after having processed 1 context to always signal the dataplane that there is work to do. Change the code to only notify the dplane worker when a context was actually added to the outgoing context queue. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-10-11 09:37:37 -04:00
Donald Sharp	154a89bc31	zebra: Fix crash in pw code Recent PR #17009 introduced a crash in pw handing for deletion. Let's fix that problem. Fixes: #17041 Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-10-09 07:17:29 -04:00
Philippe Guibert	7ae70eb5ef	zebra: fix heap-use-after free on ns shutdown The following ASAN issue has been observed: > ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840 > READ of size 4 at 0x6160000acba4 thread T0 > #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315 > #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331 > #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680 > #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490 > #4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717 > #5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413 > #6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919 > #7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454 > #8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822 > #9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212 > #10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968 > #11 0x7f26f275b8a9 in route_node_free lib/table.c:75 > #12 0x7f26f275bae4 in route_table_free lib/table.c:111 > #13 0x7f26f275b749 in route_table_finish lib/table.c:46 > #14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191 > #15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244 > #16 0x55910c4f40db in zebra_finalize zebra/main.c:249 > #17 0x7f26f2777108 in event_call lib/event.c:2011 > #18 0x7f26f264180e in frr_run lib/libfrr.c:1212 > #19 0x55910c4f49cb in main zebra/main.c:531 > #20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > #21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392 > #22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114) It happens with FRR using the kernel. During shutdown, the namespace identifier is attempted to be obtained by zebra, in an attempt to prepare zebra dataplane nexthop messages. Fix this by accessing the ns structure. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2024-10-08 22:25:55 +02:00
Russ White	b8c458622d	Merge pull request #17023 from donaldsharp/dplane_problems zebra: Allow dplane to pass larger number of nexthops down to dataplane	2024-10-08 11:45:27 -04:00
Donald Sharp	9f8968fc5a	*: Allow 16 bit size for nexthops Currently FRR is limiting the nexthop count to a uint8_t not a uint16_t. This leads to issues when the nexthop count is 256 which results in the count to overflow to 0 causing problems in the code. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-10-08 09:26:57 -04:00
Donald Sharp	a8af2b2a9d	zebra: Do not retry in 30 seconds on pw reachability failure Currently the zebra pw code has setup a retry to install the pw after 30 seconds when it is decided that reachability to the pw is gone. This causes a failure mode where the pw code just goes and re-installs the pw after 30 seconds in the non-reachability case. Instead it should just be reinstalling after reachability is restored. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-10-07 20:36:45 -04:00
Donald Sharp	f50b1f7c22	zebra: Move pw status settting until after we get results Currently the pw code sets the status of the pw for install and uninstall immediately when notifying the dplane. This is incorrect in that we do not actually know the status at this point in time. When we get the result is when to set the status. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-10-07 20:36:45 -04:00
Donatas Abraitis	ded59bcc72	Merge pull request #17013 from dksharp5/removal_functions Removal functions	2024-10-07 11:47:01 +03:00
Donna Sharp	f62dfc5d53	lib,zebra: remove unused ZEBRA_VRF_UNREGISTER Signed-off-by: Donna Sharp <dksharp5@gmail.com>	2024-10-06 19:40:49 -04:00
Donna Sharp	103f24485c	zebra: remove unsued function from tc_netlink.c Signed-off-by: Donna Sharp <dksharp5@gmail.com>	2024-10-06 19:30:56 -04:00
Donna Sharp	7a63799a84	zebra: remove unused function from if_netlink.c Signed-off-by: Donna Sharp <dksharp5@gmail.com>	2024-10-06 19:25:44 -04:00
Donna Sharp	b6dd4ff8bc	zebra: remove unused function from tc_netlink.c Signed-off-by: Donna Sharp <dksharp5@gmail.com>	2024-10-06 19:08:44 -04:00
Donna Sharp	8eb5f4f506	zebra: remove unused function rib_lookup_ipv4 Signed-off-by: Donna Sharp <dksharp5@gmail.com>	2024-10-06 18:53:11 -04:00
Russ White	15991e1a08	Merge pull request #16800 from donaldsharp/nhg_reuse_intf_down_up Nhg reuse intf down up	2024-10-04 10:28:58 -04:00
Igor Zhukov	a3877e4444	zebra: Fix crash during reconnect fpm_enqueue_rmac_table expects an fpm_rmac_arg* as its argument. The issue can be reproduced by dropping the TCP session using: ss -K dst 127.0.0.1 dport = 2620 I used Fedora 40 and frr 9.1.2 and I got the gdb backtrace: (gdb) bt 0 0x00007fdd7d6997ea in fpm_enqueue_rmac_table (bucket=0x2134dd0, arg=0x2132b60) at zebra/dplane_fpm_nl.c:1217 1 0x00007fdd7dd1560d in hash_iterate (hash=0x21335f0, func=0x7fdd7d6997a0 <fpm_enqueue_rmac_table>, arg=0x2132b60) at lib/hash.c:252 2 0x00007fdd7dd1560d in hash_iterate (hash=0x1e5bf10, func=func@entry=0x7fdd7d698900 <fpm_enqueue_l3vni_table>, arg=arg@entry=0x7ffed983bef0) at lib/hash.c:252 3 0x00007fdd7d698b5c in fpm_rmac_send (t=<optimized out>) at zebra/dplane_fpm_nl.c:1262 4 0x00007fdd7dd6ce22 in event_call (thread=thread@entry=0x7ffed983c010) at lib/event.c:1970 5 0x00007fdd7dd20758 in frr_run (master=0x1d27f10) at lib/libfrr.c:1213 6 0x0000000000425588 in main (argc=10, argv=0x7ffed983c2e8) at zebra/main.c:492 Signed-off-by: Igor Zhukov <fsb4000@yandex.ru>	2024-10-04 14:59:14 +07:00
Rajasekar Raja	aa4786642c	zebra: vlan to dplane Offload from main Trigger: Zebra core seen when we convert l2vni to l3vni and back BackTrace: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(_zlog_assert_failed+0xe9) [0x7f4af96989d9] /usr/lib/frr/zebra(zebra_vxlan_if_vni_up+0x250) [0x5561022ae030] /usr/lib/frr/zebra(netlink_vlan_change+0x2f4) [0x5561021fd354] /usr/lib/frr/zebra(netlink_parse_info+0xff) [0x55610220d37f] /usr/lib/frr/zebra(+0xc264a) [0x55610220d64a] /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(thread_call+0x7d) [0x7f4af967e96d] /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(frr_run+0xe8) [0x7f4af9637588] /usr/lib/frr/zebra(main+0x402) [0x5561021f4d32] /lib/x86_64-linux-gnu/libc.so.6(+0x2724a) [0x7f4af932624a] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7f4af9326305] /usr/lib/frr/zebra(_start+0x21) [0x5561021f72f1] Root Cause: In working case, - We get a RTM_NEWLINK whose ctx is enqueued by zebra dplane and dequeued by zebra main and processed i.e. (102000 is deleted from vxlan99) before we handle RTM_NEWVLAN. - So in handling of NEWVLAN (vxlan99) we bail out since find with vlan id 703 does not exist. root@leaf2:mgmt:/var/log/frr# cat ~/raja_logs/working/nocras.log \| grep "RTM_NEWLINK\\|QUEUED\\|vxlan99\\|in thread" 2024/07/18 23:09:33.741105 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=616, seq=0, pid=0 2024/07/18 23:09:33.744061 ZEBRA: [K8FXY-V65ZJ] Intf dplane ctx 0x7f2244000cf0, op INTF_INSTALL, ifindex (65), result QUEUED 2024/07/18 23:09:33.767240 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=508, seq=0, pid=0 2024/07/18 23:09:33.767380 ZEBRA: [K8FXY-V65ZJ] Intf dplane ctx 0x7f2244000cf0, op INTF_INSTALL, ifindex (73), result QUEUED 2024/07/18 23:09:33.767389 ZEBRA: [NVFT0-HS1EX] INTF_INSTALL for vxlan99(73) 2024/07/18 23:09:33.767404 ZEBRA: [TQR2A-H2RFY] Vlan-Vni(1186:1186-6000002:6000002) update for VxLAN IF vxlan99(73) 2024/07/18 23:09:33.767422 ZEBRA: [TP4VP-XZ627] Del L2-VNI 102000 intf vxlan99(73) 2024/07/18 23:09:33.767858 ZEBRA: [QYXB9-6RNNK] RTM_NEWVLAN bridge IF vxlan99 NS 0 2024/07/18 23:09:33.767866 ZEBRA: [KKZGZ-8PCDW] Cannot find VNI for VID (703) IF vxlan99 for vlan state update >>>>BAIL OUT In failure case, - The NEWVLAN is received first even before processing RTM_NEWLINK. - Since the vxlan id 102000 is not removed from the vxlan99, the find with vlan id 703 returns the 102000 one and we invoke zebra_vxlan_if_vni_up where the interfaces don't match and assert. root@leaf2:mgmt:/var/log/frr# cat ~/raja_logs/noworking/crash.log \| grep "RTM_NEWLINK\\|QUEUED\\|vxlan99\\|in thread" 2024/07/18 22:26:43.829370 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=616, seq=0, pid=0 2024/07/18 22:26:43.829646 ZEBRA: [K8FXY-V65ZJ] Intf dplane ctx 0x7fe07c026d30, op INTF_INSTALL, ifindex (65), result QUEUED 2024/07/18 22:26:43.853930 ZEBRA: [QYXB9-6RNNK] RTM_NEWVLAN bridge IF vxlan99 NS 0 2024/07/18 22:26:43.853949 ZEBRA: [K61WJ-XQQ3X] Intf vxlan99(73) L2-VNI 102000 is UP >>> VLAN PROCESSED BEFORE INTF EVENT 2024/07/18 22:26:43.853951 ZEBRA: [SPV50-BX2RP] RAJA zevpn_vxlanif vxlan48 and ifp vxlan99 2024/07/18 22:26:43.854005 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=508, seq=0, pid=0 2024/07/18 22:26:43.854241 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=516, seq=0, pid=0 2024/07/18 22:26:43.854251 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=544, seq=0, pid=0 ZEBRA: in thread kernel_read scheduled from zebra/kernel_netlink.c:505 kernel_read() Fix: Similar to #13396, where link change handling was offloaded to dplane, same is being done for vlan events. Note: Prior to this change, zebra main thread was interested in the RTNLGRP_BRVLAN. So all the kernel events pertaining to vlan was handled by zebra main. With this change change as well the handling of vlan events is still with Zebra main. However we offload it via Dplane thread. Ticket :#3878175 Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>	2024-09-26 20:17:35 -07:00
Rajasekar Raja	1632988acf	zebra: vlan to dplane, Relocating some functions Relocating functions used by vlan in if_netlink into zebra vxlan Note: Static variable to the functions will be added back in the next commit. Ticket :#3878175 Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>	2024-09-25 11:56:06 -07:00
Donald Sharp	f53dde0e59	zebra: Add missing proto translations Add missing isis and eigrp proto translations. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-09-25 12:14:50 -04:00
Donald Sharp	e41ae0acc1	zebra: Correctly report metrics Report the routes metric in IPFORWARDMETRIC1 and return -1 for the other metrics as required by the IP-FORWARD-MIB. inetCidrRouteMetric2 OBJECT-TYPE SYNTAX Integer32 MAX-ACCESS read-create STATUS current DESCRIPTION "An alternate routing metric for this route. The semantics of this metric are determined by the routing- protocol specified in the route's inetCidrRouteProto value. If this metric is not used, its value should be set to -1." DEFVAL { -1 } ::= { inetCidrRouteEntry 13 } I've included metric2 but it's the same for all of them. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-09-25 12:09:40 -04:00
Donald Sharp	659cd66427	zebra: Let's use memset instead of walking bytes and setting to 0 Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-09-25 12:08:03 -04:00
Donald Sharp	ecd9d441b0	zebra: Fix snmp walk of zebra rib The snmp walk of the zebra rib was skipping entries because in_addr_cmp was replaced with a prefix_cmp which worked slightly differently causing parts of the zebra rib tree to be skipped. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-09-25 12:06:29 -04:00
Donald Sharp	e54261e20d	lib, zebra: TABLE_NODE is not used No-one is using this, remove Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-09-24 16:05:54 -04:00
Donatas Abraitis	9616304e47	Merge pull request #16882 from mjstapp/fix_if_table_unlock zebra: unlock if_table route_nodes	2024-09-23 10:24:52 +02:00
Mark Stapp	c40635c5c2	zebra: unlock if_table route_nodes Must unlock if we break during iteration over any lib/table tree. Signed-off-by: Mark Stapp <mjs@cisco.com>	2024-09-20 12:24:01 -04:00
Donald Sharp	58722b9448	zebra: Pass in ZEBRA_ROUTE_MAX instead of true zebra_nhg_install_kernel takes a route type. We don't know it at that particular spot but we should not be passing in `true`. Let's use ZEBRA_ROUTE_MAX to indicate we do not know, so that the correct thing is done. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-09-20 11:00:11 -04:00

1 2 3 4 5 ...

6024 Commits