mirror_frr

mirror of https://git.proxmox.com/git/mirror_frr synced 2025-06-05 11:26:14 +00:00

Author	SHA1	Message	Date
Mark Stapp	cb7cf73992	zebra: include resolving nexthops in nhg hash Ensure that the nhg hash comparison function includes all nexthops, including recursive-resolving nexthops. Signed-off-by: Mark Stapp <mjs@cisco.com>	2025-01-27 14:17:24 -05:00
Donatas Abraitis	76ed8f61d8	Merge pull request #17814 from donaldsharp/nhg_removal_in_some_situations	2025-01-17 17:31:19 +02:00
Donald Sharp	ec6a000b0b	zebra: On Nexthop install failure don't set Installation failed Currently FRR when installing a nexthop group, the installation can fail. The assumption with the code was that the current nexthop group was not already installed. This leaves a problem state where if the users of the nexthop group are removed, the nexthop group will be removed possibly leaving a orphaned nexthop group in the data plane. FRR on a nexthop group installation does not actually know the status of the nexthop group in the kernel. It's possible that a earlier version of the nexthop group is left in play. It's possible that there is no nexthop group in the kernel at all. Leaving the Installed flag alone allows upon Zebra removing the nexthop group when it is removed from zebra. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-01-14 16:23:40 -05:00
Donald Sharp	b61424a717	zebra: Nexthops need to be ACTIVE in some cases Currently if you have an interface down event, Zebra sets the nexthop(s) as !ACTIVE that use it. On interface up events the singleton nexthops are not being set as ACTIVE. Due to timing events it is sometimes possible to end up with a route that is using a singleton Change singleton nexthops to set the nexthop to ACTIVE. This will allow the nexthop to be reinstalled appropriately as well. I was able to easily reproduce this using sharpd since it does not attempt to reinstall the routes when a interface goes up/down. Before: D>* 10.0.0.0/32 [150/0] via 192.168.102.34, dummy2, weight 1, 00:00:01 sharpd@eva ~/frr5 (master)> sudo ip link set dummy2 down ; sudo ip link set dummy2 up D> 10.0.0.0/32 [150/0] (350) via 192.168.102.34, dummy2 inactive, weight 1, 00:00:10 After code change: D>* 10.0.0.0/32 [150/0] (73) via 192.168.102.34, dummy2, weight 1, 00:00:14 sharpd@eva ~/frr5 (master)> sudo ip link set dummy2 down ; sudo ip link set dummy2 up D>* 10.0.0.0/32 [150/0] (73) via 192.168.102.34, dummy2, weight 1, 00:00:21 Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-01-14 15:12:32 -05:00
Rajasekar Raja	e77954e5d9	zebra: Optimize invoking nhg compare func In some cases, the old_re nhe and the newnhe is same and there is no point in comparing them both since they are the same. Skip comparing in such cases. Ex: 2025/01/09 23:49:27.489020 ZEBRA: [W4Z4R-NTSMD] zebra_nhg_rib_find_nhe: => nhe 0x555f611d30c0 (44[38/39/45]) 2025/01/09 23:49:27.489021 ZEBRA: [ZH3FQ-TE9NV] zebra_nhg_rib_compare_old_nhe: 0.0.0.0/0 new id: 44 old id: 44 2025/01/09 23:49:27.489021 ZEBRA: [YB8HE-Z86GN] zebra_nhg_rib_compare_old_nhe: 0.0.0.0/0 NEW 0x555f611d30c0 (44[38/39/45]) 2025/01/09 23:49:27.489023 ZEBRA: [ZSB1Z-XM2V3] 0.0.0.0/0: NH 20.1.1.9[0] vrf default(0) wgt 1, with flags 2025/01/09 23:49:27.489024 ZEBRA: [ZSB1Z-XM2V3] 0.0.0.0/0: NH 30.1.2.9[0] vrf default(0) wgt 1, with flags 2025/01/09 23:49:27.489025 ZEBRA: [ZSB1Z-XM2V3] 0.0.0.0/0: NH 20.1.1.2[4] vrf default(0) wgt 1, with flags ACTIVE 2025/01/09 23:49:27.489026 ZEBRA: [ZM3BX-HPETZ] zebra_nhg_rib_compare_old_nhe: 0.0.0.0/0 OLD 0x555f611d30c0 (44[38/39/45]) 2025/01/09 23:49:27.489027 ZEBRA: [ZSB1Z-XM2V3] 0.0.0.0/0: NH 20.1.1.9[0] vrf default(0) wgt 1, with flags 2025/01/09 23:49:27.489028 ZEBRA: [ZSB1Z-XM2V3] 0.0.0.0/0: NH 30.1.2.9[0] vrf default(0) wgt 1, with flags 2025/01/09 23:49:27.489028 ZEBRA: [ZSB1Z-XM2V3] 0.0.0.0/0: NH 20.1.1.2[4] vrf default(0) wgt 1, with flags ACTIVE Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>	2025-01-10 13:39:12 -08:00
Donald Sharp	4c166947a8	zebra: Uninstall NHG in some situations If you have this series of events: a) Decision to install a NHG is made in zebra, enqueue to DPLANE b) Changes to NHG are made and we remove it in the master pthread Since this NHG is not marked as installed it is not removed but the NHG data structure is deleted c) DPLANE installs the NHG In the end the NHG stays installed but ZEBRA has lost track of it. Modify the removal code to check to see if the NHG is queued. There are 2 cases: a) NHG is kept around for a bit before being deleted. In this case just see that the NHG is Queued and keep it around too. b) NHG is not kept around and we are just removing it. In this case check to see if it is queued and send another deletion event. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-01-10 09:35:35 -05:00
Donald Sharp	97fa24e70b	zebra: Fix leaked nhe During route processing in zebra, Zebra will create a nexthop group that matches the nexthops passed down from the routing protocol. Then Zebra will look to see if it can re-use a nhe from a previous version of the route entry( say a interface goes down ). If Zebra decides to re-use an nhe it was just dropping the route entry created. Which led to nexthop group's that had a refcount of 0 and in some cases these nexthop groups were installed into the kernel. Add a bit of code to see if the returned entry is not being used and it has no reference count and if so, properly dispose of it. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-01-09 12:34:50 -05:00
Donald Sharp	54ec9f3888	zebra: Fix resetting valid flags for NHG dependents Upon if_down, we don't reset the valid flag for dependents and unset the INSTALLED flag. So when its time for the NHG to be deleted (routes dereferenced), zebra deletes it since refcnt goes to 0, but stale NHG remains in kernel. Ticket :#4200788 Signed-off-by: Donald Sharp <sharpd@nvidia.com> Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>	2024-12-30 08:40:44 -08:00
anlan_cs	b9538fe481	zebra: fix wrong nexthop check The kernel routes are wrongly selected even the nexthop interface is linkdown. Use `ip link set dev <interface> down` on the other box to set the box's nexthop interface linkdown. The kernel routes will be kept as `linkdown`, but are still with active nexthop in `zebra`. Add three changes/commits for kernel routes in this PR: 1) The active nexthop should be the operative interface. 2) Don't uninstall the kernel routes from `zebra` even no active nexthops. (It doesn't affect the kernel routes' deletion from kernel netlink messages.) 3) Update the kernel routes when the nexthop interface becomes up. Before: (during nexthop interface is linkdown) ``` K>* 3.3.3.3/32 [0/0] via 88.88.88.1, enp2s0, weight 1, 00:00:14 ``` After: (during nexthop interface is linkdown, with all three changes) ``` K 3.3.3.3/32 [0/0] via 88.88.88.1, enp2s0 inactive, weight 1, 00:00:07 ``` This commit is 1st change: Improve the judgment for "active" nexthop to be more accurate, the active nexthop should be the operative interface. Signed-off-by: anlan_cs <anlan_cs@126.com>	2024-12-17 16:14:30 +08:00
Donald Sharp	9f8968fc5a	*: Allow 16 bit size for nexthops Currently FRR is limiting the nexthop count to a uint8_t not a uint16_t. This leads to issues when the nexthop count is 256 which results in the count to overflow to 0 causing problems in the code. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-10-08 09:26:57 -04:00
Russ White	15991e1a08	Merge pull request #16800 from donaldsharp/nhg_reuse_intf_down_up Nhg reuse intf down up	2024-10-04 10:28:58 -04:00
Donald Sharp	58722b9448	zebra: Pass in ZEBRA_ROUTE_MAX instead of true zebra_nhg_install_kernel takes a route type. We don't know it at that particular spot but we should not be passing in `true`. Let's use ZEBRA_ROUTE_MAX to indicate we do not know, so that the correct thing is done. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-09-20 11:00:11 -04:00
Donald Sharp	f02d76f0fd	zebra: Attempt to reuse NHG after interface up and route reinstall The previous commit modified zebra to reinstall the singleton nexthops for a nexthop group when a interface event comes up. Now let's modify zebra to attempt to reuse the nexthop group when this happens and the upper level protocol resends the route down with that. Only match if the protocol is the same as well as the instance and the nexthop groups would match. Here is the new behavior: eva(config)# do show ip route 9.9.9.9/32 Routing entry for 9.9.9.9/32 Known via "static", distance 1, metric 0, best Last update 00:00:08 ago * 192.168.99.33, via dummy1, weight 1 * 192.168.100.33, via dummy2, weight 1 * 192.168.101.33, via dummy3, weight 1 * 192.168.102.33, via dummy4, weight 1 eva(config)# do show ip route nexthop-group 9.9.9.9/32 % Unknown command: do show ip route nexthop-group 9.9.9.9/32 eva(config)# do show ip route 9.9.9.9/32 nexthop-group Routing entry for 9.9.9.9/32 Known via "static", distance 1, metric 0, best Last update 00:00:54 ago Nexthop Group ID: 57 * 192.168.99.33, via dummy1, weight 1 * 192.168.100.33, via dummy2, weight 1 * 192.168.101.33, via dummy3, weight 1 * 192.168.102.33, via dummy4, weight 1 eva(config)# exit eva# conf eva(config)# int dummy3 eva(config-if)# shut eva(config-if)# no shut eva(config-if)# do show ip route 9.9.9.9/32 nexthop-group Routing entry for 9.9.9.9/32 Known via "static", distance 1, metric 0, best Last update 00:00:08 ago Nexthop Group ID: 57 * 192.168.99.33, via dummy1, weight 1 * 192.168.100.33, via dummy2, weight 1 * 192.168.101.33, via dummy3, weight 1 * 192.168.102.33, via dummy4, weight 1 eva(config-if)# exit eva(config)# exit eva# exit sharpd@eva ~/frr1 (master) [255]> ip nexthop show id 57 id 57 group 37/43/50/58 proto zebra sharpd@eva ~/frr1 (master)> ip route show 9.9.9.9/32 9.9.9.9 nhid 57 proto 196 metric 20 nexthop via 192.168.99.33 dev dummy1 weight 1 nexthop via 192.168.100.33 dev dummy2 weight 1 nexthop via 192.168.101.33 dev dummy3 weight 1 nexthop via 192.168.102.33 dev dummy4 weight 1 sharpd@eva ~/frr1 (master)> Notice that we now no longer are creating a bunch of new nexthop groups. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-09-16 09:34:05 -04:00
Donald Sharp	3be8b48e6b	zebra: Reinstall nexthop when interface comes back up If a interface down event caused a nexthop group to remove one of the entries in the kernel, have it be reinstalled when the interface comes back up. Mark the nexthop as usable. new behavior: eva# show nexthop-group rib 181818168 ID: 181818168 (sharp) RefCnt: 1 Uptime: 00:00:23 VRF: default(bad-value) Valid, Installed Depends: (35) (38) (44) (51) via 192.168.99.33, dummy1 (vrf default), weight 1 via 192.168.100.33, dummy2 (vrf default), weight 1 via 192.168.101.33, dummy3 (vrf default), weight 1 via 192.168.102.33, dummy4 (vrf default), weight 1 eva# conf eva(config)# int dummy3 eva(config-if)# shut eva(config-if)# do show nexthop-group rib 181818168 ID: 181818168 (sharp) RefCnt: 1 Uptime: 00:00:44 VRF: default(bad-value) Depends: (35) (38) (44) (51) via 192.168.99.33, dummy1 (vrf default), weight 1 via 192.168.100.33, dummy2 (vrf default), weight 1 via 192.168.101.33, dummy3 (vrf default) inactive, weight 1 via 192.168.102.33, dummy4 (vrf default), weight 1 eva(config-if)# no shut eva(config-if)# do show nexthop-group rib 181818168 ID: 181818168 (sharp) RefCnt: 1 Uptime: 00:00:53 VRF: default(bad-value) Valid, Installed Depends: (35) (38) (44) (51) via 192.168.99.33, dummy1 (vrf default), weight 1 via 192.168.100.33, dummy2 (vrf default), weight 1 via 192.168.101.33, dummy3 (vrf default), weight 1 via 192.168.102.33, dummy4 (vrf default), weight 1 eva(config-if)# exit eva(config)# exit eva# exit sharpd@eva ~/frr1 (master) [255]> ip nexthop show id 181818168 id 181818168 group 35/38/44/51 proto 194 sharpd@eva ~/frr1 (master)> Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-09-16 09:34:05 -04:00
Donald Sharp	1bbbcf043b	zebra: Properly note that a nhg's nexthop has gone down Current code when a link is set down is to just mark the nexthop group as not properly setup. Leaving situations where when an interface goes down and show output is entered we see incorrect state. This is true for anything that would be checking those flags at that point in time. Modify the interface down nexthop group code to notice the nexthops appropriately ( and I mean set the appropriate flags ) and to allow a `show ip route` command to actually display what is going on with the nexthops. eva# show ip route 1.0.0.0 Routing entry for 1.0.0.0/32 Known via "sharp", distance 150, metric 0, best Last update 00:00:06 ago * 192.168.44.33, via dummy1, weight 1 * 192.168.45.33, via dummy2, weight 1 sharpd@eva:~/frr1$ sudo ip link set dummy2 down eva# show ip route 1.0.0.0 Routing entry for 1.0.0.0/32 Known via "sharp", distance 150, metric 0, best Last update 00:00:12 ago * 192.168.44.33, via dummy1, weight 1 192.168.45.33, via dummy2 inactive, weight 1 Notice now that the 1.0.0.0/32 route now correctly displays the route for the nexthop group entry. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-09-16 09:34:05 -04:00
Donald Sharp	0c72a78930	zebra: Allow for initial deny of installation of nhe's Currently the FRR code will receive both kernel and connected routes that do not actually have an underlying nexthop group at all. Zebra turns around and creates a `matching` nexthop hash entry and installs it. For connected routes, this will create 2 singleton nexthops in the dplane per interface (v4 and v6). For kernel routes it would just create 1 singleton nexthop that might be used or not. This is bad because the dplane has a limited amount of space available for nexthop entries and if you happen to have a large number of interfaces then all of a sudden you have 2x(# of interfaces) singleton nexthops. Let's modify the code to delay creation of these singleton nexthops until they have been used by something else in the system. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-08-30 08:23:48 -04:00
Donald Sharp	f90989d52a	zebra: Allow blackhole singleton nexthops to be v6 A blackhole nexthop, according to the linux kernel, can be v4 or v6. A v4 blackhole nexthop cannot be used on a v6 route, but a v6 blackhole nexthop can be used with a v4 route. Convert all blackhole singleton nexthops to v6 and just use that. Possibly reducing the number of active nexthops by 1. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-08-29 15:06:31 -04:00
Russ White	5a6cb0bf75	Merge pull request #16103 from mjstapp/fix_5549_nhg_type zebra: be consistent about v6 nexthops for v4 routes	2024-08-27 09:46:53 -04:00
Donald Sharp	c20fa972d6	zebra: Create Singleton nhg's without weights Currently FRR when it has two nexthop groups: A nexthop 1 weight 5 nexthop 2 weight 6 nexthop 3 weight 7 B nexthop 1 weight 3 nexthop 2 weight 4 nexthop 3 weight 5 We end up with 5 singleton nexthops and two groups: ID: 181818168 (sharp) RefCnt: 1 Uptime: 00:04:52 VRF: default Valid, Installed Depends: (69) (70) (71) via 192.168.119.1, enp13s0 (vrf default), weight 182 via 192.168.119.2, enp13s0 (vrf default), weight 218 via 192.168.119.3, enp13s0 (vrf default), weight 255 ID: 181818169 (sharp) RefCnt: 1 Uptime: 00:02:08 VRF: default Valid, Installed Depends: (71) (127) (128) via 192.168.119.1, enp13s0 (vrf default), weight 127 via 192.168.119.2, enp13s0 (vrf default), weight 170 via 192.168.119.3, enp13s0 (vrf default), weight 255 id 69 via 192.168.119.1 dev enp13s0 scope link proto 194 id 70 via 192.168.119.2 dev enp13s0 scope link proto 194 id 71 via 192.168.119.3 dev enp13s0 scope link proto 194 id 127 via 192.168.119.1 dev enp13s0 scope link proto 194 id 128 via 192.168.119.2 dev enp13s0 scope link proto 194 id 181818168 group 69,182/70,218/71,255 proto 194 id 181818169 group 71,255/127,127/128,170 proto 194 This is not a desirable state to be in. If you have a link flapping in the network and weights are changing rapidly you end up with a large number of singleton nexthops that are being used by the nexthop groups. This fills up asic space and clutters the table. Additionally singleton nexthops cannot have any weight and the fact that you attempt to create a singleton nexthop with different weights means nothing to the linux kernel( or any asic dplane ). Let's modify the code to always create the singleton nexthops without a weight and then just creating the NHG's that use the singletons with the appropriate weight. ID: 181818168 (sharp) RefCnt: 1 Uptime: 00:00:32 VRF: default Valid, Installed Depends: (22) (24) (28) via 192.168.119.1, enp13s0 (vrf default), weight 182 via 192.168.119.2, enp13s0 (vrf default), weight 218 via 192.168.119.3, enp13s0 (vrf default), weight 255 ID: 181818169 (sharp) RefCnt: 1 Uptime: 00:00:14 VRF: default Valid, Installed Depends: (22) (24) (28) via 192.168.119.1, enp13s0 (vrf default), weight 153 via 192.168.119.2, enp13s0 (vrf default), weight 204 via 192.168.119.3, enp13s0 (vrf default), weight 255 id 22 via 192.168.119.1 dev enp13s0 scope link proto 194 id 24 via 192.168.119.2 dev enp13s0 scope link proto 194 id 28 via 192.168.119.3 dev enp13s0 scope link proto 194 id 181818168 group 22,182/24,218/28,255 proto 194 id 181818169 group 22,153/24,204/28,255 proto 194 Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-08-22 13:22:06 -04:00
Donald Sharp	5a1b61aeba	zebra: Ensure non-equal id's are not same nhg's The function zebra_nhg_hash_equal is only used as a hash function for storage of NHG's and retrieval. If you have say two nhg's: 31 (25/26) 32 (25/26) This function would return them as being equal. Which of course leads to the problem when you attempt to hash_release 32 but release 31 from the hash. Then later when you attempt to do hash comparisons 32 has actually been freed leaving to use after free situations and shit goes down hill fast. This hash is only used as part of the hash comparison function for nexthop group storage. Since this is so let's always return the 31/32 nhg's are not equal at all. We possibly have a different problem where we are creating 31 and 32 ( when 31 should have just been used instead of 32 ) but we need to prevent any type of hash release problem at all. This supercedes any other issue( that should be tracked down on it's own ). Since you can have use after free situation that leads to a crash -vs- some possible nexthop group duplication which is very minor in comparison. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-08-10 19:49:35 -04:00
Mark Stapp	0221ed2917	zebra: be consistent about v6 nexthops for v4 routes Treat TYPE_IPV6 and TYPE_IPV6_IFINDEX nexthops the same way when processing v4 (RFC 5549) routes. Signed-off-by: Mark Stapp <mjs@cisco.com>	2024-08-06 08:17:47 -04:00
Donald Sharp	266b061994	zebra: Properly note that a nhg's nexthop has gone down Current code when a link is set down is to just mark the nexthop group as not properly setup. Leaving situations where when an interface goes down and show output is entered we see incorrect state. This is true for anything that would be checking those flags at that point in time. Modify the interface down nexthop group code to notice the nexthops appropriately ( and I mean set the appropriate flags ) and to allow a `show ip route` command to actually display what is going on with the nexthops. eva# show ip route 1.0.0.0 Routing entry for 1.0.0.0/32 Known via "sharp", distance 150, metric 0, best Last update 00:00:06 ago * 192.168.44.33, via dummy1, weight 1 * 192.168.45.33, via dummy2, weight 1 sharpd@eva:~/frr1$ sudo ip link set dummy2 down eva# show ip route 1.0.0.0 Routing entry for 1.0.0.0/32 Known via "sharp", distance 150, metric 0, best Last update 00:00:12 ago * 192.168.44.33, via dummy1, weight 1 192.168.45.33, via dummy2 inactive, weight 1 Notice now that the 1.0.0.0/32 route now correctly displays the route for the nexthop group entry. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-07-03 09:34:55 -04:00
Donatas Abraitis	dbf83cfd36	zebra: Set the weight for non-recursive next-hop If using weighted ECMP, the weight for non-recursive next-hop should be inherited from recursive next-hop. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-06-18 12:15:33 +03:00
Donald Sharp	29c1ff446e	lib, zebra: Check for not being a blackhole route In zebra_interface_nhg_reinstall zebra is checking that the nhg is a singleton and not a blackhole nhg. This was originally done with checking that the nexthop is a NEXTHOP_TYPE_IFINDEX, NEXTHOP_TYPE_IPV4_IFINDEX and NEXTHOP_TYPE_IPV6_IFINDEX. This was excluding NEXTHOP_TYPE_IPV4 and NEXTHOP_TYPE_IPV6. These were both possible to be received and maintained from the upper level protocol for when a route is being recursively resolved. If we have gotten to this point in zebra_interface_nhg_reinstall the nexthop group has already been installed at least once and we know that it is actually a valid nexthop. What the test is really trying to do is ensure that we are not reinstalling a blackhole nexthop group( Which is not possible to even be here by the way, but safety first! ). So let's change to test for that instead. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-04-22 10:44:55 -04:00
sri-mohan1	2df51c7fe3	zebra: changes for code maintainability these changes are for improving the code maintainability and readability Signed-off-by: sri-mohan1 <sri.mohan@samsung.com>	2024-03-26 10:21:45 +05:30
Donald Sharp	b2ade8e3d2	zebra: When reinstalling a NHG, set REINSTALL flag The current code is unsetting the fact that the NHG is installed. It is installed but we are reinstalling it. Let's note this in the code appropriately as REINSTALL and not remove the INSTALLED FLAG. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-02-08 12:32:26 -05:00
Donald Sharp	2934127547	zebra: Remove ifp_nhg_XXX functions completely These functions provided a level of abstraction that forced us to call multiple functions when a simple data structure change was all that is needed. Let's consolidate down and make things a bit simpler. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-02-08 11:56:40 -05:00
Donald Sharp	910b2c5a4a	zebra: Installation success should not set NHG as valid The nexthop group is marked as valid/invalid and then installed. Not installed and then marked valid. This is just a bit of code removed that might be covering up other problems that need to be sorted. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-02-08 11:37:26 -05:00
Donald Sharp	0fa9ee396b	zebra: Use switch when handling return from dplane for nhgs Convert the dplane results function for nhg's over to using a switch for the result enum. Let's specifically call out the unexpected state and also set the nexthop group as not installed when installation fails. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-02-08 11:37:25 -05:00
Donald Sharp	8f76afd044	zebra: Conslidate zebra_nhg_set_valid\|invalid functions Basically the same function two times. Let's consolidate. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-02-08 07:59:11 -05:00
Donald Sharp	65d82a45ec	zebra: dplane_nexthop_add cannot return ZEBRA_DPLANE_REQUEST_SUCCESS When installing a NHG via dplane_nexthop_add, it can only return REQUEST_QUEUED or REQUEST_FAILURE. There is no way SUCCESS can be returned with the way the dplane works at this point in time. Remove the code that attempts to set the NHE state appropriately as it is impossible. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-02-07 20:32:24 -05:00
Donald Sharp	827fb64ecb	zebra: Remove debugs for retrieving a new nhg id This is not complicated code and if zebra is allocating a new one. Zebra does not need to inform the operator about the process during debugs. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-01-24 18:29:39 -05:00
Donald Sharp	7a3b6498fc	zebra: Combine 2 debugs into 1 for NHG Detail When debugging NHG detail there is a whole bunch of lines surrounding the nexthop group. Let's clean these up since they are extremely chatty and spawn several lines. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-01-24 18:29:34 -05:00
Donald Sharp	63816f7579	zebra: use break instead of goto There is a goto statement that would be better served with a break statement. Let's try to minimize this in the code. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-01-20 20:08:05 -05:00
Russ White	0a79e117d6	Merge pull request #12600 from donaldsharp/local_routes *: Introduce Local Host Routes to FRR	2023-12-05 11:00:44 -05:00
Philippe Guibert	427d3f81f4	zebra: clarify error when calling zebra_nhg_rib_find_nhe() Display a specific log message when the rt_nhe parameter is not set at all. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-11-30 14:17:46 +01:00
Mark Stapp	bb3faf1b95	zebra: reduce number of switch statements with dplane opcodes Replace several switch blocks that contain every dplane opcode with simpler sets of if()s. In these cases the code only uses a couple of opcodes. Signed-off-by: Mark Stapp <mjs@labn.net>	2023-11-17 08:40:58 -05:00
Donald Sharp	d4aa24ba7d	: Introduce Local Host Routes to FRR Create Local routes in FRR: S 0.0.0.0/0 [1/0] via 192.168.119.1, enp39s0, weight 1, 00:03:46 K> 0.0.0.0/0 [0/100] via 192.168.119.1, enp39s0, 00:03:51 O 192.168.119.0/24 [110/100] is directly connected, enp39s0, weight 1, 00:03:46 C>* 192.168.119.0/24 is directly connected, enp39s0, 00:03:51 L>* 192.168.119.224/32 is directly connected, enp39s0, 00:03:51 O 192.168.119.229/32 [110/100] via 0.0.0.0, enp39s0 inactive, weight 1, 00:03:46 C>* 192.168.119.229/32 is directly connected, enp39s0, 00:03:46 Create ability to redistribute local routes. Modify tests to support this change. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-11-01 17:13:06 -04:00
Donald Sharp	a272a2b364	zebra: Allow longer prefix matches for nexthops Zebra currently does a shortest prefix match for resolving nexthops for a prefix. This is typically an ok thing to do but fails in several specific scenarios. If a nexthop matches to a route that is not usable, nexthop resolution just gives up and refuses to use that particular route. For example if zebra currently has a covering prefix say a 10.0.0.0/8. And about the same time it receives a 10.1.0.0/16 ( a more specific than the /8 ) and another route A, who's nexthop is 10.1.1.1. Imagine the 10.1.0.0/16 is processed enough to know we want to install it and the prefix is sent to the dataplane for installation( it is queued ) and then route A is processed, nexthop resolution will fail and the route A will be left in limbo as uninstallable. Let's modify the nexthop resolution code in zebra such that if a nexthop's most specific match is unusable, continue looking up the table till we get to the 0.0.0.0/0 route( if it's even installed ). If we find a usable route for the nexthop accept it and use it. The bgp_default_originate topology test is frequently failing with this exact problem: B>* 0.0.0.0/0 [200/0] via 192.168.1.1, r2-r1-eth0, weight 1, 00:00:21 B 1.0.1.17/32 [200/0] via 192.168.0.1 inactive, weight 1, 00:00:21 B>* 1.0.2.17/32 [200/0] via 192.168.1.1, r2-r1-eth0, weight 1, 00:00:21 C>* 1.0.3.17/32 is directly connected, lo, 00:02:00 B>* 1.0.5.17/32 [20/0] via 192.168.2.2, r2-r3-eth1, weight 1, 00:00:32 B>* 192.168.0.0/24 [200/0] via 192.168.1.1, r2-r1-eth0, weight 1, 00:00:21 B 192.168.1.0/24 [200/0] via 192.168.1.1 inactive, weight 1, 00:00:21 C>* 192.168.1.0/24 is directly connected, r2-r1-eth0, 00:02:00 C>* 192.168.2.0/24 is directly connected, r2-r3-eth1, 00:02:00 B>* 192.168.3.0/24 [20/0] via 192.168.2.2, r2-r3-eth1, weight 1, 00:00:32 B 198.51.1.1/32 [200/0] via 192.168.0.1 inactive, weight 1, 00:00:21 B>* 198.51.1.2/32 [20/0] via 192.168.2.2, r2-r3-eth1, weight 1, 00:00:32 Notice that the 1.0.1.17/32 route is inactive but the nexthop 192.168.0.1 is covered by both the 192.168.0.0/24 prefix( shortest match ) and the 0.0.0.0/0 route ( longest match ). When looking at the logs the 1.0.1.17/32 route was not being installed because the matching route was not in a usable state, which is because the 192.168.0.0/24 route was in the process of being installed. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-10-23 08:15:11 -04:00
Igor Ryzhov	7d67b9ff28	build: add -Wimplicit-fallthrough Also: - replace all /* fallthrough */ comments with portable fallthrough; pseudo keyword to accomodate both gcc and clang - add missing break; statements as required by older versions of gcc - cleanup some code to remove unnecessary fallthrough Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>	2023-10-12 21:23:18 +03:00
Donald Sharp	0dc12c9003	Revert "lib: register bgp link-state afi/safi" This reverts commit `1642a68d60`.	2023-10-10 16:45:57 -04:00
Chirag Shah	0ddda5cd96	Merge pull request #14515 from mjstapp/fix_nhg_intf_uninstall zebra: be more careful removing 'installed' flag from nhgs	2023-10-10 08:30:55 -07:00
Mark Stapp	0da89ac985	zebra: be more careful removing 'installed' flag from nhgs When interface addresses change, we examine nhgs associated with the interface in case they need to be reinstalled. As part of that, we may need to reinstall ecmp nhgs that use the interface being examined - but not always. Signed-off-by: Mark Stapp <mjs@labn.net>	2023-09-29 12:08:17 -04:00
Russ White	8e755a03a3	Merge pull request #12649 from louis-6wind/bgp-link-state bgpd: add basic support of BGP Link-State RFC7752	2023-09-26 10:07:02 -04:00
Dmytro Shytyi	f20cf1457d	bgpd,lib,sharpd,zebra: srv6 introduce multiple segs/SIDs in nexthop Append zebra and lib to use muliple SRv6 segs SIDs, and keep one seg SID for bgpd and sharpd. Note: bgpd and sharpd compilation relies on the lib and zebra files, i.e if we separate this: lib or zebra or bgpd or sharpd in different commits - this will not compile. Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>	2023-09-20 15:07:15 +02:00
Louis Scalbert	1642a68d60	lib: register bgp link-state afi/safi Register BGP Link-State AFI/SAFI values from RFC7752. Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com> Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>	2023-09-18 14:22:51 +02:00
Rajasekar Raja	27ccfd9aa6	zebra: Fix zebra crash when replacing NHE during shutdown During replace of a NHE from upper proto in zebra_nhg_proto_add(), - rib_handle_nhg_replace() is invoked with old NHE where we walk all RNs/REs & replace the re->nhe whose address points to old NHE. - In this walk, if prev re->nhe refcnt is decremented to 0, we free up the memory which the old NHE is pointing to. Later in zebra_nhg_proto_add(), we end up accessing this freed memory and crash. Logs: 1380766 2023/08/16 22:34:11.994671 ZEBRA: [WDEB1-93HCZ] zebra_nhg_decrement_ref: nhe 0x56091d890840 (70312519[2756/2762/2810]) 2 => 1 1380773 2023/08/16 22:34:11.994678 ZEBRA: [WDEB1-93HCZ] zebra_nhg_decrement_ref: nhe 0x56091d890840 (70312519[2756/2762/2810]) 1 => 0 1380777 2023/08/16 22:34:11.994844 ZEBRA: [JE46R-G2NEE] zebra_nhg_release: nhe 0x56091d890840 (70312519[2756/2762/2810]) 1380778 2023/08/16 22:34:11.994849 ZEBRA: [SCDBM-4H062] zebra_nhg_free: nhe 0x56091d890840 (70312519[2756/2762/2810]), refcnt 0 1380782 2023/08/16 22:34:11.995000 ZEBRA: [SCDBM-4H062] zebra_nhg_free: nhe 0x56091d890840 (0[]), refcnt 0 1380783 2023/08/16 22:34:11.995011 ZEBRA: lib/memory.c:84: mt_count_free(): assertion (mt->n_alloc) failed Backtrace: 0 0x00007f833f5f48eb in raise () from /lib/x86_64-linux-gnu/libc.so.6 1 0x00007f833f5df535 in abort () from /lib/x86_64-linux-gnu/libc.so.6 2 0x00007f833f636648 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 3 0x00007f833f63cd6a in ?? () from /lib/x86_64-linux-gnu/libc.so.6 4 0x00007f833f63cfb4 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 5 0x00007f833f63fbc8 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 6 0x00007f833f64172a in malloc () from /lib/x86_64-linux-gnu/libc.so.6 7 0x00007f833f6c3fd2 in backtrace_symbols () from /lib/x86_64-linux-gnu/libc.so.6 8 0x00007f833f9013fc in zlog_backtrace_sigsafe (priority=priority@entry=2, program_counter=program_counter@entry=0x7f833f5f48eb <raise+267>) at lib/log.c:222 9 0x00007f833f901593 in zlog_signal (signo=signo@entry=6, action=action@entry=0x7f833f988ee8 "aborting...", siginfo_v=siginfo_v@entry=0x7ffee1ce4a30, program_counter=program_counter@entry=0x7f833f5f48eb <raise+267>) at lib/log.c:154 10 0x00007f833f92dbd1 in core_handler (signo=6, siginfo=0x7ffee1ce4a30, context=<optimized out>) at lib/sigevent.c:254 11 <signal handler called> 12 0x00007f833f5f48eb in raise () from /lib/x86_64-linux-gnu/libc.so.6 13 0x00007f833f5df535 in abort () from /lib/x86_64-linux-gnu/libc.so.6 14 0x00007f833f958f96 in _zlog_assert_failed (xref=xref@entry=0x7f833f9e4080 <_xref.10705>, extra=extra@entry=0x0) at lib/zlog.c:680 15 0x00007f833f905400 in mt_count_free (mt=0x7f833fa02800 <MTYPE_NH_LABEL>, ptr=0x51) at lib/memory.c:84 16 mt_count_free (ptr=0x51, mt=0x7f833fa02800 <MTYPE_NH_LABEL>) at lib/memory.c:80 17 qfree (mt=0x7f833fa02800 <MTYPE_NH_LABEL>, ptr=0x51) at lib/memory.c:140 18 0x00007f833f90799c in nexthop_del_labels (nexthop=nexthop@entry=0x56091d776640) at lib/nexthop.c:563 19 0x00007f833f907b91 in nexthop_free (nexthop=0x56091d776640) at lib/nexthop.c:393 20 0x00007f833f907be8 in nexthops_free (nexthop=<optimized out>) at lib/nexthop.c:408 21 0x000056091c21aa76 in zebra_nhg_free_members (nhe=0x56091d890840) at zebra/zebra_nhg.c:1628 22 zebra_nhg_free (nhe=0x56091d890840) at zebra/zebra_nhg.c:1628 23 0x000056091c21bab2 in zebra_nhg_proto_add (id=<optimized out>, type=9, instance=<optimized out>, session=0, nhg=nhg@entry=0x56091d7da028, afi=afi@entry=AFI_UNSPEC) at zebra/zebra_nhg.c:3532 24 0x000056091c22bc4e in process_subq_nhg (lnode=0x56091d88c540) at zebra/zebra_rib.c:2689 25 process_subq (qindex=META_QUEUE_NHG, subq=0x56091d24cea0) at zebra/zebra_rib.c:3290 26 meta_queue_process (dummy=<optimized out>, data=0x56091d24d4c0) at zebra/zebra_rib.c:3343 27 0x00007f833f9492c8 in work_queue_run (thread=0x7ffee1ce55a0) at lib/workqueue.c:285 28 0x00007f833f93f60d in thread_call (thread=thread@entry=0x7ffee1ce55a0) at lib/thread.c:2008 29 0x00007f833f8f9888 in frr_run (master=0x56091d068660) at lib/libfrr.c:1223 30 0x000056091c1b8366 in main (argc=12, argv=0x7ffee1ce5988) at zebra/main.c:551 Issue: 3492162 Ticket# 3492162 Signed-off-by: Chirag Shah <chirag@nvidia.com> Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>	2023-08-31 05:40:34 +00:00
Donald Sharp	d381190a55	zebra: Remove tag from zebra_rmap_obj The tag value in all cases was being set to the re->tag. re is already stored, so let's just use that. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-11 11:21:03 -04:00
Donald Sharp	b7542d5af8	zebra: Remove instance from zebra_rmap_obj data structure In all cases the instance is derived from the re pointer and since the re pointer is already stored, let's just remove it from the game and cut to the chase. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-11 11:15:06 -04:00
Donald Sharp	cad4d0c332	zebra: Replace source_protocol with just using re in route map object Replace the source_protocol with just saving a pointer to the re in the `struct zebra_rmap_obj` data structure. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-11 11:11:40 -04:00

1 2 3 4 5 ...

350 Commits