mirror_frr

mirror of https://git.proxmox.com/git/mirror_frr synced 2025-12-03 20:03:23 +00:00

Author	SHA1	Message	Date
Mark Stapp	d26e2d9be4	Merge pull request #5600 from sworleys/NHG-Depend-Crash zebra: can't improve efficiency for recursive depends	2020-01-15 16:31:55 -05:00
Mark Stapp	a67b69c024	Merge pull request #5616 from sworleys/NHG-Fix-Recurse-to-Group zebra: just set nexthop member in handle_recursive_depend()	2020-01-15 16:26:06 -05:00
Stephen Worley	1d049aba72	zebra: just set nexthop member in handle_recursive_depend() With recent changes to the lib nexthop_group APIs (`e1f3a8eb19`), we are making new assumptions that this should be adding a single nexthop to a group, not a list of nexthops. This broke the case of a recursive nexthop resolving to a group: ``` D> 2.2.2.1/32 [150/0] via 1.1.1.1 (recursive), 00:00:09 * via 1.1.1.1, dummy1 onlink, 00:00:09 via 1.1.1.2 (recursive), 00:00:09 * via 1.1.1.2, dummy2 onlink, 00:00:09 D> 3.3.3.1/32 [150/0] via 2.2.2.1 (recursive), 00:00:04 * via 1.1.1.1, dummy1 onlink, 00:00:04 K * 10.0.0.0/8 [0/1] via 172.27.227.148, tun0, 00:00:21 ``` This group can instead just directly point to the nh that was passed. Its only being used for a lookup (the memory gets copied and used elsewhere if the nexthop is not found). Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2020-01-15 13:35:29 -05:00
Stephen Worley	77bf9504bf	lib,zebra: tighten up the nexthop_copy/nexthop_dup APIs Make the nexthop_copy/nexthop_dup APIs more consistent by adding a secondary, non-recursive, version of them. Before, it was inconsistent whether the APIs were expected to copy recursive info or not. Make it clear now that the default is recursive info is copied unless the _no_recurse() version is called. These APIs are not heavily used so it is fine to change them for now. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2020-01-15 13:35:04 -05:00
Stephen Worley	0fff714efa	zebra: can't improve efficiency for recursive depends `cb86eba3ab` was causing zebra to crash when handling a nexthop group that had a nexthop which was recursively resolved. Steps to recreate: ! nexthop-group red nexthop 1.1.1.1 nexthop 1.1.1.2 ! sharp install routes 8.8.8.1 nexthop-group red 1 ========================================= ==11898== Invalid write of size 8 ==11898== at 0x48E53B4: _nexthop_add_sorted (nexthop_group.c:254) ==11898== by 0x48E5336: nexthop_group_add_sorted (nexthop_group.c:296) ==11898== by 0x453593: handle_recursive_depend (zebra_nhg.c:481) ==11898== by 0x451CA8: zebra_nhg_find (zebra_nhg.c:572) ==11898== by 0x4530FB: zebra_nhg_find_nexthop (zebra_nhg.c:597) ==11898== by 0x4536B4: depends_find (zebra_nhg.c:1065) ==11898== by 0x453526: depends_find_add (zebra_nhg.c:1087) ==11898== by 0x451C4D: zebra_nhg_find (zebra_nhg.c:567) ==11898== by 0x4519DE: zebra_nhg_rib_find (zebra_nhg.c:1126) ==11898== by 0x452268: nexthop_active_update (zebra_nhg.c:1729) ==11898== by 0x461517: rib_process (zebra_rib.c:1049) ==11898== by 0x4610C8: process_subq_route (zebra_rib.c:1967) ==11898== Address 0x0 is not stack'd, malloc'd or (recently) free'd Zebra crashes because we weren't handling the case of the depend nexthop being recursive. For this case, we cannot make the function more efficient. A nexthop could resolve to a group of any size, thus we need allocs/frees. To solve this and retain the goal of the original patch, we separate out the two cases so it will still be more efficient if the nexthop is not recursive. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2020-01-15 13:35:04 -05:00
Donald Sharp	946de1b95a	bgpd, ospfd, zebra: Do not use 0 as VRF_DEFAULT Explicitly spell out what we are trying to do. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>	2020-01-15 08:29:36 -05:00
Mark Stapp	cb86eba3ab	zebra: improve efficiency of depends_find() Do less malloc and free in depends_find(), when looking for a singleton nexthop in the nhg hash. Signed-off-by: Mark Stapp <mjs@voltanet.io>	2019-12-18 15:34:37 -05:00
Stephen Worley	b10d6b0744	zebra: pass type when finding individual nexthop When we are doing a lookup on an individual nexthop, we should still be passing along the type that gets passed via the arguments. Otherwise, we will always think we own that NHE when in reality anyone could have put that into the kernel. Before this patch, nexthops in the kernel will get swepped out even if we didn't create them. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-12-16 16:46:30 -05:00
Donald Sharp	df7fb5800b	lib, zebra: Allow for installation of a weighted nexthop Linux has the idea of allowing a weight to be sent down as part of a nexthop group to allow the kernel to weight particular nexthop paths a bit more or less than others. See: http://tldp.org/HOWTO/Adv-Routing-HOWTO/lartc.rpdb.multiple-links.html Allow for installation into the kernel using the weight attribute associated with the nexthop. This code is foundational in that it just sets up the ability to do this, we do not use it yet. Further commits will allow for the pass through of this data from upper level protocols. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>	2019-12-09 13:37:37 -05:00
Donald Sharp	e302caaa81	Merge pull request #5416 from mjstapp/re_nhe_pointer lib,zebra: use shared nexthop-group in route_entry	2019-12-04 14:11:04 -05:00
Mark Stapp	0eb97b860d	lib,zebra: use nhg_hash_entry pointer in route_entry Replace the existing list of nexthops (via a nexthop_group struct) in the route_entry with a direct pointer to zebra's new shared group (from zebra_nhg.h). This allows more direct access to that shared group and the info it carries. Signed-off-by: Mark Stapp <mjs@voltanet.io>	2019-12-04 08:13:52 -05:00
Donatas Abraitis	d79368d3a5	Merge pull request #5192 from donaldsharp/zebra_rejection zebra: Dissallow a /32 or /128 through itself	2019-12-03 09:29:50 +02:00
Stephen Worley	4c55b5ff6b	zebra: Set resolved inactive when > multipath_num Apparently the multipath_num functionatlity has been broken for a while because we were ignoring the recusive nexthops when marking them inactive based on it. This sets them as inactive as well if the parent breaks it. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-11-21 16:28:31 -05:00
Stephen Worley	08de78b876	zebra: Use curr_active to check multipath_num We were re-counting the entire group's active number on every iteration of this nexthop_active_update() loop. This is not great from a performance perspective but also it was failing to properly mark things according to the specified multipath_num. Since a nexthop is set as active before this check, if its == to the set ecmp, it gets marked inactive even though if its under the max ecmp wanted! ex) set ecmp to 1. `/usr/lib/frr/zebra -e 1` All kernel routes will be marked inactive even with just one nexthop! K 1.1.1.1/32 [0/0] is directly connected, dummy1 inactive, 00:00:10 K 1.1.1.2/32 [0/0] is directly connected, dummy2 inactive, 00:00:10 K 1.1.1.3/32 [0/0] is directly connected, dummy3 inactive, 00:00:10 K 1.1.1.4/32 [0/0] is directly connected, dummy4 inactive, 00:00:10 K 1.1.1.5/32 [0/0] is directly connected, dummy5 inactive, 00:00:10 K 1.1.1.6/32 [0/0] is directly connected, dummy6 inactive, 00:00:10 K 1.1.1.7/32 [0/0] is directly connected, dummy7 inactive, 00:00:10 K 1.1.1.8/32 [0/0] is directly connected, dummy8 inactive, 00:00:10 K 1.1.1.9/32 [0/0] is directly connected, dummy9 inactive, 00:00:10 Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-11-21 15:27:12 -05:00
Mark Stapp	5463ce26c3	zebra: clean up rib and nhg headers Clean up the relationships between zebra's rib and nexthop-group headers as prep for adding a nexthop-group pointer to the route_entry. Signed-off-by: Mark Stapp <mjs@voltanet.io>	2019-11-21 15:05:52 -05:00
Russ White	943de56af6	Merge pull request #5241 from sworleys/SA-NHG One More Zebra NHG SA Fix and nhg_ctx API Adjustment	2019-11-19 11:44:15 -05:00
Stephen Worley	7c6d5f255e	zebra: Put freeing code in nhg_ctx_free() Put the code to free the data held by a nhg_ctx in nhg_ctx_free() as well. We do it similiarly for the dplane_ctx. Let nhg_ctx_fini() be any other routines that need to be handled before freeing. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-11-12 10:29:16 -05:00
Stephen Worley	606fa9e58d	zebra: handle depends_find() NULL nexthop SA warned us lookup could be NULL dereferenced in some paths. Handle the case where we are passed a NULL nexthop before we try to copy it. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-11-12 10:28:46 -05:00
Stephen Worley	148813c22a	zebra: zebra_nhg check each nexthop for active, not just number We were only checking that two nhg_hash_entry's were equal based on the active nexthop NUMBER. This is not sufficient in special cases where whats active with one route using it, might not be active with the other. We can see this with routes trying to resolve to themselves. Ex) 1.1.1.0/24 -> 1.1.1.1 dummy1 (inactive) -> 1.1.1.2 dummy2 1.1.2.0/24 -> 1.1.1.1 dummy1 -> 1.1.1.2 dummy1 (inactive) Without checking each nexthop individually, they will hash to the same group since they have the same number of active nexthops. Fix this by looping over every nexthop for each nhe (they should be sorted) and checking if the NEXTHOP_FLAG_ACTIVE flag's match. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-11-12 01:24:39 -05:00
Donald Sharp	7134ba7060	zebra: Fix some nhg SA issues found in latest Coverity Fix 2 Coverity issues: 1) zebra_nhg.c -> all paths in nhg_ctx_process_finish have already deref'ed the ctx pointer no need for a test of it 2) the **ifp pointer passed in may be NULL. Prevent an accidental deref if calling function does not pass in a ifp pointer. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>	2019-10-28 20:30:06 -04:00
Stephen Worley	5948f013ba	zebra: Cleanup zebra_nhg APIs Add a private header file for functions that are internal/special case like how we do it for `lib/nexthop_group_private.h`. Remove a bunch of functions from the header file only being used statically and add some comments for those remaining to indicate better what their use is. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:44 -04:00
Stephen Worley	80286aa564	zebra: Re-work zebra_nhg_*_valid APIs Re-work the validity setting and checking APIs for nhg_hash_entry's to make them clearer. Further, they were originally only beings set on ifdown and install. Extended their use into releasing entries and to account for setting the validity of a recursive dependent. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:44 -04:00
Stephen Worley	e1292378e2	zebra: Improve commenting for group requeue case The commenting for why we would need to requeue a group from the kernel to be later processed was not sufficient. Add a better explanation for the flow and state of the system. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:44 -04:00
Stephen Worley	c1da832a94	zebra: Change wording of duplicate kernel nhg flag Change the wording of the flag indicating we have received a nexthop group from the kernel with a different ID but is fundamentally identical to one we already have. It was colliding with a flag of similar name in the nexthop struct. Change it from NEXTHOP_GROUP_DUPLICATE -> NEXTHOP_GROUP_UNHASHABLE since it is in fact unhashable. Also change the wording of functions and comments referencing the same problem. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:44 -04:00
Stephen Worley	0b4dadb385	zebra: Check depends for validity, not dependents When determining whether to set the nhg_hash_entry as invalid, we should have been checking the depends, not the dependents. If its a group and at least one of its depends is valid, the group is still valid. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:44 -04:00
Stephen Worley	177e711dfc	zebra: Adjust nhg handling for dataplane result off on shutdown Now with this patch we can't use shutdown for cleanup: ``` commit `2fc69f03d2` (pr_5079) Author: Mark Stapp <mjs@voltanet.io> Date: Fri Sep 27 12:15:34 2019 -0400 zebra: during shutdown processing, drop dplane results Don't process dataplane results in zebra during shutdown (after sigint has been seen). The dplane continues to run in order to clean up, but zebra main just drops results. Signed-off-by: Mark Stapp <mjs@voltanet.io> ``` Adjusted nhg uninstall handling to clear data and other cleanup before sending to the dataplane. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:44 -04:00
Stephen Worley	724583edad	zebra: Set the nhe type in the appropriate place We were setting the nhe type on uninstall when it should be on the install. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:44 -04:00
Stephen Worley	fefa080e3c	zebra: Remove cleanup and nhg workqueue boilerplate This code was from a strategies we elected not to use and can safely be removed. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:43 -04:00
Stephen Worley	d3a3513811	lib,pbrd,zebra: Use one api to delete nexthops/group Reduce the api for deleting nexthops and the containing group to just one call rather than having a special case and handling it separately. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:43 -04:00
Stephen Worley	40a2a6cdd3	zebra: Add DPLANE_NEIGH and DPLANE_VTEP to nhg cases Add DPLANE_OP_NEIGH and DPLANE_OP_VTEP to nhg dplane handler's switch statements. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:43 -04:00
Stephen Worley	fec211ad95	zebra: Zebra nexthop group re-work checkpatch fixes Checkpatch fixes for the zebra nexthop group re-work. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:43 -04:00
Stephen Worley	e9f6516243	zebra: Fix NULL check in zebra_nhg_rib_find() Check both the nhg and nexthop are not NULL before passing them to be hashed. Clang SA caught this. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:43 -04:00
Stephen Worley	4d21c7c086	zebra: Only use passed afi for blackhole/ifindex nexthops Only used the afi passed into `zebra_nhg_find()` for nexthops that are blackhole/ifindex. Others should use the type actually declared in the nexthop struct itself. Basically, nexthop objects of type blackhole/ifindex in the kernel must have an address family, they cannot be ambigious and be shared. This is some requirement in the linux ip core code. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:43 -04:00
Stephen Worley	1b366e63be	zebra: Handle out of order kernel nexthop groups Add a mechanism to requeue groups we receive from the kernel if the IDs are in a weird order (Group ID is lower than individual nexthop IDs for example). Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:42 -04:00
Stephen Worley	3e347f4181	zebra: Free labels on nhg_ctx from kernel If we get a nexthop group from the kernel with labels and queue it as a context to process later, we have to free the label stack we allocated. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:42 -04:00
Stephen Worley	10200d4054	zebra: Add some getters for nhg_ctx Add some getters for the nhg_ctx struct. Probably unnecessary at this point since they are all static but if they ever become public it will be nice to have them. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:42 -04:00
Stephen Worley	62991a1167	zebra: NHE hash reduce calls to jhash Reduce the two calls to jhash to one jhash_3words() call to save some more hashing time. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:42 -04:00
Stephen Worley	2001be6cc0	zebra: NHE use nexthop_group_equal_no_recurse() Update nhg_hash_entry to use the non-recursive version of nexthop_group_equal() since it doesn't really need to compare all of those. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:42 -04:00
Stephen Worley	9ef49038d5	lib,zebra: Move nexthop dup marking into creation We were waiting until install time to mark nexthops as duplicate. Since they are immutable now and re-used, move this marking into when they are actually created to save a bunch of cycles. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:42 -04:00
Stephen Worley	e4ac313b12	zebra: Check active count first in nhg_hash_equal Before checking the equivalence of the whole group itself, check to see if they contain the same number of non-recursive active nexthops. This should shorten lookup time for the case of non-resolved nexthop group creation. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:42 -04:00
Stephen Worley	6384cbcb0e	zebra: Create depends after initial lookup Create any depends only after the initial hash lookup fails. Should reduce hashing cpu cycles significantly. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:42 -04:00
Stephen Worley	815059466c	zebra: Move the supports_nh bool to a better place Move the supports_nh bool indicating whether the kernel we are using supports nexthop objects into the netlink kernel interface itself. Since only linux and netlink support nexthop object APIs for now this is fine. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:41 -04:00
Stephen Worley	9a1588c4ce	zebra: Add handling for kernel del/update nexthop Add handling for delete/update nexthop object messages from the kernel. If someone deletes a nexthop object we are still using, send it back down. If the someone updates a nexthop we are using, replace that nexthop with ours. Routes are referencing this nexthop object ID and we resolved it ourselves, so we should force the other `someone` to submit to our will. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:41 -04:00
Stephen Worley	38e40db1c9	zebra: Sweep our nexthop objects out on restart On restart, if we failed to remove any nexthop objects due to a kill -9 or such event, sweep them if we aren't using them. Add a proto field to handle this and remove the is_kernel bool. Add a dupicate flag that indicates this nexthop group is only present in our ID hashtable. It is a dupicate nexthop we received from the kernel, therefore we cannot hash on it. Make the idcounter globally accessible so that kernel updates increment it as soon as we receive them, not when we handle them. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:41 -04:00
Stephen Worley	428b4c0a5d	zebra: Give installed nhe's the zebra proto Give all nhg_hash_entrys we install into the kernel as nexthop objects a defined proto matching the zebra rib table one. This makes sense since nhe's are proto-independent and determined exclusively in zebra. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:41 -04:00
Stephen Worley	8dbc800f42	zebra: Prevent duplication and overflow in nhe2grp The kernel does not allow duplicate IDs in the same group, but we are perfectly find with it internally if two different nexthops resolve the the same nexthop (default route for instance). So, we have to handle this when we get ready to install. Further, pass the max group size in the arguments to ensure we don't overflow. Don't actually think this is possible due to multipath checking in nexthop_active_update() but better to be safe. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:41 -04:00
Stephen Worley	32e757f4ae	zebra: Mark nhe valid if installed If the nhe was successfully installed, make sure its marked as valid. Not fully sure how/where the valid flag is going to be used yet. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:41 -04:00
Stephen Worley	904ba1c8ee	zebra: A group isn't recursive if one depend is We were setting a group to be recursive if its first depend was. This is not the case; individual depends of the group might be recursive but the group itself is not. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:41 -04:00
Stephen Worley	f429bd1b24	zebra: Move resolve/add depend install into api Move the resolving and installing of a single nhg_hash_entry into the install function itself, rather than letting zebra_rib handle it. Further, ensure depends are installed/queued before installing a group. The ordering should be find here since only one thread will call this API. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:41 -04:00
Stephen Worley	8dfbc65724	zebra: Install the nhe along with the route Move the installation of an nhe out of nexthop_active_update() and into the rib install path. So, only install the nhe when a route using it is being installed. Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>	2019-10-25 11:13:41 -04:00

1 2 3

134 Commits