From ec6a000b0b19ac7888a9793676741af32ec38cde Mon Sep 17 00:00:00 2001 From: Donald Sharp Date: Tue, 14 Jan 2025 16:23:40 -0500 Subject: [PATCH] zebra: On Nexthop install failure don't set Installation failed Currently FRR when installing a nexthop group, the installation can fail. The assumption with the code was that the current nexthop group was not already installed. This leaves a problem state where if the users of the nexthop group are removed, the nexthop group will be removed possibly leaving a orphaned nexthop group in the data plane. FRR on a nexthop group installation does not actually know the status of the nexthop group in the kernel. It's possible that a earlier version of the nexthop group is left in play. It's possible that there is no nexthop group in the kernel at all. Leaving the Installed flag alone allows upon Zebra removing the nexthop group when it is removed from zebra. Signed-off-by: Donald Sharp --- zebra/zebra_nhg.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/zebra/zebra_nhg.c b/zebra/zebra_nhg.c index 095673399a..712b2534cc 100644 --- a/zebra/zebra_nhg.c +++ b/zebra/zebra_nhg.c @@ -3454,7 +3454,13 @@ void zebra_nhg_dplane_result(struct zebra_dplane_ctx *ctx) ZAPI_NHG_INSTALLED); break; case ZEBRA_DPLANE_REQUEST_FAILURE: - UNSET_FLAG(nhe->flags, NEXTHOP_GROUP_INSTALLED); + /* + * With a request failure it is unknown what we now know + * this is because Zebra has lost track of whether or not + * any previous versions of this NHG are in the kernel + * or even what those versions were. So at this point + * we cannot unset the INSTALLED flag. + */ /* If daemon nhg, send it an update */ if (PROTO_OWNED(nhe)) zsend_nhg_notify(nhe->type, nhe->zapi_instance,