Commit Graph

38033 Commits

Author SHA1 Message Date
Donald Sharp
d9712eb382 bgpd: remove dmed check not required in bestpath selection
As part of the upstream master commit (f3575f61c7 bgpd: Sort the
bgp_path_inf) the snippet of the code for dmed check condition
left out, which leads to an issue of selecting incorrect bestpath.

As an example:

During the bestpath selection local route looses to another path due
to dmed condition being hit.

The snippet of the logs:

2025/02/20 03:06:20.131441 BGP: [JW7VP-K1YVV]
[2]:[0]:[48]:[00:92:00:00:00:10](VRF default): Comparing path
27.0.0.7 flags Valid  with path Static announcement flags Selected Valid Attr Changed Unsorted
2025/02/20 03:06:20.131445 BGP: [SYTDR-QV6X9] [2]:[0]:[48]:[00:92:00:00:00:10]: path 27.0.0.7 loses to path Static announcement as ES 03:44:38:39:ff:ff:02:00:00:01 is same and local
2025/02/20 03:06:20.131452 BGP: [JW7VP-K1YVV] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): Comparing path 27.0.0.8 flags Valid  with path Static announcement flags Selected Valid Attr Changed Unsorted
2025/02/20 03:06:20.131456 BGP: [SYTDR-QV6X9] [2]:[0]:[48]:[00:92:00:00:00:10]: path 27.0.0.8 loses to path Static announcement as ES 03:44:38:39:ff:ff:02:00:00:01 is same and local
2025/02/20 03:06:20.131458 BGP: [WEWEC-8SE72] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): path Static announcement is the bestpath from AS 0   <<<< static is best
2025/02/20 03:06:20.131463 BGP: [Z3A78-GM3G5] bgp_best_selection: [2]:[0]:[48]:[00:92:00:00:00:10](VRF default) pi 27.0.0.7 dmed
2025/02/20 03:06:20.131467 BGP: [Z3A78-GM3G5] bgp_best_selection: [2]:[0]:[48]:[00:92:00:00:00:10](VRF default) pi 27.0.0.8 dmed
2025/02/20 03:06:20.131471 BGP: [N6CTF-2RSKS] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): After path selection, newbest is path 27.0.0.7 oldbest was Static announce

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 83ad94694b)
2025-02-21 23:12:20 +00:00
Donald Sharp
24dbcbb31e
Merge pull request #18203 from FRRouting/mergify/bp/dev/10.3/pr-14227
pimd: Fix for data packet loss when FHR is LHR and RP (backport #14227)
2025-02-20 16:20:09 -05:00
Rajesh Varatharaj
02de49a3b2 pimd: Fix for data packet loss when FHR is LHR and RP
Topology:
A single router is acting as the First Hop Router (FHR), Last Hop Router (LHR), and RP.

RC and Issue:
When an upstream S,G is in join state, it sends a register message to the RP.
If the RP has the receiver, it sends a register stop message and switches to the shortest path.
When the register stop message is processed, it removes pimreg, moves to prune,
and starts the reg stop timer.

When the reg stop timer expires, PIM changes S,G state to Join Pending and sends out a NULL
register message to RP. RP receives it and fails to send Reg stop because SPT is not set at that point.

The problem is when the register stop timer pops and state is in Join Pending.
According to https://www.rfc-editor.org/rfc/rfc4601#section-4.4.1,
we need to put back the pimreg reg tunnel into the S,G mroute.
This causes data to be sent to the control plane and subsequently interrupts the line rate.

Fix:
If the router is FHR and RP to the group,
ignore SPT status and send out a register stop message back to the DR (in this context, the same router).

Ticket: #3506780

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>
(cherry picked from commit 8280257cc9)
2025-02-20 16:57:15 +00:00
Donald Sharp
211df1f649
Merge pull request #18162 from louis-6wind/bgp-hidden-10.3
bgpd: fix default instance when leaving the hidden state (backport 10.3)
2025-02-19 08:08:58 -05:00
Jafar Al-Gharaibeh
9a729a2fa3
Merge pull request #18191 from FRRouting/mergify/bp/dev/10.3/pr-18082
lib: nb: call child destroy CBs when YANG container is deleted (backport #18082)
2025-02-17 23:14:16 -06:00
Christian Hopps
3c08c6fe28 lib: nb: call child destroy CBs when YANG container is deleted
Previously the code was only calling the child destroy callbacks if the target
deleted node was a non-presence container. We now add a flag to the callback
structure to instruct northbound to perform the rescursive delete for code that
wishes for this to happen.

- Fix wrong relative path lookup in keychain destroy callback

Signed-off-by: Christian Hopps <chopps@labn.net>
(cherry picked from commit d03ecf4562)
2025-02-18 02:37:45 +00:00
Jafar Al-Gharaibeh
33ed0a3315
Merge pull request #18143 from FRRouting/mergify/bp/dev/10.3/pr-18079
bgpd: Fix crash in bgp_labelpool (backport #18079)
2025-02-17 13:05:21 -06:00
Donald Sharp
bb22e8b24b
Merge pull request #18166 from FRRouting/mergify/bp/dev/10.3/pr-18160
bgpd: When removing the prefix list drop the pointer (backport #18160)
2025-02-16 15:50:34 -05:00
Donald Sharp
f432c7b5b5
Merge pull request #18179 from FRRouting/mergify/bp/dev/10.3/pr-18178
isisd: Request SRv6 locator after zebra connection (backport #18178)
2025-02-16 15:50:08 -05:00
Donald Sharp
ea9fc3e7f8
Merge pull request #18183 from FRRouting/mergify/bp/dev/10.3/pr-18109
bgpd: fix vty output of evpn route-target AS4 (backport #18109)
2025-02-16 08:10:09 -05:00
Mark Stapp
ec023e084f bgpd: fix vty output of evpn route-target AS4
evpn route-targets are decoded in  ... multiple places; at least
two have a bug where the AS4 form doesn't have its AS decoded.

Signed-off-by: Mark Stapp <mjs@cisco.com>
(cherry picked from commit 9943a08720)
2025-02-15 20:11:56 +00:00
Carmine Scarpitta
c4ef0e37ef isisd: Request SRv6 locator after zebra connection
When SRv6 is enabled and an SRv6 locator is specified in the IS-IS
configuration, IS-IS may attempt to request SRv6 locator information from
zebra before the connection is fully established. If this occurs, the
request fails with the following error:

```
2025/02/14 21:41:20 ISIS: [HR66R-TWQYD][EC 100663302] srv6_manager_get_locator: invalid zclient socket
````

As a result, IS-IS is unable to obtain the locator information,
preventing SRv6 from working.

This commit fixes the issue by ensuring IS-IS requests SRv6 locator
information once the connection with zebra is successfully established.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
(cherry picked from commit f02dba19d2)
2025-02-15 14:40:39 +00:00
Carmine Scarpitta
5de0b8dc9d isisd: Add helper function to request SRv6 locator information
This commit adds a function that iterates over all IS-IS areas and asks
the SRv6 Manager for information about the configured locators.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
(cherry picked from commit 0b76fb3c13)
2025-02-15 14:40:39 +00:00
Donald Sharp
72594acc31 bgpd: When removing the prefix list drop the pointer
We are very very rarely seeing this crash:

    0 0x7f36ba48e389 in prefix_list_apply_ext lib/plist.c:789
    1 0x55eff3fa4126 in subgroup_announce_check bgpd/bgp_route.c:2334
    2 0x55eff3fa858e in subgroup_process_announce_selected bgpd/bgp_route.c:3440
    3 0x55eff4016488 in subgroup_announce_table bgpd/bgp_updgrp_adv.c:808
    4 0x55eff401664e in subgroup_announce_route bgpd/bgp_updgrp_adv.c:861
    5 0x55eff40111df in peer_af_announce_route bgpd/bgp_updgrp.c:2223
    6 0x55eff3f884cb in bgp_announce_route_timer_expired bgpd/bgp_route.c:5892
    7 0x7f36ba4ec239 in event_call lib/event.c:2019
    8 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
    9 0x55eff3e668b7 in main bgpd/bgp_main.c:557
    10 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    11 0x7f36b9e2d304 in __libc_start_main_impl ../csu/libc-start.c:360
    12 0x55eff3e64a30 in _start (/home/ci/cibuild.1407/frr-source/bgpd/.libs/bgpd+0x2fda30)
0x608000037038 is located 24 bytes inside of 88-byte region [0x608000037020,0x608000037078)
freed by thread T0 here:
    0 0x7f36ba8b76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
    1 0x7f36ba439bd7 in qfree lib/memory.c:131
    2 0x7f36ba48d3a3 in prefix_list_free lib/plist.c:156
    3 0x7f36ba48d3a3 in prefix_list_delete lib/plist.c:247
    4 0x7f36ba48fbef in prefix_bgp_orf_remove_all lib/plist.c:1516
    5 0x55eff3f679c4 in bgp_route_refresh_receive bgpd/bgp_packet.c:2841
    6 0x55eff3f70bab in bgp_process_packet bgpd/bgp_packet.c:4069
    7 0x7f36ba4ec239 in event_call lib/event.c:2019
    8 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
    9 0x55eff3e668b7 in main bgpd/bgp_main.c:557
    10 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
previously allocated by thread T0 here:
    0 0x7f36ba8b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    1 0x7f36ba4392e4 in qcalloc lib/memory.c:106
    2 0x7f36ba48d0de in prefix_list_new lib/plist.c:150
    3 0x7f36ba48d0de in prefix_list_insert lib/plist.c:186
    4 0x7f36ba48d0de in prefix_list_get lib/plist.c:204
    5 0x7f36ba48f9df in prefix_bgp_orf_set lib/plist.c:1479
    6 0x55eff3f67ba6 in bgp_route_refresh_receive bgpd/bgp_packet.c:2920
    7 0x55eff3f70bab in bgp_process_packet bgpd/bgp_packet.c:4069
    8 0x7f36ba4ec239 in event_call lib/event.c:2019
    9 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
    10 0x55eff3e668b7 in main bgpd/bgp_main.c:557
    11 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Let's just stop trying to save the pointer around in the peer->orf_plist
data structure.  There are other design problems but at least lets
stop the crash from possibly happening.

Fixes: #18138
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 3d43d7b789)
2025-02-14 21:29:01 +00:00
Louis Scalbert
85c5598bb9 tests: check as number in show run
Creates the default VRF instance after the other VRF instances. The
default VRF instance is created in hidden state. Check that AS number
in show run is correctly written.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-02-14 18:08:53 +01:00
Louis Scalbert
bb79a6562f tests: add bgp_l3vpn_hidden topotest
Test that leaving the hidden BGP instance state is working.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-02-14 18:08:53 +01:00
Alexander Skorichenko
8e04277fff bgpd: update AS value of a hidden bgp instance
'import vrf VRF' could define a hidden bgp instance with
the default AS_UNSPECIFIED (i.e. = 1) value.
When a
	router bgp AS vrf VRF
gets configured later on, replace this AS_UNSPECIFIED setting
with a requested value.

Fixes: 9680831518 ("bgpd: fix as_pretty mem leaks when un-hiding")
Signed-off-by: Alexander Skorichenko <askorichenko@netgate.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-02-14 18:08:53 +01:00
Louis Scalbert
d9d74d33bc Revert "bgpd: fix bgp vrf instance creation from implicit"
This reverts commit 2ff08af78e.

The fix is obviously wrong.

Link: 2ff08af78e
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-02-14 18:08:53 +01:00
Louis Scalbert
a2c1956efa bgpd: fix process_queue when un-hiding
bgp_process_queue_init() is not called in bgp_create() when leaving the
BGP instance hidden state because of the following goto:

>	if (hidden) {
>		bgp = bgp_old;
>		goto peer_init;
>	}

Upon reconfiguration of the default instance, the prefixes are never set
into a meta queue by mq_add_handler(). They are never processed for
zebra RIB installation and announcements of update/withdraw.

Do not delete the BGP process_queue when hiding.

Fixes: 4d0e7a49cf ("bgpd: VRF-Lite fix default bgp delete")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-02-14 18:08:53 +01:00
Louis Scalbert
20664936af bgpd: fix default instance name when un-hiding
When unconfiguring a default BGP instance with VPN SAFI configurations,
the default BGP structure remains but enters a hidden state. Upon
reconfiguration, the instance name incorrectly appears as "VIEW ?"
instead of "VRF default". And the name_pretty pointer

The name_pretty pointer is replaced by another one with the incorrect
name. This also leads to a memory leak as the previous pointer is not
properly freed.

Do not rewrite the instance name.

Fixes: 4d0e7a49cf ("bgpd: VRF-Lite fix default bgp delete")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-02-14 18:08:53 +01:00
Jafar Al-Gharaibeh
6cfea1aa87
Merge pull request #18146 from FRRouting/mergify/bp/dev/10.3/pr-18023
lib: fix false context information for SRv6 route (backport #18023)
2025-02-13 19:04:18 -06:00
Jafar Al-Gharaibeh
478864436d
Merge pull request #18151 from FRRouting/mergify/bp/dev/10.3/pr-18064
staticd: Fix SRv6 SID installation and deletion (backport #18064)
2025-02-13 19:03:50 -06:00
Donald Sharp
961a9528ec
Merge pull request #18154 from FRRouting/mergify/bp/dev/10.3/pr-18121
bgpd: release manual vpn label on instance deletion (backport #18121)
2025-02-13 17:44:20 -05:00
Louis Scalbert
6a72facbdc bgpd: release manual vpn label on instance deletion
When a BGP instance with a manually assigned VPN label is deleted, the
label is not released from the Zebra label registry. As a result,
reapplying a configuration with the same manual label leads to VPN
prefix export failures.

For example, with the following configuration:

> router bgp 65000 vrf BLUE
>  address-family ipv4 unicast
>   label vpn export <int>

Release zebra label registry on unconfiguration.

Fixes: d162d5f6f5 ("bgpd: fix hardset l3vpn label available in mpls pool")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit d6363625c3)
2025-02-13 19:09:09 +00:00
Carmine Scarpitta
92fa579d04 tests: Extend SRv6 static SIDs topotest to verify SID structure
The `static_srv6_sids` topotest verifies that staticd correctly
programs the SIDs in the zebra RIB. Currently, the topotest only
validates the programmed behavior and SID attributes.

This commit extends the topotest to also validate the SID structure.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
(cherry picked from commit a6d02fe2fb)
2025-02-13 18:38:17 +00:00
Carmine Scarpitta
2a23344123 lib: Add sidStructure in SRv6 SIDs JSON output
The `show ipv6 route json` command displays the IPv6 routing table in
JSON format, including SRv6 SIDs. For each SRv6 SID, it provides
behavior and SID attributes. However, it does not include the SID
structure.

This commit adds the SID structure to the SRv6 SID JSON output.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
(cherry picked from commit 312f7b3f8c)
2025-02-13 18:38:17 +00:00
Carmine Scarpitta
fce911649b staticd: Fix SRv6 SID installation and deletion
The SRv6 support in staticd (PR #16894) does not set the correct SID
parameters (block length, node length, function length).

This commit fixes the issue and computes the correct parameters.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
(cherry picked from commit e1654ba554)
2025-02-13 18:38:16 +00:00
Philippe Guibert
1f0845ad29 lib: fix false context information for SRv6 route
The seg6local route dumped by 'show ipv6 route' makes think that the USP
flavor is supported, whereas it is not the case. This information is a
context information, and for End, the context information should be
empty.

> # show ipv6 route
> [..]
> I>* fc00:0:4::/128 [115/0] is directly connected, sr0, seg6local End USP, weight 1, 00:49:01

Fix this by suppressing the USP information from the output.

Fixes: e496b42030 ("bgpd: prefix-sid srv6 l3vpn service tlv")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 658bf0281d)
2025-02-13 17:58:27 +00:00
Donald Sharp
ba721ac7f7 bgpd: Fix crash in bgp_labelpool
The bgp labelpool code is grabbing the vpn policy data structure.
This vpn_policy has a pointer to the bgp data structure.  If
a item placed on the bgp label pool workqueue happens to sit
there for the microsecond or so and the operator issues a
`no router bgp...` command that corresponds to the vpn_policy
bgp pointer, when the workqueue is run it will crash because
the bgp pointer is now freed and something else owns it.

Modify the labelpool code to store the vrf id associated
with the request on the workqueue.  When you wake up
if the vrf id still has a bgp pointer allow the request
to continue, else drop it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 14eac319e8)
2025-02-13 17:45:10 +00:00
Donald Sharp
dcebe85aa6
Merge pull request #18129 from FRRouting/mergify/bp/dev/10.3/pr-18026
Bfd fixups (backport #18026)
2025-02-13 11:19:28 -05:00
Jafar Al-Gharaibeh
e9e93e91fc
Merge pull request #18132 from opensourcerouting/fix/backport_82d28f137aed2e60380807a302e2b312408eff6e_10.3
Cid 1636504 (backport)
2025-02-12 22:26:21 -06:00
Jafar Al-Gharaibeh
98b774ada1
Merge pull request #18133 from FRRouting/mergify/bp/dev/10.3/pr-18120
bgpd: fix incorrect JSON in bgp_show_table_rd (backport #18120)
2025-02-12 22:25:55 -06:00
Louis Scalbert
83b859bf46 bgpd: fix incorrect json in bgp_show_table_rd
In bgp_show_table_rd(), the is_last argument is determined using the
expression "next == NULL" to check if the RD table is the last one. This
helps ensure proper JSON formatting.

However, if next is not NULL but is no longer associated with a BGP
table, the JSON output becomes malformed.

Updates the condition to also verify the existence of the next bgp_dest
table.

Fixes: 1ae44dfcba ("bgpd: unify 'show bgp' with RD with normal unicast bgp show")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit cf0269649c)
2025-02-12 20:10:58 +00:00
Philippe Guibert
867d79f5ac bgpd: fix bgp label evpn CID 1636504
The following static analysis can be seen :

> *** CID 1636504:    (ARRAY_VS_SINGLETON)
> /bgpd/bgp_evpn_mh.c: 1241 in bgp_evpn_type1_route_process()
> 1235            build_evpn_type1_prefix(&p, eth_tag, &esi, vtep_ip);
> 1236            /* Process the route. */
> 1237            if (attr) {
> 1238                    bgp_update(peer, (struct prefix *)&p, addpath_id, attr, afi, safi, ZEBRA_ROUTE_BGP,
> 1239                               BGP_ROUTE_NORMAL, &prd, &label, num_labels, 0, NULL);
> 1240            } else {
> >>>     CID 1636504:    (ARRAY_VS_SINGLETON)
> >>>     Passing "&label" to function "bgp_withdraw" which uses it as an array. This might corrupt or misinterpret adjacent memory locations.
> 1241                    bgp_withdraw(peer, (struct prefix *)&p, addpath_id, afi, safi, ZEBRA_ROUTE_BGP,
> 1242                                 BGP_ROUTE_NORMAL, &prd, &label, num_labels);
> 1243            }
> 1244            return 0;
> 1245     }
> 1246
> /bgpd/bgp_evpn_mh.c: 1238 in bgp_evpn_type1_route_process()
> 1232             * table
> 1233             */
> 1234            vtep_ip.s_addr = INADDR_ANY;
> 1235            build_evpn_type1_prefix(&p, eth_tag, &esi, vtep_ip);
> 1236            /* Process the route. */
> 1237            if (attr) {
> >>>     CID 1636504:    (ARRAY_VS_SINGLETON)
> >>>     Passing "&label" to function "bgp_update" which uses it as an array. This might corrupt or misinterpret adjacent memory locations.
> 1238                    bgp_update(peer, (struct prefix *)&p, addpath_id, attr, afi, safi, ZEBRA_ROUTE_BGP,
> 1239                               BGP_ROUTE_NORMAL, &prd, &label, num_labels, 0, NULL);
> 1240            } else {
> 1241                    bgp_withdraw(peer, (struct prefix *)&p, addpath_id, afi, safi, ZEBRA_ROUTE_BGP,
> 1242                                 BGP_ROUTE_NORMAL, &prd, &label, num_labels);
> 1243            }

Fix this by declaring a label array instead of a single array.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2025-02-12 21:54:38 +02:00
Philippe Guibert
50fe9e6e29 bgpd: simplify bgp_evpn_process_rt1 with label
Remove the num_labels variable, the received bgp_update() and
bgp_withdraw() function will read the message as including one
label or vni value.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2025-02-12 21:54:24 +02:00
Donald Sharp
461f2c327d bfdd: Use pass by reference for bfd_key_delete
Coverity is pointing out that bfd_key_delete is
passing by value instead of reference for a very
large structure.  Double plus not good.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 8119e167b0)
2025-02-12 18:01:22 +00:00
Donald Sharp
c7eb30d5b5 bfdd: Use pass by reference instead of pass by value for a struct
The function bfd_key_lookup is currently sending by value for
a now very large structure.  Let's convert this over to pass
by reference.  This is noticed by coverity.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 6d80d0c595)
2025-02-12 18:01:22 +00:00
Donald Sharp
45e7fc0bed
Merge pull request #18056 from FRRouting/mergify/bp/dev/10.3/pr-18048
pimd: fix DR election race on startup (backport #18048)
2025-02-12 12:39:27 -05:00
Donald Sharp
6d933fcbc3
Merge pull request #18083 from FRRouting/mergify/bp/dev/10.3/pr-17901
lib: actually hash all 16 bytes of IPv6 addresses, not just 4 (backport #17901)
2025-02-12 09:30:11 -05:00
Donald Sharp
8edcdf0149
Merge pull request #18101 from FRRouting/mergify/bp/dev/10.3/pr-18060
lib: crash handlers must be allowed on threads (backport #18060)
2025-02-12 08:18:58 -05:00
Donald Sharp
57c152832b
Merge pull request #18112 from FRRouting/mergify/bp/dev/10.3/pr-18078
nhrpd: fix dont consider incomplete L2 entry (backport #18078)
2025-02-12 08:17:12 -05:00
Donald Sharp
60dbf7eeba
Merge pull request #18115 from FRRouting/mergify/bp/dev/10.3/pr-18069
bgpd: Request SRv6 locator after zebra connection (backport #18069)
2025-02-12 08:15:05 -05:00
Carmine Scarpitta
8e4f5716c5 bgpd: Request SRv6 locator after zebra connection
When SRv6 is enabled and an SRv6 locator is specified in the BGP
configuration, BGP may attempt to request SRv6 locator information from
zebra before the connection is fully established. If this occurs, the
request fails with the following error:

```
2025/02/06 16:37:32 BGP: [HR66R-TWQYD][EC 100663302] srv6_manager_get_locator: invalid zclient socket
````

As a result, BGP is unable to obtain the locator information,
preventing SRv6 VPN from working.

This commit fixes the issue by ensuring BGP requests SRv6 locator
information once the connection with zebra is successfully established.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
(cherry picked from commit 16640b615d)
2025-02-12 03:00:15 +00:00
Philippe Guibert
ffcaa8ebb5 nhrpd: fix dont consider incomplete L2 entry
Sometimes, NHRP receives L2 information on a cache entry with the
0.0.0.0 IP address. NHRP considers it as valid and updates the binding
with the new IP address.

> Feb 09 20:09:54 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x2 cache used 0 type 4
> Feb 09 20:10:35 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x4 cache used 1 type 4
> Feb 09 20:10:48 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: del-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x4 cache used 1 type 4
> Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: who-has 10.2.114.238 dev dmvpn1 lladdr (unspec) nud 0x1 cache used 1 type 4
> Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QVXNM-NVHEQ] Netlink: update binding for 10.2.114.238 dev dmvpn1 from c 162.251.180.10 peer.vc.nbma 162.251.180.10 to lladdr (unspec)
> Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 0.0.0.0 nud 0x2 cache used 1 type 4
> Feb 09 20:11:30 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 0.0.0.0 nud 0x4 cache used 1 type 4

Actually, the 0.0.0.0 IP addressed mentiones in the 'who-has' message is
wrong because the nud state value means that value is incomplete and
should not be handled as a valid entry. Instead of considering it, fix
this by by invalidating the current binding. This step is necessary in
order to permit NHRP to trigger resolution requests again.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 3202323052)
2025-02-12 02:58:06 +00:00
Russ White
624682cfe9
Merge pull request #18099 from FRRouting/mergify/bp/dev/10.3/pr-18081
bgpd: fix bgp vrf instance creation from implicit (backport #18081)
2025-02-11 12:28:57 -05:00
David Lamparter
cbcc66c5d6 lib: crash handlers must be allowed on threads
Blocking all signals on non-main threads is not the way to go, at least
the handlers for SIGSEGV, SIGBUS, SIGILL, SIGABRT and SIGFPE need to run
so we get backtraces.  Otherwise the process just exits.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit 13a6ac5b4c)
2025-02-11 17:26:42 +00:00
Chirag Shah
e2d2797f7a bgpd: fix bgp vrf instance creation from implicit
In bgp route leak, when import vrf x is executed,
it creates bgp instance as hidden with asn value as unspecified.

When router bgp x is configured ensure the correct as,
asnotation is applied otherwise running config shows asn value as 0.

This can lead to frr-reload failure when any FRR config change.

Fix:
Move asn and asnotiation, as_pretty value in common done section,
so when bgp_create gets existing instance but before returning
update asn and required fields in common section.

In bgp_create(): when returning for hidden at least update asn
and required when bgp instance created implicitly due to vrf leak.

if (hidden) {
    bgp = bgp_old;
    goto peer_init; <<<
}

Before fix:
show running:

router bgp 0 vrf purple
 bgp router-id 10.10.3.11
 !
 address-family ipv4 unicast
  redistribute static
  import vrf blue
 exit-address-family
 !
 address-family ipv6 unicast
  import vrf blue
 exit-address-family
 !
 address-family l2vpn evpn
  advertise ipv4 unicast
  advertise ipv6 unicast
 exit-address-family
exit

Testing:

1) following snippet config:
router bgp 63420 vrf blue
 import vrf purple
router bgp 63420 vrf purple
 import vrf blue
2) restart frr leads to the running config with 0 asn value.

Signed-off-by: Chirag Shah <chirag@nvidia.com>
(cherry picked from commit 2ff08af78e)
2025-02-11 17:20:28 +00:00
David Lamparter
a04474c061 lib: clean up nexthop hashing mess
We were hashing 4 bytes of the address.  Even for IPv6 addresses.

Oops.

The reason this was done was to try to make it faster, but made a
complex maze out of everything.  Time for a refactor.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit 001fcfa1dd)
2025-02-11 08:43:29 +00:00
David Lamparter
e9f1d637ee lib: guard against padding garbage in ZAPI read
When reading in a nexthop from ZAPI, only set the fields that actually
have meaning.  While it shouldn't happen to begin with, we can otherwise
carry padding garbage into the unused leftover union bytes.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit 4a0e1419a6)
2025-02-11 08:43:29 +00:00
David Lamparter
e74b0ca0fc zebra: guard against junk in nexthop->rmap_src
rmap_src wasn't initialized, so for IPv4 the unused 12 bytes would
contain whatever junk is on the stack on function entry.  Also move
the IPv4 parse before the IPv6 parse so if it's successful we can be
sure the other bytes haven't been touched.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit b666ee510e)
2025-02-11 08:43:29 +00:00