Commit Graph

38036 Commits

Author SHA1 Message Date
David Lamparter
a04474c061 lib: clean up nexthop hashing mess
We were hashing 4 bytes of the address.  Even for IPv6 addresses.

Oops.

The reason this was done was to try to make it faster, but made a
complex maze out of everything.  Time for a refactor.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit 001fcfa1dd)
2025-02-11 08:43:29 +00:00
David Lamparter
e9f1d637ee lib: guard against padding garbage in ZAPI read
When reading in a nexthop from ZAPI, only set the fields that actually
have meaning.  While it shouldn't happen to begin with, we can otherwise
carry padding garbage into the unused leftover union bytes.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit 4a0e1419a6)
2025-02-11 08:43:29 +00:00
David Lamparter
e74b0ca0fc zebra: guard against junk in nexthop->rmap_src
rmap_src wasn't initialized, so for IPv4 the unused 12 bytes would
contain whatever junk is on the stack on function entry.  Also move
the IPv4 parse before the IPv6 parse so if it's successful we can be
sure the other bytes haven't been touched.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit b666ee510e)
2025-02-11 08:43:29 +00:00
David Lamparter
9a8ad94ce9 pbrd: initialize structs used in hash_lookup
Doesn't seem to break anything but really poor style to pass potentially
uninitialized data to hash_lookup.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit c88589f5e9)
2025-02-11 08:43:28 +00:00
David Lamparter
a8775530a0 fpm: guard against garbage in unused address bytes
Zero out the 12 unused bytes (for the IPv6 address) when reading in an
IPv4 address.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit 95cf0b2279)
2025-02-11 08:43:28 +00:00
David Lamparter
36534e15c0 bgpd: don't reuse nexthop variable in loop/switch
While the loop is currently exited in all cases after using nexthop, it
is a footgun to have "nh" around to be reused in another iteration of
the loop.  This would leave nexthop with partial data from the previous
use.  Make it local where needed instead.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit ce7f5b2122)
2025-02-11 08:43:28 +00:00
Donatas Abraitis
bf16e53186
Merge pull request #18053 from FRRouting/mergify/bp/dev/10.3/pr-14105
pimd: Fix for FHR mroute taking longer to age out (backport #14105)
2025-02-07 16:10:46 +02:00
Rafael Zalamena
b61fedd029 pimd: fix DR election race on startup
In case interface address is learnt during configuration, make sure to
run DR election when configuring PIM/PIM passive on interface.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
(cherry picked from commit 8644524606)
2025-02-07 03:42:34 +00:00
Rajesh Varatharaj
ccbe9f985b pimd: Fix for FHR mroute taking longer to age out
Issue:
When there is no traffic for a group, the LHR and RP take the default KAT+Join timer expiry of
a maximum of 480 seconds to clear the S,G . However, in the FHR, we update the state from JOINED
to NOT Joined, downstream state from PPto NOINFO.  This restarts the ET timer, causing S,G on FHR to
take more than 10 minutes to age out.

In other words,
Consider a case where (S,G) is in Join state. When the traffic stops and the KAT (210) expires,
 the Join expiry timer restarts. At this time, if we receive a prune, the expectation is to set
 PPT to 0 (RFC 4601 sec 4.5.2).
 When the PPT expires, we move to the noinfo state and restart the expiry timer one more time. We remove the
 (S,G) entry only after ~10 minutes when there is no active traffic.

Summary:
KAT Join ET 210 + PP ET 210 + NOINFO ET 210.

Solution:
Delete the ifchannel when in noinfo state, and KAT is not running.

Ticket: #13703

Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>
(cherry picked from commit afed39ea2b)
2025-02-07 03:40:41 +00:00
Jafar Al-Gharaibeh
3d61bbe0d9
Merge pull request #18042 from FRRouting/mergify/bp/dev/10.3/pr-17865
Coverity 2024 new hotness (backport #17865)
2025-02-06 17:20:22 -06:00
Jafar Al-Gharaibeh
8b5df22906
Merge pull request #18043 from FRRouting/mergify/bp/dev/10.3/pr-18038
pimd: fix memory leak and assign allocation type (backport #18038)
2025-02-06 17:20:03 -06:00
Rafael Zalamena
92792cb2ac pimd: fix memory leak and assign allocation type
Use a memory allocation specific type for filter names (to help detect memory
leaks) and fix a memory leak when releasing peer memory.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
(cherry picked from commit d1440dadff)
2025-02-06 16:32:34 +00:00
Donald Sharp
ae8ee154ef zebra: Ensure that changes to dg_update_list are protected by mutex
The dg_update_list access is controlled by the dg_mutex in all
other locations.  Let's just add a mutex usage around the initialization
of the dg_update_list even if it's part of the startup, just to keep
things consistent.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 19af3f3d7a)
2025-02-06 16:16:49 +00:00
Donald Sharp
edfcf0662b bgpd: Ensure ibuf count is protected by mutex
Grab the count of streams in ibuf when it is protected
by a mutex.  Since this data is written to it in another
pthread.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit f94ad538cf)
2025-02-06 16:16:48 +00:00
Donald Sharp
791ec8ebdd zebra: Add some documentation on when zserv_open should be used
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 4b96752737)
2025-02-06 16:16:48 +00:00
Donald Sharp
01af91e5ff ospfd: Fix Coverity SA #1617470, 76 and 78
msg_new takes a uint16_t, the length passed
down variable is a unsigned int, thus 32 bit.
It's possible, but highly unlikely, that the
msglen could be greater than 16 bit.
Let's just add some checks to ensure that
this could not happen.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 283cc51178)
2025-02-06 16:16:48 +00:00
Donald Sharp
62e01a8982
Merge pull request #18019 from FRRouting/mergify/bp/dev/10.3/pr-18000
bgpd: Fix up memory leak in processing eoiu marker (backport #18000)
2025-02-05 08:17:07 -05:00
Donald Sharp
9b99397859 bgpd: Fix up memory leak in processing eoiu marker
Memory is being leaked when processing the eoiu marker.
BGP is creating a dummy dest to contain the data but
it was never freed.  As well as the eoiu info was
not being freed either.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit c6b7a993fb)
2025-02-05 05:22:23 +00:00
Russ White
b2f9962550
Merge pull request #18006 from FRRouting/mergify/bp/dev/10.3/pr-17959
bgpd: Do not start BGP session if BGP identifier is not set (backport #17959)
2025-02-04 11:46:15 -05:00
Donatas Abraitis
3f788da60b tests: Check if the peer stays Idle if router-id is not set
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 48560b5c9b)
2025-02-04 16:37:56 +00:00
Donatas Abraitis
46d210ce80 bgpd: Do not start BGP session if BGP identifier is not set
If we have IPv6-only network and no IPv4 addresses at all, then by default
0.0.0.0 is created which is treated as malformed according to RFC 6286.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 739f2b566a)
2025-02-04 16:37:55 +00:00
Donatas Abraitis
388f9ef0cb doc: Say that 0.0.0.0 (0) BGP identifier is invalid
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit c9a2928954)
2025-02-04 16:37:55 +00:00
Mark Stapp
704372bf4b
Merge pull request #17969 from donaldsharp/fpm_lost_sends
zebra: Ensure dplane does not send work back to master at wrong time
2025-02-04 11:18:07 -05:00
Russ White
f74fa9543b
Merge pull request #17992 from chiragshah6/fdev5
bgpd: fix route-distinguisher in vrf leak json cmd
2025-02-04 07:40:36 -05:00
Donatas Abraitis
0ee2773149
Merge pull request #17991 from chiragshah6/bgp_dev4
zebra: fix evpn svd hash avoid double free
2025-02-04 14:34:21 +02:00
Russ White
f9e11d6974
Merge pull request #17943 from opensourcerouting/clear-event-cpu-uaf
lib: fix use after free in `clear event cpu`
2025-02-04 06:57:52 -05:00
Russ White
adeb30d8f3
Merge pull request #17336 from forrestchu/sbfd
implement SBFD
2025-02-04 06:36:43 -05:00
Donatas Abraitis
817c2c9823
Merge pull request #17990 from enkechen-panw/aigp-cfg-default
bgpd: add config default for "bgp bestpath aigp"
2025-02-04 10:51:52 +02:00
Donatas Abraitis
cb7d1cbf53
Merge pull request #17989 from cscarpitta/fix/fix_staticd_no_sid
staticd: Fix wrong xpath in `no sid X:X::X:X/M`
2025-02-04 10:47:20 +02:00
Chirag Shah
892704d07f bgpd: fix route-distinguisher in vrf leak json cmd
For auto configured value RD value comes as NULL,
switching back to original change will ensure to cover
for both auto and user configured RD value in JSON.

tor-11# show bgp vrf blue ipv4 unicast route-leak json
{
  "vrf":"blue",
  "afiSafi":"ipv4Unicast",
  "importFromVrfs":[
    "purple"
  ],
  "importRts":"10.10.3.11:6",
  "exportToVrfs":[
    "purple"
  ],
  "routeDistinguisher":"(null)", <<<<<
  "exportRts":"10.10.3.11:10"
}

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2025-02-03 20:58:45 -08:00
Chirag Shah
1d4f5b9b19 zebra: evpn svd hash avoid double free
Upon zebra shutdown hash_clean_and_free is called
where user free function is passed,
The free function should not call hash_release
which lead to double free of hash bucket.

Fix:
The fix is to avoid calling hash_release from
free function if its called from hash_clean_and_free
path.

10 0x00007f0422b7df1f in free () from /lib/x86_64-linux-gnu/libc.so.6
11 0x00007f0422edd779 in qfree (mt=0x7f0423047ca0 <MTYPE_HASH_BUCKET>,
    ptr=0x55fc8bc81980) at ../lib/memory.c:130
12 0x00007f0422eb97e2 in hash_clean (hash=0x55fc8b979a60,
    free_func=0x55fc8a529478 <svd_nh_del_terminate>) at
    ../lib/hash.c:290
13 0x00007f0422eb98a1 in hash_clean_and_free (hash=0x55fc8a675920
    <svd_nh_table>, free_func=0x55fc8a529478 <svd_nh_del_terminate>) at
    ../lib/hash.c:305
14 0x000055fc8a5323a5 in zebra_vxlan_terminate () at
    ../zebra/zebra_vxlan.c:6099
15 0x000055fc8a4c9227 in zebra_router_terminate () at
    ../zebra/zebra_router.c:276
16 0x000055fc8a4413b3 in zebra_finalize (dummy=0x7fffb881c1d0) at
    ../zebra/main.c:269
17 0x00007f0422f44387 in event_call (thread=0x7fffb881c1d0) at
    ../lib/event.c:2011
18 0x00007f0422ecb6fa in frr_run (master=0x55fc8b733cb0) at
    ../lib/libfrr.c:1243
19 0x000055fc8a441987 in main (argc=14, argv=0x7fffb881c4a8) at
    ../zebra/main.c:584

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2025-02-03 16:09:20 -08:00
Carmine Scarpitta
210a7d8981 tests: Add test case to verify SID re-add
Add a new test case that re-add the deleted SIDs and verifies that all
SIDs are added back to the RIB.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
2025-02-03 23:02:30 +01:00
Carmine Scarpitta
4eed9ee0a7 tests: Add test case to verify SID delete
Add a new test case that deletes a SID and verifies that only this
SID has been removed from the RIB.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
2025-02-03 23:02:11 +01:00
Carmine Scarpitta
c809035cc4 staticd: Fix wrong xpath in no sid X:X::X:X/M
When a user wants to delete a specific SRv6 SID, he executes the
`no sid X:X::X:X/M` command.
However, by mistake, in addition to deleting the SID requested by the
user, this command also removes all other SIDs.

This happens because `no sid X:X::X:X/M` triggers a destroy operation
on the wrong xpath `frr-staticd:staticd/segment-routing/srv6`.

This commit fixes the issue by replacing the wrong xpath
`frr-staticd:staticd/segment-routing/srv6` with the correct xpath
`frr-staticd:staticd/segment-routing/srv6/static-sids/sid[sid='%s']`.

This ensures that the `no sid X:X::X:X/M` command only deletes the SID
that was requested by the user.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
2025-02-03 22:33:00 +01:00
Donald Sharp
f54241a346
Merge pull request #17970 from mjstapp/fix_privs_no_caps
libs: return from change_caps if no caps
2025-02-03 12:57:44 -05:00
Carmine Scarpitta
0768c620e0
Merge pull request #17913 from Sokolmish/bgp-sid-release
bgpd: Release SID on router deletion
2025-02-03 14:52:00 +01:00
Enke Chen
6204db214e bgpd: add config default for "bgp bestpath aigp"
Just to make it simpler for compiling with a different default value.
No change to its default value.

Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
2025-02-02 20:35:44 -08:00
Donatas Abraitis
4f43a33d42
Merge pull request #17979 from cscarpitta/fix/fix_staticd_sid_notify
staticd: Fix NULL pointer dereference when receiving `ZAPI_SRV6_SID_RELEASED` notification
2025-02-02 21:17:33 +02:00
Russ White
593e3e199a
Merge pull request #17947 from opensourcerouting/fix/bgp_disable_vrf
bgpd: Do not ignore auto generated VRF instances when deleting
2025-02-02 12:41:12 -05:00
Donatas Abraitis
91ebab35f2
Merge pull request #17964 from cscarpitta/fix/fix-srv6-sid-manager
Fix SRv6 SID Manager
2025-02-02 13:32:36 +02:00
Carmine Scarpitta
7a46623348 staticd: Fix NULL pointer dereference
When staticd receives a `ZAPI_SRV6_SID_RELEASED` notification from SRv6
SID Manager, it tries to unset the validity flag of `sid`. But since
the `sid` variable is NULL, we get a NULL pointer dereference.

```
=================================================================
==13815==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000060 (pc 0xc14b813d9eac bp 0xffffcb135a40 sp 0xffffcb135a40 T0)
==13815==The signal is caused by a READ memory access.
==13815==Hint: address points to the zero page.
    #0 0xc14b813d9eac in static_zebra_srv6_sid_notify staticd/static_zebra.c:1172
    #1 0xe44e7aa2c194 in zclient_read lib/zclient.c:4746
    #2 0xe44e7a9b69d8 in event_call lib/event.c:1984
    #3 0xe44e7a85ac28 in frr_run lib/libfrr.c:1246
    #4 0xc14b813ccf98 in main staticd/static_main.c:193
    #5 0xe44e7a4773f8 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #6 0xe44e7a4774c8 in __libc_start_main_impl ../csu/libc-start.c:392
    #7 0xc14b813cc92c in _start (/usr/lib/frr/staticd+0x1c92c)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV staticd/static_zebra.c:1172 in static_zebra_srv6_sid_notify
==13815==ABORTING
```

This commit fixes the problem by doing a SID lookup first. If the SID
can't be found, we log an error and return. If the SID is found, we go
ahead and unset the validity flag.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
2025-02-02 10:06:22 +01:00
Donatas Abraitis
43a04450e0
Merge pull request #17972 from enkechen-panw/rr-policy
bgpd: add config default for "route-reflector allow-outbound-policy"
2025-02-02 09:53:16 +02:00
Enke Chen
a2018b3ee9 bgpd: add config default for "route-reflector allow-outbound-policy"
Just to make it simpler for compiling with a different default value.
No change to its default value.

Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
2025-02-01 10:24:19 -08:00
Donatas Abraitis
69b763f730
Merge pull request #17971 from donaldsharp/suppress_fib_giving_us_the_business
bgpd: With suppress-fib-pending ensure withdrawal is sent
2025-02-01 13:25:37 +02:00
Donald Sharp
4e8eda74ec bgpd: With suppress-fib-pending ensure withdrawal is sent
When you have suppress-fib-pending turned on it is possible
to end up in a situation where the prefix is not withdrawn
from downstream peers.

Here is the timing that I believe is happening:

a) have 2 paths to a peer.
b) receive a withdrawal from 1 path, set BGP_NODE_FIB_INSTALL_PENDING
   and send the route install to zebra.
c) receive a withdrawal from the other path.
d) At this point we have a dest->flags set BGP_NODE_FIB_INSTALL_PENDING
   old_select the path_info going away, new_select is NULL
e) A bit further down we call group_announce_route() which calls
   the code to see if we should advertise the path.  It sees the
   BGP_NODE_FIB_INSTALL_PENDING flag and says, nope.
f) the route is sent to zebra to withdraw, which unsets the
   BGP_NODE_FIB_INSTALL_PENDING.
g) This function winds up and deletes the path_info.  Dest now
   has no path infos.
h) BGP receives the route install(from step b) and unsets the
   BGP_NODE_FIB_INSTALL_PENDING flag
i) BGP receives the route removed from zebra (from step f) and
   unsets the flag again.

We know if there is no new_select, let's go ahead and just
unset the PENDING flag to allow the withdrawal to go out
at the time when the second withdrawal is received.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-31 18:53:30 -05:00
Mark Stapp
e13a4485bf libs: return from change_caps if no caps
When called without caps/privs, just return from "change_caps"
instead of exiting - it's possible that a process may not need
privs, but a lib (for example) may use the api.

Signed-off-by: Mark Stapp <mjs@cisco.com>
2025-01-31 13:13:48 -05:00
Donald Sharp
c41155221e zebra: Ensure dplane does not send work back to master at wrong time
When looping through the dplane providers, the worklist was
being populated with items from the last provider and then
the event system was checked to see if we should stop processing.
If the event system says `yes` then the dplane code would stop
and send the worklist to the master zebra pthread for collection.
This obviously skipped the next dplane provider on the list
which is double plus not good.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-31 12:38:20 -05:00
Donatas Abraitis
ce20b8cc0d
Merge pull request #17956 from pguibert6WIND/isis_srv6_codepoint_erroneous
isisd: fix erroneous srv6 information in database
2025-01-31 13:56:09 +02:00
Carmine Scarpitta
339c49bcff tests: Add testcase for static End/uN validation
This commit adds a testcase to validate static End/uN allocation.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
2025-01-30 19:28:34 +01:00
Carmine Scarpitta
a879aebf69 zebra: Fix SRv6 SID Manager
The SRv6 SID Manager does not allow allocating an SRv6 End/uN function
even though it is already supported by staticd.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
2025-01-30 19:28:34 +01:00