Commit Graph

6160 Commits

Author SHA1 Message Date
Donald Sharp
594f65d888
Merge pull request #18242 from kaffarell/master
fabricd: add option to treat dummy interfaces as loopback interfaces
2025-02-26 11:18:22 -05:00
Carmine Scarpitta
ec5ff367b1 staticd: Extend static_zebra_request_srv6_sid to request SRv6 uA SIDs
In order to configure an SRv6 uA SID in staticd, staticd should request
SRv6 SID Manager to allocate a SID bound to the uA behavior.
Currently, `static_zebra_request_srv6_sid` does not support requesting
SIDs bound to the uA behavior.

This commit extends the `static_zebra_request_srv6_sid` function to
enable staticd to request SIDs bound to the uA behavior.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
2025-02-26 07:19:51 +01:00
Gabriel Goller
80e96712e4 zebra: add ZEBRA_IF_DUMMY flag for dummy interfaces
Introduce ZEBRA_IF_DUMMY interface flag to identify Linux dummy interfaces [0].
These interfaces behave similarly to loopback interfaces and can be
specially handled by daemons.

[0]: https://github.com/torvalds/linux/blob/master/drivers/net/dummy.c

Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
2025-02-25 10:13:34 +01:00
Donna Sharp
e0387cd17f zebra: use provider function to receive data directly
Signed-off-by: Donna Sharp <dksharp5@gmail.com>
2025-02-24 14:51:12 -05:00
Donald Sharp
03ebdc3c4a zebra: Add operational retrieval of Multipath Number
The multipath number specified is not available through
the yang data and is not retrievable.  Make it so.
At this point in time do not allow this to be set from
yang.  Perhaps in the future.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-02-23 11:14:47 -05:00
Donald Sharp
f7fd861fda *: Remove unneeded IPV6_JOIN|LEAVE_GROUP
Headers include this stuff now.  No need for it
in our code base.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-02-20 16:16:35 -05:00
Donald Sharp
66434fc2ee
Merge pull request #18108 from opensourcerouting/fix/zebra_no_vni_validation
zebra: Do not flush an existing vni configuration trying to remove wrong vni
2025-02-19 07:22:03 -05:00
David Schweizer
1eef3a77e3
lib,zebra: Allow class E prefixes in RIB
Changes allow ipv4 class E addresses and prefixes in the 240.0.0.0/4
range to be configured on interfaces, imported from the kernel routing
table and redistributed as connected routes in zebra by default.

Changes also fix routes with class E prefixes in kernel routing table
getting rejected by zebra during early daemon startup.

Drivin this change in default behavior are cloud providers (with
customers still using obsolete ipv4 protocol, i.e. Azure, AWS) running
out of ip space and abusing class E for addressing instances (announced
via BGP) over tunneling connections back to customers on premise
infrastructure.

Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
2025-02-14 15:05:08 +01:00
Donald Sharp
40744f4f3d zebra: Use tableid when displaying prefix
Found some more instances of tableid not being
displayed when trying to debug something.  Fix.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-02-13 10:40:52 -05:00
Donatas Abraitis
44fe3981ee zebra: Do not flush an existing vni configuration trying to remove wrong vni
Before:

```
pc.donatas.net(config)# do sh run | include vni
vni 1
pc.donatas.net(config)# no vni 2
pc.donatas.net(config)# do sh run | include vni
pc.donatas.net(config)#
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2025-02-12 23:37:20 +02:00
Donald Sharp
54dc8382eb zebra: Allow fpm_listener to continue to try to read
Currently when the fpm_listener attempts to read say X
bytes it may only get Y( which is less than X ).  In this
case we should assume that the dplane_fpm_nl code is just
being slow, as that we know it is possible for it to send
a partial fpm message.  Let's just loosen the constraints
a bit and allow data to flow.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-02-11 12:42:02 -05:00
Jafar Al-Gharaibeh
92288c9069
Merge pull request #17865 from donaldsharp/coverity_2024_new_hotness
Coverity 2024 new hotness
2025-02-06 10:15:55 -06:00
Donald Sharp
0b42b4ce6d
Merge pull request #17901 from opensourcerouting/nexthop_hashing
lib: actually hash all 16 bytes of IPv6 addresses, not just 4
2025-02-05 09:14:58 -05:00
Russ White
3fabd4f4f9
Merge pull request #18014 from donaldsharp/nexthop_leak
Nexthop leak
2025-02-05 08:32:13 -05:00
Donald Sharp
abbfcc49f9 zebra: Fix srv6 segment nexthop memory leak.
The srv6 segment was being set but never freed
on the statically allocated nexthop.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-02-04 15:13:48 -05:00
Russ White
1cbb4b9e3d
Merge pull request #17962 from donaldsharp/fpm_problems
Fpm problems
2025-02-04 15:09:05 -05:00
Donald Sharp
29dcfd415f zebra: Stop leaking labels when receiving nexthops from kernel
This leak is happening:
Direct leak of 96 byte(s) in 2 object(s) allocated from:
    0 0x7f6922eb83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    1 0x7f6922a38ebb in qcalloc lib/memory.c:106
    2 0x7f6922a553d6 in nexthop_add_srv6_seg6 lib/nexthop.c:652
    3 0x562825e56b38 in parse_nexthop_unicast zebra/rt_netlink.c:589
    4 0x562825e58c4a in netlink_route_change_read_unicast_internal zebra/rt_netlink.c:1291
    5 0x562825e58eef in netlink_route_change_read_unicast zebra/rt_netlink.c:1321
    6 0x562825e64921 in netlink_route_change zebra/rt_netlink.c:1494
    7 0x562825e43407 in netlink_information_fetch zebra/kernel_netlink.c:407
    8 0x562825e439b5 in netlink_parse_info zebra/kernel_netlink.c:1148
    9 0x562825e44060 in kernel_read zebra/kernel_netlink.c:510
    10 0x7f6922aeca72 in event_call lib/event.c:1984
    11 0x7f6922a19e01 in frr_run lib/libfrr.c:1246
    12 0x562825e4b0b9 in main zebra/main.c:543
    13 0x7f692250c249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Just check to see if it has been allocated.  The nexthop is a stack
variable so it's a bit odd.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-02-04 15:00:12 -05:00
Russ White
4349cab51b
Merge pull request #17953 from donaldsharp/limit_ip_protocol
lib: Remove System routes from ip protocol route map choices
2025-02-04 11:43:10 -05:00
Mark Stapp
704372bf4b
Merge pull request #17969 from donaldsharp/fpm_lost_sends
zebra: Ensure dplane does not send work back to master at wrong time
2025-02-04 11:18:07 -05:00
Chirag Shah
1d4f5b9b19 zebra: evpn svd hash avoid double free
Upon zebra shutdown hash_clean_and_free is called
where user free function is passed,
The free function should not call hash_release
which lead to double free of hash bucket.

Fix:
The fix is to avoid calling hash_release from
free function if its called from hash_clean_and_free
path.

10 0x00007f0422b7df1f in free () from /lib/x86_64-linux-gnu/libc.so.6
11 0x00007f0422edd779 in qfree (mt=0x7f0423047ca0 <MTYPE_HASH_BUCKET>,
    ptr=0x55fc8bc81980) at ../lib/memory.c:130
12 0x00007f0422eb97e2 in hash_clean (hash=0x55fc8b979a60,
    free_func=0x55fc8a529478 <svd_nh_del_terminate>) at
    ../lib/hash.c:290
13 0x00007f0422eb98a1 in hash_clean_and_free (hash=0x55fc8a675920
    <svd_nh_table>, free_func=0x55fc8a529478 <svd_nh_del_terminate>) at
    ../lib/hash.c:305
14 0x000055fc8a5323a5 in zebra_vxlan_terminate () at
    ../zebra/zebra_vxlan.c:6099
15 0x000055fc8a4c9227 in zebra_router_terminate () at
    ../zebra/zebra_router.c:276
16 0x000055fc8a4413b3 in zebra_finalize (dummy=0x7fffb881c1d0) at
    ../zebra/main.c:269
17 0x00007f0422f44387 in event_call (thread=0x7fffb881c1d0) at
    ../lib/event.c:2011
18 0x00007f0422ecb6fa in frr_run (master=0x55fc8b733cb0) at
    ../lib/libfrr.c:1243
19 0x000055fc8a441987 in main (argc=14, argv=0x7fffb881c4a8) at
    ../zebra/main.c:584

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2025-02-03 16:09:20 -08:00
Donald Sharp
64709ec2a9 zebra: Ensure dplane does not send work back to master at wrong time
When looping through the dplane providers, the worklist was
being populated with items from the last provider and then
the event system was checked to see if we should stop processing.
If the event system says `yes` then the dplane code would stop
and send the worklist to the master zebra pthread for collection.
This obviously skipped the next dplane provider on the list
which is double plus not good.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-31 15:05:40 -05:00
Donald Sharp
07a803a7b3 zebra: Stop buffering output from fpm_listener
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-31 15:05:40 -05:00
Donald Sharp
c58da10d2a zebra: Limit mutex for obuf to when we access obuf
The mutex that wraps access to the output buffer
is being held for the entire time the data is
being generated to send down the pipe.  Since
the generation has absolutely nothing to do
with the obuf, let's limit the mutex holding some.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-31 15:05:40 -05:00
Donald Sharp
e71d29983a zebra: fpm_listener allow continued operation
In fpm_listener, when a error is detected it would
stop listening and not recover.  Modify the code
to close the socket and allow the connection to
recover.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-31 15:05:40 -05:00
Donald Sharp
b2fc167978 zebra: Fix pass back of data from dplane through fpm pipe
A recent code change 29122bc9b8
changed the passing of data up the fpm from passing the
tableid and vrf to the sonic expected tableid contains
the vrfid.  This violates the assumptions in the code
that the netlink message passes up the tableid as the
tableid.  Additionally this code change did not modify
the rib_find_rn_from_ctx to actually properly decode
what could be passed up.  Let's just fix this and let
Sonic carry the patch as appropriate for themselves
since they are not the only users of dplane_fpm_nl.c

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-31 15:05:40 -05:00
Donald Sharp
c41155221e zebra: Ensure dplane does not send work back to master at wrong time
When looping through the dplane providers, the worklist was
being populated with items from the last provider and then
the event system was checked to see if we should stop processing.
If the event system says `yes` then the dplane code would stop
and send the worklist to the master zebra pthread for collection.
This obviously skipped the next dplane provider on the list
which is double plus not good.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-31 12:38:20 -05:00
Carmine Scarpitta
a879aebf69 zebra: Fix SRv6 SID Manager
The SRv6 SID Manager does not allow allocating an SRv6 End/uN function
even though it is already supported by staticd.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
2025-01-30 19:28:34 +01:00
David Lamparter
b666ee510e zebra: guard against junk in nexthop->rmap_src
rmap_src wasn't initialized, so for IPv4 the unused 12 bytes would
contain whatever junk is on the stack on function entry.  Also move
the IPv4 parse before the IPv6 parse so if it's successful we can be
sure the other bytes haven't been touched.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2025-01-29 16:48:37 +01:00
Donald Sharp
f849511c47
Merge pull request #17935 from mjstapp/fix_nhg_hash_equal
zebra: include resolving nexthops in nhg hash
2025-01-29 10:14:37 -05:00
Donald Sharp
fb8e399e4f lib: Remove System routes from ip protocol route map choices
Do not allow system routes to be selected for ip protocol

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-29 09:31:53 -05:00
Russ White
bd82864d03
Merge pull request #17941 from opensourcerouting/fix-dst-src
static: fix botched staticd YANG conversion for dst-src
2025-01-28 12:23:06 -05:00
David Lamparter
2af780650f lib, zebra: carry source prefix in route_notify
When a daemon wants to know about its routes, make it possible to have
that work for dst-src routes.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2025-01-28 15:40:17 +01:00
David Lamparter
1d341d461e zebra: install dst-src routes without NHG
The Linux kernel doesn't support dst-src routes with NHGs as nexthop,
for some (rather dubious) caching reasons.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2025-01-28 11:10:31 +01:00
Mark Stapp
cb7cf73992 zebra: include resolving nexthops in nhg hash
Ensure that the nhg hash comparison function includes all
nexthops, including recursive-resolving nexthops.

Signed-off-by: Mark Stapp <mjs@cisco.com>
2025-01-27 14:17:24 -05:00
Rafael Zalamena
28a9ca3405 lib,zebra: VRF table-direct support
Implement the necessary data structures and code changes to support sending
table-direct routes to protocols running in different VRFs.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2025-01-23 14:37:09 -03:00
Pooja Jagadeesh Doijode
8c6489bc56 zebra: Return error if v6 prefix is passed to show ip route
Return error if IPv6 address or prefix is passed as an argument
to "show ip route" command.

UT:
r1# show ip route 2::3/128
% Cannot specify IPv6 address/prefix for IPv4 table
r1#
r1# show ip route 2::3
% Cannot specify IPv6 address/prefix for IPv4 table
r1#

Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
2025-01-22 10:09:03 -08:00
Donatas Abraitis
76ed8f61d8
Merge pull request #17814 from donaldsharp/nhg_removal_in_some_situations 2025-01-17 17:31:19 +02:00
Donald Sharp
19af3f3d7a zebra: Ensure that changes to dg_update_list are protected by mutex
The dg_update_list access is controlled by the dg_mutex in all
other locations.  Let's just add a mutex usage around the initialization
of the dg_update_list even if it's part of the startup, just to keep
things consistent.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-17 10:16:48 -05:00
Donald Sharp
4b96752737 zebra: Add some documentation on when zserv_open should be used
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-17 10:16:48 -05:00
Igor Ryzhov
300f8dbda4 lib: introduce global -w option for VRF netns backend
Current -n option is only for zebra and mgmtd. All other daemons receive
the VRF backend configuration from zebra upon connection to it. This
leads to a potential race condition - daemons need to know the backend
before they start reading their config, but they can be not connected to
zebra yet at this point. As the VRF backend cannot change during runtime,
let's introduce a new global -w option for setting netns backend, to
make sure that all daemons know their VRF backend immediately after
start.

The reason for introducing a new option instead of making -n global is
that ospfd already uses -n for another purposes.

Signed-off-by: Igor Ryzhov <idryzhov@gmail.com>
2025-01-15 23:38:27 +02:00
Igor Ryzhov
6f214d97d1 lib, zebra: move ns context intialization to zebra
vrf->ns_ctxt is only ever used in zebra, so move its initialization to
zebra's callback. Ideally this pointer shouldn't even be a part of
library's vrf struct, and moved to zebra-specific struct, but this is
the first step.

Signed-off-by: Igor Ryzhov <idryzhov@gmail.com>
2025-01-15 23:38:27 +02:00
Igor Ryzhov
4877f2f685 lib: remove VRF_BACKEND_UNKNOWN
The backend type cannot be unknown. It is configured to VRF_LITE by
default in zebra anyway, so just init to VRF_LITE in the lib and remove
the UNKNOWN type.

Signed-off-by: Igor Ryzhov <idryzhov@gmail.com>
2025-01-15 23:38:27 +02:00
Donald Sharp
953d5fd526
Merge pull request #17799 from LabNConsulting/chopps/backend-yang-model
mgmtd backend yang model (depends on #17796)
2025-01-15 10:22:11 -05:00
Donatas Abraitis
93ea9748cf
Merge pull request #17859 from donaldsharp/active_routes_are_active
Active routes are active
2025-01-15 15:01:59 +02:00
Donald Sharp
ec6a000b0b zebra: On Nexthop install failure don't set Installation failed
Currently FRR when installing a nexthop group, the installation can fail.
The assumption with the code was that the current nexthop group was
not already installed.  This leaves a problem state where if the
users of the nexthop group are removed, the nexthop group will be
removed possibly leaving a orphaned nexthop group in the data plane.

FRR on a nexthop group installation does not actually know the status
of the nexthop group in the kernel.  It's possible that a earlier
version of the nexthop group is left in play.  It's possible that
there is no nexthop group in the kernel at all.  Leaving the
Installed flag alone allows upon Zebra removing the nexthop
group when it is removed from zebra.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-14 16:23:40 -05:00
Donald Sharp
b61424a717 zebra: Nexthops need to be ACTIVE in some cases
Currently if you have an interface down event, Zebra
sets the nexthop(s) as !ACTIVE that use it.  On
interface up events the singleton nexthops are not being
set as ACTIVE.  Due to timing events it is sometimes
possible to end up with a route that is using a singleton

Change singleton nexthops to set the nexthop to ACTIVE.
This will allow the nexthop to be reinstalled appropriately
as well.

I was able to easily reproduce this using sharpd since
it does not attempt to reinstall the routes when a interface
goes up/down.

Before:

D>* 10.0.0.0/32 [150/0] via 192.168.102.34, dummy2, weight 1, 00:00:01

sharpd@eva ~/frr5 (master)> sudo ip link set dummy2 down ; sudo ip link set dummy2 up

D>  10.0.0.0/32 [150/0] (350) via 192.168.102.34, dummy2 inactive, weight 1, 00:00:10

After code change:

D>* 10.0.0.0/32 [150/0] (73) via 192.168.102.34, dummy2, weight 1, 00:00:14

sharpd@eva ~/frr5 (master)> sudo ip link set dummy2 down ; sudo ip link set dummy2 up

D>* 10.0.0.0/32 [150/0] (73) via 192.168.102.34, dummy2, weight 1, 00:00:21

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-14 15:12:32 -05:00
Christian Hopps
5f2a927d7b lib: northbound/mgmtd: add backend model support
Signed-off-by: Christian Hopps <chopps@labn.net>
2025-01-14 18:48:59 +00:00
Donald Sharp
5f35096123
Merge pull request #17796 from LabNConsulting/chopps/datastore-notifications
operational-state (datastore) change notifications
2025-01-14 13:47:28 -05:00
Donald Sharp
67da971218
Merge pull request #17581 from mjstapp/fix_fpm_netlink
zebra: avoid race between FPM pthread and zebra main pthread in netlink encode/decode
2025-01-14 13:42:29 -05:00
Christian Hopps
80c6f98ea7 lib: if: track oper-state inline
Signed-off-by: Christian Hopps <chopps@labn.net>
2025-01-13 23:40:52 -05:00