Current code when a link is set down is to just mark the
nexthop group as not properly setup. Leaving situations
where when an interface goes down and show output is
entered we see incorrect state. This is true for anything
that would be checking those flags at that point in time.
Modify the interface down nexthop group code to notice the
nexthops appropriately ( and I mean set the appropriate flags )
and to allow a `show ip route` command to actually display
what is going on with the nexthops.
eva# show ip route 1.0.0.0
Routing entry for 1.0.0.0/32
Known via "sharp", distance 150, metric 0, best
Last update 00:00:06 ago
* 192.168.44.33, via dummy1, weight 1
* 192.168.45.33, via dummy2, weight 1
sharpd@eva:~/frr1$ sudo ip link set dummy2 down
eva# show ip route 1.0.0.0
Routing entry for 1.0.0.0/32
Known via "sharp", distance 150, metric 0, best
Last update 00:00:12 ago
* 192.168.44.33, via dummy1, weight 1
192.168.45.33, via dummy2 inactive, weight 1
Notice now that the 1.0.0.0/32 route now correctly
displays the route for the nexthop group entry.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a new start option "-K" to libfrr to denote a graceful start,
and use it in zebra and bgpd.
zebra will use this option to denote a planned FRR graceful restart
(supporting only bgpd currently) to wait for a route sync completion
from bgpd before cleaning up old stale routes from the FIB. An optional
timer provides an upper-bounds for this cleanup.
bgpd will use this option to denote either a planned FRR graceful
restart or a bgpd-only graceful restart, and this will drive the BGP
GR restarting router procedures.
Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>
The `locator` pointer is dereferenced before ensuring it is not NULL.
Fix the issue by checking that the pointer is not NULL before
dereferencing it.
Fixes 1594013
** CID 1594013: Null pointer dereferences (REVERSE_INULL)
/zebra/zebra_srv6.c: 961 in zebra_srv6_sid_compose()
________________________________________________________________________________________________________
*** CID 1594013: Null pointer dereferences (REVERSE_INULL)
/zebra/zebra_srv6.c: 961 in zebra_srv6_sid_compose()
955 struct srv6_locator *locator,
956 uint32_t sid_func)
957 {
958 uint8_t offset, func_len;
959 struct srv6_sid_format *format = locator->sid_format;
960
CID 1594013: Null pointer dereferences (REVERSE_INULL)
Null-checking "locator" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
961 if (!sid_value || !locator)
962 return false;
963
964 if (format) {
965 offset = format->block_len + format->node_len;
966 func_len = format->function_len;
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
The `for` loop starting at line 1848 searches the `func_allocated` array
for a pointer that points to a specific `sid_wide_func` element.
The loop should iterate over all the elements of the `func_allocated`
array and dereference each element to see if it is the one we are
looking for.
Currently, the loop is using the wrong variable to iterate over the
array.
Let's fix this issue by using the correct variable in the loop.
Fixes CID 1594014
Fixes CID 1594016
** CID 1594014: Null pointer dereferences (FORWARD_NULL)
/zebra/zebra_srv6.c: 1860 in release_srv6_sid_func_explicit()
________________________________________________________________________________________________________
*** CID 1594014: Null pointer dereferences (FORWARD_NULL)
/zebra/zebra_srv6.c: 1860 in release_srv6_sid_func_explicit()
1854
1855 /* Lookup SID function in the functions allocated list of EWLIB range */
1856 for (ALL_LIST_ELEMENTS_RO(block->u.usid
1857 .wide_lib[sid_func]
1858 .func_allocated,
1859 node, sid_func_ptr))
CID 1594014: Null pointer dereferences (FORWARD_NULL)
Dereferencing null pointer "sid_wide_func_ptr".
1860 if (*sid_wide_func_ptr == sid_wide_func)
1861 break;
1862
1863 /* Ensure that the SID function is allocated */
1864 if (!sid_wide_func_ptr) {
1865 zlog_warn("%s: failed to release wide SID function %u, function is not allocated",
** CID 1594016: Possible Control flow issues (DEADCODE)
/zebra/zebra_srv6.c: 1871 in release_srv6_sid_func_explicit()
________________________________________________________________________________________________________
*** CID 1594016: Possible Control flow issues (DEADCODE)
/zebra/zebra_srv6.c: 1871 in release_srv6_sid_func_explicit()
1865 zlog_warn("%s: failed to release wide SID function %u, function is not allocated",
1866 __func__, sid_wide_func);
1867 return -1;
1868 }
1869
1870 /* Release the SID function from the EWLIB range */
CID 1594016: Possible Control flow issues (DEADCODE)
Execution cannot reach this statement: "listnode_delete(block->u.us...".
1871 listnode_delete(block->u.usid.wide_lib[sid_func]
1872 .func_allocated,
1873 sid_wide_func_ptr);
1874 zebra_srv6_sid_func_free(sid_wide_func_ptr);
1875 } else {
1876 zlog_warn("%s: function %u is outside ELIB [%u/%u] and EWLIB alloc ranges [%u/%u]",
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
At line 1736, `alloc_mode` is set to `SRV6_SID_ALLOC_MODE_EXPLICIT` or
`SRV6_SID_ALLOC_MODE_DYNAMIC` depending on the `sid_value` variable.
There will never be a case where alloc_mode will be `SRV6_SID_ALLOC_MODE_MAX`
or `SRV6_SID_ALLOC_MODE_UNSPEC`.
Let's replace the `switch(alloc_mode) {...}` with an if-else.
Fixes CID 1594015.
** CID 1594015: (DEADCODE)
/zebra/zebra_srv6.c: 1782 in get_srv6_sid()
/zebra/zebra_srv6.c: 1781 in get_srv6_sid()
________________________________________________________________________________________________________
*** CID 1594015: (DEADCODE)
/zebra/zebra_srv6.c: 1782 in get_srv6_sid()
1776 }
1777
1778 ret = get_srv6_sid_dynamic(sid, ctx, locator);
1779
1780 break;
1781 case SRV6_SID_ALLOC_MODE_MAX:
CID 1594015: (DEADCODE)
Execution cannot reach this statement: "case SRV6_SID_ALLOC_MODE_UN...".
1782 case SRV6_SID_ALLOC_MODE_UNSPEC:
1783 default:
1784 flog_err(EC_ZEBRA_SM_CANNOT_ASSIGN_SID,
1785 "%s: SRv6 Manager: Unrecognized alloc mode %u",
1786 __func__, alloc_mode);
1787 /* We should never arrive here */
/zebra/zebra_srv6.c: 1781 in get_srv6_sid()
1775 return -1;
1776 }
1777
1778 ret = get_srv6_sid_dynamic(sid, ctx, locator);
1779
1780 break;
CID 1594015: (DEADCODE)
Execution cannot reach this statement: "case SRV6_SID_ALLOC_MODE_MAX:".
1781 case SRV6_SID_ALLOC_MODE_MAX:
1782 case SRV6_SID_ALLOC_MODE_UNSPEC:
1783 default:
1784 flog_err(EC_ZEBRA_SM_CANNOT_ASSIGN_SID,
1785 "%s: SRv6 Manager: Unrecognized alloc mode %u",
1786 __func__, alloc_mode);
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
In case of EVPN MH bond, a member port going in
protodown state due to external reason (one case being linkflap),
frr updates the state correctly but upon manually
clearing external reason trigger FRR to reinstate
protodown without any reason code.
Fix is to ensure if the protodown reason was external
and new state is to have protodown 'off' then do no reinstate
protodown.
Ticket: #3947432
Testing:
switch:#ip link show swp1
4: swp1: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9216 qdisc
pfifo_fast master bond1 state DOWN mode DEFAULT group default qlen
1000
link/ether 1c:34:da:2c:aa:68 brd ff:ff:ff:ff:ff:ff protodown on
protodown_reason <linkflap>
switch:#ip link set swp1 protodown off protodown_reason linkflap off
switch:#ip link show swp1
4: swp1: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9216 qdisc
pfifo_fast master bond1 state DOWN mode DEFAULT group default qlen
1000
link/ether 1c:34:da:2c:aa:68 brd ff:ff:ff:ff:ff:ff
Signed-off-by: Chirag Shah <chirag@nvidia.com>
If using weighted ECMP, the weight for non-recursive next-hop should be
inherited from recursive next-hop.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
In the near future, some daemons may only register SIDs. This may be
the case for the pathd daemon when creating SRv6 binding SIDs.
When a locator is getting deleted at ZEBRA level, the daemon may have
an easy way to find out the SIds to unregister to.
This commit proposes to add the locator name to the SID_SRV6_NOTIFY
message whenever possible. Only case when an allocation failure happens,
the locator will not be present. In all other places, the notify API
at procol levels has the locator name extra-parameter.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
When removing a large number of routes, the linux kernel can take the
cpu for an extended amount of time, leaving a situation where FRR
detects a starvation event.
r1# sharp install routes 10.0.0.0 nexthop 192.168.44.33 1000000 repeat 10
2024-06-14 12:55:49.365 [NTFY] sharpd: [M7Q4P-46WDR] vty[5]@# sharp install routes 10.0.0.0 nexthop 192.168.44.33 1000000 repeat 10
2024-06-14 12:55:49.365 [DEBG] sharpd: [YP4TQ-01TYK] Inserting 1000000 routes
2024-06-14 12:55:57.256 [DEBG] sharpd: [TPHKD-3NYSB] Installed All Items 7.890085
2024-06-14 12:55:57.256 [DEBG] sharpd: [YJ486-NX5R1] Removing 1000000 routes
2024-06-14 12:56:07.802 [WARN] zebra: [QH9AB-Y4XMZ][EC 100663314] STARVATION: task dplane_thread_loop (634377bc8f9e) ran for 7078ms (cpu time 220ms)
2024-06-14 12:56:25.039 [DEBG] sharpd: [WTN53-GK9Y5] Removed all Items 27.783668
2024-06-14 12:56:25.039 [DEBG] sharpd: [YP4TQ-01TYK] Inserting 1000000 routes
2024-06-14 12:56:32.783 [DEBG] sharpd: [TPHKD-3NYSB] Installed All Items 7.743524
2024-06-14 12:56:32.783 [DEBG] sharpd: [YJ486-NX5R1] Removing 1000000 routes
2024-06-14 12:56:41.447 [WARN] zebra: [QH9AB-Y4XMZ][EC 100663314] STARVATION: task dplane_thread_loop (634377bc8f9e) ran for 5175ms (cpu time 179ms)
Let's modify the loop in dplane_thread_loop such that after a provider
has been run, check to see if the event should yield, if so, stop
and reschedule this for the future.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Instead of keeping a counter that is independent
of the queue's data structure. Just use the queue's
built-in counter. Ensure that it's pthread safe by
keeping it wrapped inside the mutex for adding/deleting
to the queue.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently, when a locator is deleted in zebra, zebra notifies only the
zclient that owns the locator.
With the introduction of SID Manager, the locator is no longer owned by
any client. Instead, the locator is owned by Zebra, and clients can
allocate and release SIDs from the locator using the ZAPI
ZEBRA_SRV6_MANAGER_GET_SID and ZEBRA_SRV6_MANAGER_RELEASE_SID.
Therefore, when a locator is removed in Zebra, we need to notify all
daemons so that they can release/uninstall the SIDs allocated by that
locator.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Send asynchronous notifications to zclients when an SRv6 SID is
allocated/released and when a SID alloc/release operation fails.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Previous commits introduced two new ZAPI operations,
`ZEBRA_SRV6_MANAGER_GET_SRV6_SID` and
`ZEBRA_SRV6_MANAGER_RELEASE_SRV6_SID`. These operations allow a daemon
to interact with the SRv6 SID Manager to get and release an SRv6 SID,
respectively.
This commit extends the SID Manager by adding logic to process the
requests `ZEBRA_SRV6_MANAGER_GET_SRV6_SID` and
`ZEBRA_SRV6_MANAGER_RELEASE_SRV6_SID`, and allocate/release SIDs to
requesting daemons.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Add functions to allocate/release SRv6 SIDs. SIDs can be allocated
either explicitly (allocate a specific SID) or dynamically (allocate any
available SID).
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
The previous commits introduced a new operation,
`ZEBRA_SRV6_MANAGER_GET_LOCATOR`, allowing a daemon to request
information about a specific SRv6 locator from the SRv6 SID Manager.
This commit extends the SID Manager to respond to a
`ZEBRA_SRV6_MANAGER_GET_LOCATOR` request and provide the requested
locator information.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Add a data structure to represent an SRv6 SID context and the related
management functions (allocate/free).
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Add the CLI to choose the SID format of a locator. When the SID format
of a locator is changed, the SIDs allocated from that locator might no
longer be valid (for example, because the new format might involve a
different SID allocation schema). In such a case, it is necessary to
notify all the zclients so that they can withdraw/uninstall the old SIDs
that use the previous format and allocate/install/advertise the new SIDs
based on the new format.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
An SRv6 block is an IPv6 prefix from which SIDs are allocated. This
commit adds support for SRv6 SID blocks. Specifically, it adds a data
structure to store information about an SRv6 block (e.g., its occupancy
status, which SIDs have been allocated and which are available, which
SID format is used for that block, etc.). It also adds some functions to
manage the block (allocate / free / lookup).
These functions will be used in the next commits to support the
allocation of SIDs from a block in the SID Manager.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Add functionalities to manage SRv6 SID formats (register / unregister /
lookup) and create two SID formats upon SRv6 Manager initialization:
`uncompressed-f4024` and `usid-f3216`.
In future commits, we will add the CLI to allow the user to choose
between the two formats.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Because the nhlfe label stack may contain more than one
label, ensure to copy all labels.
Co-developed-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
Change the comment in the code that refers to ZEBRA_FLAG_XXX
defines.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
Until now, when a FEC entry is added in zebra, driven by the
reception of a BGP labeled unicast update, an LSP entry is
created. That LSP entry is resolved by using the route entry
which is also installed by BGP labeled unicast.
This route entry is not available when we face with i-bgp
peering session. I-BGP labeled sessions are used to establish
IP connectivity across separate IGPs.
The below dumps illustrate a 3 IGP topology. An attempt to create
connectivity between the north and the south machines is done.
The 3 separate IGPs are configured in Segment routing:
- north-east
- east-west
- west-south
We create BGP peerings between each endpoint of each IGP:
- iBGP between (north) and (east)
- iBGP between (east) and (west)
- iBGP between (west) and (south)
Before that patch, the FEC entries could not be resolved on the
east machine:
Before:
east-vm# show mpls fec
192.0.2.1/32
Label: 18
Client list: bgp(fd 48)
192.0.2.5/32
Label: 17
Client list: bgp(fd 48)
192.0.2.7/32
Label: 19
Client list: bgp(fd 48)
east-vm# show mpls table
Inbound Label Type Nexthop Outbound Label
--------------------------------------------------------
1011 SR (OSPF) 192.168.2.2 1011
1022 SR (OSPF) 192.168.2.2 implicit-null
11044 SR (IS-IS) 192.168.3.4 implicit-null
11055 SR (IS-IS) 192.168.3.4 11055
30000 SR (OSPF) 192.168.2.2 implicit-null
30001 SR (OSPF) 192.168.2.2 implicit-null
36000 SR (IS-IS) 192.168.3.4 implicit-null
east-vm# show ip route
[..]
B 192.0.2.1/32 [200/0] via 192.0.2.1 inactive, label implicit-null, weight 1, 00:17:45
O>* 192.0.2.1/32 [110/20] via 192.168.2.2, r3-eth0, label 1011, weight 1, 00:17:47
O>* 192.0.2.2/32 [110/10] via 192.168.2.2, r3-eth0, label implicit-null, weight 1, 00:17:47
O 192.0.2.3/32 [110/0] is directly connected, lo, weight 1, 00:17:57
C>* 192.0.2.3/32 is directly connected, lo, 00:18:03
I>* 192.0.2.4/32 [115/20] via 192.168.3.4, r3-eth1, label implicit-null, weight 1, 00:17:59
B 192.0.2.5/32 [200/0] via 192.0.2.5 inactive, label implicit-null, weight 1, 00:17:56
I>* 192.0.2.5/32 [115/30] via 192.168.3.4, r3-eth1, label 11055, weight 1, 00:17:58
B> 192.0.2.7/32 [200/0] via 192.0.2.5 (recursive), label 19, weight 1, 00:17:45
* via 192.168.3.4, r3-eth1, label 11055/19, weight 1, 00:17:45
[..]
After command "mpls fec nexthop-resolution" was applied, the FEC
entries will resolve over any non BGP route that has a labeled path
selected.
east-vm# show mpls table
Inbound Label Type Nexthop Outbound Label
--------------------------------------------------------
17 SR (IS-IS) 192.168.3.4 11055
18 SR (OSPF) 192.168.2.2 1011
19 BGP 192.168.3.4 11055/19
1011 SR (OSPF) 192.168.2.2 1011
1022 SR (OSPF) 192.168.2.2 implicit-null
11044 SR (IS-IS) 192.168.3.4 implicit-null
11055 SR (IS-IS) 192.168.3.4 11055
30000 SR (OSPF) 192.168.2.2 implicit-null
30001 SR (OSPF) 192.168.2.2 implicit-null
36000 SR (IS-IS) 192.168.3.4 implicit-null
Co-developed-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When an LSP entry is created from a FEC entry, multiple labels
may now be appended to the LSP entry, instead of one single.
Upon lsp creation, the LSP trace will display all the labels
appended.
Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
There are two ways of iterating over nexthops of a given
route entry.
- Either only the main nexthop are taken into account
(which is the case today when attempting to install an
LSP entry on a BGP connected labeled route.
- Or by taking into account nexthops that are resolved
and linked in nexthop->resolved of the previous nexthop
which has RECURSIVE flag set. This second case has to be
taken into account in the case where recursive routes may
be used to install an LSP entry.
Introduce a new API in nexthop that will parse over the
appropriate nexthop, if the nexthop-resolution flag is turned
on or not on the given VRF.
Use that API in the lsp_install() function so as to walk
over the appropriate nexthops.
Co-developed-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
Upon reconfiguring nexthop-resolution updates, update the
lsp entries accordingly.
If fec nexthop-resolution becomes true, then call again
fec_change_update_lsp() for each fec entry available.
If fec nexthop-resolution becomes false, then call again
fec_change_update_lsp() for each fec entry available, and
if the update fails, uninstall any lsp related with the fec
entry. In the case lsp_install() and no lsp entry could be
created or updated, then consider this call as a failure, and
return -1.
Co-developed-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
When displaying a route table in JSON, a table JSON object is storing
all the prefix JSON objects containing the prefix information. This
results in excessive memory allocation for JSON objects, potentially
leading to an out-of-memory error on the machine with large routing
tables.
To Fix the memory consumption issue for the "show ip[v6] route [vrf XX]
json" command, display the prefixes one by one and free the memory of
each JSON object after it has been displayed.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
0e2fc3d67f ("vtysh, zebra: Fix malformed json output for multiple vrfs
in command 'show ip route vrf all json'") has been reverted in the
previous commit. Although the fix was correct, it was consuming too muca
memory when displaying large route tables.
A root JSON object was storing all the JSON objects containing the route
tables, each containing their respective prefixes in JSON objects. This
resulted in excessive memory allocation for JSON objects, potentially
leading to an out-of-memory error on the machine.
To Fix the memory consumption issue for the "show ip[v6] route vrf all
json" command, display the tables one by one and free the memory of each
JSON object after it has been displayed.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
This reverts commit 0e2fc3d67f.
This fix was correct but not optimal for memory consumption at scale.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Fixes:
zebra/zebra_netns_notify.c: In function 'zebra_ns_ready_read':
zebra/zebra_netns_notify.c:266:40: error: implicit declaration of function 'basename' [-Wimplicit-function-declaration]
266 | if (strmatch(VRF_DEFAULT_NAME, basename(netnspath))) {
| ^~~~~~~~
Fixed by including libgen.h, then since basename may modify its
parameter, allocate a copy on the stack, using strdupa, and pass the
temporary string to basename.
According to the man page for basename:
With glibc, one gets the POSIX version of basename() when
<libgen.h> is included, and the GNU version otherwise.
The POSIX version of basename may modify the contents of path,
so we should to pass a copy when calling this function.
[1] https://man7.org/linux/man-pages/man3/basename.3.html
Signed-off-by: Georgi Valkov <gvalkov@gmail.com>
The 'show running-config' does not display the ipv6 source address
when a locator is not configured. Fix this by systematically displaying
the ipv6 source address.
Fixes: 6a0956169b ("zebra: Add encap source address to SRv6 config write function")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Currently zebra does not deny the routes if `ip protocol <proto> route-map
FOO`
commmand is configured with reference to an undefined route-map (FOO in
this case).
However, on FRR restart, in zebra_route_map_check() routes get denied
if route-map name is available but the route-map is not defined. This
change was introduced in fd303a4ba1.
Fix:
When `ip protocol <proto> route-map FOO` CLI is configured with reference to an
undefined route-map FOO, let the processing in ip_protocol_rm_add() and
ip_protocol_rm_del() go through so that zebra can deny the routes instead
of simply returning. This will result in consistent behavior.
Testing Done:
Before fix:
```
spine-1# configure
spine-1(config)# ip protocol bgp route-map rmap7
root@spine-1:mgmt:/var/home/cumulus# vtysh -c "show run" | grep rmap7
ip protocol bgp route-map rmap7
root@spine-1:mgmt:/var/home/cumulus#
spine-1(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, A - Babel, D - SHARP, F - PBR, f - OpenFabric,
Z - FRR,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
C>* 27.0.0.1/32 is directly connected, lo, 02:27:45
B>* 27.0.0.3/32 [20/0] via fe80::202:ff:fe00:21, downlink_1, weight 1, 02:27:35
B>* 27.0.0.4/32 [20/0] via fe80::202:ff:fe00:29, downlink_2, weight 1, 02:27:40
B>* 27.0.0.5/32 [20/0] via fe80::202:ff:fe00:31, downlink_3, weight 1, 02:27:40
B>* 27.0.0.6/32 [20/0] via fe80::202:ff:fe00:39, downlink_4, weight 1, 02:27:40
```
After fix:
```
spine-1(config)# ip protocol bgp route-map route-map67
spine-1(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, A - Babel, D - SHARP, F - PBR, f - OpenFabric,
Z - FRR,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
C>* 27.0.0.1/32 is directly connected, lo, 00:35:03
B 27.0.0.3/32 [20/0] via fe80::202:ff:fe00:21, downlink_1 inactive, weight 1, 00:34:58
B 27.0.0.4/32 [20/0] via fe80::202:ff:fe00:29, downlink_2 inactive, weight 1, 00:34:57
B 27.0.0.5/32 [20/0] via fe80::202:ff:fe00:31, downlink_3 inactive, weight 1, 00:34:57
B 27.0.0.6/32 [20/0] via fe80::202:ff:fe00:39, downlink_4 inactive, weight 1, 00:34:58
spine-1(config)#
root@spine-1:mgmt:/var/home/cumulus# ip route show
root@spine-1:mgmt:/var/home/cumulus#
```
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
Configured with "mpls label bind 1.1.1.1/32 explicit-null", the running
configuration is:
```
!
mpls label bind 1.1.1.1/32 IPv4 Explicit Null
!
```
After this commit, the running configuration is:
```
!
mpls label bind 1.1.1.1/32 explicit-null
!
```
And add the support for the "no" form:
```
anlan(config)# mpls label bind 1.1.1.1/32 explicit-null
anlan(config)# no mpls label bind 1.1.1.1/32 explicit-null
```
Signed-off-by: anlan_cs <anlan_cs@tom.com>
* Starting from version DPDK 22.11 we have API changes:
The rte_driver and rte_device objects are now opaque and must be manipulated through added accessors.
We need to update Zebra DPDK sources to DPDK version >=22.11
* Fix clang-format
Signed-off-by: EasyNet <devel@easynet.dev>
The zebra_nexthop_vty_helper() and zebra_nexthop_json_helper()
functions could be very helpful to display nexthop information
from whatever daemon.
Move the core function in the nexthop_vty_helper() and the
nexthop_json_helper() function. The zebra API call remains
unchanged.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Include the prefix source address when set by a route-map in
show output for routes, in various formats.
Add some debugs when encoding netlink route messages with
a source address.
Signed-off-by: Mark Stapp <mjs@cisco.com>
If you had a situation where an operator turned on
ospfd with snmp but not ospf6d and agentx was configured
then you get into a situation where ospf6d would complain
that the config for agentx did not exist. Let's modify
the code to allow this situation to happen.
Fixes: #15896
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The fpm code path for the dplane_fpm_nl module was improperly
encoding the multipath nexthop data for vxlan type routes.
Move this into the embedded nexthop encoding where it belongs.
This change makes it so that the usage of `-M dplane_fpm_nl`
is now producing the same netlink messages that `-M fpm`
produces when using vxlan based nexthops.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
1) Add ability to hex-dump the received packet for debugging
2) Receive encap type and vxlan vni and display them.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In the context of SVD (Single VxLAN Device) for L3VNI,
the remote VTEP's nexthop is programmed neighbor entry against
SVD along with neighbor entry against SVI.
However, when L3VNI is removed or the VRF is disabled, all SVI
based remote nexthop neighbors are uninstalled and deleted.
The SVD based neigh entries remains in Zebra and the Kernel.
Subsequently, when reconfiguring L3VNI and relearning the same nexthop,
the neighbor entry is not programmed is because it is not removed
from Zebra SVD neighbor hash table, leading to the failure to
reprogram the entry.
With this fix, the SVD nexthop neigh entry is uninstalled
and deleted from Zebra and Kernel.
Ticket: #3729045
Testing:
borderleaf:# ip neigh show 2.2.2.2
2.2.2.2 dev vlan2560_l3 lladdr 00:01:00:00:1d:09 extern_learn NOARP proto zebra
2.2.2.2 dev vxlan99 lladdr 00:01:00:00:1d:09 extern_learn NOARP proto zebra
With the fix:
Zebra log shows both enties SVD (vxlan99) and SVI (vlan2560_l3)
neighbor entries are deleted.
2024/05/03 18:41:33.527125 ZEBRA: [NH6N7-54CD1] Tx RTM_DELNEIGH family
ipv4 IF vxlan99(16) Neigh 2.2.2.2 MAC null flags 0x10 state 0x0
ext_flags 0x0
2024/05/03 18:41:33.527128 ZEBRA: [NH6N7-54CD1] Tx RTM_DELNEIGH family
ipv4 IF vlan2560_l3(18) Neigh 2.2.2.2 MAC null flags 0x10 state 0x0
ext_flags 0x0
borderleaf:# ip neigh show 2.2.2.2
borderleaf:#
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Upon bridge flap, the associated SVD case,
VLAN membership is not updated correctly.
When SVI comes up, the VNI could not associate
with it as bridge VLAN membership was not updated.
Ticket: #3821632
Testing:
Before fix:
-----------
tor-1:#ifdown br_l3vni ; sleep 1 ; ifup br_l3vni
tor-1:# vtysh -c 'show evpn vni 8888'
VNI: 8888
Type: L3
Tenant VRF: sym_1
Vlan: 490
Bridge: br_l3vni
Local Vtep Ip: 27.0.0.9
Vxlan-Intf: vxlan99
SVI-If: None <<<<<< SVI not found
State: Down <<<<<< status remained in down BGP is not informed
VNI Filter: none
System MAC: None
Router MAC: None
L2 VNIs: 1800 1801 1900 1901
After fix:
----------
tor-1:# ifdown br_l3vni; sleep 1; ifup br_l3vni
tor-1:# vtysh
Hello, this is FRRouting (version 8.4.3).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
tor-1# show evpn vni 8888
VNI: 8888
Type: L3
Tenant VRF: sym_1
Vlan: 490
Bridge: br_l3vni
Local Vtep Ip: 27.0.0.9
Vxlan-Intf: vxlan99
SVI-If: vlan490_l3 <<<<<<
State: Up <<<<<<
VNI Filter: none
System MAC: 44:38:39:ff:ff:29
Router MAC: 44:38:39:ff:ff:29
L2 VNIs: 1800 1801 1900 1901
Signed-off-by: Chirag Shah <chirag@nvidia.com>
This is theoretically not needed if neither DEFUNs nor zlog_* calls are
used, except I'm about to turn it into a build error to catch the cases
where it _is_ necessary. Which is libmgmt_be_nb.la in this case, where
it causes build failures on hppa.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
With the commit `605df8d4`, all real things are moved into dplane.
So the operations mentioned in this comment have nothing to do with
this function `netlink_link_change()`.
Just remove that confusing and useless comment.
Signed-off-by: anlan_cs <anlan_cs@tom.com>
As part of the kernel netlink functionality, it is
possible that a bit of nested attributes can be
passed up. This attribute has a type value which
is stored in the lower 8 bits and in the upper 8
bits are a couple control flags that can be used.
FRR can parse this data and then just throw away
the value unless we mask off the upper 8 bits.
Let's ensure that it can be properly parsed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
I don't know why it appears only now.
> ../sdist/zebra/fpm_listener.c:420:8: error: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigned long') [-Werror,-Wsign-compare]
> if (!RTNH_OK(rtnh, len)) {
> ^~~~~~~~~~~~~~~~~~
> ../sdist/include/linux/rtnetlink.h:437:31: note: expanded from macro 'RTNH_OK'
> ((int)(rtnh)->rtnh_len) <= (len))
len is set with RTA_PAYLOAD and should be an integer.
> len = RTA_PAYLOAD(mpath_rtattr);
> #define RTA_PAYLOAD(rta) ((int)((rta)->rta_len) - RTA_LENGTH(0))
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add the support for adding DX6 behavior into netlink layer of zebra.
Add the necessary test in sharpd.
> ubuntu2204# sharp install seg6local-routes 1:1::1:2 nexthop-seg6local loop1 End_DX6 4:4::4:6 1
> ubuntu2204# do show ipv6 route
> [..]
> D>* 1:1::1:2/128 [150/0] is directly connected, loop1, seg6local End.DX6 nh6 4:4::4:6, weight 1, 00:00:03
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In zebra_interface_nhg_reinstall zebra is checking that the
nhg is a singleton and not a blackhole nhg. This was originally
done with checking that the nexthop is a NEXTHOP_TYPE_IFINDEX,
NEXTHOP_TYPE_IPV4_IFINDEX and NEXTHOP_TYPE_IPV6_IFINDEX. This
was excluding NEXTHOP_TYPE_IPV4 and NEXTHOP_TYPE_IPV6. These
were both possible to be received and maintained from the upper
level protocol for when a route is being recursively resolved.
If we have gotten to this point in zebra_interface_nhg_reinstall
the nexthop group has already been installed at least once
and we *know* that it is actually a valid nexthop. What the
test is really trying to do is ensure that we are not reinstalling
a blackhole nexthop group( Which is not possible to even be
here by the way, but safety first! ). So let's change
to test for that instead.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Change input/output arguments of the RPC callback from lists of
(xpath/value) tuples to YANG data trees.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
This was caught by the grpc_basic test which was receiving an invalid error
result, which was returned b/c inside zebra the libyang code was flagging the
value as invalid for a derived zebra interface type.
Signed-off-by: Christian Hopps <chopps@labn.net>
If a command is not marked as `YANG`-converted, the current command
batching buffer is flushed before executing the command. We shouldn't
flush the buffer when executing an `exit` command. It should only be
flushed if the next command is not `YANG`-converted, which is checked by
the command itself, not the previous `exit`.
Fixes#15706.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
An operator found a situation where zebra was
backing up in a significant way towards BGP
with EVPN changes taking up some serious amounts
of memory. The key lines that would have clued
us in on it were behind a dev build. Let's change
this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When configuring a SID list by vtysh, the segment list
obtained in iproute2 is the exact opposite:
>
>vtysh:
>ipv6 route 2005::/64 eth0 segments 2001:db8:aaaa::7/2002::2/2003::3/2004::4
>
>root@r1:/# ip -6 route
>2005::/64 nhid 6 encap seg6 mode encap segs 4 [ 2004::4 2003::3 2002::2 2001:db8:aaaa::7 ] dev dummy0 proto 196 metric 20 pref medium
>
Fix this by keeping the same vtysh config and swap the
segment's order of the list in the rt_netlink.c
>
>root@r1:/# ip -6 route
>2005::/64 nhid 6 encap seg6 mode encap segs 4 [ 2001:db8:aaaa::7 2002::2 2003::3 2004::4 ] dev dummy0 proto 196 metric 20 pref medium
>
Fixes: f20cf14 ("bgpd,lib,sharpd,zebra: srv6 introduce multiple segs/SIDs in nexthop")
Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
Command 'show ip route vrf <vrf_name> json' returns a valid json object,
however if instead of <vrf_name> we specify 'all', we get an invalid json
object, like:
{//vrf1 routes}{//vrf2 routes}{vrf3 routes}
After the fix:
{"vrf1":{//vrf1 routes},"vrf2:{//vrf2 routes},"vrf3":{//vrf3 routes}}
Which is a valid json object, that can be parsed effectively using built-in
modules. The rest of the commands remains unaffected and behave the same.
Signed-off-by: Piotr Suchy <psuchy@akamai.com>
A specific sequence of actions involving the addition and removal of IP
routes and network interfaces can lead to a route installation failure.
The issue occurs under the following conditions:
- Initially, there is no route present via the ens3 interface.
- Adds a route: ip route 10.0.0.0/24 192.168.0.100 ens3
- Removes the same route: no ip route 10.0.0.0/24 192.168.0.100 ens3
- Removes the ens3 interface.
- Re-adds the ens3 interface.
- Again adds the same route: ip route 10.0.0.0/24 192.168.0.100 ens3
- And again removes it: no ip route 10.0.0.0/24 192.168.0.100 ens3
- Shuts down the ens3 interface
- Reactivates the interface
- Adds the route once more: ip route 10.0.0.0/24 192.168.0.100 ens3
The route appears to be rejected.
> # show ip route nexthop
> S>r 10.0.0.0/24 [1/0] (6) via 192.168.0.100, ens3, weight 1, 00:00:01
The commit 35729f38fa ("zebra: Add a timer to nexthop group deletion")
introduced a feature to keep a nexthop-group in Zebra for a certain
period even when it is no longer in use. But if a nexthop-group
interface is removed during this period, the association between the
nexthop-group and the interface is lost in zebra memory. If the
interface is later added back and a route is re-established, the
nexthop-group interface dependency is not correctly reestablished.
As a consequence, the nexthop-group flags remain unset when the
interface is down. Upon the interface's reactivation, zebra does not
reinstall the nexthop-group in the kernel because it is marked as valid
and installed, but in reality, it does not exist in the kernel (it was
removed when the interface was down). Thus, attempts to install a route
via this nexthop-group ID fail.
Stop maintaining a nexthop-group when its associated interface is no
longer present.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Split zebra's vrf_terminate() into disable() and delete() stages.
The former enqueues all events for the dplane thread.
Memory freeing is performed in the second stage.
Signed-off-by: Alexander Skorichenko <askorichenko@netgate.com>
Recent commit: 6b2554b94a
Exposed, via Address Sanitation, that memory was being
leaked. Unfortunately the CI system did not catch this.
Two pieces of memory were being lost: The zserv client
data structure as well as anything on the client->gr_info_queue.
Clean these up.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently, the way zebra works is it creates pthread per client (BGP is
of interest in this case) and this thread loops itself in zserv_read()
to check for any incoming data. If there is one, then it reads,
validates and adds it in the ibuf_fifo signalling the main thread to
process the message. The main thread when it gets a change, processes
the message, and invokes the function pointer registered in the header
command. (Ex: zserv_handlers).
Finally, if all of this was successful, this task reschedules itself and
loops in zserv_read() again
However, if there are already items on ibuf FIFO, that means zebra is
slow in processing. And with the current mechanism if Zebra main is
busy, the ibuf FIFO keeps growing holding up the memory.
Show memory zebra:(Example: 15k streams hoarding ~160 MB of data)
--- qmem libfrr ---
Stream : 44 variable 3432352 15042 161243800
Fix:
Client IO Thread: (zserv_read)
- Stop doing the read events when we know there are X number of items
on the FIFO already.(X - zebra zapi-packets <1-10000> (Default-1000)
- Determine the number of items on the zserv->ibuf_fifo. Subtract this
from the work items and only pull the number of items off that would
take us to X items on the ibuf_fifo again.
- If the number of items in the ibuf_fifo has reached to the maximum
* Either initially when zserv_read() is called (or)
* when processing the remainders of the incoming buffer
the client IO thread is woken by the the zebra main.
Main thread: (zserv_process_message)
If the client ibuf always schedules a wakeup to the client IO to read
more items from the socked buffer. This way we ensure
- Client IO thread always tries to read the socket buffer and add more
items to the ibuf_fifo (until max limit)
- hidden config change (zebra zapi-packets <>) is taken into account
Ticket: #3390099
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
Found by coverity. Let's just lock the writeable
amount to see if it is possible. It's ok because
we want to know if we have room *now*. If another
pthread runs it will only remove data from fnc->obuf
and make more room. So this is ok.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Containers inside a choice's case must be treated as presence containers
as they can be explicitly created and deleted. They must have `create`
and `destroy` callbacks, otherwise the internal data they represent may
never be deleted.
The issue can be reproduced with the following steps:
- create an access-list with destination-network params
```
# access-list test seq 1 permit ip any 10.10.10.0 0.0.0.255
```
- delete the `destination-network` container
```
# mgmt delete-config /frr-filter:lib/access-list[name='test'][type='ipv4']/entry[sequence='1']/destination-network
# mgmt commit apply
MGMTD: No changes found to be committed!
```
As the `destination-network` container is non-presence, and all its
leafs are mandatory, mgmtd doesn't see any changes to be commited and
simply updates its YANG data tree without passing any updates to backend
daemons.
This commit fixes the issue by requiring `create` and `destroy`
callbacks for containers inside choice's cases.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
A macvlan interface can have its underlying link-interface in another
namespace (aka. netns). However, by default, zebra does not know the
interface from the other namespaces. It results in a crash the pointer
to the link interface is NULL.
> 6 0x0000559d77a329d3 in zebra_vxlan_macvlan_up (ifp=0x559d798b8e00) at /root/frr/zebra/zebra_vxlan.c:4676
> 4676 link_zif = link_ifp->info;
> (gdb) list
> 4671 struct interface *link_ifp, *link_if;
> 4672
> 4673 zif = ifp->info;
> 4674 assert(zif);
> 4675 link_ifp = zif->link;
> 4676 link_zif = link_ifp->info;
> 4677 assert(link_zif);
> 4678
> (gdb) p zif->link
> $2 = (struct interface *) 0x0
> (gdb) p zif->link_ifindex
> $3 = 15
Fix the crash by returning when the macvlan link-interface is in another
namespace. No need to go further because any vxlan under the macvlan
interface would not be accessible by zebra.
Link: https://github.com/FRRouting/frr/issues/15370
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The current code is unsetting the fact that the
NHG is installed. It is installed but we are
reinstalling it. Let's note this in the code
appropriately as REINSTALL and not remove the
INSTALLED FLAG.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The function call in to zebra_interface_nhg_reinstall
is an action that takes place on interface up events
*not* when the connected addresses are added to
a system. this will prevent this function being
called when new connected interfaces come alive
that is an independent operation of the interface
coming up.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
zebra_if_nhg_dependents_XXX were just simple wrapper
functions that inited/deleted data structures.
These were already function calls themselves. Let's
remove the abstraction and make the code simpler.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
These functions provided a level of abstraction that forced
us to call multiple functions when a simple data structure
change was all that is needed. Let's consolidate down
and make things a bit simpler.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The nexthop group is marked as valid/invalid and then
installed. Not installed and then marked valid.
This is just a bit of code removed that might be covering
up other problems that need to be sorted.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Convert the dplane results function for nhg's over to
using a switch for the result enum. Let's specifically
call out the unexpected state and also set the nexthop
group as not installed when installation fails.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When installing a NHG via dplane_nexthop_add, it can only return
REQUEST_QUEUED or REQUEST_FAILURE. There is no way SUCCESS can
be returned with the way the dplane works at this point in time.
Remove the code that attempts to set the NHE state appropriately
as it is impossible.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Create a single registry of default port values that daemons
are using. Most of these are vty ports, but there are some
others for features like ospfapi and zebra FPM.
Signed-off-by: Mark Stapp <mjs@labn.net>
Allow bandwidth up to 1000000 Mb/s (ie. 1 Tb/s) and document it.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
get_iflink_speed() returns UINT32_MAX when the speeds is unknown.
Routing daemons (at least ospfd) interprets it as the high value.
Return errors in get_iflink_speed() to avoid the confusion.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
- use `apply_finish` callback when possible to avoid multiple applies per commit
- move table range working to the CLI handler
- remove unnecessary conditional compilation
- remove unnecessary boolean conversion
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
These commands don't really provide any functionality. VRF is associated
with netns automatically based on its name, and it's not possible to
associate VRF and netns with different names with these commands:
- When trying to assosiate a VRF with an already existing netns with a
different name:
`NS /run/netns/test is already configured with VRF 1(test)`
- When trying to assiciate a VRF with a non-existing netns, so they
become linked once the netns is created:
`Invalid pathname for /run/netns/test: No such file or directory`
- When doing "no netns" to unlink the netns and link it back to the same
VRF:
`VRF 1 is already configured with VRF test`
- When doing "no netns" to unlink the netns and link it to another VRF:
`Can not associate NS 4294967295 with NETNS /run/netns/test`
As shown above, not a single usecase is working. We can't remove them
completely to preserve backwards-compatibility, so just make them empty.
The main reason for this change is not to spend a lot of time trying to
figure out how to convert them to northbound.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
- unnecessary command duplication
- usage of oper data during validation
- unnecessary checks for things that can't happen
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Make link-params a presence container and activate it when entering the
node. The "enable" command is not necessary anymore but kept hidden for
backwards compatibility.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Replace "shutdown" leaf with "enabled" leaf in frr-zebra YANG module
to make it in line with standard YANG models.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Introduce new "[no] multicast <enable|disable>" command to be able to
remove the configuration. Current "[no] multicast" command cannot be
removed. Current command is hidden but still works for backwards
compatibility.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
clang-format doesn't understand FRR_DAEMON_INFO is a long macro where
laying out items semantically makes sense.
(Also use only one `FRR_DAEMON_INFO(` in isisd so editors don't get
confused with the mismatching `( ( )`.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This is not complicated code and if zebra is allocating
a new one. Zebra does not need to inform the operator
about the process during debugs.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When debugging NHG detail there is a whole bunch
of lines surrounding the nexthop group. Let's
clean these up since they are extremely chatty and
spawn several lines.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
dest was shadowing dest inside of an if statement additionally
both legs needed dest to be assigned. Let's clean this up a
slight bit and use it appropriately
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Fix this:
***********************************************************************************
Address Sanitizer Error detected in zebra_opaque.test_zebra_opaque/r3.asan.zebra.11099
=================================================================
==11099==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 66 byte(s) in 1 object(s) allocated from:
#0 0x7f527fc06b40 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb40)
#1 0x7f527f5e852b in qmalloc lib/memory.c:100
#2 0x56418d20832d in zread_route_add zebra/zapi_msg.c:2125
#3 0x56418d215d08 in zserv_handle_commands zebra/zapi_msg.c:4011
#4 0x56418d32ab5b in zserv_process_messages zebra/zserv.c:520
#5 0x7f527f6938d3 in event_call lib/event.c:2003
#6 0x7f527f5cb692 in frr_run lib/libfrr.c:1218
#7 0x56418d1c3336 in main zebra/main.c:508
#8 0x7f527e656c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
SUMMARY: AddressSanitizer: 66 byte(s) leaked in 1 allocation(s).
***********************************************************************************
Code inspection leads to some code paths where the opaque data was not
freed up.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently, when editing a leaf-list, `nb_candidate_edit` expects to
receive it's xpath without a predicate and the value in a separate
argument, and then creates the full xpath. This hack is complicated,
because it depends on the operation and on the caller being a backend or
not. Instead, let's require to always include the predicate in a
leaf-list xpath. Update all the usages in the code accordingly.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
There is a goto statement that would be better served
with a break statement. Let's try to minimize this
in the code.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
- initialize the necessary bit when creating if_link_params
- fix CLI description to mark extended as the default mode
- correctly set mode to extended when using the "no" form of the command
- handle the "show_defaults" parameter correctly in cli_show callback
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
First, any data tree validation in CLI handler is not correct, because
this code won't be called when the change is done through any other
frontend. Second, these checks are not necessary at all, because NB
layer handles the change between admin-grp/affinity automatically.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
- it was not printed at all because of the incorrect `yang_dnode_exist`
check
- the intended output was "admin-group" instead of "admin-grp" used in
the actual CLI command
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Don't use config tree when updating internal daemon state. Everything
needed is already stored in internal structures.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When affinity mode is "standard", bit position cannot be greater than
31. Add a "must" statement to the YANG model to validate this, and
remove our custom validation code that does the same.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Change the type of affinity leaf-list in frr-zebra to a leafref with
"require-instance" property set to true. This change tells libyang to
automatically check that affinity-map exists before usage and doesn't
allow it to be deleted if it's referenced. It allows us to remove all
the manual code that is doing the same thing.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
SA has decided that old_re could be a NULL pointer
even though the zebra_redistribute_check function
checks for NULL and returns false that would
not allow a NULL pointer deref.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Practically no-one uses this and ioctls are pretty much
wrappered. Further wrappering could make this even better.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Fix the following coverity issue:
*** CID 1575079: Null pointer dereferences (REVERSE_INULL)
/zebra/zebra_dplane.c: 5950 in dplane_srv6_encap_srcaddr_set()
5944 if (ret == AOK)
5945 result = ZEBRA_DPLANE_REQUEST_QUEUED;
5946 else {
5947 atomic_fetch_add_explicit(&zdplane_info
5948 .dg_srv6_encap_srcaddr_set_errors,
5949 1, memory_order_relaxed);
CID 1575079: Null pointer dereferences (REVERSE_INULL)
Null-checking "ctx" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
5950 if (ctx)
5951 dplane_ctx_free(&ctx);
5952 }
5953 return result;
5954 }
5955
Remove the pointer check for `ctx`. At this point in the
function it has to be non null since we deref'ed it.
Additionally the alloc function that creates it cannot
fail.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Despite if it's managed by FRR or the kernel, show it. If the system has only
link-local addresses, we should show it unless it's a secondary one.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Let's say we this:
```
$ ip link set down dev r1-eth0
$ ip link set up dev r1-eth0
```
But at the same time we have this interface configured by the FRR too:
```
interface r1-eth0
ipv6 address fe80:1::1/64
exit
```
We never re-add fe80:1::1/64, when the interface comes up, and we have a
strange situation where NHT stops working and other stuff depending on NHT
stops too (BGP peering, etc.).
Closes: https://github.com/FRRouting/frr/issues/15050
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The t_dequeue was being enqueued with a timer of 0
this is really an event instead of a timer. Let's
use that instead.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
An operator is reporting that the dplane_fpm_nl connection has
started to accumulate contexts. One such path that could cause
this is that the obuf used is full and stays full. This would
imply that what ever is on the receiving end has gotten wedged
and is not reading from the stream of data being sent it's way.
If after 15 seconds of no response, let's declare the connection
dead and reset it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When zebra is started, someone may have configured an SRv6 encap source
address different from the default address ( :: ) in the kernel.
On startup, zebra should not assume that the actual SRv6 encap source
address is the default address ( :: ), but should retrieve the actual
source address from the kernel and put it in zebra configuration. In
other words, on startup we expect the actual SRv6 encap source
address and the configured one to be the same.
This commit makes the necessary changes to support the above.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
When writing the SRv6 config from zebra, we must also include the source
address of outer encapsulating IPv6 header.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
Add a new CLI command `show segment-routing srv6 manager [json]` to
verify the overall SRv6 state. The current output displays only the
configured source address of outer encapsulating IPv6 header. The output
can be extended in the future to show more information, including
summary SRv6 information and supported capabilities.
Example:
```
r1# show segment-routing srv6 manager
Parameters:
Encapsulation:
Source Address:
Configured: fc00:0:1::1
r1# show segment-routing srv6 manager json
{
"parameters":{
"encapsulation":{
"sourceAddress":{
"configured":"fc00:0:1::1"
}
}
}
}
```
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
- Add a new node `SRV6_ENCAP_NODE` to the CLI graph. This node allows
users to configure encapsulation parameters for SRv6, including the
source address of the outer encapsulating IPv6 header.
- Install a new CLI command `source-address` under the
`SRV6_ENCAP_NODE` node. This command is used to configure the source
address of the outer encapsulating IPv6 header.
- Install a new CLI command `no source-address` under the
`SRV6_ENCAP_NODE` node. This command is used to unset the
source address of the outer encapsulating IPv6 header and restore the
default source address.
Examples:
```
router# segment-routing
router(sr)# srv6
router(srv6)# encapsulation
router(srv6-encap)# source-address fc00:0:1::1
```
```
router# segment-routing
router(sr)# srv6
router(srv6)# encapsulation
router(srv6-encap)# no source-address
```
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
Add a bunch of set functions and associated data structure in
zebra_dplane to allow the configuration of the source address for SRv6
encap in the data plane.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
Generic Netlink is an extension of Netlink meant for kernel-user space
communications. It supports the dynamic allocation of communication
channels. Kernel and user space applications register their services
with a Generic Netlink controller. The Generic Netlink controller is
responsible for assigning a unique channel number with each service.
Clients who want to use a service query the controller to see if
the service exists and to determine the correct channel number. The
channel number is used to access the requested service.
This commit adds the base functionality to get the channel number
assigned to a specific service. More precisely, this commit adds a
function `genl_resolve_family()` that takes the service name (called
family in the Generic Netlink terminology) as an input parameter and
queries the Generic Netlink controller to get the channel number
assigned with the requested service.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
zebra already supports several Netlink sockets which allow it to
communicate with the kernel. Each Netlink socket has a specific purpose:
we have a socket for incoming events from the kernel, a socket for
programming the dataplane, a socket for the kernel messages, a socket
used as the command channel. All the currently supported sockets are
based on the `NETLINK_ROUTE` protocol.
This commit adds a new Netlink socket that allows zebra to send
commands to the kernel using the `Generic Netlink` protocol.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
The `netlink_socket()` function is used in many places to create and
initialize Netlink sockets. Currently, it can only create
`NETLINK_ROUTE` Netlink sockets.
This commit generalizes the behavior of the `netlink_socket()` function,
enabling it to generate Netlink sockets of any type. Specifically, it
extends the `netlink_socket()` function with a new argument `nl_family`,
which allows developers to specify the Netlink family of the socket to
be created.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
Whenever a link up change was detected on a macvlan device where
the linked device wasn't visible in the namespace zebra was
running in, the linked zebra interface was NULL. This was already
handled in the event of a link down, but was ommitted from the
upside. Added the same null check to the up-side.
Signed-off-by: Tomi Salminen <tlsalmin@gmail.com>
The adata variable was being leaked on shutdown since
it was calloc'ed. There is no need to make this dynamic
memory. Just choose a size and use that. Add a bit
of code to ensure that if it's not large enough,
it will just stop and the developer will fix it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
the zebra pseudo wire code was registering a callback
per vrf. These callbacks are not per vrf based. They
are vrf agnostic so this was a mistake. Modify the code
to on startup register once and on shutdown unregister once.
Finally rename the zebra_pw_init and zebra_pw_exit functions
to more properly reflect when they are called.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The route entry created when using a ctx to pass route
entry data backup to the master pthread in zebra is
being leaked. Prevent this from happening.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The NHG_DEL operation is done directly from ZAPI call, whereas
the NHG_ADD operation is done in the rib_nhg meta queue.
This may be problematic when ADD is followed by DEL. Imagine a
scenarion with two protocol NHIDs. <NH1> depends of <NH2> and
<NH3>. The deletion of <NH3> at the protocol level will trigger
2 messages to ZEBRA: NHG_ADD(<NH1>) and NHG_DEL(<NH3>).
Those operations are properly enqueued in ZAPI, but in the end,
the NHG_DEL is executed first. This causes NHG_ADD to unlink an
already freed NHG.
Fix this by consistently enqueuing NHG_DEL and NHG_ADD operations.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add ability for the connected routes to know
if they are a prefix route or not.
sharpd@eva:/work/home/sharpd/frr1$ ip addr show dev dummy1
13: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether aa:93:ce:ce:3f:62 brd ff:ff:ff:ff:ff:ff
inet 192.168.55.1/24 scope global noprefixroute dummy1
valid_lft forever preferred_lft forever
inet 192.168.56.1/24 scope global dummy1
valid_lft forever preferred_lft forever
inet6 fe80::a893:ceff:fece:3f62/64 scope link
valid_lft forever preferred_lft forever
sharpd@eva:/work/home/sharpd/frr1$ sudo vtysh -c "show int dummy1"
Interface dummy1 is up, line protocol is up
Link ups: 0 last: (never)
Link downs: 0 last: (never)
vrf: default
index 13 metric 0 mtu 1500 speed 0 txqlen 1000
flags: <UP,BROADCAST,RUNNING,NOARP>
Type: Ethernet
HWaddr: aa:93:ce:ce:3f:62
inet 192.168.55.1/24 noprefixroute
inet 192.168.56.1/24
inet6 fe80::a893:ceff:fece:3f62/64
Interface Type Other
Interface Slave Type None
protodown: off
sharpd@eva:/work/home/sharpd/frr1$ sudo vtysh -c "show ip route"
Codes: K - kernel route, C - connected, L - local, S - static,
R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric, t - Table-Direct,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp13s0, 00:00:08
K>* 169.254.0.0/16 [0/1000] is directly connected, virbr2 linkdown, 00:00:08
L>* 192.168.44.1/32 is directly connected, dummy2, 00:00:08
L>* 192.168.55.1/32 is directly connected, dummy1, 00:00:08
C>* 192.168.56.0/24 is directly connected, dummy1, 00:00:08
L>* 192.168.56.1/32 is directly connected, dummy1, 00:00:08
L>* 192.168.119.205/32 is directly connected, enp13s0, 00:00:08
sharpd@eva:/work/home/sharpd/frr1$ ip route show
default via 192.168.119.1 dev enp13s0 proto dhcp metric 100
169.254.0.0/16 dev virbr2 scope link metric 1000 linkdown
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.45.0/24 dev virbr2 proto kernel scope link src 192.168.45.1 linkdown
192.168.56.0/24 dev dummy1 proto kernel scope link src 192.168.56.1
192.168.119.0/24 dev enp13s0 proto kernel scope link src 192.168.119.205 metric 100
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown
sharpd@eva:/work/home/sharpd/frr1$ ip route show table 255
local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1
local 127.0.0.1 dev lo proto kernel scope host src 127.0.0.1
broadcast 127.255.255.255 dev lo proto kernel scope link src 127.0.0.1
local 172.17.0.1 dev docker0 proto kernel scope host src 172.17.0.1
broadcast 172.17.255.255 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
local 192.168.44.1 dev dummy2 proto kernel scope host src 192.168.44.1
broadcast 192.168.44.255 dev dummy2 proto kernel scope link src 192.168.44.1
local 192.168.45.1 dev virbr2 proto kernel scope host src 192.168.45.1
broadcast 192.168.45.255 dev virbr2 proto kernel scope link src 192.168.45.1 linkdown
local 192.168.55.1 dev dummy1 proto kernel scope host src 192.168.55.1
broadcast 192.168.55.255 dev dummy1 proto kernel scope link src 192.168.55.1
local 192.168.56.1 dev dummy1 proto kernel scope host src 192.168.56.1
broadcast 192.168.56.255 dev dummy1 proto kernel scope link src 192.168.56.1
local 192.168.119.205 dev enp13s0 proto kernel scope host src 192.168.119.205
broadcast 192.168.119.255 dev enp13s0 proto kernel scope link src 192.168.119.205
local 192.168.122.1 dev virbr0 proto kernel scope host src 192.168.122.1
broadcast 192.168.122.255 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown
Fixes: #14952
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The linux kernel can send up a flag that tells us that the
connected address is not a PREFIXROUTE. Add the ability
to note this and pass it up from the data plane.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When allocating big protocol level identifiers, the number range is
big, and when pushing to netlink messages, the first nexthop group
is truncated, whereas the nexthop has been installed on the kernel.
> ubuntu2204(config)# nexthop-group A
> ubuntu2204(config-nh-group)# group 1
> ubuntu2204(config-nh-group)# group 2
> ubuntu2204(config-nh-group)# exi
> ubuntu2204(config)# nexthop-group 1
> ubuntu2204(config-nh-group)# nexthop 192.0.2.130 loop1 enable-proto-nhg-control
> ubuntu2204(config-nh-group)# exi
> ubuntu2204(config)# nexthop-group 2
> ubuntu2204(config-nh-group)# nexthop 192.0.2.131 loop1 enable-proto-nhg-control
> [..]
> 2023/11/24 16:47:40 ZEBRA: [VNMVB-91G3G] _netlink_nexthop_build_group: ID (179687500): group 17968/179687502
> # ip nexthop ls
> id 179687500 group 179687501/179687502 proto 194
Fix this by increasing the buffer size when appending the first number.
Fixes: 8d03bc501b ("zebra: Handle nhg_hash_entry encaps/more debugging")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add three counters that account for the nhg operations
that are using the zebra API with the NHG_ADD and NHG_DEL
commands.
> # show zebra client
> [..]
> Type Add Update Del
> ==================================================
> IPv4 100 0 0
> IPv6 0 0 0
> Redist:v4 0 0 0
> Redist:v6 0 0 0
> NHG 1 1 1
> VRF 3 0 0
> [..]
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
If there happens to be a entry in the zebra rib
that has a lower admin distance then a newly received
re, zebra would not notify the upper level protocol
about this happening. Imagine a case where there
is a connected route for say a /32 and bgp receives
a route from a peer that is the same route as the
connected. Since BGP has no network statement and
perceives the route as being `good` bgp will install
the route into zebra. Zebra will look at the new
bgp re and correctly identify that the re is not
something that it will use and do nothing. This
change notices this and sends up a BETTER_ADMIN_WON
route notification.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Configure hash table cleanup with specific free functions for `zrouter.filter_hash`, `zrouter.qdisc_hash`, and `zrouter.class_hash`.
This ensures proper memory cleanup, addressing memory leaks.
The ASan leak log for reference:
```
***********************************************************************************
Address Sanitizer Error detected in tc_basic.test_tc_basic/r1.asan.zebra.15495
=================================================================
==15495==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 176 byte(s) in 1 object(s) allocated from:
#0 0x7fd5660ffd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7fd565afe238 in qcalloc lib/memory.c:105
#2 0x5564521c6c9e in tc_filter_alloc_intern zebra/zebra_tc.c:389
#3 0x7fd565ac49e8 in hash_get lib/hash.c:147
#4 0x5564521c7c74 in zebra_tc_filter_add zebra/zebra_tc.c:409
#5 0x55645210755a in zread_tc_filter zebra/zapi_msg.c:3428
#6 0x5564521127c1 in zserv_handle_commands zebra/zapi_msg.c:4004
#7 0x5564522208b2 in zserv_process_messages zebra/zserv.c:520
#8 0x7fd565b9e034 in event_call lib/event.c:1974
#9 0x7fd565ae142b in frr_run lib/libfrr.c:1214
#10 0x5564520c14b1 in main zebra/main.c:492
#11 0x7fd564ec2c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x7fd5660ffd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7fd565afe238 in qcalloc lib/memory.c:105
#2 0x5564521c6c6e in tc_class_alloc_intern zebra/zebra_tc.c:239
#3 0x7fd565ac49e8 in hash_get lib/hash.c:147
#4 0x5564521c784f in zebra_tc_class_add zebra/zebra_tc.c:293
#5 0x556452107ce5 in zread_tc_class zebra/zapi_msg.c:3315
#6 0x5564521127c1 in zserv_handle_commands zebra/zapi_msg.c:4004
#7 0x5564522208b2 in zserv_process_messages zebra/zserv.c:520
#8 0x7fd565b9e034 in event_call lib/event.c:1974
#9 0x7fd565ae142b in frr_run lib/libfrr.c:1214
#10 0x5564520c14b1 in main zebra/main.c:492
#11 0x7fd564ec2c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
Direct leak of 12 byte(s) in 1 object(s) allocated from:
#0 0x7fd5660ffd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7fd565afe238 in qcalloc lib/memory.c:105
#2 0x5564521c6c3e in tc_qdisc_alloc_intern zebra/zebra_tc.c:128
#3 0x7fd565ac49e8 in hash_get lib/hash.c:147
#4 0x5564521c753b in zebra_tc_qdisc_install zebra/zebra_tc.c:184
#5 0x556452108203 in zread_tc_qdisc zebra/zapi_msg.c:3286
#6 0x5564521127c1 in zserv_handle_commands zebra/zapi_msg.c:4004
#7 0x5564522208b2 in zserv_process_messages zebra/zserv.c:520
#8 0x7fd565b9e034 in event_call lib/event.c:1974
#9 0x7fd565ae142b in frr_run lib/libfrr.c:1214
#10 0x5564520c14b1 in main zebra/main.c:492
#11 0x7fd564ec2c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
SUMMARY: AddressSanitizer: 228 byte(s) leaked in 3 allocation(s).
***********************************************************************************
```
Signed-off-by: Keelan Cannoo <keelan.cannoo@icloud.com>
Replace `struct list *` with `DLIST(if_connected, ...)`.
NB: while converting this, I found multiple places using connected
prefixes assuming they were IPv4 without checking:
- vrrpd/vrrp.c: vrrp_socket()
- zebra/irdp_interface.c: irdp_get_prefix(), irdp_if_start(),
irdp_advert_off()
(these fixes are really hard to split off into separate commits as that
would require going back and reapplying the change but with the old list
handling)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
a) Rename rib_init to zebra_rib_init() to better follow how
things are named
b) on shutdown cycle through the rib_dplane_q and free
up any contexts sitting in it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>