During internal testing, when the following sequence is followed, two
non default vrfs end up pointing to the same table-id
- Initially vrf201 has table id 1002
- ip link add dev vrf202 type vrf table 1002
- ip link set dev vrf202 up
- ip link set dev <intrerface> master vrf202
This will ideally lead to zebra exit since this is a misconfiguration as
expected.
However if we perform a restart frr.service at this point, we end up
having two vrfs pointing to same table-id and bad things can happen.
This is because in the interface_vrf_change, we incorrectly check for
vrf_lookup_by_id() to evaluate if there is a misconfig. This works well
for a non restart case but not for the startup case.
root@mlx-3700-20:mgmt:/var/log/frr# sudo vtysh -c "sh vrf"
vrf mgmt id 37 table 1001
vrf vrf201 id 46 table 1002
vrf vrf202 id 59 table 1002 >>>>
Fix: in all cases of misconfiguration, exit zebra as expected.
Ticket :#3970414
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
The bgp_duplicate_nexthop test installs routes with nexthop's
flags set to both DUPLICATE and FIB: this should not happen.
The DUPLICATE flag of a nexthop indicates this nexthop is already
used in the same nexthop-group, and there is no need to install it
twice in the system; having the FIB flag set indicates that the
nexthop is installed in the system. This is why both flags should
not be set on the same nexthop.
This case happens at installation time, but can also happen
at update time.
- Fix this by not setting the FIB flag value when the DUPLICATE
flag is present.
- Modify the bgp_duplicate_test to check that the FIB flag is not
present on duplicated nexthops.
- Modify the bgp_peer_type_multipath_relax test.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add a new start option "-K" to libfrr to denote a graceful start,
and use it in zebra and bgpd.
zebra will use this option to denote a planned FRR graceful restart
(supporting only bgpd currently) to wait for a route sync completion
from bgpd before cleaning up old stale routes from the FIB. An optional
timer provides an upper-bounds for this cleanup.
bgpd will use this option to denote either a planned FRR graceful
restart or a bgpd-only graceful restart, and this will drive the BGP
GR restarting router procedures.
Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>
The `locator` pointer is dereferenced before ensuring it is not NULL.
Fix the issue by checking that the pointer is not NULL before
dereferencing it.
Fixes 1594013
** CID 1594013: Null pointer dereferences (REVERSE_INULL)
/zebra/zebra_srv6.c: 961 in zebra_srv6_sid_compose()
________________________________________________________________________________________________________
*** CID 1594013: Null pointer dereferences (REVERSE_INULL)
/zebra/zebra_srv6.c: 961 in zebra_srv6_sid_compose()
955 struct srv6_locator *locator,
956 uint32_t sid_func)
957 {
958 uint8_t offset, func_len;
959 struct srv6_sid_format *format = locator->sid_format;
960
CID 1594013: Null pointer dereferences (REVERSE_INULL)
Null-checking "locator" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
961 if (!sid_value || !locator)
962 return false;
963
964 if (format) {
965 offset = format->block_len + format->node_len;
966 func_len = format->function_len;
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
The `for` loop starting at line 1848 searches the `func_allocated` array
for a pointer that points to a specific `sid_wide_func` element.
The loop should iterate over all the elements of the `func_allocated`
array and dereference each element to see if it is the one we are
looking for.
Currently, the loop is using the wrong variable to iterate over the
array.
Let's fix this issue by using the correct variable in the loop.
Fixes CID 1594014
Fixes CID 1594016
** CID 1594014: Null pointer dereferences (FORWARD_NULL)
/zebra/zebra_srv6.c: 1860 in release_srv6_sid_func_explicit()
________________________________________________________________________________________________________
*** CID 1594014: Null pointer dereferences (FORWARD_NULL)
/zebra/zebra_srv6.c: 1860 in release_srv6_sid_func_explicit()
1854
1855 /* Lookup SID function in the functions allocated list of EWLIB range */
1856 for (ALL_LIST_ELEMENTS_RO(block->u.usid
1857 .wide_lib[sid_func]
1858 .func_allocated,
1859 node, sid_func_ptr))
CID 1594014: Null pointer dereferences (FORWARD_NULL)
Dereferencing null pointer "sid_wide_func_ptr".
1860 if (*sid_wide_func_ptr == sid_wide_func)
1861 break;
1862
1863 /* Ensure that the SID function is allocated */
1864 if (!sid_wide_func_ptr) {
1865 zlog_warn("%s: failed to release wide SID function %u, function is not allocated",
** CID 1594016: Possible Control flow issues (DEADCODE)
/zebra/zebra_srv6.c: 1871 in release_srv6_sid_func_explicit()
________________________________________________________________________________________________________
*** CID 1594016: Possible Control flow issues (DEADCODE)
/zebra/zebra_srv6.c: 1871 in release_srv6_sid_func_explicit()
1865 zlog_warn("%s: failed to release wide SID function %u, function is not allocated",
1866 __func__, sid_wide_func);
1867 return -1;
1868 }
1869
1870 /* Release the SID function from the EWLIB range */
CID 1594016: Possible Control flow issues (DEADCODE)
Execution cannot reach this statement: "listnode_delete(block->u.us...".
1871 listnode_delete(block->u.usid.wide_lib[sid_func]
1872 .func_allocated,
1873 sid_wide_func_ptr);
1874 zebra_srv6_sid_func_free(sid_wide_func_ptr);
1875 } else {
1876 zlog_warn("%s: function %u is outside ELIB [%u/%u] and EWLIB alloc ranges [%u/%u]",
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
At line 1736, `alloc_mode` is set to `SRV6_SID_ALLOC_MODE_EXPLICIT` or
`SRV6_SID_ALLOC_MODE_DYNAMIC` depending on the `sid_value` variable.
There will never be a case where alloc_mode will be `SRV6_SID_ALLOC_MODE_MAX`
or `SRV6_SID_ALLOC_MODE_UNSPEC`.
Let's replace the `switch(alloc_mode) {...}` with an if-else.
Fixes CID 1594015.
** CID 1594015: (DEADCODE)
/zebra/zebra_srv6.c: 1782 in get_srv6_sid()
/zebra/zebra_srv6.c: 1781 in get_srv6_sid()
________________________________________________________________________________________________________
*** CID 1594015: (DEADCODE)
/zebra/zebra_srv6.c: 1782 in get_srv6_sid()
1776 }
1777
1778 ret = get_srv6_sid_dynamic(sid, ctx, locator);
1779
1780 break;
1781 case SRV6_SID_ALLOC_MODE_MAX:
CID 1594015: (DEADCODE)
Execution cannot reach this statement: "case SRV6_SID_ALLOC_MODE_UN...".
1782 case SRV6_SID_ALLOC_MODE_UNSPEC:
1783 default:
1784 flog_err(EC_ZEBRA_SM_CANNOT_ASSIGN_SID,
1785 "%s: SRv6 Manager: Unrecognized alloc mode %u",
1786 __func__, alloc_mode);
1787 /* We should never arrive here */
/zebra/zebra_srv6.c: 1781 in get_srv6_sid()
1775 return -1;
1776 }
1777
1778 ret = get_srv6_sid_dynamic(sid, ctx, locator);
1779
1780 break;
CID 1594015: (DEADCODE)
Execution cannot reach this statement: "case SRV6_SID_ALLOC_MODE_MAX:".
1781 case SRV6_SID_ALLOC_MODE_MAX:
1782 case SRV6_SID_ALLOC_MODE_UNSPEC:
1783 default:
1784 flog_err(EC_ZEBRA_SM_CANNOT_ASSIGN_SID,
1785 "%s: SRv6 Manager: Unrecognized alloc mode %u",
1786 __func__, alloc_mode);
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
In case of EVPN MH bond, a member port going in
protodown state due to external reason (one case being linkflap),
frr updates the state correctly but upon manually
clearing external reason trigger FRR to reinstate
protodown without any reason code.
Fix is to ensure if the protodown reason was external
and new state is to have protodown 'off' then do no reinstate
protodown.
Ticket: #3947432
Testing:
switch:#ip link show swp1
4: swp1: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9216 qdisc
pfifo_fast master bond1 state DOWN mode DEFAULT group default qlen
1000
link/ether 1c:34:da:2c:aa:68 brd ff:ff:ff:ff:ff:ff protodown on
protodown_reason <linkflap>
switch:#ip link set swp1 protodown off protodown_reason linkflap off
switch:#ip link show swp1
4: swp1: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9216 qdisc
pfifo_fast master bond1 state DOWN mode DEFAULT group default qlen
1000
link/ether 1c:34:da:2c:aa:68 brd ff:ff:ff:ff:ff:ff
Signed-off-by: Chirag Shah <chirag@nvidia.com>
If using weighted ECMP, the weight for non-recursive next-hop should be
inherited from recursive next-hop.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
In the near future, some daemons may only register SIDs. This may be
the case for the pathd daemon when creating SRv6 binding SIDs.
When a locator is getting deleted at ZEBRA level, the daemon may have
an easy way to find out the SIds to unregister to.
This commit proposes to add the locator name to the SID_SRV6_NOTIFY
message whenever possible. Only case when an allocation failure happens,
the locator will not be present. In all other places, the notify API
at procol levels has the locator name extra-parameter.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
When removing a large number of routes, the linux kernel can take the
cpu for an extended amount of time, leaving a situation where FRR
detects a starvation event.
r1# sharp install routes 10.0.0.0 nexthop 192.168.44.33 1000000 repeat 10
2024-06-14 12:55:49.365 [NTFY] sharpd: [M7Q4P-46WDR] vty[5]@# sharp install routes 10.0.0.0 nexthop 192.168.44.33 1000000 repeat 10
2024-06-14 12:55:49.365 [DEBG] sharpd: [YP4TQ-01TYK] Inserting 1000000 routes
2024-06-14 12:55:57.256 [DEBG] sharpd: [TPHKD-3NYSB] Installed All Items 7.890085
2024-06-14 12:55:57.256 [DEBG] sharpd: [YJ486-NX5R1] Removing 1000000 routes
2024-06-14 12:56:07.802 [WARN] zebra: [QH9AB-Y4XMZ][EC 100663314] STARVATION: task dplane_thread_loop (634377bc8f9e) ran for 7078ms (cpu time 220ms)
2024-06-14 12:56:25.039 [DEBG] sharpd: [WTN53-GK9Y5] Removed all Items 27.783668
2024-06-14 12:56:25.039 [DEBG] sharpd: [YP4TQ-01TYK] Inserting 1000000 routes
2024-06-14 12:56:32.783 [DEBG] sharpd: [TPHKD-3NYSB] Installed All Items 7.743524
2024-06-14 12:56:32.783 [DEBG] sharpd: [YJ486-NX5R1] Removing 1000000 routes
2024-06-14 12:56:41.447 [WARN] zebra: [QH9AB-Y4XMZ][EC 100663314] STARVATION: task dplane_thread_loop (634377bc8f9e) ran for 5175ms (cpu time 179ms)
Let's modify the loop in dplane_thread_loop such that after a provider
has been run, check to see if the event should yield, if so, stop
and reschedule this for the future.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Instead of keeping a counter that is independent
of the queue's data structure. Just use the queue's
built-in counter. Ensure that it's pthread safe by
keeping it wrapped inside the mutex for adding/deleting
to the queue.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently, when a locator is deleted in zebra, zebra notifies only the
zclient that owns the locator.
With the introduction of SID Manager, the locator is no longer owned by
any client. Instead, the locator is owned by Zebra, and clients can
allocate and release SIDs from the locator using the ZAPI
ZEBRA_SRV6_MANAGER_GET_SID and ZEBRA_SRV6_MANAGER_RELEASE_SID.
Therefore, when a locator is removed in Zebra, we need to notify all
daemons so that they can release/uninstall the SIDs allocated by that
locator.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Send asynchronous notifications to zclients when an SRv6 SID is
allocated/released and when a SID alloc/release operation fails.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Previous commits introduced two new ZAPI operations,
`ZEBRA_SRV6_MANAGER_GET_SRV6_SID` and
`ZEBRA_SRV6_MANAGER_RELEASE_SRV6_SID`. These operations allow a daemon
to interact with the SRv6 SID Manager to get and release an SRv6 SID,
respectively.
This commit extends the SID Manager by adding logic to process the
requests `ZEBRA_SRV6_MANAGER_GET_SRV6_SID` and
`ZEBRA_SRV6_MANAGER_RELEASE_SRV6_SID`, and allocate/release SIDs to
requesting daemons.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Add functions to allocate/release SRv6 SIDs. SIDs can be allocated
either explicitly (allocate a specific SID) or dynamically (allocate any
available SID).
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
The previous commits introduced a new operation,
`ZEBRA_SRV6_MANAGER_GET_LOCATOR`, allowing a daemon to request
information about a specific SRv6 locator from the SRv6 SID Manager.
This commit extends the SID Manager to respond to a
`ZEBRA_SRV6_MANAGER_GET_LOCATOR` request and provide the requested
locator information.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Add a data structure to represent an SRv6 SID context and the related
management functions (allocate/free).
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Add the CLI to choose the SID format of a locator. When the SID format
of a locator is changed, the SIDs allocated from that locator might no
longer be valid (for example, because the new format might involve a
different SID allocation schema). In such a case, it is necessary to
notify all the zclients so that they can withdraw/uninstall the old SIDs
that use the previous format and allocate/install/advertise the new SIDs
based on the new format.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
An SRv6 block is an IPv6 prefix from which SIDs are allocated. This
commit adds support for SRv6 SID blocks. Specifically, it adds a data
structure to store information about an SRv6 block (e.g., its occupancy
status, which SIDs have been allocated and which are available, which
SID format is used for that block, etc.). It also adds some functions to
manage the block (allocate / free / lookup).
These functions will be used in the next commits to support the
allocation of SIDs from a block in the SID Manager.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Add functionalities to manage SRv6 SID formats (register / unregister /
lookup) and create two SID formats upon SRv6 Manager initialization:
`uncompressed-f4024` and `usid-f3216`.
In future commits, we will add the CLI to allow the user to choose
between the two formats.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
When displaying a route table in JSON, a table JSON object is storing
all the prefix JSON objects containing the prefix information. This
results in excessive memory allocation for JSON objects, potentially
leading to an out-of-memory error on the machine with large routing
tables.
To Fix the memory consumption issue for the "show ip[v6] route [vrf XX]
json" command, display the prefixes one by one and free the memory of
each JSON object after it has been displayed.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
0e2fc3d67f ("vtysh, zebra: Fix malformed json output for multiple vrfs
in command 'show ip route vrf all json'") has been reverted in the
previous commit. Although the fix was correct, it was consuming too muca
memory when displaying large route tables.
A root JSON object was storing all the JSON objects containing the route
tables, each containing their respective prefixes in JSON objects. This
resulted in excessive memory allocation for JSON objects, potentially
leading to an out-of-memory error on the machine.
To Fix the memory consumption issue for the "show ip[v6] route vrf all
json" command, display the tables one by one and free the memory of each
JSON object after it has been displayed.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
This reverts commit 0e2fc3d67f.
This fix was correct but not optimal for memory consumption at scale.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Fixes:
zebra/zebra_netns_notify.c: In function 'zebra_ns_ready_read':
zebra/zebra_netns_notify.c:266:40: error: implicit declaration of function 'basename' [-Wimplicit-function-declaration]
266 | if (strmatch(VRF_DEFAULT_NAME, basename(netnspath))) {
| ^~~~~~~~
Fixed by including libgen.h, then since basename may modify its
parameter, allocate a copy on the stack, using strdupa, and pass the
temporary string to basename.
According to the man page for basename:
With glibc, one gets the POSIX version of basename() when
<libgen.h> is included, and the GNU version otherwise.
The POSIX version of basename may modify the contents of path,
so we should to pass a copy when calling this function.
[1] https://man7.org/linux/man-pages/man3/basename.3.html
Signed-off-by: Georgi Valkov <gvalkov@gmail.com>
The 'show running-config' does not display the ipv6 source address
when a locator is not configured. Fix this by systematically displaying
the ipv6 source address.
Fixes: 6a0956169b ("zebra: Add encap source address to SRv6 config write function")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Currently zebra does not deny the routes if `ip protocol <proto> route-map
FOO`
commmand is configured with reference to an undefined route-map (FOO in
this case).
However, on FRR restart, in zebra_route_map_check() routes get denied
if route-map name is available but the route-map is not defined. This
change was introduced in fd303a4ba1.
Fix:
When `ip protocol <proto> route-map FOO` CLI is configured with reference to an
undefined route-map FOO, let the processing in ip_protocol_rm_add() and
ip_protocol_rm_del() go through so that zebra can deny the routes instead
of simply returning. This will result in consistent behavior.
Testing Done:
Before fix:
```
spine-1# configure
spine-1(config)# ip protocol bgp route-map rmap7
root@spine-1:mgmt:/var/home/cumulus# vtysh -c "show run" | grep rmap7
ip protocol bgp route-map rmap7
root@spine-1:mgmt:/var/home/cumulus#
spine-1(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, A - Babel, D - SHARP, F - PBR, f - OpenFabric,
Z - FRR,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
C>* 27.0.0.1/32 is directly connected, lo, 02:27:45
B>* 27.0.0.3/32 [20/0] via fe80::202:ff:fe00:21, downlink_1, weight 1, 02:27:35
B>* 27.0.0.4/32 [20/0] via fe80::202:ff:fe00:29, downlink_2, weight 1, 02:27:40
B>* 27.0.0.5/32 [20/0] via fe80::202:ff:fe00:31, downlink_3, weight 1, 02:27:40
B>* 27.0.0.6/32 [20/0] via fe80::202:ff:fe00:39, downlink_4, weight 1, 02:27:40
```
After fix:
```
spine-1(config)# ip protocol bgp route-map route-map67
spine-1(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, A - Babel, D - SHARP, F - PBR, f - OpenFabric,
Z - FRR,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
C>* 27.0.0.1/32 is directly connected, lo, 00:35:03
B 27.0.0.3/32 [20/0] via fe80::202:ff:fe00:21, downlink_1 inactive, weight 1, 00:34:58
B 27.0.0.4/32 [20/0] via fe80::202:ff:fe00:29, downlink_2 inactive, weight 1, 00:34:57
B 27.0.0.5/32 [20/0] via fe80::202:ff:fe00:31, downlink_3 inactive, weight 1, 00:34:57
B 27.0.0.6/32 [20/0] via fe80::202:ff:fe00:39, downlink_4 inactive, weight 1, 00:34:58
spine-1(config)#
root@spine-1:mgmt:/var/home/cumulus# ip route show
root@spine-1:mgmt:/var/home/cumulus#
```
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
Configured with "mpls label bind 1.1.1.1/32 explicit-null", the running
configuration is:
```
!
mpls label bind 1.1.1.1/32 IPv4 Explicit Null
!
```
After this commit, the running configuration is:
```
!
mpls label bind 1.1.1.1/32 explicit-null
!
```
And add the support for the "no" form:
```
anlan(config)# mpls label bind 1.1.1.1/32 explicit-null
anlan(config)# no mpls label bind 1.1.1.1/32 explicit-null
```
Signed-off-by: anlan_cs <anlan_cs@tom.com>