An operator found a situation where zebra was
backing up in a significant way towards BGP
with EVPN changes taking up some serious amounts
of memory. The key lines that would have clued
us in on it were behind a dev build. Let's change
this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Command 'show ip route vrf <vrf_name> json' returns a valid json object,
however if instead of <vrf_name> we specify 'all', we get an invalid json
object, like:
{//vrf1 routes}{//vrf2 routes}{vrf3 routes}
After the fix:
{"vrf1":{//vrf1 routes},"vrf2:{//vrf2 routes},"vrf3":{//vrf3 routes}}
Which is a valid json object, that can be parsed effectively using built-in
modules. The rest of the commands remains unaffected and behave the same.
Signed-off-by: Piotr Suchy <psuchy@akamai.com>
A specific sequence of actions involving the addition and removal of IP
routes and network interfaces can lead to a route installation failure.
The issue occurs under the following conditions:
- Initially, there is no route present via the ens3 interface.
- Adds a route: ip route 10.0.0.0/24 192.168.0.100 ens3
- Removes the same route: no ip route 10.0.0.0/24 192.168.0.100 ens3
- Removes the ens3 interface.
- Re-adds the ens3 interface.
- Again adds the same route: ip route 10.0.0.0/24 192.168.0.100 ens3
- And again removes it: no ip route 10.0.0.0/24 192.168.0.100 ens3
- Shuts down the ens3 interface
- Reactivates the interface
- Adds the route once more: ip route 10.0.0.0/24 192.168.0.100 ens3
The route appears to be rejected.
> # show ip route nexthop
> S>r 10.0.0.0/24 [1/0] (6) via 192.168.0.100, ens3, weight 1, 00:00:01
The commit 35729f38fa ("zebra: Add a timer to nexthop group deletion")
introduced a feature to keep a nexthop-group in Zebra for a certain
period even when it is no longer in use. But if a nexthop-group
interface is removed during this period, the association between the
nexthop-group and the interface is lost in zebra memory. If the
interface is later added back and a route is re-established, the
nexthop-group interface dependency is not correctly reestablished.
As a consequence, the nexthop-group flags remain unset when the
interface is down. Upon the interface's reactivation, zebra does not
reinstall the nexthop-group in the kernel because it is marked as valid
and installed, but in reality, it does not exist in the kernel (it was
removed when the interface was down). Thus, attempts to install a route
via this nexthop-group ID fail.
Stop maintaining a nexthop-group when its associated interface is no
longer present.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Split zebra's vrf_terminate() into disable() and delete() stages.
The former enqueues all events for the dplane thread.
Memory freeing is performed in the second stage.
Signed-off-by: Alexander Skorichenko <askorichenko@netgate.com>
Recent commit: 6b2554b94a
Exposed, via Address Sanitation, that memory was being
leaked. Unfortunately the CI system did not catch this.
Two pieces of memory were being lost: The zserv client
data structure as well as anything on the client->gr_info_queue.
Clean these up.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently, the way zebra works is it creates pthread per client (BGP is
of interest in this case) and this thread loops itself in zserv_read()
to check for any incoming data. If there is one, then it reads,
validates and adds it in the ibuf_fifo signalling the main thread to
process the message. The main thread when it gets a change, processes
the message, and invokes the function pointer registered in the header
command. (Ex: zserv_handlers).
Finally, if all of this was successful, this task reschedules itself and
loops in zserv_read() again
However, if there are already items on ibuf FIFO, that means zebra is
slow in processing. And with the current mechanism if Zebra main is
busy, the ibuf FIFO keeps growing holding up the memory.
Show memory zebra:(Example: 15k streams hoarding ~160 MB of data)
--- qmem libfrr ---
Stream : 44 variable 3432352 15042 161243800
Fix:
Client IO Thread: (zserv_read)
- Stop doing the read events when we know there are X number of items
on the FIFO already.(X - zebra zapi-packets <1-10000> (Default-1000)
- Determine the number of items on the zserv->ibuf_fifo. Subtract this
from the work items and only pull the number of items off that would
take us to X items on the ibuf_fifo again.
- If the number of items in the ibuf_fifo has reached to the maximum
* Either initially when zserv_read() is called (or)
* when processing the remainders of the incoming buffer
the client IO thread is woken by the the zebra main.
Main thread: (zserv_process_message)
If the client ibuf always schedules a wakeup to the client IO to read
more items from the socked buffer. This way we ensure
- Client IO thread always tries to read the socket buffer and add more
items to the ibuf_fifo (until max limit)
- hidden config change (zebra zapi-packets <>) is taken into account
Ticket: #3390099
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
Found by coverity. Let's just lock the writeable
amount to see if it is possible. It's ok because
we want to know if we have room *now*. If another
pthread runs it will only remove data from fnc->obuf
and make more room. So this is ok.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Containers inside a choice's case must be treated as presence containers
as they can be explicitly created and deleted. They must have `create`
and `destroy` callbacks, otherwise the internal data they represent may
never be deleted.
The issue can be reproduced with the following steps:
- create an access-list with destination-network params
```
# access-list test seq 1 permit ip any 10.10.10.0 0.0.0.255
```
- delete the `destination-network` container
```
# mgmt delete-config /frr-filter:lib/access-list[name='test'][type='ipv4']/entry[sequence='1']/destination-network
# mgmt commit apply
MGMTD: No changes found to be committed!
```
As the `destination-network` container is non-presence, and all its
leafs are mandatory, mgmtd doesn't see any changes to be commited and
simply updates its YANG data tree without passing any updates to backend
daemons.
This commit fixes the issue by requiring `create` and `destroy`
callbacks for containers inside choice's cases.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
A macvlan interface can have its underlying link-interface in another
namespace (aka. netns). However, by default, zebra does not know the
interface from the other namespaces. It results in a crash the pointer
to the link interface is NULL.
> 6 0x0000559d77a329d3 in zebra_vxlan_macvlan_up (ifp=0x559d798b8e00) at /root/frr/zebra/zebra_vxlan.c:4676
> 4676 link_zif = link_ifp->info;
> (gdb) list
> 4671 struct interface *link_ifp, *link_if;
> 4672
> 4673 zif = ifp->info;
> 4674 assert(zif);
> 4675 link_ifp = zif->link;
> 4676 link_zif = link_ifp->info;
> 4677 assert(link_zif);
> 4678
> (gdb) p zif->link
> $2 = (struct interface *) 0x0
> (gdb) p zif->link_ifindex
> $3 = 15
Fix the crash by returning when the macvlan link-interface is in another
namespace. No need to go further because any vxlan under the macvlan
interface would not be accessible by zebra.
Link: https://github.com/FRRouting/frr/issues/15370
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>