Recent commit: 6b2554b94a
Exposed, via Address Sanitation, that memory was being
leaked. Unfortunately the CI system did not catch this.
Two pieces of memory were being lost: The zserv client
data structure as well as anything on the client->gr_info_queue.
Clean these up.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Create Local routes in FRR:
S 0.0.0.0/0 [1/0] via 192.168.119.1, enp39s0, weight 1, 00:03:46
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp39s0, 00:03:51
O 192.168.119.0/24 [110/100] is directly connected, enp39s0, weight 1, 00:03:46
C>* 192.168.119.0/24 is directly connected, enp39s0, 00:03:51
L>* 192.168.119.224/32 is directly connected, enp39s0, 00:03:51
O 192.168.119.229/32 [110/100] via 0.0.0.0, enp39s0 inactive, weight 1, 00:03:46
C>* 192.168.119.229/32 is directly connected, enp39s0, 00:03:46
Create ability to redistribute local routes.
Modify tests to support this change.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Include an event ptr-to-ptr in the event_execute() api
call, like the various schedule api calls. This allows the
execute() api to cancel an existing scheduled task if that
task is being executed inline.
Signed-off-by: Mark Stapp <mjs@labn.net>
BGP signals to zebra that a afi has converged immediately
after it has finished processing all routes for a given
afi/safi. This generates events in zebra in this order
a) Routes received from BGP, placed on early-rib Meta-Q
b) Signal GR for the afi.
Now imagine that zebra reads GR code and immediately
processes routes that are in the actual rib and
removes some routes. This generates a
c) route deletion to the kernel for some number of
routes that may be in the the early-rib Meta-Q
d) Process the Meta-Q, and re-install the routes
This is undesirable behavior in zebra. In that
while we may end up in a correct state, there
will be a blip for some number of routes that
happen to be in the early rib Meta-Q.
Modify the GR code to have it's own processing
entry at the end of the Meta-Q. This will
allow all routes to be processed and ready
for handling by the Graceful Restart code.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
After the restructure of the gr code to allow zebra_gr
to have individual cleanups of afi, this is no longer necessary.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The GR code in FRR used to wait till all AFI's were complete
before cleaning up the routes from the upper level protocol.
This of course can lead to some weird situations where say
ipv4 finishes and then v6 is stuck waiting for a peer to come
up and never finishes. v4 when it finishes signals zebra that
it is done but no action is taken at that moment.
Modify the code to allow the zebra_gr.c code to handle a per
afi removal, instead of doing it all at the end.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The zebra_gr code had 3 functions when effectively only
1 was needed. Cleans up some code weirdness around
multiple switch statements for the same api->cap
as well as consolidating down to only caring about
SAFI_UNICAST, since that is all we care about at the
moment.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
We have code that tracks both afi and safi's,
but we only ever operate on the afi's. So lets
limit our work being done to something more sensible.
I'm leaving the safi being broadcast through the zapi
message, as that I am not sure what else should be ripped
out at this point in time.
Finally re-arrange the zread_client_capabilites function
to stop the multiple levels of function calling that really
serve no purpose.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
By the time this function is called we have already
ensured that the pointers are good several times.
I like consistency but this is a bit much
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When GR is running and attempting to clear up a node
if the node that is currently saved and we are coming
back to happens to be deleted during the time zebra
suspends the GR code due to hitting the node limit
then zebra GR code will just completely stop processing
and potentially leave stale nodes around forever.
Let's just remove this hole and process what we can.
Can you imagine trying to debug this after the fact?
If we remove a node then that counts toward the maximum
to process of ZEBRA_MAX_STALE_ROUTE_COUNT. This should
prevent any non-processing with a slightly larger cost
of having to look at a few nodes repeatedly
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The info->do_delete variable was being set to true only when
u.val was 1. The problem with this is that u.val is a union
and the various ways that we can call this event causes
different values to be written to the union value on the thread.
This makes no sense. Just set the variable to what we want it to
be when we need it to be true. Since it was only ever set during
a thread_execute section.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The GR debug logs are doing all sorts of wonderful stuff
but they were not actually displaying anything useful to the operator
about what vrf we are operating in.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Used for graceful-restart mostly.
Especially for bgp_show_neighbor_graceful_restart_capability_per_afi_safi()
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
to make sure that c++ code can include them, avoid using reserved
keywords like 'delete' or 'new'.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
When a client connects to zebra with GR capabilities and
then restarts, it might disconnect again even before hello is
sent leading zebra cores.
GR should be supported only for dynamic neighbor who are capable
of restarting.
Signed-off-by: Santosh P K <sapk@vmware.com>
Somewhat gnarly code flow here that might be leaking memory - can't tell
if it's a test artifact or not, but in any case this reduces the
situations in which we need to alloc a block.
And we don't need to check XCALLOC for success...
And we don't need to null check before XFREE...
Or set XFREE'd pointers to NULL...
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Handling capability received from client. It may contain
GR enable/disable, Stale time changes, RIB update complete
for given AFi, ASAFI and instance. It also has changes for
stale route handling.
Signed-off-by: Santosh P K <sapk@vmware.com>
Zebra will have special handling for clients with GR enabled.
When client disconnects with GR enabled, then a stale client
will be created and its RIB will be retained till stale timer
or client comes up and updated its RIB.
Co-authored-by: Santosh P K <sapk@vmware.com>
Co-authored-by: Soman K S <somanks@vmware.com>
Signed-off-by: Santosh P K <sapk@vmware.com>