- split NewRpcState object into 2, a Unary and a Streaming variant, which
then allows for the next.
- move all state machine details inside these new state objects
- use a template arg to allow for Streaming state tracking object
creation and deletion w/o requiring this in each specific RPC
hander.
- Code is more rugged by design now.
Thanks to Rafael Zalamena <rzalamena@opensourcerouting.org> for the cleanup
ideas/motivation.
Signed-off-by: Christian Hopps <chopps@labn.net>
Let's clean up the valgrind output even more by calling the protobuf
shutdown function that deallocates all library used memory.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Currently the nexthop tracking code is only sending to the requestor
what it was requested to match against. When the nexthop tracking
code was simplified to not need an import check and a nexthop check
in b8210849b8 for bgpd. It was not
noticed that a longer prefix could match but it would be seen
as a match because FRR was not sending up both the resolved
route prefix and the route FRR was asked to match against.
This code change causes the nexthop tracking code to pass
back up the matched requested route (so that the calling
protocol can figure out which one it is being told about )
as well as the actual prefix that was matched to.
Fixes: #10766
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
sockopt_cork is a no-op function that was cleaned up
in 2017. Since then it's still not being used. At
this point in time there is little point in keeping a
dead function that will not be used because of vagaries
between platforms
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
RB-tree and double-linked-list easily support backwards iteration, and
an use case seems to have popped up. Let's make it accessible.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
`json_object_object_add()` adds keys/items to objects/dictionaries.
Useful to have a printfrr based variant for the key there.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The vtysh live logs don't try to buffer messages when vtysh isn't
reading them fast enough. Either the kernel has space and can accept
messages without delay, or it doesn't and we continue on.
While this is intentional (otherwise slow vtysh could block a routing
daemon), at least give the user an indication if messages were dropped.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This provides direct raw log output with full metadata directly at
startup regardless of configuration details.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This was the intent here to begin with, not sure where I managed to
forget this along the way...
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The timestamps used for the live log are wallclock, not monotonic. Also
some fields were left uninitialized.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
While running singlethreaded, the RCU code is "dormant" and rcu_free is
an immediate operation. This results in the log target loop accessing
free'd memory if a log target removes itself while a message is printed
(which is likely to happen on e.g. error conditions.)
Just use frr_each_safe to avoid this issue.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
- rather than coerce `const char *` to std:string&, just pass the
C ptr, as that's what is used anyway.
fixes#10578
Signed-off-by: Christian Hopps <chopps@labn.net>
Don't let open sockets hang for too long. This will fix an issue where a
improperly coded client (e.g. socat) could exaust the amount of open
file descriptors.
Documentation:
https://grpc.github.io/grpc/cpp/md_doc_keepalive.html
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
This issue is applicable to other protocols as well.
When user has used route-map, even though the prefixes are falling
under the permit rule, the prefixes were denied and were shown
as inactive route in zebra.
Reason being the parameter which is of type enum was passed to the api
route_map_get_index and was typecasted to uint8_t *.
This problem is visible in case of Big Endian systems because we are
accessing the most significant byte.
'match_ret' field is an enum in the caller and so it is of 4 bytes,
the typecasting it to 1 byte and passing it to the api made
the api to put the value in the most significant byte
which was already zero previously. Therefore the actual value
RMAP_NOMATCH which was 1 never gets reset in this case.
Therefore the api always returns 'RMAP_NOMATCH' and hence
the prefixes are always denied.
Fixes: #9782
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Call `zlog_file_rotate` for command file lines as well otherwise on
`SIGUSR1` the old descriptor will still be used and no new log file will
be created for the rotation.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
If a operator issues a series of route-map deletions and
then re-adds, *and* this triggers the hash table to realloc
to grow to a larger size, then subsuquent route-map operations
will be against a corrupted hash table.
Why?
Effectively the route-map code was inserting each
route-map <NAME> into a hash for storage. Upon
deletion there is this concept of delayed processing
so the routemap code sets a bit `to-be-processed`
and marks the route-map for deletion. This is
1 entry in the hash table. Then if the operator
recreates the hash, FRR would add another hash
entry. If another deletion happens then there
now are 2 deletion entries that are indistinguishable
from a hash perspective.
FRR stores the deleted name of the route-map so that
any delayed processing can lookup the name and only process
those peers that are related to that route-map name.
This is good as that if in say BGP, we do not want
to reprocess all the peers that don't use the route-map.
Solution:
The whole purpose of the delay of deletion and the
storage of the route-map is to allow the using protocol
the ability to process the route-map at a later time
while still retaining the route-map name( for more efficient
reprocessing ). The problem exists because we are keeping
multiple copies of deletion events that are indistinguishable
from each other causing hash havoc.
The truth is that we only need to keep 1 copy of the
routemap in the table. If the series of events is:
a) delete ( schedule processing )
b) add ( reschedule processing )
Current code ends up processing the route-map two times
and in this event we really just need to reprocess everything
with the new route-map.
If the series of events is:
a) delete (schedule processing )
b) add (reschedule)
c) delete (reschedule)
d) add (reschedule)
All this really points to is that FRR just needs to keep the last
in the series of maps and ensuring that FRR knows that we need
to continue processing the route-map. So in the creation processing
if the hash has an entry for this map, the routemap code knows that
this is a deletion event. Mark this route-map for later processing
if it was marked so. Also in the lookup function do not return
a map if the map found was deleted.
Fixes: #10708
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
On FreeBSD I have noticed that subsuquent calls to clock_gettime(..)
can return an after time that is before first calls value.
This in turn is generating CPU_HOG's because the subtraction
is wrapping into very very large numbers:
2022/02/28 20:12:58 SHARP: [PTDQA-70FG5] start: 35.741981000 now: 35.740581000
2022/02/28 20:12:58 SHARP: [XK9YH-ZD8FA][EC 100663313] CPU HOG: task zclient_read (800744240) ran for 0ms (cpu time 18446744073709550ms)
(Please note I added the first line of debug to figure this issue out).
I have been asked to open a FreeBSD bug report and have done so.
In the mean time I think that it is important that FRR does
not generate bogus CPU HOG's on FreeBSD ( especially since
this may or may not be easily fixed and FRR has no control
over what version of the operating system, operators are
going to be running with FRR.
So, add a bit of specialized code that checks to see if
the after time in FreeBSD is before the now time in
thread_consumed_time and do some quick manipulations
to not have this issue.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This adds the plumbing necessary to yield back a file descriptor to
vtysh. The fd is passed on the command status code bytes through
AF_UNIX SCM_RIGHTS.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Add the ability to inspect the timers and when they will pop
per daemon:
sharpd@eva ~/frr (thread_return_null)> vtysh -c "show thread timers"
Thread timers for zebra:
Showing timers for default
--------------------------
rtadv_timer 00:00:00.520
if_zebra_speed_update 00:00:02.745
if_zebra_speed_update 00:00:02.745
if_zebra_speed_update 00:00:02.745
if_zebra_speed_update 00:00:02.745
if_zebra_speed_update 00:00:02.745
if_zebra_speed_update 00:00:02.745
if_zebra_speed_update 00:00:02.746
if_zebra_speed_update 00:00:02.744
if_zebra_speed_update 00:00:02.745
Showing timers for Zebra dplane thread
--------------------------------------
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Since there are timers that are created based upon doing some
math and we know that unsigned values when doing math and we accidently
subtract a larger number from a smaller number causes the unsigned
number to wrap to very large numbers, let's put in a small catch
in place to see if there are any places in the system that
mistakes are made and FRR is accidently creating a problem
for itself.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
assert when if_lookup_address is passed with
a family that is not AF_INET or AF_INET6 as
that we are dead in the water and this is a
dev escape
Signed-off-by: Donald Sharp <sharpd@nvidia.com>