Commit Graph

4875 Commits

Author SHA1 Message Date
Donald Sharp
7344eb1f62
Merge pull request #10739 from LabNConsulting/chopps/fixgrpc-reorg
grpc, lib: grpc cleanup/reorg
2022-03-15 09:10:03 -04:00
Donald Sharp
d09a7c82c9
Merge pull request #10565 from lyq140/patch-thread
lib: not thread off when schedule
2022-03-15 08:40:41 -04:00
Christian Hopps
48c93061f5 lib: grpc: rework RPC handlers improve code clarity
- split NewRpcState object into 2, a Unary and a Streaming variant, which
  then allows for the next.
- move all state machine details inside these new state objects
  - use a template arg to allow for Streaming state tracking object
    creation and deletion w/o requiring this in each specific RPC
    hander.
- Code is more rugged by design now.

Thanks to Rafael Zalamena <rzalamena@opensourcerouting.org> for the cleanup
ideas/motivation.

Signed-off-by: Christian Hopps <chopps@labn.net>
2022-03-14 15:54:25 -04:00
Donald Sharp
341e1d6e0a
Merge pull request #10738 from LabNConsulting/chopps/fixgrpc
fixes for grpc module
2022-03-14 14:57:59 -04:00
Christian Hopps
d3074a5207 lib: grpc: use candiate ID to delete rather than pointer to candiate
- also be consistent in candidate IDs being uint64_t

Signed-off-by: Christian Hopps <chopps@labn.net>
2022-03-14 11:14:12 -04:00
Rafael Zalamena
79c681952e lib: call protobuf clean up on exit
Let's clean up the valgrind output even more by calling the protobuf
shutdown function that deallocates all library used memory.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2022-03-14 11:14:12 -04:00
Donald Sharp
804013b1f9
Merge pull request #10724 from opensourcerouting/lib-rotate-logs
lib: rotate log file supplied by command line
2022-03-13 10:09:48 -04:00
David Lamparter
32addf8f7a
Merge pull request #10716 from donaldsharp/routemap_rbtree_nonuniq 2022-03-13 15:08:36 +01:00
Donald Sharp
3e553c80ba
Merge pull request #10779 from opensourcerouting/typesafe-backflip
lib: typesafe container reverse iterators
2022-03-13 09:26:26 -04:00
Donatas Abraitis
3b3b5ab350
Merge pull request #10783 from donaldsharp/bgp_zebra_nht
Bgp zebra nht
2022-03-12 21:46:13 +02:00
Donald Sharp
06e4e90132 *: When matching against a nexthop send and process what it matched against
Currently the nexthop tracking code is only sending to the requestor
what it was requested to match against.  When the nexthop tracking
code was simplified to not need an import check and a nexthop check
in b8210849b8 for bgpd.  It was not
noticed that a longer prefix could match but it would be seen
as a match because FRR was not sending up both the resolved
route prefix and the route FRR was asked to match against.

This code change causes the nexthop tracking code to pass
back up the matched requested route (so that the calling
protocol can figure out which one it is being told about )
as well as the actual prefix that was matched to.

Fixes: #10766
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-12 11:18:45 -05:00
Donald Sharp
58c05959d5 bgpd, lib, pimd: Remove sockopt_cork
sockopt_cork is a no-op function that was cleaned up
in 2017.  Since then it's still not being used.  At
this point in time there is little point in keeping a
dead function that will not be used because of vagaries
between platforms

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-12 08:21:16 -05:00
David Lamparter
643ea83be2 lib: add _last and _prev on typesafe RB/DLIST
RB-tree and double-linked-list easily support backwards iteration, and
an use case seems to have popped up.  Let's make it accessible.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-12 13:23:36 +01:00
David Lamparter
543a26848d lib: add %pFXh to print prefix w/o prefixlen
Mostly for pimd, for the time being.  May be removed again if unused.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-11 13:43:19 +01:00
David Lamparter
424ec38499 lib: add JSON printfrr dict-key helper
`json_object_object_add()` adds keys/items to objects/dictionaries.
Useful to have a printfrr based variant for the key there.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-11 13:43:00 +01:00
Donald Sharp
37a9cb28ab
Merge pull request #10749 from opensourcerouting/live-log-polish
lib, vtysh: apply some polish to live-log feature
2022-03-10 19:41:48 -05:00
David Lamparter
a4af82ee2b lib, vtysh: report lost messages on live log
The vtysh live logs don't try to buffer messages when vtysh isn't
reading them fast enough.  Either the kernel has space and can accept
messages without delay, or it doesn't and we continue on.

While this is intentional (otherwise slow vtysh could block a routing
daemon), at least give the user an indication if messages were dropped.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-07 18:03:16 +01:00
David Lamparter
3bcdae106e lib: add monitor:<fd> command line log target
This provides direct raw log output with full metadata directly at
startup regardless of configuration details.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-07 18:03:16 +01:00
David Lamparter
834585bdb9 lib: add a few more bits to live log header
... and add some comments explaining the individual fields.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-07 18:03:16 +01:00
David Lamparter
da2fc19187 lib: support multiple --log options
Allow simultaneously enabling syslog, stdout and/or file logs.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-07 18:03:15 +01:00
David Lamparter
2eda953a2a lib: make live log sockets non-blocking
This was the intent here to begin with, not sure where I managed to
forget this along the way...

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-07 18:03:15 +01:00
David Lamparter
1609a9d636 lib: fix live log fields for crashlog
The timestamps used for the live log are wallclock, not monotonic.  Also
some fields were left uninitialized.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-07 18:03:13 +01:00
Donald Sharp
db0a45d0d6
Merge pull request #10741 from LabNConsulting/chopps/critfixgrpc
critical fixes for grpc
2022-03-07 11:49:48 -05:00
David Lamparter
d03440cab7 lib: fix log target removal when singlethreaded
While running singlethreaded, the RCU code is "dormant" and rcu_free is
an immediate operation.  This results in the log target loop accessing
free'd memory if a log target removes itself while a message is printed
(which is likely to happen on e.g. error conditions.)

Just use frr_each_safe to avoid this issue.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-07 17:23:12 +01:00
Christian Hopps
fe095adc24 lib: grpc: fix handling of "empty" yang type
- rather than coerce `const char *` to std:string&, just pass the
C ptr, as that's what is used anyway.

fixes #10578

Signed-off-by: Christian Hopps <chopps@labn.net>
2022-03-06 12:00:22 -05:00
Christian Hopps
83f6fce7d2 lib: grpc: fix shutdown code
fixes #9732

Signed-off-by: Christian Hopps <chopps@labn.net>
2022-03-06 12:00:17 -05:00
Christian Hopps
c85ecd6405 lib: grpc: initialize uninitialized member variables
fixes #9732, fixes #10578

Signed-off-by: Christian Hopps <chopps@labn.net>
2022-03-06 07:39:58 -05:00
Christian Hopps
96d434f853 lib: grpc: do not remove candidate entry too soon
Fix from Rafael Zalamena <rzalamena@opensourcerouting.org>

Signed-off-by: Christian Hopps <chopps@labn.net>
2022-03-06 07:37:52 -05:00
Rafael Zalamena
673e440770 lib: tweak northbound gRPC default timeout
Don't let open sockets hang for too long. This will fix an issue where a
improperly coded client (e.g. socat) could exaust the amount of open
file descriptors.

Documentation:
https://grpc.github.io/grpc/cpp/md_doc_keepalive.html

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2022-03-06 07:37:52 -05:00
Mobashshera Rasool
a34ce5c5e4 lib: Route-map failed for OSPF routes even for matching prefixes
This issue is applicable to other protocols as well.
When user has used route-map, even though the prefixes are falling
under the permit rule, the prefixes were denied and were shown
as inactive route in zebra.

Reason being the parameter which is of type enum was passed to the api
route_map_get_index and was typecasted to uint8_t *.
This problem is visible in case of Big Endian systems because we are
accessing the most significant byte.

'match_ret' field is an enum in the caller and so it is of 4 bytes,
the typecasting it to 1 byte and passing it to the api made
the api to put the value in the most significant byte
which was already zero previously. Therefore the actual value
RMAP_NOMATCH which was 1 never gets reset in this case.
Therefore the api always returns 'RMAP_NOMATCH' and hence
the prefixes are always denied.

Fixes: #9782
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
2022-03-04 03:57:09 -08:00
Rafael Zalamena
3c1f92018b lib: rotate log file supplied by command line
Call `zlog_file_rotate` for command file lines as well otherwise on
`SIGUSR1` the old descriptor will still be used and no new log file will
be created for the rotation.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2022-03-03 18:28:08 -03:00
Donald Sharp
5f503e5f5a lib: Fix corruption when routemap delete/add sequence happens
If a operator issues a series of route-map deletions and
then re-adds, *and* this triggers the hash table to realloc
to grow to a larger size, then subsuquent route-map operations
will be against a corrupted hash table.

Why?

Effectively the route-map code was inserting each
route-map <NAME> into a hash for storage.  Upon
deletion there is this concept of delayed processing
so the routemap code sets a bit `to-be-processed`
and marks the route-map for deletion.  This is
1 entry in the hash table.  Then if the operator
recreates the hash, FRR would add another hash
entry.  If another deletion happens then there
now are 2 deletion entries that are indistinguishable
from a hash perspective.

FRR stores the deleted name of the route-map so that
any delayed processing can lookup the name and only process
those peers that are related to that route-map name.
This is good as that if in say BGP, we do not want
to reprocess all the peers that don't use the route-map.

Solution:
The whole purpose of the delay of deletion and the
storage of the route-map is to allow the using protocol
the ability to process the route-map at a later time
while still retaining the route-map name( for more efficient
reprocessing ).  The problem exists because we are keeping
multiple copies of deletion events that are indistinguishable
from each other causing hash havoc.

The truth is that we only need to keep 1 copy of the
routemap in the table.  If the series of events is:
a) delete ( schedule processing )
b) add ( reschedule processing )

Current code ends up processing the route-map two times
and in this event we really just need to reprocess everything
with the new route-map.

If the series of events is:
a) delete (schedule processing )
b) add (reschedule)
c) delete (reschedule)
d) add (reschedule)

All this really points to is that FRR just needs to keep the last
in the series of maps and ensuring that FRR knows that we need
to continue processing the route-map.  So in the creation processing
if the hash has an entry for this map, the routemap code knows that
this is a deletion event.  Mark this route-map for later processing
if it was marked so.  Also in the lookup function do not return
a map if the map found was deleted.

Fixes: #10708
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-02 15:41:54 -05:00
Donald Sharp
91f2045d53 lib: Fix zclient.c enum event to enum zclient_event
zclient.c is using `enum event` let's rename it to a better
named data structure `enum zclient_event`.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-02 09:19:05 -05:00
Donald Sharp
55a70ffb78 lib: Rename enum event to enum vty_event
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-02 09:17:47 -05:00
Donald Sharp
4e2839de64 lib: Fix FreeBSD clock_gettime(CLOCK_THREAD_CPUTIME_ID,..) going backwards
On FreeBSD I have noticed that subsuquent calls to clock_gettime(..)
can return an after time that is before first calls value.
This in turn is generating CPU_HOG's because the subtraction
is wrapping into very very large numbers:

2022/02/28 20:12:58 SHARP: [PTDQA-70FG5]     start: 35.741981000  now: 35.740581000
2022/02/28 20:12:58 SHARP: [XK9YH-ZD8FA][EC 100663313] CPU HOG: task zclient_read (800744240) ran for 0ms (cpu time 18446744073709550ms)

(Please note I added the first line of debug to figure this issue out).

I have been asked to open a FreeBSD bug report and have done so.
In the mean time I think that it is important that FRR does
not generate bogus CPU HOG's on FreeBSD ( especially since
this may or may not be easily fixed and FRR has no control
over what version of the operating system, operators are
going to be running with FRR.

So, add a bit of specialized code that checks to see if
the after time in FreeBSD is before the now time in
thread_consumed_time and do some quick manipulations
to not have this issue.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-01 09:08:32 -05:00
David Lamparter
2821405a69
Merge pull request #10640 from donaldsharp/thread_timers 2022-03-01 11:45:36 +01:00
Russ White
e3aa338256
Merge pull request #10566 from whichbug/master
isisd: use base64 to encode the binary data.
2022-02-28 09:44:47 -05:00
Donald Sharp
5506048083
Merge pull request #10353 from opensourcerouting/vtysh-live-log
lib & vtysh: RFC5424 syslog + vtysh live log display
2022-02-28 09:43:29 -05:00
Donald Sharp
6da9f59f4f
Merge pull request #10664 from opensourcerouting/checksum-iov
lib: make checksum code take iovec for input
2022-02-28 07:28:49 -05:00
David Lamparter
0798d2760d lib: implement terminal monitor for vtysh
Adds a new logging target that sends log messages to vtysh.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-02-28 13:28:43 +01:00
David Lamparter
b2dde56b2c lib: allow returning a file descriptor over vtysh
This adds the plumbing necessary to yield back a file descriptor to
vtysh.  The fd is passed on the command status code bytes through
AF_UNIX SCM_RIGHTS.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-02-28 13:28:40 +01:00
David Lamparter
df45017f48 lib: add accessor for raw timestamp in zlog
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-02-28 13:25:47 +01:00
Donald Sharp
22f31b8c52 lib, vtysh: Add show thread timers command
Add the ability to inspect the timers and when they will pop
per daemon:

sharpd@eva ~/frr (thread_return_null)> vtysh -c "show thread timers"
Thread timers for zebra:

Showing timers for default
--------------------------
  rtadv_timer                                       00:00:00.520
  if_zebra_speed_update                             00:00:02.745
  if_zebra_speed_update                             00:00:02.745
  if_zebra_speed_update                             00:00:02.745
  if_zebra_speed_update                             00:00:02.745
  if_zebra_speed_update                             00:00:02.745
  if_zebra_speed_update                             00:00:02.745
  if_zebra_speed_update                             00:00:02.746
  if_zebra_speed_update                             00:00:02.744
  if_zebra_speed_update                             00:00:02.745

Showing timers for Zebra dplane thread
--------------------------------------

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-28 06:39:07 -05:00
anlan_cs
4d4c404bf6 *: Add necessary new line for output of vty_out()
Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-02-27 10:59:19 +08:00
David Lamparter
89087f23b5 lib: use iovec for checksum code
... to allow checksumming noncontiguous blurbs of data.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-02-26 16:49:12 +01:00
David Lamparter
264c806da9 lib: guard checksum.h against multiple inclusion
checksum.h was throwing errors if it ended up included twice.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-02-26 16:49:12 +01:00
Donald Sharp
e2eff5c3b1 lib: Add a Dev catch for when a timer is set for > 1 year
Since there are timers that are created based upon doing some
math and we know that unsigned values when doing math and we accidently
subtract a larger number from a smaller number causes the unsigned
number to wrap to very large numbers, let's put in a small catch
in place to see if there are any places in the system that
mistakes are made and FRR is accidently creating a problem
for itself.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-25 08:19:07 -05:00
Donald Sharp
cc9f21da22 *: Change thread->func to return void instead of int
The int return value is never used.  Modify the code
base to just return a void instead.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-23 19:56:04 -05:00
whichbug
ac3133450d isisd: fix #10505 using base64 encoding
Using base64 instead of the raw string to encode
the binary data.

Signed-off-by: whichbug <whichbug@github.com>
2022-02-22 15:27:30 -05:00
Donald Sharp
4d7aae38ab lib: Fix possible usage of uninited data
assert when if_lookup_address is passed with
a family that is not AF_INET or AF_INET6 as
that we are dead in the water and this is a
dev escape

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-22 11:02:15 -05:00