Incorporate into the `show thread cpu` the number of times
we have issued warnings about a particular thread being
too slow.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When generating SLOW_THREAD warnings let's differentiate between
a cpu bound process and a wall bound process. Effectively
a slow thread can now be a process in FRR doing lots of work( cpu bound )
or wall bound ( the cpu is heavy load and a FRR process may be pre-empted
and never scheduled ).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Back when I put this together in 2015, ISO C11 was still reasonably new
and we couldn't require it just yet. Without ISO C11, there is no
"good" way (only bad hacks) to require a semicolon after a macro that
ends with a function definition. And if you added one anyway, you'd get
"spurious semicolon" warnings on some compilers...
With C11, `_Static_assert()` at the end of a macro will make it so that
the semicolon is properly required, consumed, and not warned about.
Consistently requiring semicolons after "file-level" macros matches
Linux kernel coding style and helps some editors against mis-syntax'ing
these macros.
Signed-off-by: David Lamparter <equinox@diac24.net>
CR, NL and CRNL are all OK, but CRNL shouldn't get treated as 2
newlines (which causes an empty command to be executed => empty prompt
line.)
Signed-off-by: David Lamparter <equinox@diac24.net>
The FDs are in struct vty, and there's ->fd and ->wfd, which shouldn't
be confused. Passing vty_sock along separately just creates mixups.
Signed-off-by: David Lamparter <equinox@diac24.net>
The logging code writes log messages with a `\n` line ending, meanwhile
the VTY code switches it so you need `\r\n`...
And we don't flush the newline after executing a command either.
After this patch, starting daemons like `zebra/zebra -t` should provide
a nice development/debugging experience with a VTY open right there on
stdio and `log stdout` interspersed.
(This is already documented in the man pages, it just looked like sh*t
previously since the log messages didn't newline correctly.)
Signed-off-by: David Lamparter <equinox@diac24.net>
The return from sockunion2hostprefix tells us if the conversion
succeeded or not. There are places in the code where we
always assume that it just `works`, since it can fail
notice and try to do the right thing.
Please note that failure of this function for most cases
of sockunion2hostprefix is highly highly unlikely as that
the sockunion was already created and tested elsewhere
it's just that this function can fail.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The Solaris code has gone through a deprecation cycle. No-one
has said anything to us and worse of all we don't have any test
systems running Solaris to know if we are making changes that
are breaking on Solaris. Remove it from the system so
we can clean up a bit.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* use actual error code instead of "false"
* add missing new line
Before:
```
nfware# show interface | include (a]
% Regex compilation error: Success% Bad regexp '(a]'
% Unknown command: show interface | include (a]
```
After:
```
nfware# show interface | include (a]
% Regex compilation error: Unmatched ( or \(
% Bad regexp '(a]'
% Unknown command: show interface | include (a]
```
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When using the default CLI mode, the northbound layer needs to create
a separate transaction to process each YANG-modeled command since
they are supposed to be applied immediately (there's no candidate
configuration nor the "commit" command like in the transactional
CLI). The problem is that configuration transactions have an overhead
associated to them, in big part because of the use of some heavy
libyang functions like `lyd_validate()` and `lyd_diff()`. As of
now this overhead is substantial and doesn't scale well when large
numbers of transactions need to be performed in sequence.
As an example, loading 50k prefix-lists using a single transaction
takes about 2 seconds on a modern CPU. Loading the same 50k
prefix-lists using 50k transactions can take more than an hour
to complete (which is unacceptable by any standard). To fix this
problem, some heavy optimization work needs to be done on libyang and
on the FRR northbound itself too (e.g. perform partial configuration
diffs whenever possible). This, however, should be a long term
effort since these optimizations shouldn't be trivial to implement
and we're far from having the performance numbers we need.
In the meanwhile, this commit introduces a simple but efficient
workaround to alleviate the issue. In short, a new back-off timer
was introduced in the CLI to monitor and detect when too many
YANG-modeled commands are being received at the same time. When
a certain threshold is reached (100 YANG-modeled commands within
one second), the northbound starts to group all subsequent commands
into a single large transaction, which allows them to be processed
much faster (e.g. seconds and not hours). It's essentially a
protection mechanism that creates dynamically-sized transactions
when necessary to prevent performance issues from happening. This
mechanism is enabled both when parsing configuration files and when
reading commands from a terminal.
The downside of this optimization is that, if several YANG-modeled
commands are grouped into the same transaction and at least one of
them fails, the whole transaction is rejected. This is undesirable
since users don't expect transactional behavior when that's not
enabled explicitly. To minimize this issue, the CLI will log all
commands that were rejected whenever that happens, to make the
user aware of what happened and have enough information to fix
the problem. Commands that fail due to parsing errors or CLI-level
validations in general are rejected separately.
Again, this proposed workaround is intended to be temporary. The
goal is to provided a quick fix to issues like #6658 while we work
on better long-term solutions.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Remove mid-string line breaks, cf. workflow doc:
.. [#tool_style_conflicts] For example, lines over 80 characters are allowed
for text strings to make it possible to search the code for them: please
see `Linux kernel style (breaking long lines and strings)
<https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_
and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_.
Scripted commit, idempotent to running:
```
python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'`
```
Signed-off-by: David Lamparter <equinox@diac24.net>
Instead of returning only error codes (e.g. NB_ERR_VALIDATION)
to the northbound clients, do better than that and also return
a human-readable error message. This should make FRR more
automation-friendly since operators won't need to dig into system
logs to find out what went wrong in the case of an error.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
The new northbound context structure contains information about
the client performing a configuration transaction. This information
will be made available to all configuration callbacks through the
args->context parameter.
The usefulness of this structure comes from the fact that it can be
used as a communication channel (both input and output) between the
northbound callbacks and the northbound clients. This can be done
through its "client_data" field which contains client-specific data.
This should cover some very specific scenarios where a northbound
callback should perform an action only if the configuration change
is coming from a given client. An example would be sending a PCEP
response to a PCE when an SR-TE policy is created or modified
through the PCEP northbound client (for that to happen, the
northbound callbacks need to have access to the PCEP request ID,
which needs to be available).
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Replace sprintf with snprintf where straightforward to do so.
- sprintf's into local scope buffers of known size are replaced with the
equivalent snprintf call
- snprintf's into local scope buffers of known size that use the buffer
size expression now use sizeof(buffer)
- sprintf(buf + strlen(buf), ...) replaced with snprintf() into temp
buffer followed by strlcat
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Rather than doing a f*gly hack for the RPKI code, let's do an on-exit
hook in cmd_node. Also allows replacing some special-casing in the vty
code.
Signed-off-by: David Lamparter <equinox@diac24.net>
And again for the name. Why on earth would we centralize this, just so
people can forget to update it?
Signed-off-by: David Lamparter <equinox@diac24.net>
Same as before, instead of shoving this into a big central list we can
just put the parent node in cmd_node.
Signed-off-by: David Lamparter <equinox@diac24.net>
There is really no reason to not put this in the cmd_node.
And while we're add it, rename from pointless ".func" to ".config_write".
[v2: fix forgotten ldpd config_write]
Signed-off-by: David Lamparter <equinox@diac24.net>
The only nodes that have this as 0 don't have a "->func" anyway, so the
entire thing is really just pointless.
Signed-off-by: David Lamparter <equinox@diac24.net>
RCA:
when client is killed, show running-config command crashes vtysh.
vtysh_client_config function temporarily makes vty->of which is standard output file
pointer to null inorder to suppress output to user.
This call further tries to communicate with each client and when the client
is terminated, socket call fails and hits the exception path to print the
connection has failed using vty_out.
vty_out crashes because vtysh_client_config has temporarily made vty->of
pointer to NULL to supress o/p to user.
Fix:
vty_out function should check for the sanity of vty->of pointer.
If it doesn't exist, this must have hit exception path, so use the
vty->saved_of if exists.
Signed-off-by: Saravanan K <saravanank@vmware.com>
The `--enable-pcreposix` configure option was not actually compiling
properly. Follow pre-existing pattern for inclusion of regex.h
or the pcreposix.h header.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Load the startup configuration directly into the CLI shared candidate
configuration instead of loading it into a private candidate
configuration. This way we don't need to initialize the shared
candidate separately later as a copy of the running configuration,
which is a potentially expensive operation.
Also, make the northbound process SIGHUP correctly even when --tcli
is not used.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Adding a lock to protect the global running configuration doesn't
help much since the FRR daemons are not prepared to process
configuration changes in a pthread that is not the main one (a
whole lot of new protections would be necessary to prevent race
conditions).
This means the lock added by commit 83981138 only adds more
complexity for no benefit. Remove it now to simplify the code.
All northbound clients, including the gRPC one, should either run
in the main pthread or use synchronization primitives to process
configuration transactions in the main pthread.
This reverts commit 83981138fe.
The FRR community has run into an issue where keeping up our
CI system to work with solaris has become a fairly large burden.
We have also sent emails and asked around and have not found
anyone standing up saying that they are using Solaris.
Given the fact that we do not have any comprehensive testing
being done w/ solaris and the fact that we are getting a steady
stream of new features that will never work on solaris and
we cannot find anyone to say that they are using it. Let's
start the drawn out process of deprecating the code.
If in the mean-time someone comes forward with the fact that
they are using it we can then not deprecate it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The correct cast for these is (unsigned char), because "char" could be
signed and thus have some negative value. isalpha & co. expect an int
arg that is positive, i.e. 0-255. So we need to cast to (unsigned char)
when calling any of these.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Adding a read with the address of the thread pointer we want to
use will allow lib/thread.c to properly handle your thread pointers.
Instead we were setting the pointer to NULL before we passed
into the _read and _write thread functions. Remove the NULL
pointer set and just let thread.c handle everything.
vty_stdio_resume and vty_read would blindly add read and write
which would cause vty_event() to drop the thread pointer.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We need to be calling snprintfrr() instead of snprintf() in places that
wrap snprintf in some user-exposed way; otherwise the extensions won't
be available for those functions.
Signed-off-by: David Lamparter <equinox@diac24.net>
Even when using the classic CLI mode (i.e. when --tcli is not
used), the northbound code still uses vty->candidate_config
to perform configuration changes. From the perspective of the
user, the running configuration is being edited directly, but
under the hood the northbound layer does a full configuration
transaction for each command. When the running configuration is
edited by a northbound client other than the CLI (e.g. kernel,
gRPC), vty->candidate_config might become outdated, and this can
lead to lots of weird problems. To fix this, always regenerate
vty->candidate_config before each configuration command when
using the classic CLI mode. When using the transactional CLI,
the user needs to update the candidate manually using the "update"
command, otherwise the "commit" command will fail with this error:
"% Candidate configuration needs to be updated before commit".
Fixes some problems reported by Don after moving an interface from
one VRF to another one while zebra is running.
Reported-by: Don Slice <dslice@cumulusnetworks.com>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
This makes printfrr extensions available in most of our format strings.
snprintf() is the obvious exception.
Signed-off-by: David Lamparter <equinox@diac24.net>
Add 'no log commands' cli and at the same time add a
--command-log-always to the daemon startup cli.
If --command-log-always is specified then all commands are
auto-logged and the 'no log commands' form of the command
is now ignored.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>