When a node is top-level, we shouldn't stop the whole processing, we
should just skip this single node.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
`darr_avail` returns the available capacity excluding the already
existing terminating NULL byte. Take this into account when using
`darr_avail`. Otherwise, if the error length is a power of 2, the
capacity is never enough and the function stucks in an infinite loop.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
If the initial darr capacity is not enough for the output, the `ap` is
reused multiple times, which is wrong, because it may be altered by
`vsnprintf`. Make a copy of `ap` each time instead of reusing.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
By calling `ly_log_options` with `LY_LOSTORE`, the current code
effectively disables libyang logging and never enables it back. The call
is done to get the current logging options, but we don't really need
that. When looking for a schema node, we don't want neither to log nor
to store the error, so simply set the temporary options to 0.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When a prefix-list entry is updated, current NB code calls the
replacement code multiple times, once per each updated field. It means
that when multiple fields of an entry are changed in a single commit,
the replacement is done with an interim state of a prefix-list instead
of a final one. To fix the issue, we should call the replacement code
once, after all fields of an entry are updated.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When an access-list entry is updated, current NB code calls notification
hooks for each updated field. It means that when multiple fields of an
entry are changed in a single commit, the hooks are run with an interim
state of an access-list instead of a final one. To fix the issue, we
should call the hooks once, after all fields of an entry are updated.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Containers inside a choice's case must be treated as presence containers
as they can be explicitly created and deleted. They must have `create`
and `destroy` callbacks, otherwise the internal data they represent may
never be deleted.
The issue can be reproduced with the following steps:
- create an access-list with destination-network params
```
# access-list test seq 1 permit ip any 10.10.10.0 0.0.0.255
```
- delete the `destination-network` container
```
# mgmt delete-config /frr-filter:lib/access-list[name='test'][type='ipv4']/entry[sequence='1']/destination-network
# mgmt commit apply
MGMTD: No changes found to be committed!
```
As the `destination-network` container is non-presence, and all its
leafs are mandatory, mgmtd doesn't see any changes to be commited and
simply updates its YANG data tree without passing any updates to backend
daemons.
This commit fixes the issue by requiring `create` and `destroy`
callbacks for containers inside choice's cases.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When ordering operations, destroys must always come before other
operations, to correctly cover the change of a "case" in a "choice".
The problem can be reproduced with the following commands:
```
access-list test seq 1 permit 10.0.0.0/8
access-list test seq 1 permit host 10.0.0.1
access-list test seq 1 permit 10.0.0.0/8
```
Before this commit, the order of changes would be the following:
- `access-list test seq 1 permit 10.0.0.0/8`
- `modify` for `ipv4-prefix`
- `access-list test seq 1 permit host 10.0.0.1`
- `destroy` for `ipv4-prefix`
- `modify` for `host`
- `access-list test seq 1 permit 10.0.0.0/8`
- `modify` for `ipv4-prefix`
- `destroy` for `host`
As `destroy` for `host` is called last, it rewrites the fields that were
filled by `modify` callback of `ipv4-prefix`. This commit fixes this
problem by always calling `destroy` callbacks first.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Notifications are sent by mgmtd for each session of a client, so they
should be processed once per each session.
Also, add session_id parameter to an async_notification callback as all
other callbacks have this parameter.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When ordering the NB callbacks according to their priorities, if the
operation is "destroy" we should reverse the order, to destroy the
dependants before the dependencies.
This fixes the crash, that can be reproduced with the following steps:
```
frr# conf term file-lock
frr(config)# affinity-map map bit-position 10
frr(config)# interface test
frr(config-if)# link-params
frr(config-link-params)# affinity map
frr(config-link-params)# exit
frr(config-if)# exit
frr(config)# mgmt commit apply
frr(config)# no affinity-map map
frr(config)# interface test
frr(config-if)# link-params
frr(config-link-params)# no affinity map
frr(config-link-params)# exit
frr(config-if)# exit
frr(config)# mgmt commit apply
```
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Other objects depend on affinity-maps being created before them by using
leafref with require-instance true. Set the priority to ensure that.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Remove adding of line feeds when encondig. We're using these functions
only for encoding binary data for storing in YANG data tree.
According to RFC 7950, section 9.8.2:
```
9.8.2. Lexical Representation
Binary values are encoded with the base64 encoding scheme (see
Section 4 in [RFC4648]).
```
According to mentioned RFC 4648, section 3.1:
```
Implementations MUST NOT add line feeds to base-encoded data unless
the specification referring to this document explicitly directs base
encoders to add line feeds after a specific number of characters.
```
Therefore, line feeds must not be added to the encoded data.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Before this fix would always return empty results b/c there was no
libyang tree to print to output format.
Signed-off-by: Christian Hopps <chopps@labn.net>
Convert only when this is really needed, e.g. `match ip address prefix-list ...`.
Otherwise, we can't have mixed match clauses, like:
```
match ip address prefix-list p1
match evpn route-type prefix
```
This won't work, because the prefix is already converted, and we can't extract
route type, vni, etc. from the original EVPN prefix.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Currently, YANG notification processing is done using a special type of
callbacks registered in backend clients. In this commit, we start using
regular northbound infrastructure instead, because it already has a
convenient way of registering xpath-specific callbacks without the need
for creating additional structures for each necessary notification. We
also now pass a notification data to the callback, instead of a plain
JSON. This allows to use regular YANG library functions for inspecting
notification fields, instead of manually parsing the JSON.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Current code assumes that notification is always sent in stripped JSON
format and therefore notification xpath starts at the third symbol of
notification data. Assuming JSON is more or less fine, because this
representation is internal to FRR, but the assumption about the xpath is
wrong, because it won't work for not top-level notifications. YANG
allows to define notification as a child for some data node deep into
the tree and in this case notification data contains not only the
notification node itself, but also all its parents.
To fix the issue, parse the notification data and get its xpath from its
schema node.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When exiting from a level below the config node, like `router rip`,
vtysh executes a resync by sending "end" and "conf term [file-lock]"
commands to all the daemons. As statet in the description comment, it's
done "in case one of the daemons is somewhere else". I don't think this
actually ever happens, but even if it is, it is a bug in a daemon that
needs to be fixed. This resync was okay before the introduction of
mgmtd, but now it unlocks and locks back the datastores during the
configuration reading process, which can lead to a failure which is
explained in the previous commit.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
There exists cases where just honoring the FD_LIMIT size
as given to us by the operating system makes no sense.
Let's just make a switch to allow for this for things
like vtysh and ospfclient which will never have 1k files
open at any given time.
Fixes: #15315
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
We don't need to manually load built-in modules. This fixes the
following warning in mgmtd:
```
YANG model "ietf-yang-metadata@*" "*@*"not embedded, trying external file
```
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
mgmtd is supposed to only register CLI callbacks. If configuration
callbacks are registered, they are getting called on startup when mgmtd
reads config files, and they can use infrastructure that is not
initialized on mgmtd, or allocate some memory that is never freed.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
In a non-controlled startup, the rcu data structures were
not being created until after logging could happen. This
is bad. Move it so that the rcu data structures are
created first, before logging( HA! ) can happen.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
External libraries can re-enter the FRR code through a hook function. A
crash occurs when logging from this hook function if the library has
initiated a new pthread, as the FRR RCU context is not initialized for
this thread.
Add frr_pthread_non_controlled_startup() function to initialize a valid
RCU context within a FRR pthread context, originating from an external
pthread.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
It's unlikely that an operator will ever set a fd
limit of over 100k. Let's warn the operator that
things are in a bit of a wonky state.
Fixes: #15280
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Remove operational data check from CLI command. It never works in mgmtd
and it is not needed in backend daemons because it's done in
`lib_vrf_destroy` callback.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently, staticd configuration is tightly coupled with VRF existence.
Because of that, it has to use a hack in NB infrastructure to create a
VRF configuration when at least one static route is configured for this
VRF. This hack is incompatible with mgmtd, because mgmtd doesn't execute
configuration callbacks. Because of that, the configuration may become
out of sync between mgmtd and staticd. There are two main cases:
1. Create static route in a VRF. The VRF data node will be created
automatically in staticd by the NB hack, but not in mgmtd.
2. Delete VRF which has some static routes configured. The static route
configuration will be deleted from staticd by the NB hack, but not
from mgmtd.
To fix the problem, decouple configuration of static routes from VRF
configuration. Now it is possible to configure static routes even if the
VRF doesn't exist yet. Once the VRF is created, staticd applies all the
preconfigured routes.
This change also fixes the problem with static routes being preserved in
the system when staticd "control-plane-protocol" container is deleted
but the VRF is still configured.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Create a single registry of default port values that daemons
are using. Most of these are vty ports, but there are some
others for features like ospfapi and zebra FPM.
Signed-off-by: Mark Stapp <mjs@labn.net>
pahole reports that the hash_bucket has 2 4 byte holes
in the data structure. Let's reorganize this a bit
and save 8 bytes per hash_bucket instance.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
It allows people not familiar with libyang and FRR internals to use
mgmtd FE API by looking only at `mgmt_msg_native.h` header. We still use
the same values to avoid a lot of mapping code, and ensure that any
change doesn't slip unnoticed by using static asserts.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Fix reference bandwidth description. It is Kbps, not Mbps.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
- use `apply_finish` callback when possible to avoid multiple applies per commit
- move table range working to the CLI handler
- remove unnecessary conditional compilation
- remove unnecessary boolean conversion
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When the daemon is partially mgmtd-converted, it receives configuration
from vty and mgmtmd simultaneosly. This configuration must be
synchronized.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When duplicating nodes, we should always keep flags, especially the
LYD_NEW flag that indicates not validated data. This allows to select a
new choice's case without the need to explicitly remove the existing one.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Current denominators are not integers and some of them lose precision
because of that, for example, 1e-6 is actually stored as
9.9999999999999995e-07 and 1-e12 is stored as 9.9999999999999998e-13.
When multiplying by such denominators, we receive incorrect values.
Changing denominators to integers and using division instead of
multiplication improves precision and solves the problem.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
It shouldn't be set unless some affinity is configured. NB callbacks
set this flag correctly when necessary.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Use consistent `e_somepath` names for expanded versions of `somepath`.
Also remove all paths from `config.h` and put them into
`lib/config_paths.h` - this is to make more obvious when someone is
doing something probably not quite properly structured.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Both of these belong in `/var/lib`, not `/var/run`.
Rather hilariously, the history read in
`mgmt_history_read_cmt_record_index` was always failing, because it was
doing a `file_exists(MGMTD_COMMIT_FILE_PATH)` check. Which is the wrong
macro - it's `.../commit-%s.json`, including the unprocessed `%s`, which
would never exist.
I guess noone ever tried if this actually works. Cool.
On the plus side, this means I don't have to implement legacy
compatibility for this, since it never worked to begin with.
(SQLite3 DB location is also changed in this commit since it also uses
`DAEMON_DB_DIR`.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
These functions load daemon-specific persistent state from
`/var/lib/frr` and supersede open-coded variants of similar calls in
ospfd, ospf6d and isisd to save GR state and/or sequence numbers.
Unlike the open-coded variants, the save call correctly `fsync()`s the
saved data to ensure disk contents are consistent.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This needs to be used for persistent state, which currently is misplaced
into `/var/run` / `/run` where it gets deleted across reboots.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
CLI for access/prefix list removal was using `nb_cli_apply_changes`
multiple times in the same command. It's fine for regular daemons but
not for mgmtd. Refactor the code to apply changes only once.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently, when editing a leaf-list, `nb_candidate_edit` expects to
receive it's xpath without a predicate and the value in a separate
argument, and then creates the full xpath. This hack is complicated,
because it depends on the operation and on the caller being a backend or
not. Instead, let's require to always include the predicate in a
leaf-list xpath. Update all the usages in the code accordingly.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Previously each container created all it's decendents before descending into
the children and repeating the process.
Signed-off-by: Christian Hopps <chopps@labn.net>
- initialize the necessary bit when creating if_link_params
- fix CLI description to mark extended as the default mode
- correctly set mode to extended when using the "no" form of the command
- handle the "show_defaults" parameter correctly in cli_show callback
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When affinity mode is "standard", bit position cannot be greater than
31. Add a "must" statement to the YANG model to validate this, and
remove our custom validation code that does the same.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Change the type of affinity leaf-list in frr-zebra to a leafref with
"require-instance" property set to true. This change tells libyang to
automatically check that affinity-map exists before usage and doesn't
allow it to be deleted if it's referenced. It allows us to remove all
the manual code that is doing the same thing.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Add support of RPKI commands in the VRF configure context.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add a config that specifies per-deamon log file names.
Move the handy generated list of daemon names from vtysh to lib;
edit the gitignore files to match.
Signed-off-by: Mark Stapp <mjs@labn.net>
When creating an initial tree trunk for oper data walk, if the xpath
represents a leaf, the leaf is created with an incorrect empty value.
If it doesn't actually exist in daemon's oper data, its value is not
overwritten later and an empty value is returned in the result.
For example, when requesting
`/frr-interface:lib/interface[name='eth0']/description`, the result is:
```
{
"frr-interface:lib": {
"interface": [
{
"name": "eth0",
"description": ""
}
]
}
}
```
instead of an empty JSON that it should be.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Like in RESTCONF GET request and NETCONF get-data request, make it
possible to request state-only, config-only, or all data.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently it's the same as get-tree request for the backend, but it is
going to be expanded in the following commits.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Technically changing a leaf from uint16 to uint32 is a NBC change; however,
increasing this to uint32 should not break anyone in reality.
Signed-off-by: Christian Hopps <chopps@labn.net>
Setting this variable to true makes NB ignore only configuration-related
callbacks. CLI-related callbacks are still loaded and executed, so
rename the variable to make it clearer.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Replace operation removes the current data node configuration and sets
the provided value. As current northbound code works only with one
xpath at a time, the operation only makes sense to clear the config of
a container without deleting it itself. However, the next step is to
allow passing JSON-encoded complex values to northbound operations which
will make replace operation much more useful.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently, there's a single operation type which doesn't return error
if the object doesn't exists. To be compatible with NETCONF/RESTCONF,
we should support differentiate between DELETE (fails when object
doesn't exist) and REMOVE (doesn't fail if the object doesn't exist).
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently, there's no difference between CREATE and MODIFY operations.
To be compatible with NETCONF/RESTCONF, add new CREATE_EXCL operation
that throws an error if the configuration data already exists.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently, nb_operation enum means two different things - edit operation
type (frontend part), and callback type (backend part). These types
overlap, but they are not identical. We need to add more operation
types to support NETCONF/RESTCONF integration, so it's better to have
separate enums to identify different entities.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
The maximum number of file descriptors in an fd set is limited by
FD_SETSIZE. This limitation is important because the libc macros
FD_SET(), FD_CLR() and FD_ISSET() will invoke a sigabort if the size of
the fd set given to them is above FD_SETSIZE.
We ran into such a sigabort with bgpd because snmp can return an fd set
of size higher than FD_SETSIZE when calling snmp_select_info(). An
unfortunate FD_ISSET() call later causes the following abort:
Received signal 6 at 1701115534 (si_addr 0xb94, PC 0x7ff289a16a7c); aborting...
/lib/x86_64-linux-gnu/libfrr.so.0(zlog_backtrace_sigsafe+0xb3) [0x7ff289d62bba]
/lib/x86_64-linux-gnu/libfrr.so.0(zlog_signal+0x1b4) [0x7ff289d62a1f]
/lib/x86_64-linux-gnu/libfrr.so.0(+0x102860) [0x7ff289da4860]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7ff2899c2520]
/lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c) [0x7ff289a16a7c]
/lib/x86_64-linux-gnu/libc.so.6(raise+0x16) [0x7ff2899c2476]
/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3) [0x7ff2899a87f3]
/lib/x86_64-linux-gnu/libc.so.6(+0x896f6) [0x7ff289a096f6]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x2a) [0x7ff289ab676a]
/lib/x86_64-linux-gnu/libc.so.6(+0x1350c6) [0x7ff289ab50c6]
/lib/x86_64-linux-gnu/libc.so.6(+0x1366ab) [0x7ff289ab66ab]
/lib/x86_64-linux-gnu/libfrrsnmp.so.0(+0x36f5) [0x7ff2897736f5]
/lib/x86_64-linux-gnu/libfrrsnmp.so.0(+0x3c27) [0x7ff289773c27]
/lib/x86_64-linux-gnu/libfrr.so.0(thread_call+0x1c2) [0x7ff289dbe105]
/lib/x86_64-linux-gnu/libfrr.so.0(frr_run+0x257) [0x7ff289d56e69]
/usr/bin/bgpd(main+0x4f4) [0x560965c40488]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7ff2899a9d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7ff2899a9e40]
/usr/bin/bgpd(_start+0x25) [0x560965c3e965]
in thread agentx_timeout scheduled from /build/make-pkg/output/_packages/cp-routing/src/lib/agentx.c:122 agentx_events_update()
Also, the following error is logged by snmp just before the abort:
snmp[err]: Use snmp_sess_select_info2() for processing large file descriptors
snmp uses a custom struct netsnmp_large_fd_set to work above the limit
imposed by FD_SETSIZE. It is noteworthy that, when calling
snmp_select_info() instead of snmp_select_info2(), snmp uses the same
code working with its custom, large structs, and copy/paste the result
to a regular, libc compatible fd_set. So there should be no downside
working with snmp_select_info2() instead of snmp_select_info().
Replace every use of the libc file descriptors sets by snmp's extended
file descriptors sets in agentx to acommodate for the high number of
file descriptors that can come out of snmp. This should prevent the
abort seen above.
Signed-off-by: Edwin Brossette <edwin.brossette@6wind.com>