Necessary to understand what is going on when bpf_program_load fails
Signed-off-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Since moving get_rate() and get_size() from tc to lib, on some
systems we fail to link because of missing math lib.
Move the functions that require math lib to their own c file
and add -lm to dcb that now use those functions.
../lib/libutil.a(utils.o): In function `get_rate':
utils.c:(.text+0x10dc): undefined reference to `floor'
../lib/libutil.a(utils.o): In function `get_size':
utils.c:(.text+0x1394): undefined reference to `floor'
../lib/libutil.a(json_print.o): In function `sprint_size':
json_print.c:(.text+0x14c0): undefined reference to `rint'
json_print.c:(.text+0x14f4): undefined reference to `rint'
json_print.c:(.text+0x157c): undefined reference to `rint'
Fixes: f3be0e6366 ("lib: Move get_rate(), get_rate64() from tc here")
Fixes: 44396bdfcc ("lib: Move get_size() from tc here")
Fixes: adbe5de966 ("lib: Move sprint_size() from tc here, add print_size()")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Petr Machata says:
====================
Add support to the dcb tool for the following two DCB objects:
- APP, which allows configuration of traffic prioritization rules based on
several possible packet headers.
- DCBX, which is a 1-byte bitfield of flags that configure whether the DCBX
protocol is implemented in the device or in the host, and which version
of the protocol should be used.
Patch #1 adds a new helper for finding a name of a given dsfield value.
This is useful for APP DSCP-to-priority rules, which can use human-readable
DSCP names.
Patches #2, #3 and #4 extend existing interfaces for, respectively, parsing
of the X:Y mappings, for setting a DCB object, and for getting a DCB
object.
In patch #5, support for the command line argument -N / --Numeric is
added. The APP tool later uses it to decide whether to format DSCP values
as human-readable strings or as plain numbers.
Patches #6 and #7 add the subtools themselves and their man pages.
v2:
- Two patches dropped and sent to iproute2 branch as "dcb: Fixes".
This patch set now depends on that one.
- Patch #5:
- Make it -N / --Numeric instead of -n / --no-nice-names
- Rename the flag from no_nice_names to numeric as well
- Patch #6:
- Adjust to s/no_nice_names/numeric/ from another patch.
====================
Signed-off-by: David Ahern <dsahern@kernel.org>
The Linux DCBX object is a 1-byte bitfield of flags that configure whether
the DCBX protocol is implemented in the device or in the host, and which
version of the protocol should be used. Add a tool to access the per-port
Linux DCBX object.
For example:
# dcb dcbx set dev eni1np1 host ieee
# dcb dcbx show dev eni1np1
host ieee
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
DCB APP interfaces are standardized in 802.1q-2018, and allow configuration
of traffic prioritization rules based on several possible headers.
Add a dcb subtool for maintenance and display of the APP table. For
example:
# dcb app add dev eni1np1 dscp-prio 0:0 CS3:3 CS6:6
# dcb app show dev eni1np1
dscp-prio 0:0 CS3:3 CS6:6
# dcb app add dev eni1np1 dscp-prio CS3:4
# dcb app show dev eni1np1
dscp-prio 0:0 CS3:3 CS3:4 CS6:6
# dcb app replace dev eni1np1 dscp-prio CS3:5
# dcb app show dev eni1np1
dscp-prio 0:0 CS3:5 CS6:6
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
Some DSCP values can be translated to symbolic names. That may not be
always desirable. Introduce a command-line option similar to other tools,
-N or --Numeric, to suppress this translation.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
The function dcb_get_attribute() assumes that the caller knows the exact
size of the looked-for payload. It also assumes that the response comes
wrapped in an DCB_ATTR_IEEE nest. The former assumption does not hold for
the IEEE APP table, which has variable size. The latter one does not hold
for DCBX, which is not IEEE-nested, and also for any CEE attributes, which
would come CEE-nested.
Factor out the payload extractor from the current dcb_get_attribute() code,
and put into a helper. Then rewrite dcb_get_attribute() compatibly in terms
of the new function. Introduce dcb_get_attribute_va() as a thin wrapper for
IEEE-nested access, and dcb_get_attribute_bare() for access to attributes
that are not nested.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
The function dcb_set_attribute() takes a fully-formed payload as an
argument. For callers that need to build a nested attribute, such as is the
case for DCB APP table, this is not great, because with libmnl, they would
need to construct a separate netlink message just to pluck out the payload
and hand it over to this function.
Currently, dcb_set_attribute() also always wraps the payload in an
DCB_ATTR_IEEE container, because that is what all the dcb subtools so far
needed. But that is not appropriate for DCBX in particular, and in fact a
handful other attributes, as well as any CEE payloads.
Instead, generalize this code by adding parameters for constructing a
custom payload and for fetching the response from a custom response
attribute. Then add dcb_set_attribute_va(), which takes a callback to
invoke in the right place for the nest to be built, and
dcb_set_attribute_bare(), which is similar to dcb_set_attribute(), but does
not encapsulate the payload in an IEEE container. Rewrite
dcb_set_attribute() compatibly in terms of the new functions.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
The function parse_mapping() assumes the key is a number, with a single
configurable exception, which is using "all" to mean "all possible keys".
If a caller wishes to use symbolic names instead of numbers, they cannot
reuse this function.
To facilitate reuse in these situations, convert parse_mapping() into a
helper, parse_mapping_gen(), which instead of an allow-all boolean takes a
generic key-parsing callback. Rewrite parse_mapping() in terms of this
newly-added helper and add a pair of key parsers, one for just numbers,
another for numbers and the keyword "all". Publish the latter as well.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
For formatting DSCP (not full dsfield), it would be handy to be able to
just get the name from the name table, and not get any of the remaining
cruft related to formatting. Add a new entry point to just fetch the
name table string uninterpreted. Use it from rtnl_dsfield_n2a().
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
The json output of the TCA_FLOWER_KEY_MPLS_OPTS attribute was invalid.
Example:
$ tc filter add dev eth0 ingress protocol mpls_uc flower mpls \
lse depth 1 label 100 \
lse depth 2 label 200
$ tc -json filter show dev eth0 ingress
...{"eth_type":"8847",
" mpls":[" lse":["depth":1,"label":100],
" lse":["depth":2,"label":200]]}...
This is invalid as the arrays, introduced by "[", can't contain raw
string:value pairs. Those must be enclosed into "{}" to form valid json
ojects. Also, there are spurious whitespaces before the mpls and lse
strings because of the indentation used for normal output.
Fix this by putting all LSE parameters (depth, label, tc, bos and ttl)
into the same json object. The "mpls" key now directly contains a list
of such objects.
Also, handle strings differently for normal and json output, so that
json strings don't get spurious indentation whitespaces.
Normal output isn't modified.
The json output now looks like:
$ tc -json filter show dev eth0 ingress
...{"eth_type":"8847",
"mpls":[{"depth":1,"label":100},
{"depth":2,"label":200}]}...
Fixes: eb09a15c12 ("tc: flower: support multiple MPLS LSE match")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This to keep compatible with the major tools, ip and tc. Also
document the option in the man page, which was neglected.
Fixes: 67033d1c1c ("Add skeleton of a new tool, dcb")
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
DCB socket buffer is allocated in dcb_init(), but never freed(). Free it
in dcb_fini().
Fixes: 67033d1c1c ("Add skeleton of a new tool, dcb")
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
dcb currently sends all netlink messages with a type RTM_GETDCB, even the
set ones. Change to the appropriate type.
Fixes: 67033d1c1c ("Add skeleton of a new tool, dcb")
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
To allow building a new suite of DCB tools on an older kernel, carry a copy
of dcbnl.h.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add support in rdma for extack errors to be received
in userspace when sent from kernel, so now netlink extack
error messages sent from kernel would be printed for the
user.
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Before:
# ip nexthop help
Usage: ip nexthop { list | flush } [ protocol ID ] SELECTOR
ip nexthop { add | replace } id ID NH [ protocol ID ]
ip nexthop { get| del } id ID
SELECTOR := [ id ID ] [ dev DEV ] [ vrf NAME ] [ master DEV ]
[ groups ] [ fdb ]
NH := { blackhole | [ via ADDRESS ] [ dev DEV ] [ onlink ]
[ encap ENCAPTYPE ENCAPHDR ] | group GROUP ] }
GROUP := [ id[,weight]>/<id[,weight]>/... ]
ENCAPTYPE := [ mpls ]
ENCAPHDR := [ MPLSLABEL ]
After:
# ip nexthop help
Usage: ip nexthop { list | flush } [ protocol ID ] SELECTOR
ip nexthop { add | replace } id ID NH [ protocol ID ]
ip nexthop { get | del } id ID
SELECTOR := [ id ID ] [ dev DEV ] [ vrf NAME ] [ master DEV ]
[ groups ] [ fdb ]
NH := { blackhole | [ via ADDRESS ] [ dev DEV ] [ onlink ]
[ encap ENCAPTYPE ENCAPHDR ] | group GROUP [ fdb ] }
GROUP := [ <id[,weight]>/<id[,weight]>/... ]
ENCAPTYPE := [ mpls ]
ENCAPHDR := [ MPLSLABEL ]
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Match all MPLS fields using smallest and highest possible values.
Test the two ways of specifying MPLS header matching:
* with the basic mpls_{label,tc,bos,ttl} keywords (match only on the
first LSE),
* with the more generic "lse" keyword (allows matching at different
depth of the MPLS label stack).
This test file allows to find problems like the one fixed by
Linux commit 7fdd375e3830 ("net: sched: Fix dump of MPLS_OPT_LSE_LABEL
attribute in cls_flower").
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch allows the user to set and retrieve the
IFLA_MACVLAN_BC_QUEUE_LEN parameter via the bcqueuelen
command line argument
This parameter controls the requested size of the queue for
broadcast and multicast packages in the macvlan driver.
If not specified, the driver default (1000) will be used.
Note: The request is per macvlan but the actually used queue
length per port is the maximum of any request to any macvlan
connected to the same port.
For this reason, the used queue length IFLA_MACVLAN_BC_QUEUE_LEN_USED
is also retrieved and displayed in order to aid in the understanding
of the setting. However, it can of course not be directly set.
Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>
Signed-off-by: David Ahern <dsahern@gmail.com>
add_addr_accepted value is not printed if add_addr_signal value is 0.
Fix this properly looking for add_addr_accepted value, instead.
Fixes: 9c3be2c0ee ("ss: mptcp: add msk diag interface support")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
keys_ex is dinamically allocated with calloc on line 770, but
is not freed in case of error at line 823.
Fixes: 081d6c310d ("tc: pedit: Support JSON dumping")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
nlg_ntf is dinamically allocated in mnlg_socket_open(), and is freed on
the out: return path. However, some error paths do not free it,
resulting in memory leak.
This commit fix this using mnlg_socket_close(), and reporting the
correct error number when required.
Fixes: 9b13cddfe2 ("devlink: implement flash status monitoring")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Commit 924c43778a ("man: tc-ct.8: Add manual page for ct tc action")
add man page for tc-ct, but it brings with it a bogus block of text
in the benning of tc-flower man page.
This commit simply removes it.
Fixes: 924c43778a ("man: tc-ct.8: Add manual page for ct tc action")
Reported-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Petr Machata says:
====================
Add support to the dcb tool for the following three DCB objects:
- PFC, for "Priority-based Flow Control", allows configuration of priority
lossiness, and related toggles.
- DCBNL buffer interfaces are an extension to the 802.1q DCB interfaces and
allow configuration of port headroom buffers.
- DCBNL maxrate interfaces are an extension to the 802.1q DCB interfaces
and allow configuration of rate with which traffic in a given traffic
class is sent.
Patches #1-#4 fix small issues in the current DCB code and man pages.
Patch #5 adds new helpers to the DCB dispatcher.
Patches #6 and #7 add support for command line arguments -s and -i. These
enable, respectively, display of statistical counters, and ISO/IEC mode of
rate units.
Patches #8-#10 add the subtools themselves and their man pages.
====================
Signed-off-by: David Ahern <dsahern@gmail.com>
DCBNL maxrate interfaces are an extension to the 802.1q DCB interfaces and
allow configuration of rate with which traffic in a given traffic class is
sent.
Add a dcb subtool to allow showing and tweaking of this per-TC maximum
rate. For example:
# dcb maxrate show dev eni1np1
tc-maxrate 0:25Gbit 1:25Gbit 2:25Gbit 3:25Gbit 4:25Gbit 5:25Gbit 6:100Gbit 7:25Gbit
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
DCBNL buffer interfaces are an extension to the 802.1q DCB interfaces and
allow configuration of port headroom buffers.
Add a dcb subtool to allow showing and tweaking of buffer priority mapping
and buffer sizes. For example:
# dcb buf show dev eni1np1
prio-buffer 0:0 1:0 2:0 3:3 4:0 5:0 6:6 7:0
buffer-size 0:10000 1:0 2:0 3:70000 4:0 5:0 6:10000 7:0
total-size 221072
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Allow switching "dcb" into the ISO/IEC mode of units by passing -i.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Allow selective display of statistical counters by passing -s.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
The DCB buffer object has a settable array of 32-bit quantities, and the
maxrate object of 64-bit ones. Adjust dcb_parse_mapping() and related
helpers to support 64-bit values in mappings, and add appropriate helpers.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
None, one, or many parameters can be given on the command line, but
the current synopsis allows only none or one. Fix it.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
"dcb ets show dev X help" currently shows full "ets" help instead of just
help for the show command. Fix it.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
getopt_long() currently includes "c" and "n" in the short option string.
These probably slipped in as a cut'n'paste, and are not actually accepted.
Remove them.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add pr_out_dev() helper function and use it both by cmd_dev_show_cb()
and by cmd_mon_show_cb().
Dev stats will be added on the next patch to dev context, so
cmd_mon_show_cb() should print the whole dev context and not just dev
handle.
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add reload action and reload limit to devlink reload command to enable
the user to select the reload action required and constrains limits on
these actions that he may want to ensure.
The following reload actions are supported:
driver_reinit: driver entities re-initialization, applying
devlink-param and devlink-resource values.
fw_activate: firmware activate.
The uAPI is backward compatible, if the reload action option is omitted
from the reload command, the driver reinit action will be used.
Note that when required to do firmware activation some drivers may need
to reload the driver. On the other hand some drivers may need to reset
the firmware to reinitialize the driver entities. Therefore, the devlink
reload command returns the actions which were actually performed.
By default reload actions are not limited and driver implementation may
include reset or downtime as needed to perform the actions. However, if
reload limit is selected, the driver should perform only if it can do it
while keeping the limit constraints.
Reload limit added:
no_reset: No reset allowed, no down time allowed, no link flap and no
configuration is lost.
Command examples:
$devlink dev reload pci/0000:82:00.0 action driver_reinit
reload_actions_performed:
driver_reinit
$devlink dev reload pci/0000:82:00.0 action fw_activate
reload_actions_performed:
driver_reinit fw_activate
devlink dev reload pci/0000:82:00.1 action driver_reinit -jp
{
"reload": {
"reload_actions_performed": [ "driver_reinit" ]
}
}
devlink dev reload pci/0000:82:00.0 action fw_activate -jp
{
"reload": {
"reload_actions_performed": [ "driver_reinit","fw_activate" ]
}
}
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Petr Machata says:
==================
The DCB tool will have commands that deal with buffer sizes and traffic
rates. TC is another tool that has a number of such commands, and functions
to support them: get_size(), get_rate/64(), s/print_size() and
s/print_rate(). In this patchset, these functions are moved from TC to lib/
for possible reuse and modernized.
s/print_rate() has a hidden parameter of a global variable use_iec, which
made the conversion non-trivial. The parameter was made explicit,
print_rate() converted to a mostly json_print-like function, and
sprint_rate() retired in favor of the new print_rate. Patches #1 and #2
deal with this.
The intention was to treat s/print_size() similarly, but unfortunately two
use cases of sprint_size() cannot be converted to a json_print-like
print_size(), and the function sprint_size() had to remain as a discouraged
backdoor to print_size(). This is done in patch #3.
Patch #4 then improves the code of sprint_size() a little bit.
Patch #5 fixes a buglet in formatting small rates in IEC mode.
Patches #6 and #7 handle a routine movement of, respectively,
get_rate/64() and get_size() from tc to lib.
This patchset does not actually add any new uses of these functions. A
follow-up patchset will add subtools for management of DCB buffer and DCB
maxrate objects that will make use of them.
====================
Signed-off-by: David Ahern <dsahern@gmail.com>
The function get_size() serves for parsing of sizes using a handly notation
that supports units and their prefixes, such as 10Kbit. This will be useful
for the DCB buffer size parsing. Move the function from TC to the general
library, so that it can be reused.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
The functions get_rate() and get_rate64() are useful for parsing rate-like
values. The DCB tool will find these useful in the maxrate subtool.
Move them over to lib so that they can be easily reused.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
ISO/IEC units are distinguished from the decadic ones by using a prefixes
like "Ki", "Mi" instead of "K" and "M". The current code inserts the letter
"i" after the decadic unit when in IEC mode. However it does so even when
the prefix is an empty string, formatting 1Kbit in IEC mode as "1000ibit".
Fix by omitting the letter if there is no prefix.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>