VLAN netdevices have two similar attributes: ingress-qos-map and
egress-qos-map. These attributes can be configured with a series of
802.1-priority-to-skb-priority (and vice versa) mappings. A reusable helper
along those lines will be handy for configuration of various
priority-to-tc, tc-to-algorithm, and other arrays in DCB.
Therefore extract the logic to a function parse_mapping(), move to utils.c,
and dispatch to utils.c from iplink_vlan.c. That necessitates extraction of
a VLAN-specific parse_qos_mapping(). Do that, and propagate addattr_l()
return value up, unlike the original.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Receiving a message in libmnl is a somewhat involved operation. Devlink's
mnlg library has an implementation that is going to be handy for other
tools as well. Extract it into a new helper.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Allocation of a new netlink message with the two usual headers is reusable
with other netlink netlink message types. Extract it into a helper,
mnlu_msg_prepare(). Take the second header as an argument, instead of
passing in parameters to initialize it, and copy it in.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
This little dance of mnl_socket_open(), option setting, and bind, is the
same regardless of tool. Extract into a new module that should hold helpers
for working with libmnl, mnl_util.c.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
The value of a number of booleans is shown as "on" and "off" in the plain
output, and as an actual boolean in JSON mode. Add a function that does
that.
RDMA tool already uses a function named print_on_off(). This function
always shows "on" and "off", even in JSON mode. Since there are probably
very few if any consumers of this interface at this point, migrate it to
the new central print_on_off() as well.
Signed-off-by: Petr Machata <me@pmachata.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Take from the macsec code parse_one_of() and adapt so that it passes the
primary result as the main return value, and error result through a
pointer. That is the simplest way to make the code reusable across data
types without introducing extra magic.
Also from macsec take the specialization of parse_one_of() for parsing
specifically the strings "off" and "on".
Convert the macsec code to the new helpers.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
The code for handling batches is largely the same across iproute2 tools.
Extract a helper to handle the batch, and adjust the tools to dispatch to
this helper. Sandwitch the invocation between prologue / epilogue code
specific for each tool.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Manpage:
* Remove the extra "and to ip packets" part from command description
to make it more understandable.
* Redirect packets to eth1, instead of eth0, as told in the
description.
Help string:
* "mpls pop" can be followed by a CONTROL keyword.
* "mpls modify" can also set the MPLS_BOS field.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
* "vlan pop" can be followed by a CONTROL keyword.
* Add missing space in error message.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Vlad Buslov says:
====================
Implement support for terse dump mode which provides only essential
classifier/action info (handle, stats, cookie, etc.). Use new
TCA_DUMP_FLAGS_TERSE flag to prevent copying of unnecessary data from
kernel.
====================
Signed-off-by: David Ahern <dsahern@gmail.com>
Implement support for classifier/action terse dump using new TCA_DUMP_FLAGS
tlv with only available flag value TCA_DUMP_FLAGS_TERSE. Set the flag when
user requested it with following example CLI (-br for 'brief'):
$ tc -s -br filter show dev ens1f0 ingress
filter protocol ip pref 49151 flower chain 0
filter protocol ip pref 49151 flower chain 0 handle 0x1
not_in_hw
action order 1: gact Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
filter protocol ip pref 49152 flower chain 0
filter protocol ip pref 49152 flower chain 0 handle 0x1
not_in_hw
action order 1: gact Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
In terse mode dump only outputs essential data needed to identify the
filter and action (handle, cookie, etc.) and stats, if requested by the
user. The intention is to significantly improve rule dump rate by omitting
all static data that do not change after rule is created.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Modify implementations that return error from action_until->print_aopt()
callback to silently skip actions that don't have their corresponding
TCA_ACT_OPTIONS attribute set (some actions already behave like this).
Print action kind before returning from action_until->print_aopt()
callbacks. This is necessary to support terse dump mode in following patch
in the series.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
On some systems (e.g. current Debian/stable) the inclusion
of utils.h pulled in some other things that may end up
defining __aligned, in a possibly different way than what
we had here.
Use our own definition only if there isn't one already.
Fixes: d5acae244f ("libnetlink: add nl_print_policy() helper")
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Commit 02a261b5ba ("m_mpls: add mac_push action") added a matches()
test for the "mac_push" string before the test for "modify".
This changes the previous behaviour as 'action m' used to match
"modify" while it now matches "mac_push".
Revert to the original behaviour by moving the "mac_push" test after
"modify".
Fixes: 02a261b5ba ("m_mpls: add mac_push action")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Tuong Lien says:
====================
This series adds two new options in the 'iproute2/tipc' command, enabling users
to use the new TIPC encryption features, i.e. the master key and rekeying which
have been recently merged in kernel.
The help menu of the "tipc node set key" command is also updated accordingly:
# tipc node set key --help
Usage: tipc node set key KEY [algname ALGNAME] [PROPERTIES]
tipc node set key rekeying REKEYING
KEY
Symmetric KEY & SALT as a composite ASCII or hex string (0x...) in form:
[KEY: 16, 24 or 32 octets][SALT: 4 octets]
ALGNAME
Cipher algorithm [default: "gcm(aes)"]
PROPERTIES
master - Set KEY as a cluster master key
<empty> - Set KEY as a cluster key
nodeid NODEID - Set KEY as a per-node key for own or peer
REKEYING
INTERVAL - Set rekeying interval (in minutes) [0: disable]
now - Trigger one (first) rekeying immediately
EXAMPLES
tipc node set key this_is_a_master_key master
tipc node set key 0x746869735F69735F615F6B657931365F73616C74
tipc node set key this_is_a_key16_salt algname "gcm(aes)" nodeid 1001002
tipc node set key rekeying 600
====================
Signed-off-by: David Ahern <dsahern@gmail.com>
As supported in kernel, the TIPC encryption rekeying can be tuned using
the netlink attribute - 'TIPC_NLA_NODE_REKEYING'. Now we add the
'rekeying' option correspondingly to the 'tipc node set key' command so
that user will be able to perform that tuning:
tipc node set key rekeying REKEYING
where the 'REKEYING' value can be:
INTERVAL - Set rekeying interval (in minutes) [0: disable]
now - Trigger one (first) rekeying immediately
For example:
$ tipc node set key rekeying 60
$ tipc node set key rekeying now
The command's help menu is also updated with these descriptions for the
new command option.
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David Ahern <dsahern@gmail.com>
In addition to the support of master key in kernel, we add the 'master'
option to the 'tipc node set key' command for user to be able to
specify a key as master key during the key setting. This is carried out
by turning on the new netlink flag - 'TIPC_NLA_NODE_KEY_MASTER'.
For example:
$ tipc node set key "this_is_a_master_key" master
The command's help menu is also updated to give a better description of
all the available options.
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David Ahern <dsahern@gmail.com>
Guillaume Nault says:
====================
This patch series adds the possibility for TC to tunnel Ethernet frames
over MPLS.
Patch 1 allows adding or removing the Ethernet header.
Patch 2 allows pushing an MPLS LSE before the MAC header.
By combining these actions, it becomes possible to encapsulate an
entire Ethernet frame into MPLS, then add an outer Ethernet header
and send the resulting frame to the next hop.
====================
Signed-off-by: David Ahern <dsahern@gmail.com>
Add support for the new TCA_MPLS_ACT_MAC_PUSH action (kernel commit
a45294af9e96 ("net/sched: act_mpls: Add action to push MPLS LSE before
Ethernet header")). This action let TC push an MPLS header before the
MAC header of a frame.
Example (encapsulate all outgoing frames with label 20, then add an
outer Ethernet header):
# tc filter add dev ethX matchall \
action mpls mac_push label 20 ttl 64 \
action vlan push_eth dst_mac 0a:00:00:00:00:02 \
src_mac 0a:00:00:00:00:01
This patch also adds an alias for ETH_P_TEB, since it is useful when
decapsulating MPLS packets that contain an Ethernet frame.
With MAC_PUSH, there's no previous Ethertype to modify. However, the
"protocol" option is still needed, because the kernel uses it to set
skb->protocol. So rename can_modify_ethtype() to can_set_ethtype().
Also add a test suite for m_mpls, which covers the new action and the
pre-existing ones.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add support for the new TCA_VLAN_ACT_POP_ETH and TCA_VLAN_ACT_PUSH_ETH
actions (kernel commit 19fbcb36a39e ("net/sched: act_vlan:
Add {POP,PUSH}_ETH actions"). These action let TC remove or add the
Ethernet at the head of a frame.
Drop an Ethernet header:
# tc filter add dev ethX matchall action vlan pop_eth
Push an Ethernet header (the original frame must have no MAC header):
# tc filter add dev ethX matchall action vlan \
push_eth dst_mac 0a:00:00:00:00:02 src_mac 0a:00:00:00:00:01
Also add a test suite for m_vlan, which covers these new actions and
the pre-existing ones.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
For some devices, updating the flash can take significant time during
operations where no status can meaningfully be reported. This can be
somewhat confusing to a user who sees devlink appear to hang on the
terminal waiting for the device to update.
Recent changes to the kernel interface allow such long running commands
to provide a timeout value indicating some upper bound on how long the
relevant action could take.
Provide a ticking counter of the time elapsed since the previous status
message in order to make it clear that the program is not simply stuck.
Display this message whenever the status message from the kernel
indicates a timeout value. Additionally also display the message if
we've received no status for more than couple of seconds. If we elapse
more than the timeout provided by the status message, replace the
timeout display with "timeout reached".
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
`ip addr` when run under qemu-user-riscv64, fails. This likely is due
to qemu-5.1 not doing translation of RTM_GETNSID calls. Aborting ip
completely is not helpful for the user however. This patch reworks
the error handling.
Before:
rtest:/ # ip a
2: host0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
request send failed: Operation not supported
link/ether 46:3f:2d:88:3d:db brd ff:ff:ff:ff:ff:ffrtest:/ #
Afterwards:
rtest:/ # ip a
2: host0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
rtnl_send(RTM_GETNSID): Operation not supported. Continuing anyway.
link/ether 46:3f:2d:88:3d:db brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 192.168.72.147/28 brd 192.168.72.159 scope global host0
valid_lft forever preferred_lft forever
inet6 fe80::443f:2dff:fe88:3ddb/64 scope link
valid_lft forever preferred_lft forever
Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
In case of bad entries in /proc/mounts just skip cgroup cache initialization.
Cgroups in output will be shown as "unreachable:cgroup_id".
Fixes: d5e6ee0dac ("ss: introduce cgroup2 cache and helper functions")
Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru>
Reported-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Newer kernels can dump per-op policies, so print out the new
mapping attribute to indicate which op has which policy.
v2:
* print out both do/dump policy idx
v3:
* fix userspace API which renumbered after patch rebasing
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David Ahern <dsahern@gmail.com>
Nikolay Aleksandrov says:
====================
This set adds support for IGMPv3/MLDv2 attributes, they're mostly
read-only at the moment. The only new "set" option is the source address
for S,G entries. It is added in patch 01 (see the patch commit message for
an example). Patch 02 shows a missing flag (fast_leave) for
completeness, then patch 03 shows the new IGMPv3/MLDv2 flags:
added_by_star_ex and blocked. Patches 04-06 show the new extra
information about the entry's state when IGMPv3/MLDv2 are enabled. That
includes its filter mode (include/exclude), source list with timers and
origin protocol (currently only static/kernel), in order to show the new
information the user must use "-d"/show_details.
Here's the output of a few IGMPv3 entries:
dev bridge port ens12 grp 239.0.0.1 src 20.21.22.23 temp filter_mode include proto kernel blocked 0.00
dev bridge port ens12 grp 239.0.0.1 src 8.9.10.11 temp filter_mode include proto kernel blocked 0.00
dev bridge port ens12 grp 239.0.0.1 src 1.2.3.1 temp filter_mode include proto kernel blocked 0.00
dev bridge port ens12 grp 239.0.0.1 temp filter_mode exclude source_list 20.21.22.23/0.00,8.9.10.11/0.00,1.2.3.1/0.00 proto kernel 26.65
====================
Signed-off-by: David Ahern <dsahern@gmail.com>
Print the mdb entry's protocol (i.e. who added it) when it's available if
the user requested to show details (-d). Currently the only possible
values are RTPROT_STATIC (user-space added) or RTPROT_KERNEL
(automatically added by kernel). The value is kernel controlled.
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Print the mdb entry's source list when it's available if the user
requested to show details (-d). Each source has an associated timer
which controls if traffic should be forwarded to that S,G entry (if the
timer is non-zero traffic is forwarded, otherwise it's not).
Currently the source list is kernel controlled and can't be changed by
user-space.
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Print the mdb entry's filter mode when it's available if the user
requested to show details (-d). It can be either include or exclude.
Currently it's kernel controlled and can't be changed by user-space.
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
With IGMPv3/MLDv2 support we have 2 new flags:
- added_by_star_ex: set when the S,G entry was automatically created
because of a *,G entry in EXCLUDE mode
- blocked: set when traffic for the S,G entry for that port has to be
blocked
Both flags are used only on the new S,G entries and are currently kernel
managed, i.e. similar to other flags which can't be set from user-space.
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
We're not showing the fast_leave flag when it's set. Currently that can
be only when an mdb entry is being deleted due to fast leave, so it will
only affect mdb monitor.
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch adds the user-space control and dump of mdb entry source
address. When setting the new MDBA_SET_ENTRY_ATTRS nested attribute is
used and inside is added MDBE_ATTR_SOURCE based on the address family.
When dumping we look for MDBA_MDB_EATTR_SOURCE and if present we add the
"src x.x.x.x" output. The source address will be always shown as it's
needed to match the entry to modify it from user-space.
Example:
$ bridge mdb add dev bridge port ens13 grp 239.0.0.1 src 1.2.3.4 permanent vid 100
$ bridge mdb show
dev bridge port ens13 grp 239.0.0.1 src 1.2.3.4 permanent vid 100
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The XFRMA_SET_MARK_MASK attribute can be set in states (4.19+)
It is optional and the kernel default is 0xffffffff
It is the mask of XFRMA_SET_MARK(a.k.a. XFRMA_OUTPUT_MARK in 4.18)
e.g.
./ip/ip xfrm state add output-mark 0x6 mask 0xab proto esp \
auth digest_null 0 enc cipher_null ''
ip xfrm state
src 0.0.0.0 dst 0.0.0.0
proto esp spi 0x00000000 reqid 0 mode transport
replay-window 0
output-mark 0x6/0xab
auth-trunc digest_null 0x30 0
enc ecb(cipher_null)
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
sel src 0.0.0.0/0 dst 0.0.0.0/0
Signed-off-by: Antony Antony <antony@phenome.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add health reporter test command and allow user to trigger a test event.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The recently added DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK allows
userspace to indicate how a device should handle subsections of a flash
component when updating. For example, a flash component might contain
vital data such as PCIe serial number or configuration fields such as
settings that control device bootup.
The overwrite mask allows specifying whether the device should overwrite
these subsections when updating from the provided image. If nothing is
specified, then the update is expected to preserve all vital fields and
configuration.
Add support for specifying the overwrite mask using the new "overwrite"
option to the flash command line.
By specifying "overwrite identifiers", the user request that the flash
update should overwrite any settings in the updated flash component with
settings from the provided flash image
$devlink dev flash pci/0000:af:00.0 file flash_image.bin overwrite identifiers
By specifying "overwrite settings" the user requests that the flash update
should overwrite any settings in the updated flash component with setting
values from the provided flash image.
$devlink dev flash pci/0000:af:00.0 file flash_image.bin overwrite settings
These options may be combined, in which case both subsections will be sent
in the overwrite mask, resulting in a request to overwrite all settings and
identifiers stored in the updated flash components.
$devlink dev flash pci/0000:af:00.0 file flash_image.bin overwrite settings overwrite identifiers
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
These were reported as IPv6-only and ignored:
# ip address add 192.0.2.2/24 dev dummy5 noprefixroute
Warning: noprefixroute option can be set only for IPv6 addresses
# ip address add 224.1.1.10/24 dev dummy5 autojoin
Warning: autojoin option can be set only for IPv6 addresses
This enables them back for IPv4.
Fixes: 9d59c86e57 ("iproute2: ip addr: Organize flag properties structurally")
Signed-off-by: Adel Belhouane <bugs.a.b@free.fr>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Used for tracking neighbour table overflows.
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Remove the extra space between the reported ipoib attrs - use only one
space instead of two.
Fixes: de0389935f ("iplink: Added support for the kernel IPoIB RTNL ops")
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The patch exposes statistics for XDP sockets which can be useful for
debugging purposes.
The stats exposed are:
rx dropped
rx invalid
rx queue full
rx fill ring empty
tx invalid
tx ring empty
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>