When table and vrftable are used in SRv6, ip should bail out if table
ids are not valid, and return a proper error message to the user.
Achieve this simply checking rtnl_rttable_a2n return value, as we
already do in the rest of iproute.
Fixes: 0486388a87 ("add support for table name in SRv6 End.DT* behaviors")
Fixes: 69629b4e43 ("seg6: add support for vrftable attribute in SRv6 End.DT4/DT6 behaviors")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The kernel signals when offload fails using the 'RTM_F_OFFLOAD_FAILED'
flag. Print it to help users understand the offload state of the route.
The "rt_" prefix is used in order to distinguish it from the offload state
of nexthops, similar to "rt_offload" and "rt_trap".
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Since NETLINK_GET_STRICT_CHK was enabled, the kernel rejects commands
that pass a prefix length, eg:
ip route get `1.0.0.0/1
Error: ipv4: Invalid values in header for route get request.
ip route get 0.0.0.0/0
Error: ipv4: rtm_src_len and rtm_dst_len must be 32 for IPv4
Since there's no point in setting a rtm_dst_len that we know is going
to be rejected, just force it to the right value if it's passed on
the command line. Print a warning to stderr to notify users.
Bug-Debian: https://bugs.debian.org/944730
Reported-By: Clément 'wxcafé' Hertling <wxcafe@wxcafe.net>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The kernel might truncate VF info in IFLA_VFINFO_LIST. Compare the
expected number of VFs in IFLA_NUM_VF to how many were found in the
list and warn accordingly.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
* Fix PROTO description in help message (mpls isn't a valid argument).
* Remove SRCPORTMIN description from help message since it doesn't
appear in the syntax string.
* Use same keywords in help message and in man page.
* Use the "ethertype" option name (.B ethertype) rather than the
option value (.I ETHERTYPE) in the man page description of
[no]multiproto.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The len8_dlc element is filled by the CAN interface driver and used for CAN
frame creation by the CAN driver when the CAN_CTRLMODE_CC_LEN8_DLC flag is
supported by the driver and enabled via netlink configuration interface.
Add the command line support for cc-len8-dlc for Linux 5.11+
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David Ahern <dsahern@kernel.org>
Necessary to understand what is going on when bpf_program_load fails
Signed-off-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Before:
# ip nexthop help
Usage: ip nexthop { list | flush } [ protocol ID ] SELECTOR
ip nexthop { add | replace } id ID NH [ protocol ID ]
ip nexthop { get| del } id ID
SELECTOR := [ id ID ] [ dev DEV ] [ vrf NAME ] [ master DEV ]
[ groups ] [ fdb ]
NH := { blackhole | [ via ADDRESS ] [ dev DEV ] [ onlink ]
[ encap ENCAPTYPE ENCAPHDR ] | group GROUP ] }
GROUP := [ id[,weight]>/<id[,weight]>/... ]
ENCAPTYPE := [ mpls ]
ENCAPHDR := [ MPLSLABEL ]
After:
# ip nexthop help
Usage: ip nexthop { list | flush } [ protocol ID ] SELECTOR
ip nexthop { add | replace } id ID NH [ protocol ID ]
ip nexthop { get | del } id ID
SELECTOR := [ id ID ] [ dev DEV ] [ vrf NAME ] [ master DEV ]
[ groups ] [ fdb ]
NH := { blackhole | [ via ADDRESS ] [ dev DEV ] [ onlink ]
[ encap ENCAPTYPE ENCAPHDR ] | group GROUP [ fdb ] }
GROUP := [ <id[,weight]>/<id[,weight]>/... ]
ENCAPTYPE := [ mpls ]
ENCAPHDR := [ MPLSLABEL ]
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
This patch allows the user to set and retrieve the
IFLA_MACVLAN_BC_QUEUE_LEN parameter via the bcqueuelen
command line argument
This parameter controls the requested size of the queue for
broadcast and multicast packages in the macvlan driver.
If not specified, the driver default (1000) will be used.
Note: The request is per macvlan but the actually used queue
length per port is the maximum of any request to any macvlan
connected to the same port.
For this reason, the used queue length IFLA_MACVLAN_BC_QUEUE_LEN_USED
is also retrieved and displayed in order to aid in the understanding
of the setting. However, it can of course not be directly set.
Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>
Signed-off-by: David Ahern <dsahern@gmail.com>
The tools "ip" and "tc" use a flag "use_iec", which indicates whether, when
formatting rate values, the prefixes "K", "M", etc. should refer to powers
of 1024, or powers of 1000. The flag is currently kept as a global variable
in "ip" and "tc", but is nonetheless declared in util.h.
Instead, move the declaration to tool-specific headers ip/ip_common.h and
tc/tc_common.h.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
We introduce the "vrftable" attribute for supporting the SRv6 End.DT4 and
End.DT6 behaviors in iproute2.
The "vrftable" attribute indicates the routing table associated with
the VRF device used by SRv6 End.DT4/DT6 for routing IPv4/IPv6 packets.
The SRv6 End.DT4/DT6 is used to implement IPv4/IPv6 L3 VPNs based on Segment
Routing over IPv6 networks in multi-tenants environments.
It decapsulates the received packets and it performs the IPv4/IPv6 routing
lookup in the routing table of the tenant.
The SRv6 End.DT4/DT6 leverages a VRF device in order to force the routing
lookup into the associated routing table using the "vrftable" attribute.
Some examples:
$ ip -6 route add 2001:db8::1 encap seg6local action End.DT4 vrftable 100 dev eth0
$ ip -6 route add 2001:db8::2 encap seg6local action End.DT6 vrftable 200 dev eth0
Standard Output:
$ ip -6 route show 2001:db8::1
2001:db8::1 encap seg6local action End.DT4 vrftable 100 dev eth0 metric 1024 pref medium
JSON Output:
$ ip -6 -j -p route show 2001:db8::2
[ {
"dst": "2001:db8::2",
"encap": "seg6local",
"action": "End.DT6",
"vrftable": 200,
"dev": "eth0",
"metric": 1024,
"flags": [ ],
"pref": "medium"
} ]
v2:
- no changes made: resubmit after pulling out this patch from the kernel
patchset.
v1:
- mixing this patch with the kernel patchset confused patckwork.
Signed-off-by: Paolo Lungaroni <paolo.lungaroni@cnit.it>
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: David Ahern <dsahern@gmail.com>
If multiple ip processes are ran at the same time to set up
separate network namespaces, and it is the first time so /run/netns
has to be set up first, and they end up doing it at the same time,
the processes might enter a recursive loop creating thousands of
mount points, which might crash the system depending on resources
available.
Try to take a flock on /run/netns before doing the mount() dance, to
ensure this cannot happen. But do not try too hard, and if it fails
continue after printing a warning, to avoid introducing regressions.
First reported on Debian: https://bugs.debian.org/949235
To reproduce (WARNING: run in a VM to avoid system lockups):
for i in {0..9}
do
strace -e trace=mount -e inject=mount:delay_exit=1000000 ip \
netns add "testnetns$i" 2>&1 | tee "$i.log" &
done
wait
The strace is to ensure the problem always reproduces, to add an
artificial synchronization point after the first mount().
Reported-by: Etienne Dechamps <etienne@edechamps.fr>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Do not hardcode /usr/lib/ip as a path and allow libraries path
configuration in run-time.
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Gcc-10 complains about possible string length overflow.
This can't happen Ethernet address format is always limited to
18 characters or less. Just resize the temp buffer.
Fixes: 70dfb0b883 ("iplink: bridge: export bridge_id and designated_root")
Cc: nikolay@cumulusnetworks.com
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
There are directly calls in libbpf for bpf program load/attach.
So we could just use two wrapper functions for ipvrf and convert
them with libbpf support.
Function bpf_prog_load() is removed as it's conflict with libbpf
function name.
bpf.c is moved to bpf_legacy.c for later main libbpf support in
iproute2.
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch aim to add basic checking functions for later iproute2
libbpf support.
First we add check_libbpf() in configure to see if we have bpf library
support. By default the system libbpf will be used, but static linking
against a custom libbpf version can be achieved by passing libbpf DESTDIR
to variable LIBBPF_DIR for configure.
Another variable LIBBPF_FORCE is used to control whether to build iproute2
with libbpf. If set to on, then force to build with libbpf and exit if
not available. If set to off, then force to not build with libbpf.
When dynamically linking against libbpf, we can't be sure that the
version we discovered at compile time is actually the one we are
using at runtime. This can lead to hard-to-debug errors. So we add
a new file lib/bpf_glue.c and a helper function get_libbpf_version()
to get correct libbpf version at runtime.
Signed-off-by: Hangbin Liu <haliu@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Instead of rolling a custom on-off printer, use the one added to utils.c.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Instead of rolling a custom on-off printer, use the one added to utils.c.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Instead of rolling a custom on-off printer, use the one added to utils.c.
Note that _print_onoff() has an extra parameter for a JSON-specific flag
name. However that argument is not used, and never was. Therefore when
moving over to print_on_off(), drop this argument.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Invoke parse_on_off() from bridge_slave_parse_on_off() instead of
hand-rolling one. Exit on failure, because the invarg that was ivoked here
before would.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Invoke parse_on_off() instead of rolling a custom function.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Currently, the nexthop flags are only printed when the nexthop has a
nexthop device. The offload / trap indication is therefore not printed
for nexthop groups.
Instead, always print the nexthop flags, regardless if the nexthop has a
nexthop device or not.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The kernel can now signal that a nexthop is trapping packets instead of
forwarding them. Print the flag to help users understand the offload
state of each nexthop.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The DCB tool will have to provide an interface to a number of fixed-size
arrays. Unlike the egress- and ingress-qos-map, it makes good sense to have
an interface to set all members to the same value. For example to set
strict priority on all TCs besides select few, or to reset allocated
bandwidth to all zeroes, again besides several explicitly-given ones.
To support this usage, extend the parse_mapping() with a boolean that
determines whether this special use is supported. If "all" is given and
recognized, mapping_cb is called with the key of -1.
Have iplink_vlan pass false for allow_all.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
VLAN netdevices have two similar attributes: ingress-qos-map and
egress-qos-map. These attributes can be configured with a series of
802.1-priority-to-skb-priority (and vice versa) mappings. A reusable helper
along those lines will be handy for configuration of various
priority-to-tc, tc-to-algorithm, and other arrays in DCB.
Therefore extract the logic to a function parse_mapping(), move to utils.c,
and dispatch to utils.c from iplink_vlan.c. That necessitates extraction of
a VLAN-specific parse_qos_mapping(). Do that, and propagate addattr_l()
return value up, unlike the original.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Take from the macsec code parse_one_of() and adapt so that it passes the
primary result as the main return value, and error result through a
pointer. That is the simplest way to make the code reusable across data
types without introducing extra magic.
Also from macsec take the specialization of parse_one_of() for parsing
specifically the strings "off" and "on".
Convert the macsec code to the new helpers.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
The code for handling batches is largely the same across iproute2 tools.
Extract a helper to handle the batch, and adjust the tools to dispatch to
this helper. Sandwitch the invocation between prologue / epilogue code
specific for each tool.
Signed-off-by: Petr Machata <me@pmachata.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
`ip addr` when run under qemu-user-riscv64, fails. This likely is due
to qemu-5.1 not doing translation of RTM_GETNSID calls. Aborting ip
completely is not helpful for the user however. This patch reworks
the error handling.
Before:
rtest:/ # ip a
2: host0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
request send failed: Operation not supported
link/ether 46:3f:2d:88:3d:db brd ff:ff:ff:ff:ff:ffrtest:/ #
Afterwards:
rtest:/ # ip a
2: host0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
rtnl_send(RTM_GETNSID): Operation not supported. Continuing anyway.
link/ether 46:3f:2d:88:3d:db brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 192.168.72.147/28 brd 192.168.72.159 scope global host0
valid_lft forever preferred_lft forever
inet6 fe80::443f:2dff:fe88:3ddb/64 scope link
valid_lft forever preferred_lft forever
Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The XFRMA_SET_MARK_MASK attribute can be set in states (4.19+)
It is optional and the kernel default is 0xffffffff
It is the mask of XFRMA_SET_MARK(a.k.a. XFRMA_OUTPUT_MARK in 4.18)
e.g.
./ip/ip xfrm state add output-mark 0x6 mask 0xab proto esp \
auth digest_null 0 enc cipher_null ''
ip xfrm state
src 0.0.0.0 dst 0.0.0.0
proto esp spi 0x00000000 reqid 0 mode transport
replay-window 0
output-mark 0x6/0xab
auth-trunc digest_null 0x30 0
enc ecb(cipher_null)
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
sel src 0.0.0.0/0 dst 0.0.0.0/0
Signed-off-by: Antony Antony <antony@phenome.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
These were reported as IPv6-only and ignored:
# ip address add 192.0.2.2/24 dev dummy5 noprefixroute
Warning: noprefixroute option can be set only for IPv6 addresses
# ip address add 224.1.1.10/24 dev dummy5 autojoin
Warning: autojoin option can be set only for IPv6 addresses
This enables them back for IPv4.
Fixes: 9d59c86e57 ("iproute2: ip addr: Organize flag properties structurally")
Signed-off-by: Adel Belhouane <bugs.a.b@free.fr>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Used for tracking neighbour table overflows.
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Remove the extra space between the reported ipoib attrs - use only one
space instead of two.
Fixes: de0389935f ("iplink: Added support for the kernel IPoIB RTNL ops")
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch adds support for recently
added link IFLA_PROTO_DOWN_REASON attribute.
IFLA_PROTO_DOWN_REASON enumerates reasons
for the already existing IFLA_PROTO_DOWN link
attribute.
$ cat /etc/iproute2/protodown_reasons.d/r.conf
0 mlag
1 evpn
2 vrrp
3 psecurity
$ ip link set dev vx10 protodown on protodown_reason vrrp on
$ip link show dev vx10
14: vx10: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
DEFAULT group default qlen 1000
link/ether f2:32:28:b8:35:ff brd ff:ff:ff:ff:ff:ff protodown on
protodown_reason <vrrp>
$ip -p -j link show dev vx10
[ {
<snip>
"proto_down": true,
"proto_down_reason": [ "vrrp" ]
} ]
$ip link set dev vx10 protodown_reason mlag on
$ip link show dev vx10
14: vx10: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
DEFAULT group default qlen 1000
link/ether f2:32:28:b8:35:ff brd ff:ff:ff:ff:ff:ff protodown on
protodown_reason <mlag,vrrp>
$ip -p -j link show dev vx10
[ {
<snip>
"proto_down": true,
"protodown_reason": [ "mlag","vrrp" ]
} ]
$ip -p -j link show dev vx10
$ip link set dev vx10 protodown off protodown_reason vrrp off
Error: Cannot clear protodown, active reasons.
$ip link set dev vx10 protodown off protodown_reason mlag off
$
Note: for somereason the json and non-json key for protodown
are different (protodown and proto_down). I have kept the
same for protodown reason for consistency (protodown_reason and
proto_down_reason).
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The XFRMA_SET_MARK_MASK attribute is set in states (4.19+).
It is the mask of XFRMA_SET_MARK(a.k.a. XFRMA_OUTPUT_MARK in 4.18)
sample output: note the output-mark mask
ip xfrm state
src 192.1.2.23 dst 192.1.3.33
proto esp spi 0xSPISPI reqid REQID mode tunnel
replay-window 32 flag af-unspec
output-mark 0x3/0xffffff
aead rfc4106(gcm(aes)) 0xENCAUTHKEY 128
if_id 0x1
Signed-off-by: Antony Antony <antony@phenome.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Indenting of 'ip link set' options below 'link-netns' was wrong, they
should be on the same level as the above.
While being at it, fix closing brackets in vf-specific options. Also
write node/port_guid parameters in upper-case without curly braces: They
are supposed to be replaced by values, not put literally.
Fixes: 8589eb4efd ("treewide: refactor help messages")
Fixes: 5a3ec4ba64 ("iplink: Update usage in help message")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch enhances the iplink command to add a proto parameters to
create PRP device/interface similar to HSR. Both protocols are
quite similar and requires a pair of Ethernet interfaces. So re-use
the existing HSR iplink command to create PRP device/interface as
well. Use proto parameter to differentiate the two protocols.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
ip maddress add|del takes a MAC address as argument, so insist on
getting a length of ETH_ALEN bytes. This makes sure the passed argument
is actually a MAC address and especially not an IPv4 address which
was previously accepted and silently taken as a MAC address.
While at it, do not print *argv in the error path as this has been
modified by ll_addr_a2n() and doesn't contain the full string anymore,
which can lead to misleading error messages.
Also while at it, replace the hardcoded buffer size with the actual
buffer size using sizeof().
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Replace the iproute2 snapshot with a version string which is
autogenerated as part of the build process using git describe.
This will also allow seeing if the version of the command
is built from the same sources is as upstream.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This flag allows to create SA where sequence number can cycle in
outbound packets if set.
Signed-off-by: Petr Vaněk <pv@excello.cz>
Signed-off-by: David Ahern <dsahern@kernel.org>
According to 'ip mptcp help', 'endpoint show' can accept no argument:
ip mptcp endpoint show [ id ID ]
It makes sense to print all endpoints when no filter is used.
So here if the following command is used, all endpoints are printed:
ip mptcp endpoint show
Same as:
ip mptcp endpoint
Fixes: 7e0767cd ("add support for mptcp netlink interface")
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>