This patch adds documentation for additional offload modes and
associated parameters in tc-mqprio.
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Sample output:
$ sudo ./ip/ip fou add port 111 ipproto 11
$ sudo ./ip/ip fou add port 222 ipproto 22 -6
$ ./ip/ip fou show
port 222 ipproto 22 -6
port 111 ipproto 11
Signed-off-by: Greg Greenway <ggreenway@apple.com>
GCC version 7.2.1 complains that 'result1' may be used uninitialized in
parse_action_control_slash_spaces(). This should not be possible in
practice, so the actual value 'result1' is initialized with does not
matter.
Signed-off-by: Phil Sutter <phil@nwl.cc>
The function parse_action_control_slash() returns early if 'p' is NULL,
so after the first call to action_a2n(), 'p' is guaranteed not to be
NULL. Otherwise, the assignment '*p = 0' above would dereference the
NULL pointer already anyway, so just drop this check here.
Signed-off-by: Phil Sutter <phil@nwl.cc>
commit 28033ae4e0f ("net: netlink: Update attr validation to require
exact length for some types") introduces a stricter control on attributes
of type NLA_U* and NLA_S*.
Since the tipc tool is sending a family attribute of u32 instead of as
expected u16 the tool is now effectively broken.
We fix this by changing the type of the said attribute.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
As was reported [1], the iproute2 fails to compile on old systems,
in Cong's case, it was Fedora 19, in our case it was RedHat 7.2, which
failed with the following errors during compilation:
ipxfrm.c: In function ‘xfrm_selector_print’:
ipxfrm.c:479:7: error: ‘IPPROTO_MH’ undeclared (first use in this
function)
case IPPROTO_MH:
^
ipxfrm.c:479:7: note: each undeclared identifier is reported only once
for each function it appears in
ipxfrm.c: In function ‘xfrm_selector_upspec_parse’:
ipxfrm.c:1345:8: error: ‘IPPROTO_MH’ undeclared (first use in this
function)
case IPPROTO_MH:
^ make[1]: *** [ipxfrm.o] Error 1
The reason to it is the order of headers files. The IPPROTO_MH field is
set in kernel's UAPI header file (in6.h), but only in case
__UAPI_DEF_IPPROTO_V6 is set before. That define comes from other kernel's
header file (libc-compat.h) and is set in case there are no previous
libc relevant declarations.
In ip code, the include of <netdb.h> causes to indirect inclusion of
<netinet/in.h> and it sets __UAPI_DEF_IPPROTO_V6 to be zero and prevents from
IPPROTO_MH declaration.
This patch takes the simplest possible approach to fix the compilation
error by checking if IPPROTO_MH was defined before and in case it
wasn't, it defines it to be the same as in the kernel.
[1] https://www.spinics.net/lists/netdev/msg463980.html
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Riad Abo Raed <riada@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Any iproute utility that uses any function from lib/utils.c needs
to declare its own resolve_hosts variable instance although it does
not need/use hostname resolving functionality (currently only 'ip'
and 'ss' commands uses this).
The patch declares single common instance of resolve_hosts directly
in utils.c so the existing ones can be removed (the same approach
that is used for timestamp_short).
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
In order to calculate the idleSlope parameter of CBS correctly, users
must take into account the entire packet size, including the overhead
from all layers.
Add some more details to the man page to clarify that, giving one
simple example and pointing users to the correct 802.1Q section for
further clarifications if needed.
Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Kernel can now return non-fatal error messages in extack facility.
Update iproute2 to dump to use if present.
- rename nl_dump_ext_err to nl_dump_ext_ack
- rename errmsg to msg
- add call to nl_dump_ext_ack in rtnl_dump_done and __rtnl_talk for
non-error path
Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Using 'ip deleteall' with policies that have marks, fails unless you
eplicitely specify the mark values. This is very uncomfortable when
bulk-deleting policies and states. With this patch all relevant states
and policies are wiped by 'ip deleteall' regardless of their mark
values.
Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
Socket polices are added to a socket using setsockopt(2). They cannot be
deleted by iproute2. The attempt to delete them causes an error
(EINVAL).
To avoid this unnecessary error message all socket policies are skipped
in xfrm_policy_keep.
Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
Listing policies on systems with a lot of socket policies can be
confusing due to the number of returned polices. Even if socket polices
are not of interest, they cannot be filtered. This patch adds an option
to filter all socket policies from the output.
Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
This patch was previously submitted as RFC. Submitting this as
non-RFC now that the classid reservation scheme for hardware
traffic classes and offloads to route packets to a hardware
traffic class are accepted in net-next.
HW traffic classes 0 through 15 are represented using the
reserved classid values :ffe0 - :ffef.
Example:
Match Dst IPv4,Dst Port and route to TC1:
# tc filter add dev eth0 protocol ip parent ffff:\
prio 1 flower dst_ip 192.168.1.1/32\
ip_proto udp dst_port 12000 skip_sw\
hw_tc 1
# tc filter show dev eth0 parent ffff:
filter pref 1 flower chain 0
filter pref 1 flower chain 0 handle 0x1 hw_tc 1
eth_type ipv4
ip_proto udp
dst_ip 192.168.1.1
dst_port 12000
skip_sw
in_hw
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
The Credit Based Shaper (CBS) queueing discipline allows bandwidth
reservation with sub-milisecond precision. It is defined by the
802.1Q-2014 specification (section 8.6.8.2 and Annex L).
The syntax is:
tc qdisc add dev DEV parent NODE cbs locredit <LOCREDIT>
hicredit <HICREDIT> sendslope <SENDSLOPE>
idleslope <IDLESLOPE>
(The order is not important)
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch was previously submitted as RFC. Submitting this as
non-RFC now that the tc/mqprio changes are accepted in net-next.
Adds new mqprio options for 'mode' and 'shaper'. The mode
option can take values for offload modes such as 'dcb' (default),
'channel' with the 'hw' option set to 1. The new 'channel' mode
supports offloading TCs and other queue configurations. The
'shaper' option is to support HW shapers ('dcb' default) and
takes the value 'bw_rlimit' for bandwidth rate limiting. The
parameters to the bw_rlimit shaper are minimum and maximum
bandwidth rates. New HW shapers in future can be supported
through the shaper attribute.
# tc qdisc add dev eth0 root mqprio num_tc 2 map 0 0 0 0 1 1 1 1\
queues 4@0 4@4 hw 1 mode channel shaper bw_rlimit\
min_rate 1Gbit 2Gbit max_rate 4Gbit 5Gbit
# tc qdisc show dev eth0
qdisc mqprio 804a: root tc 2 map 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
queues:(0:3) (4:7)
mode:channel
shaper:bw_rlimit min_rate:1Gbit 2Gbit max_rate:4Gbit 5Gbit
v2: Avoid buffer overrun and minor cleanup.
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
IPvlan supported bridge-only functionality prior to commits
a190d04db937 ('ipvlan: introduce 'private' attribute for all
existing modes.') and fe89aa6b250c ('ipvlan: implement VEPA mode').
These two commits allow to configure the VEPA and private modes now.
This patch adds those options in ip command.
e.g.
bash:~# ip link add link eth0 name ipvl0 type ipvlan mode l2 private
-or-
bash:~# ip link add link eth0 type ipvl0 type ipvlan mode l2 vepa
Also the output will reflect the mode and the mode-flag accordingly.
e.g.
bash:~# ip -details link show ipvl0
4: ipvl0@eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc ...
link/ether 00:1a:11:44:a5:3e brd ff:ff:ff:ff:ff:ff promiscuity 0
ipvlan mode l2 private addrgenmode eui64 numtxqueues 1 ...
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
If Netid or State columns are missing, we must not subtract one
for each of these two columns from the remaining screen width,
while distributing available space to columns. This one
character corresponding to one delimiting space has to be
subtracted only if the columns are actually printed.
Further, in the existing implementation, if the screen width is
an odd number, one additional character is added to the width of
one of the two columns.
But if both are not printed, this filling character needs to be
added somewhere else, in order to have the right spacing
allowing us to fill lines completely.
Address and port fields are printed in pairs (local and remote),
so we can't distribute the space to any of them, because it
would be doubled. Instead, print this additional space to the
right of the Send-Q column, to keep code changes to a minimum.
This is particularly visible with 'ss -f netlink -Z'. Before
this patch, with an 80 column terminal, we have:
$ ss -f netlink -Z|head -n3
Recv-Q Send-Q Local Address:Port Peer Address:Port
0 0 rtnl:evolution-calen/2049 * pr
oc_ctx=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
0 0 rtnl:clock-applet/1944 * pr
oc_ctx=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
and with an 81 column terminal:
$ ss -f netlink -Z|head -n3
Recv-Q Send-Q Local Address:Port Peer Address:Port
0 0 rtnl:evolution-calen/2049 * pro
c_ctx=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
0 0 rtnl:clock-applet/1944 * pro
c_ctx=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
After this patch, in both cases, the output is:
$ ss -f netlink -Z|head -n3
Recv-Q Send-Q Local Address:Port Peer Address:Port
0 0 rtnl:evolution-calen/2049 *
proc_ctx=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
0 0 rtnl:clock-applet/1944 *
proc_ctx=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Both local address and service, and remote address and service
fields are already printed out in netlink_show_one() before we
start printing process context, by calling sock_addr_print()
twice.
At this point, sock_addr_print() has already forced the remote
service field to be 'serv_width' wide -- that is, 'serv_width'
width has already been consumed, before we print process
context.
Hence, it makes no sense to force the display width of process
context to be 'serv_width' wide again: previous prints have
filled up the line already. Remove the width specifier and
prefix with a space instead, to keep this consistent with fields
which are displayed after the first output line.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
This patch adds fastopen_no_cookie option to enable/disable TCP fastopen
without a cookie on a per-route basis.
Support in Linux was added with 71c02379c762 (tcp: Configure TFO without
cookie per socket and/or per route).
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Use strtol-based API to parse and validate integer input; atoi() does
not detect errors and may yield undefined behaviour if result can't be
represented.
v2: use get_unsigned() since network namespace is really an unsigned value.
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
IP6_TNL_F_ALLOW_LOCAL_REMOTE allows tunnel traffic on ip6tnl devices
where the remote endpoint is a local host address.
Specifying "[no]allow-localremote" controls the
IP6_TNL_F_ALLOW_LOCAL_REMOTE flag on ip6tnl interfaces.
This is the user-space counterpart for kernel
commit 908d140a87a7 ("ip6_tunnel: Allow rcv/xmit even if remote address is a local address")
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
This config maps to IFLA_BRPORT_VLAN_TUNNEL bridge port netlink
flag attribute. This flag enables vlan to tunnel mapping on a bridge
port. It is off by default.
set vlan_tunnel attribute on bridge port vxlan0:
$ip link set dev vxlan0 type bridge_slave vlan_tunnel on
$ip link set dev vxlan0 type bridge_slave vlan_tunnel off
or via bridge command
$bridge link set dev vxlan0 vlan_tunnel on
$bridge link set dev vxlan0 vlan_tunnel off
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
This patch changes ife_prio to ife_tcindex which is right variable to
assign in the argument in this case.
Signed-off-by: Alexander Aring <aring@mojatatu.com>