This patch adds support to lookup a neigh entry
using recently added support in the kernel using RTM_GETNEIGH
example:
$ip neigh get 10.0.2.4 dev test-dummy0
10.0.2.4 dev test-dummy0 lladdr de:ad:be:ef:13:37 PERMANENT
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Tested-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch adds support to lookup a bridge fdb entry
using recently added support in the kernel using RTM_GETNEIGH
(and AF_BRIDGE family).
example:
$bridge fdb get 02:02:00:00:00:03 dev test-dummy0 vlan 1002
02:02:00:00:00:03 dev test-dummy0 vlan 1002 master bridge
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Tested-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The man page of ip-macsec and the existance of the tool makes it seem like
the user could just configure static keys once, and be done with it. That is
not the case. Some form or key management must be done in user space.
Add a note about that.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Conflicts:
devlink/devlink.c
Fixed the conflict by updating the numbering for all new attributes
after the ones in master branch.
Signed-off-by: David Ahern <dsahern@gmail.com>
After commit 6df9c7a06a ("ss: add SK_MEMINFO_DROPS display") ss -m
displays also a drop counter for each socket.
This commit properly document it into the man page.
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Listen to status notifications coming from kernel during flashing and
put them on stdout to inform user about the status.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add document of setting the adaptive-moderation for the ib device.
Signed-off-by: Yamin Friedman <yaminf@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Add the command alias "change" to man page.
Don't show it on usage, since it is not commonly used.
Reported-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Matteo Croce <mcroce@redhat.com>
Add document of accessing the QP counter, including bind/unbind a QP
to a counter manually or automatically, and dump counter statistics.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add documentation for the latest options, flags and txtime-delay, to the
taprio manpage.
This also adds an example to run tc in txtime offload mode.
Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Document the newly added option (skip_sock_check) on the etf man-page.
Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add a man page describing the newly added TC mpls manipulation actions.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
ss by default shows data rates in human-readable form - as Mbps/Gbps etc.
Enhance --numeric mode to show raw values in bps, without conversion.
Signed-of-by: Tomasz Torcz <tomasz.torcz@nordea.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
URL for netem page on sources section points to a no more existent
resource. Fix this using the correct URL.
Fixes: cd72dcf13c ("netem: add man-page")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add a new parameter '-Numeric' to show the number of protocol, scope,
dsfield, etc directly instead of converting it to human readable name.
Do the same on tc and ss.
This patch is based on David Ahern's previous patch.
Suggested-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add nhid option for routes to use nexthop objects by id.
Example:
$ ip nexthop add id 1 via 10.99.1.2 dev veth1
$ ip route add 10.100.1.0/24 nhid 1
$ ip route ls
...
10.100.1.0/24 nhid 1 via 10.99.1.2 dev veth1
Signed-off-by: David Ahern <dsahern@gmail.com>
ctinfo is a tc action restoring data stored in conntrack marks to
various fields. At present it has two independent modes of operation,
restoration of DSCP into IPv4/v6 diffserv and restoration of conntrack
marks into packet skb marks.
It understands a number of parameters specific to this action in
additional to the usual action syntax. Each operating mode is
independent of the other so all options are optional, however not
specifying at least one mode is a bit pointless.
Usage: ... ctinfo [dscp mask [statemask]] [cpmark [mask]] [zone ZONE]
[CONTROL] [index <INDEX>]
DSCP mode
dscp enables copying of a DSCP stored in the conntrack mark into the
ipv4/v6 diffserv field. The mask is a 32bit field and specifies where
in the conntrack mark the DSCP value is located. It must be 6
contiguous bits long. eg. 0xfc000000 would restore the DSCP from the
upper 6 bits of the conntrack mark.
The DSCP copying may be optionally controlled by a statemask. The
statemask is a 32bit field, usually with a single bit set and must not
overlap the dscp mask. The DSCP restore operation will only take place
if the corresponding bit/s in conntrack mark ANDed with the statemask
yield a non zero result.
eg. dscp 0xfc000000 0x01000000 would retrieve the DSCP from the top 6
bits, whilst using bit 25 as a flag to do so. Bit 26 is unused in this
example.
CPMARK mode
cpmark enables copying of the conntrack mark to the packet skb mark. In
this mode it is completely equivalent to the existing act_connmark
action. Additional functionality is provided by the optional mask
parameter, whereby the stored conntrack mark is logically ANDed with the
cpmark mask before being stored into skb mark. This allows shared usage
of the conntrack mark between applications.
eg. cpmark 0x00ffffff would restore only the lower 24 bits of the
conntrack mark, thus may be useful in the event that the upper 8 bits
are used by the DSCP function.
Usage: ... ctinfo [dscp mask [statemask]] [cpmark [mask]] [zone ZONE]
[CONTROL] [index <INDEX>]
where :
dscp MASK is the bitmask to restore DSCP
STATEMASK is the bitmask to determine conditional restoring
cpmark MASK mask applied to restored packet mark
ZONE is the conntrack zone
CONTROL := reclassify | pipe | drop | continue | ok |
goto chain <CHAIN_INDEX>
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
while at it, fix missing square bracket near 'ptype' and a typo in the
action description (it's -> its).
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add man page to describe additional set netns command
for rdma device.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Allow to limit 'ip xfrm {state|policy} list' output to a certain address
family and to delete all states/policies by family.
Although preferred_family was already set in filters, the filter
function ignored it. To enable filtering despite the lack of other
selectors, filter.use has to be set if family is not AF_UNSPEC.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch updates the tc-bpf.8 example application for changes to the
struct bpf_elf_map definition. In it's current form, things compile, but
the resulting object file is rejected by the verifier when attempting to
load it through tc.
Signed-off-by: Lucas Siba <lucas.siba@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
[ dropped the unnecessary flags initialization on commit ]
This patch adds support for the VLAN bridge binding flag that is
provided in net-next kernel by the series merged by 1ab839281cf7
("net-support-binding-vlan-dev-link-state-to-vlan-member-bridge-ports")
Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch adds support for binding FOU ports using iproute2.
Kernel-support was added in 1713cb37bf67 ("fou: Support binding FoU
socket").
The parse function now handles new arguments for setting the
binding-related attributes, while the print function writes the new
attributes if they are set. Also, the man page has been updated.
v2->v3:
* Remove redundant ll_init_map()-calls (thanks David Ahern).
v1->v2 (all changes suggested by David Ahern):
* Fix reverse Christmas tree ordering.
* Remove redundant peer_port_set-variable, it is enough to check
peer_port.
* Add proper error handling of invalid local/peer addresses.
* Use interface name and not index.
* Remove updating fou-header file, it is already done.
Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add support for manipulating and showing the vlan_stats_per_port bridge
option which can be toggled only when there are no port VLANs
configured. Also update the man page with the new option.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Interfaces take a 'if_id' which is an interface id which can be set on
an xfrm policy as its interface lookup key (XFRMA_IF_ID).
Signed-off-by: Matt Ellison <matt@arroyo.io>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This adds support for the newly added fwmark option to CAKE, which allows
overriding the tin selection from the per-packet firewall marks. The fwmark
field is a bitmask that is applied to the fwmark to select the tin.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Update the 'rdma link' man page with 'link add/delete' info.
Signed-off-by: Steve Wise <larrystevenwise@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Update man page to reflect the changes made in Linux.
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add a man page describing tipc link broadcast command get and set
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David Ahern <dsahern@gmail.com>
This adds configuration for the IFLA_BRPORT_MCAST_TO_UCAST flag that
allows multicast packets to be replicated as unicast packets.
Signed-off-by: Tobias Jungel <tobias.jungel@bisdn.de>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
We already print src_vni for a fdb entry when present.
This patch adds the ability to set src_vni on a fdb
entry. When not specified, kernel will use vni specified
on the vxlan device. This can be used on a vxlan fdb entry
when the vxlan device is in external or collect metadata
mode.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Default colors are not contrast enough on dark backround
and this functionality, which uses more suitable colors
is hidden in the code.
Signed-off-by: Petr Vorel <pvorel@suse.cz>
Add a man page describing devlink health's command set. Also add a
reference link from devlink main man page.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add new command for updating flash of devices via devlink API.
Example:
$ cp flash-boot.bin /lib/firmware/
$ devlink dev flash pci/0000:05:00.0 file flash-boot.bin
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch simply changes the description of the mcast_flood flag
with "flood" instead of "be flooded with" to avoid confusion, and be
consistent with the description of the flooding flag, which "Controls
whether a given port will *flood* unicast traffic for which there is
no FDB entry."
At the same time, fix the documentation for the "flood" flag which
is incorrectly described as "flooding on" or "flooding off".
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Also show socket class_id/priority used by classful qdisc.
Kernel report this together with tclass since commit
("inet_diag: fix reporting cgroup classid and fallback to priority")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
AF_XDP is an address family that is optimized for high performance
packet processing.
This patch adds AF_XDP support to ss(8) so that sockets can be queried
and monitored.
Example:
$ sudo ss --xdp -e -p -m
Recv-Q Send-Q Local Address:Port Peer Address:Port
0 0 enp134s0f0:q20 *
users:(("xdpsock",pid=17787,fd=3)) ino:39424 sk:4
rx(entries:2048)
tx(entries:2048)
umem(id:1,size:8388608,num_pages:2048,chunk_size:2048,headroom:0,ifindex:7,
qid:20,zc:0,refs:1)
fr(entries:2048)
cr(entries:2048) skmem:(r0,rb212992,t0,tb212992,f0,w0,o0,bl0,d0)
0 0 enp24s0f0:q0 *
users:(("xdpsock",pid=17780,fd=3)) ino:37384 sk:5
rx(entries:2048)
tx(entries:2048)
umem(id:0,size:8388608,num_pages:2048,chunk_size:2048,headroom:0,ifindex:6,
qid:0,zc:1,refs:1)
fr(entries:2048)
cr(entries:2048) skmem:(r0,rb212992,t0,tb212992,f0,w0,o0,bl0,d0)
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
ip tracks namespaces with dummy files in /var/run/netns/, but can't see
namespaces created with other tools.
Creating the dummy file and bind mounting the correct procfs entry will
make ip aware of that namespace.
Add an ip netns subcommand to automate this task.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Reviewed-by: Andrea Claudi <aclaudi@redhat.com>
Tested-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
ip l add dev tun type gretap external
ip r a 10.0.0.1 encap ip dst 192.168.152.171 id 1000 dev gretap
For gretap example when the command set the id but don't set the
TUNNEL_KEY flags. There is no key field in the send packet
User can set flags with key, csum, seq
ip r a 10.0.0.1 encap ip dst 192.168.152.171 id 1000 key csum dev gretap
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Change the id parameter of the tunnel_key set action from mandatory to
optional.
Some tunneling protocols (e.g. GRE) specify the id as an optional field.
Signed-off-by: Adi Nissim <adin@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
ip xfrm state show currently dumps keys unconditionally. This limits its
use in logging, as security information can be leaked.
This patch adds a nokeys option to ip xfrm ( state show | monitor ), which
prevents the printing of keys. This allows ip xfrm state show to be used
in logging without exposing keys.
Signed-off-by: Benedict Wong <benedictwong@google.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Pass the same parameters Lintian uses in Debian.
$ make check
<...>
Checking manpages for syntax errors...
<standard input>:48: warning: macro `Q' not defined
Error in tc-taprio.8
Makefile:27: recipe for target 'check' failed
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
.Q does not exist so groff complains and the "queues" word is actually
not displayed.
Fixes: 579acb4bc5 ("taprio: Add manpage for tc-taprio(8)")
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
groff stiff complains about unbreakable lines:
96: warning [p 2, 3.0i]: can't break line
Indent it some more.
Fixes: 7f5047524c ("man: ss.8: break and indent long line")
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
ip rule add from all iif gretap tun_id 2000 lookup 200
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
ip/rtpr mentioned in man as bash script is actually posix shell script
(doesn't require to use bash).
Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
DECnet belongs in the history museum of dead protocols along
with Appletalk and IPX.
Linux support has outlived its natural life and the time has
come to remove it from iproute2. Dead code is a source
of bugs and exploits.
If anyone actually has DECnet running on some old distribution
they can just keep to the old version of iproute2.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Added support for filtering based on port ranges.
UAPI changes have been accepted into net-next.
Example:
1. Match on a port range:
-------------------------
$ tc filter add dev enp4s0 protocol ip parent ffff:\
prio 1 flower ip_proto tcp dst_port 20-30 skip_hw\
action drop
$ tc -s filter show dev enp4s0 parent ffff:
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
eth_type ipv4
ip_proto tcp
dst_port 20-30
skip_hw
not_in_hw
action order 1: gact action drop
random type none pass val 0
index 1 ref 1 bind 1 installed 85 sec used 3 sec
Action statistics:
Sent 460 bytes 10 pkt (dropped 10, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
2. Match on IP address and port range:
--------------------------------------
$ tc filter add dev enp4s0 protocol ip parent ffff:\
prio 1 flower dst_ip 192.168.1.1 ip_proto tcp dst_port 100-200\
skip_hw action drop
$ tc -s filter show dev enp4s0 parent ffff:
filter protocol ip pref 1 flower chain 0 handle 0x2
eth_type ipv4
ip_proto tcp
dst_ip 192.168.1.1
dst_port 100-200
skip_hw
not_in_hw
action order 1: gact action drop
random type none pass val 0
index 2 ref 1 bind 1 installed 58 sec used 2 sec
Action statistics:
Sent 920 bytes 20 pkt (dropped 20, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
v6:
Modified to change json output format as object for sport/dport.
"dst_port":{
"start":2000,
"end":6000
},
"src_port":{
"start":50,
"end":60
}
v5:
Simplified some code and used 'sscanf' for parsing. Removed
space in output format.
v4:
Added man updates explaining filtering based on port ranges.
Removed 'range' keyword.
v3:
Modified flower_port_range_attr_type calls.
v2:
Addressed Jiri's comment to sync output format with input
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The different encapsulation types are described in ENCAP_*
non-terminals, but ENCAP definition lists them without the ENCAP_
prefix. Fix this for consistency.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
All rdma-related man pages list each other in SEE ALSO section, only
rdma-resource.8 is missing. Add it for the sake of consistency.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
IPX has been depracted then removed from upstream kernels.
Drop support from ip route as well.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
When disabling a flag, one needs to AND with the inverse not the flag
itself. Otherwise specifying for instance 'home -nodad' will effectively
clear the flags variable.
While being at it, simplify the code a bit by merging common parts of
negated and non-negated case branches. Also allow for the "special
cases" to be inverted, too.
Fixes: f73ac674d0 ("ip: change flag names to an array")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add a note to 'nexthop' description stating the maximum number of
nexthops per command and pointing at 'append' command as a workaround.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Allow to set the DF bit behaviour for outgoing IPv4 packets: it can be
always on, inherited from the inner header, or, by default, always off,
which is the current behaviour.
v2:
- Indicate in the man page what DF refers to, using RFC 791 wording
(David Ahern)
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Allow to set the DF bit behaviour for outgoing IPv4 packets: it can be
always on, inherited from the inner header, or, by default, always off,
which is the current behaviour.
v2:
- Indicate in the man page what DF refers to, using RFC 791 wording
(David Ahern)
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The region field was not added to the devlink man page.
Fixes: 8b4fbf0bed ("devlink: Add support for devlink-region access")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
". If" gets interpreted as a macro, so move the period to the previous
line:
33: warning: macro `If' not defined
Fixes: 141b55f854 ("Add SKB Priority qdisc support in tc(8)")
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Fixes groff warning:
ss.8 92: warning [p 2, 2.8i]: can't break line
And makes the line also more readable.
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Currently when we add geneve with "ttl inherit", we only set ttl to 0, which
is actually use whatever default value instead of inherit the inner protocol's
ttl value.
To make a difference with ttl inherit and ttl == 0, we add an attribute
IFLA_GENEVE_TTL_INHERIT in kernel commit 52d0d404d39dd ("geneve: add ttl
inherit support"). Now let's use "ttl inherit" to inherit the inner
protocol's ttl, and use "ttl auto" to means "use whatever default value",
the same behavior with ttl == 0.
v2:
1) remove IFLA_GENEVE_TTL_INHERIT defination in if_link.h as it's already
updated.
2) Still use addattr8() so we can enable/disable ttl inherit, as Michal
suggested.
v3: Update man page
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch adds support for the new backup port option that can be set
on a bridge port. If the port's carrier goes down all of the traffic
gets redirected to the configured backup port. We add the following new
arguments:
$ ip link set dev brport type bridge_slave backup_port brport2
$ ip link set dev brport type bridge_slave nobackup_port
$ bridge link set dev brport backup_port brport2
$ bridge link set dev brport nobackup_port
The man pages are updated respectively.
Also 2 minor style adjustments:
- add missing space to bridge man page's state argument
- use lower starting case for vlan_tunnel in ip-link man page (to be
consistent with the rest)
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Adds new option extern_learn to set NTF_EXT_LEARNED flag
on neigh entries.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This documents the parameters and provides an example of usage.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Allow matching on options in Geneve tunnel headers.
The options can be described in the form
CLASS:TYPE:DATA/CLASS_MASK:TYPE_MASK:DATA_MASK, where CLASS is
represented as a 16bit hexadecimal value, TYPE as an 8bit
hexadecimal value and DATA as a variable length hexadecimal value.
e.g.
# ip link add name geneve0 type geneve dstport 0 external
# tc qdisc add dev geneve0 ingress
# tc filter add dev geneve0 protocol ip parent ffff: \
flower \
enc_src_ip 10.0.99.192 \
enc_dst_ip 10.0.99.193 \
enc_key_id 11 \
geneve_opts 0102:80:1122334421314151/ffff:ff:ffffffffffffffff \
ip_proto udp \
action mirred egress redirect dev eth1
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
While at it also add missing text for proxy in the man page.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add support for the new sticky flag that can be set on fdbs and update the
man page.
CC: David Ahern <dsahern@gmail.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
IPVLAN and IPVTAP are using the same functions and parameters. So we can
just add a new link_util with id ipvtap. Others are the same.
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Conflicts:
ip/iproute_lwtunnel.c
In addition to merge conflict between bd59e5b151 and 94a8722f2f,
updated the code added by the latter commit based on the change of the
former (ie., added ret = to the new rta_addattr_l).
Signed-off-by: David Ahern <dsahern@gmail.com>
Extend slotting with support for non-uniform distributions. This is
similar to netem's non-uniform distribution delay feature.
Syntax:
slot distribution DISTRIBUTION DELAY JITTER [packets MAX_PACKETS] \
[bytes MAX_BYTES]
The syntax and use of the distribution table is the same as in the
non-uniform distribution delay feature. A file DISTRIBUTION must be
present in TC_LIB_DIR (e.g. /usr/lib/tc) containing numbers scaled by
NETEM_DIST_SCALE. A random value x is selected from the table and it
takes DELAY + ( x * JITTER ) as delay. Correlation between values is not
supported.
Examples:
Normal distribution delay with mean = 800us and stdev = 100us.
> tc qdisc add dev eth0 root netem slot distribution normal \
800us 100us
Optionally set the max slot size in bytes and/or packets.
> tc qdisc add dev eth0 root netem slot distribution normal \
800us 100us bytes 64k packets 42
Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Slotting is a crude approximation of the behaviors of shared media such
as cable, wifi, and LTE, which gather up a bunch of packets within a
varying delay window and deliver them, relative to that, nearly all at
once.
It works within the existing loss, duplication, jitter and delay
parameters of netem. Some amount of inherent latency must be specified,
regardless.
The new "slot" parameter specifies a minimum and maximum delay between
transmission attempts.
The "bytes" and "packets" parameters can be used to limit the amount of
information transferred per slot.
Examples of use:
tc qdisc add dev eth0 root netem delay 200us \
slot 800us 10ms bytes 64k packets 42
A more correct example, using stacked netem instances and a packet limit
to emulate a tail drop wifi queue with slots and variable packet
delivery, with a 200Mbit isochronous underlying rate, and 20ms path
delay:
tc qdisc add dev eth0 root handle 1: netem delay 20ms rate 200mbit \
limit 10000
tc qdisc add dev eth0 parent 1:1 handle 10:1 netem delay 200us \
slot 800us 10ms bytes 64k packets 42 limit 512
Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Since CAKE now has three different settings that can be overridden by tc
filters (priority and host and flow hashes), documenting how they work is
probably a good idea.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Allow for -color={never,auto,always} to have colored output disabled,
enabled only if stdout is a terminal or enabled regardless of stdout
state.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add missing --pretty and --json options, correct --zero to --zeros and
correct the mess around --scan/--interval including broken man page
formatting.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This was the only bit missing in comparison to devlink help text.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Versioning scheme of Linux and iproute2 is similar, therefore the
referenced kernel versions are likely to confuse readers. Clarify this
by prefixing each kernel version by 'Linux' prefix.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
sch_skbprio is a qdisc that prioritizes packets according to their skb->priority
field. Under congestion, it drops already-enqueued lower priority packets to
make space available for higher priority packets. Skbprio was conceived as a
solution for denial-of-service defenses that need to route packets with
different priorities as a means to overcome DoS attacks.
Signed-off-by: Nishanth Devarajan <ndev2021@gmail.com>
Reviewed-by: Michel Machado <michel@digirati.com.br>
Signed-off-by: David Ahern <dsahern@gmail.com>
Update the list of architectures supporting eBPF JIT as of Linux 4.18.
Also mention the Linux version where support for a particular
architecture was introduced. Finally, reformat the list of architectures
as a bullet list in order to make it more readable.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch makes sch_cake's gso/gro splitting configurable
from userspace.
To disable breaking apart superpackets in sch_cake:
tc qdisc replace dev whatever root cake no-split-gso
to enable:
tc qdisc replace dev whatever root cake split-gso
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add matching on tos/ttl of the IP tunnel headers.
For example, here's decap rule that matches on the tunnel tos:
tc filter add dev vxlan_sys_4789 protocol ip parent ffff: prio 10 flower \
enc_src_ip 192.168.10.2 enc_dst_ip 192.168.10.1 enc_key_id 100 enc_dst_port 4789 enc_tos 0x30 \
src_mac e4:11:22:33:44:70 dst_mac e4:11:22:33:44:50 \
action tunnel_key unset \
action mirred egress redirect dev eth0_0
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Allow to set tos and ttl for the tunnel.
For example, here's encap rule that sets tos to the tunnel:
tc filter add dev eth0_0 protocol ip parent ffff: prio 10 flower \
src_mac e4:11:22:33:44:50 dst_mac e4:11:22:33:44:70 \
action tunnel_key set src_ip 192.168.10.1 dst_ip 192.168.10.2 id 100 dst_port 4789 tos 0x30 \
action mirred egress redirect dev vxlan_sys_4789
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This is consistent with the other multi-word parameters. Also change the
JSON output to be consistent with way it is formatted for the other
options.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David Ahern <dsahern@gmail.com>
sch_cake is intended to squeeze the most bandwidth and latency out of even
the slowest ISP links and routers, while presenting an API simple enough
that even an ISP can configure it.
Example of use on a cable ISP uplink:
tc qdisc add dev eth0 cake bandwidth 20Mbit nat docsis ack-filter
To shape a cable download link (ifb and tc-mirred setup elided)
tc qdisc add dev ifb0 cake bandwidth 200mbit nat docsis ingress wash besteffort
Cake is filled with:
* A hybrid Codel/Blue AQM algorithm, "Cobalt", tied to an FQ_Codel
derived Flow Queuing system, which autoconfigures based on the bandwidth.
* A novel "triple-isolate" mode (the default) which balances per-host
and per-flow FQ even through NAT.
* An deficit based shaper, that can also be used in an unlimited mode.
* 8 way set associative hashing to reduce flow collisions to a minimum.
* A reasonable interpretation of various diffserv latency/loss tradeoffs.
* Support for zeroing diffserv markings for entering and exiting traffic.
* Support for interacting well with Docsis 3.0 shaper framing.
* Support for DSL framing types and shapers.
* Support for ack filtering.
* Extensive statistics for measuring, loss, ecn markings, latency variation.
Various versions baking have been available as an out of tree build for
kernel versions going back to 3.10, as the embedded router world has been
running a few years behind mainline Linux. A stable version has been
generally available on lede-17.01 and later.
sch_cake replaces a combination of iptables, tc filter, htb and fq_codel
in the sqm-scripts, with sane defaults and vastly simpler configuration.
Cake's principal author is Jonathan Morton, with contributions from
Kevin Darbyshire-Bryant, Toke Høiland-Jørgensen, Sebastian Moeller,
Ryan Mounce, Tony Ambardar, Dean Scarff, Nils Andreas Svee, Dave Täht,
and Loganaden Velvindron.
Testing from Pete Heist, Georgios Amanakis, and the many other members of
the cake@lists.bufferbloat.net mailing list.
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David Ahern <dsahern@gmail.com>
Devlink region allows access to driver defined address regions.
Each device can create its supported address regions and register
them. A device which exposes a region will allow access to it
using devlink.
This support allows reading and dumping regions snapshots as well
as presenting information such as region size and current available
snapshots.
A snapshot represents a memory image of a region taken by the driver.
If a device collects a snapshot of an address region it can be later
exposed using devlink region read or dump commands.
This functionality allows for future analyses on the snapshots.
The dump command is designed to read the full address space of a
region or of a snapshot unlike the read command which allows
reading only a specific section in a region/snapshot indicated by
an address and a length, current support is for reading and dumping
for a previously taken snapshot ID.
New commands added:
devlink region show [ DEV/REGION ]
devlink region delete DEV/REGION snapshot SNAPSHOT_ID
devlink region dump DEV/REGION [ snapshot SNAPSHOT_ID ]
devlink region read DEV/REGION [ snapshot SNAPSHOT_ID ]
address ADDRESS length length
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add an initial manpage for tc-etf covering all config options, basic
concepts and operation modes.
Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Fix 2 typos on the man page of the CBS qdisc.
Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Allow setting tunnel options using the act_tunnel_key action.
Options are expressed as class:type:data and multiple options
may be listed using a comma delimiter.
# ip link add name geneve0 type geneve dstport 0 external
# tc qdisc add dev eth0 ingress
# tc filter add dev eth0 protocol ip parent ffff: \
flower indev eth0 \
ip_proto udp \
action tunnel_key \
set src_ip 10.0.99.192 \
dst_ip 10.0.99.193 \
dst_port 6081 \
id 11 \
geneve_opts 0102:80:00800022,0102:80:00800022 \
action mirred egress redirect dev geneve0
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add support for configuration parameters set and show.
Each parameter can be either generic or driver-specific.
The user can retrieve data on these configuration parameters by devlink
param show command and can set new value to a configuration parameter
by devlink param set command.
The configuration parameters can be set in different configuration
modes:
runtime - set while driver is running, no reset required.
driverinit - applied while driver initializes, requires restart
driver by devlink reload command.
permanent - written to device's non-volatile memory, hard reset
required to apply.
New commands added:
devlink dev param show [DEV name PARAMETER]
devlink dev param set DEV name PARAMETER value VALUE
cmode { permanent | driverinit | runtime }
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch adds support for the new isolated port option which, if set,
would allow the isolated ports to communicate only with non-isolated
ports and the bridge device. The option can be set via the bridge or ip
link type bridge_slave commands, e.g.:
$ ip link set dev eth0 type bridge_slave isolated on
$ bridge link set dev eth0 isolated on
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch adds support for OUTPUT_MARK in xfrm state to exercise the
functionality added by kernel commit 077fbac405bf
("net: xfrm: support setting an output mark.").
Sample output-
(with mark and output-mark)
src 192.168.1.1 dst 192.168.1.2
proto esp spi 0x00004321 reqid 0 mode tunnel
replay-window 0 flag af-unspec
mark 0x10000/0x3ffff output-mark 0x20000
auth-trunc xcbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b211 96
enc cbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b233
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
(with mark only)
src 192.168.1.1 dst 192.168.1.2
proto esp spi 0x00004321 reqid 0 mode tunnel
replay-window 0 flag af-unspec
mark 0x10000/0x3ffff
auth-trunc xcbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b211 96
enc cbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b233
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
(with output-mark only)
src 192.168.1.1 dst 192.168.1.2
proto esp spi 0x00004321 reqid 0 mode tunnel
replay-window 0 flag af-unspec
output-mark 0x20000
auth-trunc xcbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b211 96
enc cbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b233
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
(no mark and output-mark)
src 192.168.1.1 dst 192.168.1.2
proto esp spi 0x00004321 reqid 0 mode tunnel
replay-window 0 flag af-unspec
auth-trunc xcbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b211 96
enc cbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b233
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
v1->v2: Moved the XFRMA_OUTPUT_MARK print after XFRMA_MARK in
xfrm_xfrma_print() as mentioned by Lorenzo
v2->v3: Fix one help formatting error as mentioned by Lorenzo.
Keep mark and output-mark on the same line and add man page info as
mentioned by David.
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch adds basic support for Qualcomm rmnet devices.
Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Update the man pages for the resource attributes as well
as the driver-specific attributes.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Currently there is no way to log offloading errors if the rule is not
explicitly marked as skip_sw, making it hard for other applications such
as Open vSwitch to log why a given could not be offloaded.
This patch adds support for signaling the kernel that more verbose
logging is wanted, which now will include such messages.
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
As the kernel code says, limit is actually the amount of packets it can
hold queued at a time, as per:
static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
struct sk_buff **to_free)
{
...
if (unlikely(sch->q.qlen >= sch->limit))
return qdisc_drop_all(skb, sch, to_free);
So lets fix the description of the field in the man page.
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Users have reported a regression due to ip now dropping capabilities
unconditionally.
zerotier-one VPN and VirtualBox use ambient capabilities in their
binary and then fork out to ip to set routes and links, and this
does not work anymore.
As a workaround, do not drop caps if CAP_NET_ADMIN (the most common
capability used by ip) is set with the INHERITABLE flag.
Users that want ip vrf exec to work do not need to set INHERITABLE,
which will then only set when the calling program had privileges to
give itself the ambient capability.
Fixes: ba2fc55b99 ("Drop capabilities if not running ip exec vrf with libcap")
Signed-off-by: Luca Boccassi <bluca@debian.org>
Currently, iproute allows setting those flags, but it's impossible to
clear them, since their current value is fetched from the kernel and
then we OR in the additional flags passed on the command line.
Add no* variants to allow clearing them.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David Ahern <dsahern@gmail.com>
GRE tunnels are currently only documented together with IPIP and SIT
tunnels, but they actually have very different configuration
options. Let's separate them.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David Ahern <dsahern@gmail.com>
Ignore options "peer-offset" and "offset" when creating sessions. Keep
them when dumping sessions in order to avoid breaking external scripts.
"peer-offset" has always been a noop in iproute2. "offset" is now
ignored in Linux 4.16 (and was broken before that).
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add initial support for oneline mode in tc; actions, filters and qdiscs
will be gradually updated in the follow-up patches.
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The original problem was that a simple call to 'ss' leads to loading of
sctp_diag kernel module which might not be desired. While searching for
a workaround, it became clear how inconvenient it is to exclude a single
socket table from being queried.
This patch allows to prefix an item passed to '-A' parameter with an
exclamation mark to inverse its meaning.
Signed-off-by: Phil Sutter <phil@nwl.cc>
ip vrf exec requires root or CAP_NET_ADMIN, CAP_SYS_ADMIN and
CAP_DAC_OVERRIDE. It is not possible to run unprivileged commands like
ping as non-root or non-cap-enabled due to this requirement.
To allow users and administrators to safely add the required
capabilities to the binary, drop all capabilities on start if not
invoked with "vrf exec".
Update the manpage with the requirements.
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This has to be a second match statement to the same u32 filter, not a
second one (which tc-filter doesn't support at all).
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
NTF_EXT_LEARNED can be set by a user on bridge fdb entry.
Provide a bridge command option to allow a user to set
NTF_EXT_LEARNED on a bridge fdb entry.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add missing documentation of the memory_limit fq_codel parameter and the
ce_threshold codel and fq_codel parameters.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David Ahern <dsahern@gmail.com>
Conflicts:
bridge/mdb.c
Updated bridge/bridge.c per removal of check_if_color_enabled by commit
1ca4341d2c ("color: disable color when json output is requested")
Signed-off-by: David Ahern <dsahern@gmail.com>
add support to match on ip_proto, sport and dport ranges.
For ip_proto, this patch currently enumerates, tcp, udp and sctp.
This list can be extended in the future.
example:
$ip rule add sport 666-777 dport 999 ip_proto tcp table 100
$ip rule show
0: from all lookup local
32765: from all ip_proto 6 sport 666-777 dport 999 lookup 100
32766: from all lookup main
32767: from all lookup default
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Modify 'ip rule' command to notice when the kernel passes
to us the originating protocol.
Add code to allow the `ip rule flush protocol XXX`
command to be accepted and properly handled.
Modify the documentation to reflect these code changes.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The commit calls a new tc ematch for using netfilter xtable matches.
This allows early classification as well as mirroning/redirecting traffic
based on logic implemented in netfilter extensions.
Current supported use case is classification based on the incoming IPSec
state used during decpsulation using the 'policy' iptables extension
(xt_policy).
The matcher uses libxtables for parsing the input parameters.
Example use for matching an IPSec state with reqid 1:
tc qdisc add dev eth0 ingress
tc filter add dev eth0 protocol ip parent ffff: \
basic match 'ipt(-m policy --dir in --pol ipsec --reqid 1)' \
action drop
This is the user-space counter part of kernel commit ccc007e4a746
("net: sched: add em_ipt ematch for calling xtables matches")
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
For IP-in-IP tunnels, one can specify the [no]allow-localremote command
when configuring a device. Under the hood, this flips the
IP6_TNL_F_ALLOW_LOCAL_REMOTE flag on the netdevice. However, ip6gretap
and ip6erspan devices, where the flag is also relevant, are not IP-in-IP
tunnels, and thus there's no way to configure the flag on these
netdevices. Therefore introduce the command to link_gre6 as well.
The original support was introduced in commit 21440d19d9
("ip: link_ip6tnl.c/ip6tunnel.c: Support IP6_TNL_F_ALLOW_LOCAL_REMOTE flag")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Implement an option (-b) to execute RDMAtool commands
from supplied file. This follows the same model as
in use for ip and devlink tools, by expecting
every new command to be on new line.
These commands are expected to be without any -*
(e.g. -d, -j, e.t.c) global flags, which should be
called externally.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Implement the -color option; in this case -co is ambiguous
since it was already used for -conf.
For now this just means putting device name in color.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Document color option, and no longer have restriction on json
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Make bridge work like other iproute2 commands and accept
same json and pretty flags.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add description for -json and -pretty options.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
If the kernel receives a negative nsid it will automatically assign
the next available nsid. In this case alloc_netid() will set min and
max to 0 for ird_alloc(). And when max == 0 idr_alloc() will interpret
this as the maximum range, i.e. specific to nsids it will try to find
an id in the range [0,INT_MAX). This is intentionally supported in the
kernel for nsids.
Commit acbe9118ce ("ip netns: use strtol() instead of atoi()")
regressed ip netns in that respect although previously the use-case
was either accidentally supported or opaquely supported such that it
triggered the original commit. From what I can gather it went as
follows before: atoi() was called with a string indicating a negative
value which caused it to return -1 which was passed to the
kernel. Let's make it less opaque by introducing the keyword "auto":
ip netns set <netns-name> auto
will cause nsid to be set to -1 and the kernel will select an available
nsid.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Spartan version of resource tracking documentation.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
During qdisc creation it is possible to specify shared block for bot
ingress and egress. Pass this values to kernel according to the command
line options.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
So far, qdisc was the only handle that could be used to manipulate
filters. Kernel added support for using block to manipulate it. So add
the support to use block index to manipulate filters. The magic
TCM_IFINDEX_MAGIC_BLOCK indicates the block index is in use.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Zero value in min/max_tx_rate has a special meaning of no rate limit,
document it.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
netdevsim is a new software device for testing kernel APIs
without any hardware attached. Allow users to create such
devices.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Lintian detected the following formatting errors:
man/man8/devlink-sb.8.gz 230: warning: macro `b' not defined
man/man8/ip-link.8.gz 1243: warning: macro `in-8' not defined
(possibly missing space after `in')
man/man8/tc-u32.8.gz `R' is a string (producing the registered sign),
not a macro.
Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The filesytem paths to these scripts might be different on various
distros, so don't mention it in the manpages. It is not really useful
information anyway.
Originally submitted as Debian bug:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=561424
Reported-by: jidanni@jidanni.org
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Trying to set a label longer than 15 characters returns an error:
RTNETLINK answers: Numerical result out of range
Document the limit in the manpage.
Originally reported as a Debian bug:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=661886
Reported-by: Gabor Kiss <kissg@ssg.ki.iif.hu>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
A Debian user suggested adding more network-related keywords to the
ip manpage, so that manpage-scraping and indexing software like
apropos can do a better job of categorizing the programs.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=877983
Suggested-by: Lynoure Braakman <lynoure@gmail.com>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Documentation should be distribution-agnostic - any specific quirks
should be handled by downstream maintainers, if necessary.
Remove mentions of Debian paths and package names.
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The patch adds support for configuring the erspan v2, for both
ipv4 and ipv6 erspan implementation. Three additional fields
are added: 'erspan_ver' for distinguishing v1 or v2, 'erspan_dir'
for specifying direction of the mirrored traffic, and 'erspan_hwid'
for users to set ERSPAN engine ID within a system.
As for manpage, the ERSPAN descriptions used to be under GRE, IPIP,
SIT Type paragraph. Since IP6GRE/IP6GRETAP also supports ERSPAN,
the patch removes the old one, creates a separate ERSPAN paragrah,
and adds an example.
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
veth and vxcan both create a vitual tunnel between a pair of virtual network
devices. This patch adds the content for the now supported vxcan netdevices
and the documentation to create peer devices for vxcan and veth.
Additional remove 'can' that accidently was on the list of link types which
can be created by 'ip link add' as 'can' devices are real network devices.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
v3:
Rebase and use out() instead of printf().
v2:
Print the path MTU immediately after the MSS, as it is easier to parse
for humans (suggested by Neal Cardwell).
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The patch adds 'external' option to support collect metadata
gre6 tunnel. The 'external' keyword is already used to set the
device into collect metadata mode such as vxlan, geneve, ipip,
etc. This patch extends support for ipv6 gre and gretap.
Example of L3 and L2 gre device:
bash:~# ip link add dev ip6gre123 type ip6gre external
bash:~# ip link add dev ip6gretap123 type ip6gretap external
Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Validate the upper limit for gso_max_size, valid range is [0-65,536]
inclusive. Fix minor whitespace in iplink man page.
Signed-off-by: Solio Sarabia <solio.sarabia@intel.com>
This allows sending GSO maximum values when configuring a device.
The values are advisory. Most devices will ignore them but for some
pseudo devices such as veth pairs they can be set.
Example:
# ip link add dev vm1 type veth peer name vm2 gso_max_size 32768
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Commit 6bbe5e6290 ("man: tc-csum.8: Fix example") changed both source
and destination IP addresses in example code but missed to update the
example's description accordingly.
Fixes: 6bbe5e6290 ("man: tc-csum.8: Fix example")
Signed-off-by: Phil Sutter <phil@nwl.cc>
For all files in iproute2 which do not have an obvious license
identification, mark them with SPDK GPL-2
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch adapts the tc command line interface to allow bandwidth limits
to be specified as a percentage of the interface's capacity.
Adding this functionality requires passing the specified device string to
each class/qdisc which changes the prototype for a couple of functions: the
.parse_qopt and .parse_copt interfaces. The device string is a required
parameter for tc-qdisc and tc-class, and when not specified, the kernel
returns ENODEV. In this patch, if the user tries to specify a bandwidth
percentage without naming the device, we return an error from userspace.
Signed-off-by: Nishanth Devarajan<ndev2021@gmail.com>
This patch adds documentation for additional offload modes and
associated parameters in tc-mqprio.
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>