Commit Graph

3411 Commits

Author SHA1 Message Date
Nikolay Aleksandrov
9208b4e7c9 bridge: add support for the multicast flood flag
Recently a new per-port flag was added which controls the flooding of
unknown multicast, this patch adds support for controlling it via iproute2.
It also updates the man pages with information about the new flag.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
2016-10-17 05:29:24 -07:00
Jakub Kicinski
87e46a5198 tc: cls_bpf: handle skip_sw and skip_hw flags
Add support for controling hardware offload using (now standard)
skip_sw and skip_hw flags in cls_bpf.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
2016-10-17 05:27:59 -07:00
Nikolay Aleksandrov
660afec25f bridge: vlan: remove wrong stats help
When I did the per-vlan stats iproute2 support, I left out a hunk from a
previous version of the patch that was using a special subcommand "stats".
Since the latest version uses the -s switch remove the help for the stats
subcommand.

Fixes: 7abf5de677 ("bridge: vlan: add support to display per-vlan statistics")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
2016-10-17 05:22:47 -07:00
Stephen Hemminger
7409334b87 ip: macvlan style cleanup
breaklong lines.
2016-10-12 15:23:27 -07:00
michael-dev@fami-braun.de
f33b727610 iproute2: macvlan: add "source" mode
Adjusting iproute2 utility to support new macvlan link type mode called
"source".

Example of commands that can be applied:
  ip link add link eth0 name macvlan0 type macvlan mode source
  ip link set link dev macvlan0 type macvlan macaddr add 00:11:11:11:11:11
  ip link set link dev macvlan0 type macvlan macaddr del 00:11:11:11:11:11
  ip link set link dev macvlan0 type macvlan macaddr flush
  ip -details link show dev macvlan0

Based on previous work of Stefan Gula <steweg@gmail.com>

Signed-off-by: Michael Braun <michael-dev@fami-braun.de>

Cc: steweg@gmail.com
2016-10-12 15:22:14 -07:00
Lucas Bates
a40995d1c7 man pages: add man page for skbmod action
Signed-off-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:21:55 -07:00
Stephen Hemminger
ec2e005fe5 tc_filter: style cleanup
Break long lines and whtespace changes.
2016-10-12 15:21:13 -07:00
Jamal Hadi Salim
120f556d15 tc filters: add support to get individual filters by handle
sudo $TC filter add dev $ETH parent ffff: prio 2 protocol ip \
u32 match u32 0 0 flowid 1:1 \
action ok
sudo $TC filter add dev $ETH parent ffff: prio 1 protocol ip \
u32 match ip protocol 1 0xff flowid 1:10 \
action ok

now dump to see all rules..
$TC -s filter ls dev $ETH parent ffff: protocol ip
 ....
filter pref 1 u32
filter pref 1 u32 fh 801: ht divisor 1
filter pref 1 u32 fh 801::800 order 2048 key ht 801 bkt 0 flowid 1:10  (rule hit 0 success 0)
  match 00010000/00ff0000 at 8 (success 0 )
        action order 1: gact action drop
         random type none pass val 0
         index 6 ref 1 bind 1 installed 4 sec used 4 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

filter pref 2 u32
filter pref 2 u32 fh 800: ht divisor 1
filter pref 2 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1  (rule hit 336 success 336)
  match 00000000/00000000 at 0 (success 336 )
        action order 1: gact action pass
         random type none pass val 0
         index 5 ref 1 bind 1 installed 38 sec used 4 sec
        Action statistics:
        Sent 24864 bytes 336 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0
 ....

..get filter 801::800
$TC -s filter get dev $ETH parent ffff: protocol ip \
handle 801:0:800 prio 2  u32

 ....
filter parent ffff: protocol ip pref 1 u32 fh 801::800 order 2048 key ht 801 bkt 0 flowid 1:10  (rule hit 260 success 130)
  match 00010000/00ff0000 at 8 (success 130 )
        action order 1: gact action drop
         random type none pass val 0
         index 6 ref 1 bind 1 installed 348 sec used 0 sec
        Action statistics:
        Sent 11440 bytes 130 pkt (dropped 130, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0
 ....

..get other one
$TC -s filter get dev $ETH parent ffff: protocol ip \
handle 800:0:800 prio 2  u32

....
filter parent ffff: protocol ip pref 2 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1  (rule hit 514 success 514)
  match 00000000/00000000 at 0 (success 514 )
        action order 1: gact action pass
         random type none pass val 0
         index 5 ref 1 bind 1 installed 506 sec used 4 sec
        Action statistics:
        Sent 35544 bytes 514 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0
....

..try something that doesnt exist
$TC -s filter get dev $ETH parent ffff: protocol ip  handle 800:0:803 prio 2  u32

.....
RTNETLINK answers: No such file or directory
We have an error talking to the kernel
.....

Note, added NLM_F_ECHO is for backward compatibility. old kernels never
before Eric's patch will not respond without it and newer kernels (after Erics patch)
will ignore it.
In old kernels there is a side effect:
In addition to a response to the GET you will receive an event (if you do tc mon).
But this is still better than what it was before (not working at all).

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:14:47 -07:00
Stephen Hemminger
557b705445 tc: skbmod style cleanup
break long lines
2016-10-12 15:12:51 -07:00
Jamal Hadi Salim
46871dc9c6 man pages: Add tc-ife to Makefile
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:09:52 -07:00
Lucas Bates
d491a3480f man pages: update ife action to include tcindex
Signed-off-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:09:52 -07:00
Jamal Hadi Salim
da65128998 actions: add skbmod action
This action is intended to be an upgrade from a usability perspective
from pedit (as well as operational debugability).
Compare this:

sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action pedit munge offset -14 u8 set 0x02 \
    munge offset -13 u8 set 0x15 \
    munge offset -12 u8 set 0x15 \
    munge offset -11 u8 set 0x15 \
    munge offset -10 u16 set 0x1515 \
    pipe

to:

sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action skbmod dmac 02:15:15:15:15:15

Or worse, try to debug a policy with destination mac, source mac and
etherype. Then make that a hundred rules and you'll get my point.

The most important ethernet use case at the moment is when redirecting or
mirroring packets to a remote machine. The dst mac address needs a re-write
so that it doesn't get dropped or confuse an interconnecting (learning) switch
or dropped by a target machine (which looks at the dst mac).

In the future common use cases on pedit can be migrated to this action
(as an example different fields in ip v4/6, transports like tcp/udp/sctp
etc). For this first cut, this allows modifying basic ethernet header.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:09:52 -07:00
Craig Dillabaugh
883c6708e4 action gact: list pipe as a valid action
Signed-off-by: Craig Dillabaugh <cdillaba@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:09:52 -07:00
Jamal Hadi Salim
8da6ff35cd actions ife: Introduce encoding and decoding of tcindex metadata
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:09:52 -07:00
Roman Mashak
1b600f4b54 ife: improve help text
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:09:52 -07:00
Roman Mashak
57ee4430f9 ife: print prio, mark and hash as unsigned
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:09:52 -07:00
Roman Mashak
9a56cca3f3 ife action: allow specifying index in hex
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2016-10-12 15:09:52 -07:00
Stephen Hemminger
e147161b1a ip: iprule style cleanup
Trivial whitespace cleanup to iprule
2016-10-09 19:29:24 -07:00
Hangbin Liu
ca89c52143 ip rule: add selector support
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
2016-10-09 19:25:59 -07:00
Hangbin Liu
cb294a1de6 ip rule: merge ip rule flush and list, save together
iprule_flush() and iprule_list_or_save() both call function
rtnl_wilddump_request() and rtnl_dump_filter(). So merge them
together just like other files do.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
2016-10-09 19:25:59 -07:00
Stephen Hemminger
6773bcc227 iplink: cleanup style errors
Fix long strings causing checkpatch warnings
2016-10-09 19:24:38 -07:00
Moshe Shemesh
56e9f0ab19 ip link: Add support to configure SR-IOV VF to vlan protocol 802.1ad (VST QinQ)
Introduce a new API that exposes a list of vlans per VF (IFLA_VF_VLAN_LIST),
giving the ability for user-space application to specify it for the VF as
an option to support 802.1ad (VST QinQ).

We introduce struct vf_vlan_info, which extends struct vf_vlan and adds
an optional VF VLAN proto parameter.
Default VLAN-protocol is 802.1Q.

Add IFLA_VF_VLAN_LIST in addition to IFLA_VF_VLAN to keep backward
compatibility with older kernel versions.

Suitable ip link tool command examples:
 - Set vf vlan protocol 802.1ad (S-TAG)
	ip link set eth0 vf 1 vlan 100 proto 802.1ad
 - Set vf vlan S-TAG and vlan C-TAG (VST QinQ)
	ip link set eth0 vf 1 vlan 100 proto 802.1ad vlan 30 proto 802.1Q
 - Set vf to VST (802.1Q) mode
	ip link set eth0 vf 1 vlan 100 proto 802.1Q
 - Or by omitting the new parameter (backward compatible)
	ip link set eth0 vf 1 vlan 100

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
2016-10-09 19:17:15 -07:00
Eric Dumazet
39f8caeb96 tc: fq: display unthrottle latency
In linux-4.9 fq packet scheduler got a new stat :

unthrottle_latency in nano second units.

Gives a good indication of system load or timer implementation
latencies.

Signed-off-by: Eric Dumazet <edumazet@google.com>
2016-10-09 19:15:13 -07:00
Shmulik Ladkani
4654173e90 tc: m_vlan: Add vlan modify action
The 'vlan modify' action allows to replace an existing 802.1q tag
according to user provided settings.
It accepts same arguments as the 'vlan push' action.

For example, this replaces vid 6 with vid 5:

 # tc filter add dev veth0 parent ffff: pref 1 protocol 802.1q \
      basic match 'meta(vlan mask 0xfff eq 6)' \
      action vlan modify id 5 continue

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
2016-10-09 19:11:34 -07:00
Nikolay Aleksandrov
590bf22a34 ipmroute: add support for age dumping
Add support to dump the mroute cache entry age if the show_stats (-s)
switch is provided.
Example:
$ ip -s mroute
(0.0.0.0, 239.10.10.10)          Iif: eth0       Oifs: eth0
  0 packets, 0 bytes, Age  245.44

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
2016-10-09 19:09:31 -07:00
Stephen Hemminger
b96306f8d9 Merge branch 'master' into net-next 2016-10-09 19:04:50 -07:00
Stephen Hemminger
63ec17a3da v4.8.0 2016-10-09 19:00:11 -07:00
Anton Aksola
e29a8e0537 iproute2: build nsid-name cache only for commands that need it
The calling of netns_map_init() before command parsing introduced
a performance issue with large number of namespaces.

As commands such as add, del and exec do not need to iterate through
/var/run/netns it would be good not no build the cache before executing
these commands.

Example:
unpatched:
time seq 1 1000 | xargs -n 1 ip netns add

real    0m16.832s
user    0m1.350s
sys    0m15.029s

patched:
time seq 1 1000 | xargs -n 1 ip netns add

real    0m3.859s
user    0m0.132s
sys    0m3.205s

Signed-off-by: Anton Aksola <aakso@iki.fi>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
2016-10-09 18:56:47 -07:00
Stephen Hemminger
d99272470a update headers from pre 4.9 (net-next) 2016-10-09 18:55:58 -07:00
Stephen Hemminger
d54e3ab985 Merge branch 'master' into net-next 2016-10-09 18:53:52 -07:00
Sushma Sitaram
58d93d0030 tc: f_u32: Fill in 'linkid' provided by user
Currently, 'linkid' input by the user is parsed but 'handle' is appended to the netlink message.

# tc filter add dev enp1s0f1 protocol ip parent ffff: prio 99 u32 ht 800: \
	order 1 link 1: offset at 0 mask 0f00 shift 6 plus 0 eat match ip \
	protocol 6 ff

resulted in:
filter protocol ip pref 99 u32 fh 800::1 order 1 key ht 800 bkt 0
  match 00060000/00ff0000 at 8
    offset 0f00>>6 at 0  eat

This patch results in:
filter protocol ip pref 99 u32 fh 800::1 order 1 key ht 800 bkt 0 link 1:
  match 00060000/00ff0000 at 8
    offset 0f00>>6 at 0  eat

Signed-off-by Sushma Sitaram: Sushma Sitaram <sushma.sitaram@intel.com>
2016-10-09 18:51:00 -07:00
anuradhak
afd3921ea9 bridge: Fix garbled json output seen if a vlan filter is specified
json objects were started but not completed if the fdb vlan did not
match the specified filter vlan.

Sample output:
$ bridge -j fdb show vlan 111
[{
        "mac": "44:38:39:00:69:88",
        "dev": "br0",
        "vlan": 111,
        "master": "br0",
        "state": "permanent"
    }
]
$ bridge -j fdb show vlan 100
[]
$

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2016-10-09 18:49:32 -07:00
Igor Ryzhov
6cf2609ddb fix netlink message length checks
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2016-10-09 18:48:30 -07:00
Hangbin Liu
22a84711f4 ip: Use specific slave id
The original bond/bridge/vrf and slaves use same id, which make people
confused. Use bond/bridge/vrf_slave as id name will make code more clear.

Acked-by: Phil Sutter <psutter@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
2016-09-22 16:39:55 -07:00
Hangbin Liu
77089b583a misc/ss: tcp cwnd should be unsigned
tcp->snd_cwd is a u32, but ss treats it like a signed int. This may
results in negative bandwidth calculations.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Phil Sutter <phil@nwl.cc>
2016-09-22 16:39:08 -07:00
Hangbin Liu
d1f338b318 misc/ss: tcp cwnd should be unsigned
tcp->snd_cwd is a u32, but ss treats it like a signed int. This may
results in negative bandwidth calculations.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Phil Sutter <phil@nwl.cc>
2016-09-22 16:38:22 -07:00
Lorenzo Colitti
ec75249b14 ss: Support displaying and filtering on socket marks.
This allows the user to dump sockets with a given mark (via
"fwmark = 0x1234/0x1234" or "fwmark = 12345", etc.) , and to
display the socket marks of dumped sockets.

The relevant kernel commits are: d545caca827b ("net: inet: diag:
expose the socket mark to privileged processes.") and
- a52e95abf772 ("net: diag: allow socket bytecode filters to
match socket marks")

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
2016-09-22 16:34:40 -07:00
Alexei Starovoitov
4bfe682536 iptnl: add support for collect_md flag in IPv4 and IPv6 tunnels
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2016-09-21 16:36:24 -07:00
Stephen Hemminger
a9c990b6d7 Merge branch 'master' into net-next 2016-09-21 16:35:56 -07:00
Jiri Benc
1f4c51c0e4 tunnels: use macros for IPv6 address comparison
Replace open coded comparison of IPv6 addresses with appropriate macros.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
2016-09-21 16:35:05 -07:00
Liping Zhang
c44003f7e7 ipmonitor: fix ip monitor can't work when NET_NS is not enabled
In ip monitor, netns_map_init will check getnsid is supported or not.
But when /proc/self/ns/net does not exist, we just print out error
messages and exit. So user cannot use ip monitor anymore when
CONFIG_NET_NS is disabled:
  # ip monitor
  open("/proc/self/ns/net"): No such file or directory

If open "/proc/self/ns/net" failed, set have_rtnl_getnsid to false.

Fixes: d652ccbf81 ("netns: allow to dump and monitor nsid")
Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
2016-09-21 16:32:44 -07:00
Neal Cardwell
2f0f9aef94 ss: output TCP BBR diag information
Dump useful TCP BBR state information from a struct tcp_bbr_info that
was grabbed using the inet_diag API.

We tolerate info that is shorter or longer than expected, in case the
kernel is older or newer than the ss binary. We simply print the
minimum of what is expected from the kernel and what is provided from
the kernel. We use the same trick as that used for struct tcp_info:
when the info from the kernel is shorter than we hoped, we pad the end
with zeroes, and don't print fields if they are zero.

The BBR output looks like:
  bbr:(bw:1.2Mbps,mrtt:18.965,pacing_gain:2.88672,cwnd_gain:2.88672)

The motivation here is to be consistent with DCTCP, which looks like:
  dctcp(ce_state:23,alpha:23,ab_ecn:23,ab_tot:23)

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
2016-09-21 16:29:35 -07:00
Stephen Hemminger
16c2a51dc4 update bpf.h 2016-09-21 16:28:56 -07:00
Hangbin Liu
bffb68b6c2 ip route: check ftell, fseek return value
ftell() may return -1 in error case, which is not handled and
therefore pass a negative offset to fseek(). The return code of
fseek() is also not checked.

Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
2016-09-20 09:52:35 -07:00
Stephen Hemminger
36923f4e69 Merge branch 'master' into net-next 2016-09-20 09:50:53 -07:00
Mahesh Bandewar
b7c1488034 ip: (ipvlan) introduce L3s mode
The new mode 'l3s' can be set like -

  ip link add link <master> dev <IPvlan-slave> type ipvlan mode l3s

  e.g. ip link add link eth0 dev ipvl0 type ipvlan mode l3s

Also did some trivial code restructuring.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
2016-09-20 09:50:45 -07:00
Davide Caratti
f20f5f7990 macsec: fix input range of 'icvlen' parameter
the maximum possible ICV length in a MACsec frame is 16 octects, not 32:
fix get_icvlen() accordingly, so that a proper error message is displayed
in case input 'icvlen' is greater than 16.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
2016-09-20 09:48:26 -07:00
Jiri Benc
e2cfe5501f vxlan: group address requires net device
This is now enforced in the kernel, check also in iproute to get a better
error message.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
2016-09-20 09:46:41 -07:00
Davide Caratti
087dec7fcf tc: don't accept qdisc 'handle' greater than ffff
since get_qdisc_handle() truncates the input value to 16 bit, return an
error and prompt "invalid qdisc ID" in case input 'handle' parameter needs
more than 16 bit to be stored.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
2016-09-20 09:44:59 -07:00
Phil Sutter
003f0fde69 iproute: fix documentation for ip rule scan order
Looks like the real issue is missing definition of priority.
2016-09-20 09:36:45 -07:00