Commit Graph

3411 Commits

Author SHA1 Message Date
Stephen Hemminger
ba2a2124ec update to net-next headers (pre 4.10 rc)
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2016-12-12 15:26:55 -08:00
Stephen Hemminger
2a56c090e4 Merge branch 'master' into net-next 2016-12-12 15:24:40 -08:00
Stephen Hemminger
ae0969c893 v4.9.0 2016-12-12 15:07:42 -08:00
Stephen Hemminger
dc5622cb66 update to 4.9 release headers
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2016-12-12 15:06:18 -08:00
David Ahern
eebc7cc192 Makefile: really suppress printing of directories
Makefile adds --no-print-directory to MAKEFLAGS if VERBOSE is not
defined however Config always defines VERBOSE. Update the check to
whether VERBOSE is 0.

Fixes: 57bdf8b764 ("Make builds default to quiet mode")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
2016-12-09 12:49:44 -08:00
Simon Horman
eb3b5696f1 tc: flower: support matching on ICMP type and code
Support matching on ICMP type and code.

Example usage:

tc qdisc add dev eth0 ingress

tc filter add dev eth0 protocol ip parent ffff: flower \
	indev eth0 ip_proto icmp type 8 code 0 action drop

tc filter add dev eth0 protocol ipv6 parent ffff: flower \
	indev eth0 ip_proto icmpv6 type 128 code 0 action drop

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-12-09 12:46:34 -08:00
Simon Horman
6910d65661 tc: flower: introduce enum flower_endpoint
Introduce enum flower_endpoint and use it instead of a bool
as the type for paramatising source and destination.

This is intended to improve read-ability and provide some type
checking of endpoint parameters.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-12-09 12:45:59 -08:00
Daniel Borkmann
c7272ca720 bpf: add initial support for attaching xdp progs
Now that we made the BPF loader generic as a library, reuse it
for loading XDP programs as well. This basically adds a minimal
start of a facility for iproute2 to load XDP programs. There
currently only exists the xdp1_user.c sample code in the kernel
tree that sets up netlink directly and an iovisor/bcc front-end.

Since we have all the necessary infrastructure in place already
from tc side, we can just reuse its loader back-end and thus
facilitate migration and usability among the two for people
familiar with tc/bpf already. Sharing maps, performing tail calls,
etc works the same way as with tc. Naturally, once kernel
configuration API evolves, we will extend new features for XDP
here as well, resp. extend dumping of related netlink attributes.

Minimal example:

  clang -target bpf -O2 -Wall -c prog.c -o prog.o
  ip [-force] link set dev em1 xdp obj prog.o       # attaching
  ip [-d] link                                      # dumping
  ip link set dev em1 xdp off                       # detaching

For the dump, intention is that in the first line for each ip
link entry, we'll see "xdp" to indicate that this device has an
XDP program attached. Once we dump some more useful information
via netlink (digest, etc), idea is that 'ip -d link' will then
display additional relevant program information below the "link/
ether [...]" output line for such devices, for example.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2016-12-09 12:44:12 -08:00
Daniel Borkmann
fb24802b9c bpf: check for owner_prog_type and notify users when differ
Kernel commit 21116b7068b9 ("bpf: add owner_prog_type and accounted mem
to array map's fdinfo") added support for telling the owner prog type in
case of prog arrays. Give a notification to the user when they differ,
and the program eventually fails to load.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2016-12-09 12:44:12 -08:00
Thomas Graf
0f74d0f3a9 bpf: Fix number of retries when growing log buffer
The log buffer is automatically grown when the verifier output does not
fit into the default buffer size. The number of growing attempts was
not sufficient to reach the maximum buffer size so far.

Perform 9 iterations to reach max and let the 10th one fail.

j:0     i:65536         max:16777215
j:1     i:131072        max:16777215
j:2     i:262144        max:16777215
j:3     i:524288        max:16777215
j:4     i:1048576       max:16777215
j:5     i:2097152       max:16777215
j:6     i:4194304       max:16777215
j:7     i:8388608       max:16777215
j:8     i:16777216      max:16777215

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
2016-12-09 12:42:11 -08:00
Roi Dayan
6566ca8cdb devlink: Add option to set and show eswitch inline mode
This is needed for some HWs to do proper macthing and steering.
Possible values are none, link, network, transport.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2016-12-09 12:41:03 -08:00
Roi Dayan
a93b6bb3a2 devlink: Add usage help for eswitch subcommand
Add missing usage help for devlink dev eswitch subcommand.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2016-12-09 12:40:52 -08:00
Stephen Hemminger
3dd0bb51d7 update kernel headers from net-next
Net-next now closed.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2016-12-09 12:40:07 -08:00
Stephen Hemminger
e6fee79104 Merge branch 'master' into net-next 2016-12-09 12:38:51 -08:00
Stephen Hemminger
e49aef96bb update kernel headers 2016-12-09 12:38:35 -08:00
Stephen Hemminger
b95e5c55a9 Revert "devlink: Add usage help for eswitch subcommand"
This reverts commit 11f4cd31d2.
2016-12-09 12:37:39 -08:00
Stephen Hemminger
d646916993 Revert "devlink: Add option to set and show eswitch inline mode"
This reverts commit b9dcf9c282.

Intended for net-next
2016-12-09 12:37:19 -08:00
Simon Horman
6bd5b80cdc tc: flower: make use of flower_port_attr_type() safe and silent
Make use of flower_port_attr_type() safe:
* flower_port_attr_type() may return a valid index into tb[] or -1.
  Only access tb[] in the case of the former.
* Do not access null entries in tb[]

Also make usage silent - it is valid for ip_proto to be invalid,
for example if it is not specified as part of the filter.

Fixes: a1fb0d4842 ("tc: flower: Support matching on SCTP ports")
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-12-05 10:13:26 -08:00
Simon Horman
61dff9ac10 tc: flower: correct name of ip_proto parameter to flower_parse_port()
This corrects a typo.

Fixes: a1fb0d4842 ("tc: flower: Support matching on SCTP ports")
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-12-05 10:13:26 -08:00
Simon Horman
6ad7e60c1f tc: flower: document SCTP ip_proto
Add SCTP ip_proto to help text and man page.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-12-05 10:13:26 -08:00
Simon Horman
730381fede tc: flower: remove references to eth_type in manpage
Remove references to eth_type and ether_type (spelling error) in
the tc flower manpage.

Also correct formatting of boldface text with whitespace.

Cc: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2016-12-02 14:59:43 -08:00
Stephen Hemminger
143a704bf8 update kernel headers from net-next
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2016-12-02 14:54:33 -08:00
Stephen Hemminger
f2df31170f Merge branch 'master' into net-next 2016-12-02 14:19:08 -08:00
Simon Horman
1dd0cca7fa ss: initialise variables outside of for loop
Initialise for loops outside of for loops. GCC flags this as being
out of spec unless C99 or C11 mode is used.

With this change the entire tree appears to compile cleanly with -Wall.

$ gcc --version
gcc (Debian 4.9.2-10) 4.9.2
...
$ make
...
ss.c: In function ‘unix_show_sock’:
ss.c:3128:4: error: ‘for’ loop initial declarations are only allowed in C99 or C11 mode
...

Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-12-02 14:17:09 -08:00
Amir Vadai
d57639a475 tc/act_tunnel: Introduce ip tunnel action
This action could be used before redirecting packets to a shared tunnel
device, or when redirecting packets arriving from a such a device.

The 'unset' action is optional. It is used to explicitly unset the
metadata created by the tunnel device during decap. If not used, the
metadata will be released automatically by the kernel.
The 'set' operation, will set the metadata with the specified values for
the encap.

For example, the following flower filter will forward all ICMP packets
destined to 11.11.11.2 through the shared vxlan device 'vxlan0'. Before
redirecting, a metadata for the vxlan tunnel is created using the
tunnel_key action and it's arguments:

$ tc filter add dev net0 protocol ip parent ffff: \
    flower \
      ip_proto 1 \
      dst_ip 11.11.11.2 \
    action tunnel_key set \
      src_ip 11.11.0.1 \
      dst_ip 11.11.0.2 \
      id 11 \
    action mirred egress redirect dev vxlan0

Signed-off-by: Amir Vadai <amir@vadai.me>
2016-12-02 14:12:09 -08:00
Amir Vadai
bb9b63b18e tc/cls_flower: Classify packet in ip tunnels
Introduce classifying by metadata extracted by the tunnel device.
Outer header fields - source/dest ip and tunnel id, are extracted from
the metadata when classifying.

For example, the following will add a filter on the ingress Qdisc of shared
vxlan device named 'vxlan0'. To forward packets with outer src ip
11.11.0.2, dst ip 11.11.0.1 and tunnel id 11. The packets will be
forwarded to tap device 'vnet0':

$ tc filter add dev vxlan0 protocol ip parent ffff: \
    flower \
      enc_src_ip 11.11.0.2 \
      enc_dst_ip 11.11.0.1 \
      enc_key_id 11 \
      dst_ip 11.11.11.1 \
    action mirred egress redirect dev vnet0

Signed-off-by: Amir Vadai <amir@vadai.me>
2016-12-02 14:12:09 -08:00
Amir Vadai
aab0f61043 libnetlink: Introduce rta_getattr_be*()
Add the utility functions rta_getattr_be16() and rta_getattr_be32(), and
change existing code to use it.

Signed-off-by: Amir Vadai <amir@vadai.me>
2016-12-02 14:12:09 -08:00
Phil Sutter
039b3620cf ss: unix_show: No need to initialize members of calloc'ed structs
Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:47 -08:00
Phil Sutter
b710a72254 ss: Make sstate_namel local to scan_state()
Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:47 -08:00
Phil Sutter
1882c0db02 ss: Make sstate_name local to sock_state_print()
Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:47 -08:00
Phil Sutter
96d45daa92 ss: Make unix_state_map local to unix_show()
Also make it const, since there won't be any write access happening.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:47 -08:00
Phil Sutter
2f938ce1fa ss: Get rid of single-fielded struct snmpstat
A struct with only a single field does not make much sense. Besides
that, it was used by print_summary() only.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:47 -08:00
Phil Sutter
6b224dad23 ss: Get rid of useless goto in handle_follow_request()
Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:46 -08:00
Phil Sutter
b3535dd61d ss: Make slabstat_ids local to get_slabstat()
Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:46 -08:00
Phil Sutter
95eafe438a ss: Make some variables function-local
addrp_width and screen_width are used in main() only, so no need to have
them globally available.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:46 -08:00
Phil Sutter
b25bad2ffe ss: Make user_ent_hash_build_init local to user_ent_hash_build()
By having it statically defined, there is no need for it to be global.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:46 -08:00
Phil Sutter
86dfa1be4a ss: Make tmr_name local to tcp_timer_print()
It's used only there, so no need to have it globally defined.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:46 -08:00
Phil Sutter
0cb74a8610 ss: Turn generic_proc_open() wrappers into macros
Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:46 -08:00
Phil Sutter
f25062e9e7 ss: Eliminate unix_use_proc()
This function is used only at a single place anymore, so replace the
call to it by it's content, which makes that specific part of
unix_show() consistent with e.g. tcp_show().

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:46 -08:00
Phil Sutter
2d0e538f3e ss: Drop list traversal from unix_stats_print()
Although this complicates the dedicated procfs-based code path in
unix_show() a bit, it's the only sane way to get rid of unix_show_sock()
output diverging from other socket types in that it prints all socket
details in a new line.

As a side effect, it allows to eliminate all procfs specific code in
the same function.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:46 -08:00
Phil Sutter
5f27ac1db9 ss: introduce proc_ctx_print()
This consolidates identical code in three places. While the function
name is not quite perfect as there is different proc_ctx printing code
in netlink_show_one() as well, I sadly didn't find a more suitable one.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:46 -08:00
Phil Sutter
be7e4d20b9 ss: Use sockstat->type in all socket types
Unix sockets used that field already to hold info about the socket type.
By replicating this approach in all other socket types, we can get rid
of protocol parameter in inet_stats_print() and have sock_state_print()
figure things out by itself.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:46 -08:00
Phil Sutter
4519999708 ss: Add missing tab when printing UNIX details
When dumping UNIX sockets and show_details is active but not show_mem
(ss -xne), the socket details are printed without being prefixed by tab.
Fix this by printing the tab character when either one of '-e' or '-m'
has been specified.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:46 -08:00
Phil Sutter
6babc649a9 ss: Drop empty lines in UDP output
When dumping UDP sockets and show_tcpinfo (-i) is active but not
show_mem (-m), print_tcpinfo() does not output anything leading to an
empty line being printed after every socket. Fix this by skipping the
call to print_tcpinfo() and the previous newline printing in that case.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:46 -08:00
Phil Sutter
36df1a6e92 ss: Mark fall through in arg parsing switch()
As there is a certain chance of overlooking this, better add a comment
to draw readers' attention.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-12-02 14:07:46 -08:00
Yuchung Cheng
b6c7fc61fa ss: print new tcp_info fields: busy, rwnd-limited, sndbuf-limited times
Dump some new fields added to tcp_info in v4.10: tcpi_busy_time,
tcpi_rwnd_limited, tcpi_sndbuf_limited.

Example output for a flow busy for 110ms but never measurably limited by
receive window or send buffer:
   busy:110ms

Example output for a flow usually limited by receive window:
   busy:111ms rwnd_limited:101ms(91.0%)

Example output for a flow sometimes limited by send buffer:
   busy:50ms sndbuf_limited:10ms(20.0%)

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
2016-12-01 11:00:28 -08:00
Neal Cardwell
2f579872fb ss: print new tcp_info fields: delivery_rate and app_limited
Dump the new delivery_rate and delivery_rate_app_limited fields that
were added to tcp_info in Linux v4.9.

Example output:
  pacing_rate 65.7Mbps delivery_rate 62.9Mbps

And for the application-limited case this looks like:
  pacing_rate 1031.1Mbps delivery_rate 87.4Mbps app_limited

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
2016-12-01 11:00:28 -08:00
Cyrill Gorcunov
41fe6c34de ss: Add inet raw sockets information gathering via netlink diag interface
unix, tcp, udp[lite], packet, netlink sockets already support diag
interface for their collection and killing. Implement support
for raw sockets.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
2016-12-01 10:55:56 -08:00
Cyrill Gorcunov
9f66764e30 libnetlink: Add test for error code returned from netlink reply
In case if some diag module is not present in the system,
say the kernel is not modern enough, we simply skip the
error code reported. Instead we should check for data
length in NLMSG_DONE and process unsupported case.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
2016-12-01 10:55:56 -08:00
Stephen Hemminger
bf9a0aff36 Update kernel headers for XDP and tcp_info
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2016-12-01 10:52:30 -08:00