Commit Graph

4528 Commits

Author SHA1 Message Date
Leon Romanovsky
99da90326e rdma: Protect dev_map_lookup from wrong input
Despite the fact that all callers to dev_map_lookup are ensuring that
there is always device name prior to call to that function, it is better
and safer to check that in the dev_map_lookup itself.

Fixes: 40df8263a0 ("rdma: Add dev object")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2017-12-27 07:47:35 -08:00
Leon Romanovsky
0fc8c30b4e rdma: Reduce scope of _dev_map_lookup call
There is no external users of _dev_map_lookup function,
so let's limit its scope to be local.

Fixes: 40df8263a0 ("rdma: Add dev object")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2017-12-27 07:47:35 -08:00
William Tu
1e4be52bf2 erspan: add erspan usage description
The patch adds erspan usage description, so 'ip link help erspan'
and 'ip link help ip6erspan' shows the options.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2017-12-27 07:47:35 -08:00
Serhey Popovych
08ede25fda ip/tunnel: No need to free answer after rtnl_talk() on error
Since rtnl_talk() never returns with answer buffer allocated
on error we do not need to release it manually. After this
initializing answer with NULL before rtnl_talk() is useless.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
2017-12-26 09:07:43 -08:00
Serhey Popovych
1ed8a5ca87 utils: ll_addr: Handle ARPHRD_IP6GRE in ll_addr_n2a()
ll_addr_n2a() correctly prints tunnel endpoints for gre, ipip, sit
and ip6tnl, but not for ip6gre. Fix this by adding ARPHRD_IP6GRE to
IPv6 tunnel endpoing address conversion.

Before:
-------

$ ip link show
...
18: ip6tnl0: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group default
    link/tunnel6 :: brd ::
19: ip6gre0: <NOARP> mtu 1456 qdisc noop state DOWN mode DEFAULT group default
    link/gre6 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 brd \
00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00

After:
------

$ ip link show
...
18: ip6tnl0: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group default
    link/tunnel6 :: brd ::
19: ip6gre0: <NOARP> mtu 1456 qdisc noop state DOWN mode DEFAULT group default
    link/gre6 :: brd ::

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
2017-12-26 09:07:42 -08:00
William Tu
2897636267 erspan: add erspan version II support
The patch adds support for configuring the erspan v2, for both
ipv4 and ipv6 erspan implementation.  Three additional fields
are added: 'erspan_ver' for distinguishing v1 or v2, 'erspan_dir'
for specifying direction of the mirrored traffic, and 'erspan_hwid'
for users to set ERSPAN engine ID within a system.

As for manpage, the ERSPAN descriptions used to be under GRE, IPIP,
SIT Type paragraph.  Since IP6GRE/IP6GRETAP also supports ERSPAN,
the patch removes the old one, creates a separate ERSPAN paragrah,
and adds an example.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2017-12-20 15:18:04 -08:00
David Ahern
f82517f80d Update headers from 4.15-rc3
Update kernel headers to commit f39a5c01c3d2 ("Merge branch
'nfp-flower-add-Geneve-tunnel-support'")

Signed-off-by: David Ahern <dsahern@gmail.com>
2017-12-20 15:17:59 -08:00
Alexander Zubkov
9135c4d603 iproute: "list/flush/save default" selected all of the routes
When running "ip route list default" and not specifying address family,
one will get all of the routes instead of just default only. The same
is for "exact default" and "match default".

It behaves in such a way because default route with unspecified family
has the same all-zeroes value like no prefix specified at all. Thus
following code blindly ignores the fact, that prefix was actually
specified.

This patch adds the flag PREFIXLEN_SPECIFIED to the default route too.
And then checks its value when filtering routes.

Signed-off-by: Alexander Zubkov <green@msu.ru>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-19 08:23:09 -08:00
Alexander Zubkov
ed7fdc950d iproute: list/flush/save filter also by metric
Metric is one of the "unique key" fields of the route in Linux. But
still one can not use its value in filter while running ip list.
Because of this writing checks in scripts for example is incovenient.

Signed-off-by: Alexander Zubkov <green@msu.ru>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-19 08:16:10 -08:00
Serhey Popovych
560cf61253 link_vti6: Always add local/remote endpoint attributes
All tunnels already support for parsing/adding zero
endpoints and vti6 isn't an exception.

This check was added as part of commit 2a80154fde
(vti6: fix local/remote any addr handling) and looks
too restrictive as purpose of change is to avoid
endpoint configuration from uninitialized data.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-19 08:14:59 -08:00
Serhey Popovych
95614cc8a3 link_ip6tnl: Use IN6ADDR_ANY_INIT to initialize local/remote endpoints
Use specialized helper to initialize endpoint addresses with
zeros instead of open coding this. This unifies initialization
style with other ipv6 tunnel variants (i.e. gre6 and vti6).

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-19 08:14:01 -08:00
Serhey Popovych
1f44b93744 ip/tunnel: Use tnl_parse_key() to parse tunnel key
It is added with
commit a7ed1520ee ("ip/tunnel: introduce tnl_parse_key()")
to avoid code duplication in ip6?tunnel.c.

Reuse it for gre/gre6 and vti/vti6 tunnel rtnl
configuration interface with the same purpose
it is used in tunnel ioctl interface in ip6?tunnel.c.

While there change type of key variables from
unsigned integer to __be32 to reflect nature of the
value they store and place error message in
tnl_parse_key() on a single line to make single
call to fprintf().

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-19 08:14:01 -08:00
Serhey Popovych
dac9ff35ea iplink: Kill redundant network device name checks
Since commit 625df645b7 (Check user supplied interface name lengths)
iplink_parse() validates network device name using check_ifname()
helpers.

Remove redundant "name" length checks from iplink_parse() callers.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-19 08:12:41 -08:00
Serhey Popovych
f88becf35e iplink: Process "alias" parameter correctly
Do not stop parameters processing after "alias" parameter: it might
not be a last one. Seems copy pasted from "type" parameter code.

Check it's length does not exceed IFALIASZ - 1. Better we warn
than get RTNL error.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-19 08:11:38 -08:00
Serhey Popovych
b7ea12ae43 iplink: Improve index parameter handling
Correctly check for valid network device index supplied on
command line: indexes are always greather than zero. Check
for duplicate "index" argument.

Initialize @index to 0 to simplify handling it in iplink_modify().
Other callers (link_veth.c, iplink_vxcan.c) already did so.

No need to initialize ifi_index with 0 since it is already
initialized at the @struct req initialization time and not
modified in iplink_parse().

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-19 08:11:38 -08:00
Stephen Hemminger
bd9cea5d8c utils: fix makeargs stack overflow
The makeargs() function did not handle end of string correctly
and would reference past end of string.

Found by fuzzing with ASAN.

Reported-by:Bug Basher <iamliketohack@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-18 11:19:48 -08:00
Stephen Hemminger
5073581835 ss: fix crash with invalid command input file
If given an invalid input file with -F flag, ss would crash.
Examples of invalid input are line to long, or null file.

Found by fuzzing with ASAN.

Reported-by:Bug Basher <iamliketohack@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-18 11:18:55 -08:00
Stephen Hemminger
ae8e1cb83b ip: validate vlan value for vlan info
The VLAN tag must be 0..4095 to be valid.
Better to trap it here.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-16 13:14:38 -08:00
Serhey Popovych
a6addd5cdc ip: gre: fix IFLA_GRE_LINK attribute sizing
Attribute IFLA_GRE_LINK is 32 bit long, not 8 bit.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
2017-12-16 10:08:54 -08:00
Serhey Popovych
9aceaad71b ip/tunnel: Use get_addr() instead of get_prefix() for local/remote endpoints
Manual page ip-link(8) states that both local and remote accept
IPADDR not PREFIX. Use get_addr() instead of get_prefix() to
parse local/remote endpoint address correctly.

Force corresponding address family instead of using preferred_family
to catch weired cases as shown below.

Before this patch it is possible to create tunnel with commands:

  ip    li add dev ip6gre2 type ip6gre local fe80::1/64 remote fe80::2/64
  ip -4 li add dev ip6gre2 type ip6gre local 10.0.0.1/24 remote 10.0.0.2/24

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
2017-12-16 10:08:54 -08:00
Serhey Popovych
57daab1e70 ip/tunnel: Unify setup and accept zero address for local/remote endpoints
It is fully legal to submit zero (INADDR_ANY/IN6ADDR_ANY_INIT)
value for local and/or remote endpoints for all tunnel drivers:
no need additionally check this in userspace.

Note that all tunnel specific code already can pass zero address
to the kernel.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
2017-12-16 10:08:54 -08:00
Oliver Hartkopp
1eccc57341 ip: add vxcan/veth to ip-link man page
veth and vxcan both create a vitual tunnel between a pair of virtual network
devices. This patch adds the content for the now supported vxcan netdevices
and the documentation to create peer devices for vxcan and veth.

Additional remove 'can' that accidently was on the list of link types which
can be created by 'ip link add' as 'can' devices are real network devices.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-16 10:04:33 -08:00
Roman Mashak
3d791a326b ss: add missing path MTU parameter
v3:
   Rebase and use out() instead of printf().
v2:
   Print the path MTU immediately after the MSS, as it is easier to parse
   for humans (suggested by Neal Cardwell).

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-16 10:02:34 -08:00
Stephen Hemminger
2c6aaad949 include: qdisc offload defines
UAPI changes from upstream:
	net: sched: Add TCA_HW_OFFLOAD
	pkt_sched: Remove TC_RED_OFFLOADED from uapi

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-16 10:00:43 -08:00
Stephen Hemminger
c189177efc Merge branch 'master' into net-next 2017-12-14 21:19:54 -08:00
William Tu
6231c5bec6 gre6: add collect metadata support
The patch adds 'external' option to support collect metadata
gre6 tunnel.  The 'external' keyword is already used to set the
device into collect metadata mode such as vxlan, geneve, ipip,
etc.  This patch extends support for ipv6 gre and gretap.
Example of L3 and L2 gre device:
bash:~# ip link add dev ip6gre123 type ip6gre external
bash:~# ip link add dev ip6gretap123 type ip6gretap external

Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
2017-12-14 21:19:49 -08:00
Chris Mi
83cf5bc73b tc: fix command "tc actions del" hang issue
If command is RTM_DELACTION, a non-NULL pointer is passed to rtnl_talk().
Then flag NLM_F_ACK is not set on n->nlmsg_flags and netlink_ack() will
not be called. Command tc will wait for the reply for ever.

Fixes: 86bf43c7c2 ("lib/libnetlink: update rtnl_talk to support malloc buff at run time")
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-14 21:17:04 -08:00
Stephen Hemminger
08f9d166c3 iplink: add definitions for GSO_MAX
Until kernel exports these, add GSO_MAX values into iplink
rather than assuming they are UINT_MAX + 1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-14 18:22:56 -08:00
Solio Sarabia
051274b4db iplink: validate maximum gso_max_size
Validate the upper limit for gso_max_size, valid range is [0-65,536]
inclusive. Fix minor whitespace in iplink man page.

Signed-off-by: Solio Sarabia <solio.sarabia@intel.com>
2017-12-14 18:12:14 -08:00
Jiri Pirko
1876ab0779 tc: fix json array closing
Fixes: 2704bd6255 ("tc: jsonify actions core")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
2017-12-13 18:16:27 -08:00
Oliver Hartkopp
7827b37603 ip: add vxcan to help text
Add missing tag 'vxcan' inside the help text which was missing in commit
efe459c76d ('ip: link add vxcan support').

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
2017-12-13 18:16:22 -08:00
Phil Dibowitz
7b17832445 Show 'external' link mode in output
Recently `external` support was added to the tunnel drivers, but there is no way
to introspect this from userspace. This adds support for that.

Now `ip -details link` shows it:

```
7: tunl60@NONE: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group
default qlen 1
    link/tunnel6 :: brd :: promiscuity 0
    ip6tnl external any remote :: local :: encaplimit 0 hoplimit 0 tclass 0x00 flowlabel 0x00000 (flowinfo 0x00000000) addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
```

Signed-off-by: Phil Dibowitz <phil@ipom.com>
2017-12-13 18:15:51 -08:00
Stephen Hemminger
3a2fbf007b Merge branch 'master' into net-next 2017-12-12 12:12:20 -08:00
Davide Caratti
88b428f03f tc: bash-completion: add missing 'classid' keyword
users of 'matchall' filter can specify a value for the class id: update
bash-completion accordingly.

Fixes: b32c0b64fa ("tc: bash-completion: Add support for matchall")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2017-12-12 12:11:37 -08:00
Stefano Brivio
87b1a7aec7 ss: Implement automatic column width calculation
Group fitting fields into lines and space them equally using the
remaining screen width for each line. If columns don't fit on
one line, break them into the least possible amount of lines and
keep them aligned across lines.

This is done by:
 - recording the length of the longest item in each column during
   formatting and buffering (which was added in the previous patch)
 - fitting as many fields as possible on each line of output
 - distributing the remaining padding space equally between the
   columns

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
2017-12-12 12:11:37 -08:00
Stefano Brivio
691bd854bf ss: Buffer raw fields first, then render them as a table
This allows us to measure the maximum field length for each
column before printing fields and will permit us to apply
optimal field spacing and distribution. Structure of the output
buffer with chunked allocation is described in comments.

Output is still unchanged, original spacing is used.

Running over one million sockets with -tul options by simply
modifying main() to loop 50,000 times over the *_show()
functions, buffering the whole output and rendering it at the
end, with 10 UDP sockets, 10 TCP sockets, while throwing
output away, doesn't show significant changes in execution time
on my laptop with an Intel i7-6600U CPU:

- before this patch:
$ time ./ss -tul > /dev/null
real	0m29.899s
user	0m2.017s
sys	0m27.801s

- after this patch:
$ time ./ss -tul > /dev/null
real	0m29.827s
user	0m1.942s
sys	0m27.812s

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
2017-12-12 12:11:37 -08:00
Stefano Brivio
59f46b7b5b ss: Introduce columns lightweight abstraction
Instead of embedding spacing directly while printing contents,
logically declare columns and functions to buffer their content,
to print left and right spacing around fields, to flush them to
screen, and to print headers.

This makes it a bit easier to handle layout changes and prepares
for full output buffering, needed for optimal spacing in field
output layout.

Columns are currently set up to retain exactly the same output
as before. This needs some slight adjustments of the values
previously calculated in main(), as the width value introduced
here already includes the width of left delimiters and spacing
is not explicitly printed anymore whenever a field is printed.
These calculations will go away altogether once automatic width
calculation is implemented.

We can also remove explicit printing of newlines after the final
content for a given line is printed, flushing the last field on
a line will cause field_flush() to print newlines where
appropriate.

No changes in output expected here.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
2017-12-12 12:11:37 -08:00
Stefano Brivio
90351722cb ss: Replace printf() calls for "main" output by calls to helper
This is preparation work for output buffering, which will allow
us to use optimal spacing and alignment of logical "columns".

The new out() function is just a re-implementation of a typical
libc's printf(), except that the return value of vfprintf() is
ignored as no callers use it. This implementation will be
replaced in the next patches to provide column width adjustment
and adequate spacing.

All printf() calls that output parts of the socket list are now
replaced by calls to out(). Output of summary and version is
excluded from this.

No functional differences here, output not affected.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
2017-12-12 12:11:37 -08:00
Stephen Hemminger
81724d6142 Merge branch 'master' into net-next 2017-12-11 16:06:11 -08:00
Stephen Hemminger
4b072e9b49 uapi: tun add eBPF based queue selection method
Upstream commit 96f84061620c6325a2ca9a9a05b410e6461d03c3
    tun: add eBPF based queue selection method

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-11 16:03:27 -08:00
Stephen Hemminger
b7f5fd3698 uapi: add access to snd_cwnd and other sock_ops
From upstream kernel commit f19397a5c65665d66e3866b42056f1f58b7a366b
    bpf: Add access to snd_cwnd and others in sock_ops

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-11 16:01:17 -08:00
Roman Mashak
9f1a9ae888 ss: remove duplicate assignment
Fixes: 8250bc9ff4 ("ss: Unify inet sockets output")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-11 15:56:10 -08:00
Stephen Hemminger
c2db423f7c iplink: allow configuring GSO max values
This allows sending GSO maximum values when configuring a device.
The values are advisory. Most devices will ignore them but for some
pseudo devices such as veth pairs they can be set.

Example:
	# ip link add dev vm1 type veth peer name vm2 gso_max_size 32768

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2017-12-08 21:33:08 -08:00
Stephen Hemminger
5c6e3478ac Merge branch 'master' into net-next 2017-12-08 21:32:33 -08:00
Michal Privoznik
3572e01a09 tc: util: Don't call NEXT_ARG_FWD() in __parse_action_control()
Not all callers want parse_action_control*() to advance the
arguments. For instance act_parse_police() does the argument
advancing itself.

Fixes: e67aba5595 ("tc: actions: add helpers to parse and print control actions")
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2017-12-08 10:29:01 -08:00
Wei Wang
00ac78d39c ss: print tcpi_rcv_ssthresh
tcpi_rcv_ssthresh is an important stats when debugging receive side
behavior.
Add it to the ss output.

Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
2017-12-08 10:27:57 -08:00
Stephen Hemminger
39be47fb5e update headers from 4.15-rc2
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-12-05 17:30:29 -08:00
Phil Sutter
6bf156415a man: tc-csum.8: Fix inconsistency in example description
Commit 6bbe5e6290 ("man: tc-csum.8: Fix example") changed both source
and destination IP addresses in example code but missed to update the
example's description accordingly.

Fixes: 6bbe5e6290 ("man: tc-csum.8: Fix example")
Signed-off-by: Phil Sutter <phil@nwl.cc>
2017-11-29 10:14:51 -08:00
Stephen Hemminger
b38778bb5e update bpf header from net-next
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-11-28 18:16:51 -08:00
Stephen Hemminger
f6351157b9 Merge branch 'master' into net-next 2017-11-28 09:53:28 -08:00