Commit Graph

138 Commits

Author SHA1 Message Date
Hangbin Liu
86bf43c7c2 lib/libnetlink: update rtnl_talk to support malloc buff at run time
This is an update for 460c03f3f3 ("iplink: double the buffer size also in
iplink_get()"). After update, we will not need to double the buffer size
every time when VFs number increased.

With call like rtnl_talk(&rth, &req.n, NULL, 0), we can simply remove the
length parameter.

With call like rtnl_talk(&rth, nlh, nlh, sizeof(req), I add a new variable
answer to avoid overwrite data in nlh, because it may has more info after
nlh. also this will avoid nlh buffer not enough issue.

We need to free answer after using.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
2017-10-26 12:29:29 +02:00
Phil Sutter
625df645b7 Check user supplied interface name lengths
The original problem was that something like:

| strncpy(ifr.ifr_name, *argv, IFNAMSIZ);

might leave ifr.ifr_name unterminated if length of *argv exceeds
IFNAMSIZ. In order to fix this, I thought about replacing all those
cases with (equivalent) calls to snprintf() or even introducing
strlcpy(). But as Ulrich Drepper correctly pointed out when rejecting
the latter from being added to glibc, truncating a string without
notifying the user is not to be considered good practice. So let's
excercise what he suggested and reject empty, overlong or otherwise
invalid interface names right from the start - this way calls to
strncpy() like shown above become safe and the user has a chance to
reconsider what he was trying to do.

Note that this doesn't add calls to check_ifname() to all places where
user supplied interface name is parsed. In many cases, the interface
must exist already and is therefore looked up using ll_name_to_index(),
so if_nametoindex() will perform the necessary checks already.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2017-10-02 08:01:21 -07:00
Stephen Hemminger
731e28cc28 Merge branch 'master' into net-next 2017-09-01 14:15:31 -07:00
Michal Kubecek
460c03f3f3 iplink: double the buffer size also in iplink_get()
Commit 72b365e8e0 ("libnetlink: Double the dump buffer size") increased
the buffer size for "ip link show" command to 32 KB to handle NICs with
large number of VFs. With "dev" filter, a different code path is taken and
iplink_get() still uses only 16 KB buffer.

The size of 32768 is not very future-proof as NICs supporting 120-128 VFs
are already in use so that single RTM_NEWLINK message in the dump can
exceed 30000 bytes. But it's what rtnl_talk() and rtnl_dump_filter_l() use
so let's be consistent. Once this proves insufficient, all three sizes
should be increased.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
2017-09-01 14:15:00 -07:00
Michal Kubecek
6599162b95 iplink: check for message truncation in iplink_get()
If message length exceeds maxlen argument of rtnl_talk(), it is truncated
to maxlen but unlike in the case of truncation to the length of local
buffer in rtnl_talk(), the caller doesn't get any indication of a problem.

In particular, iplink_get() passes the truncated message on and parsing it
results in various warnings and sometimes even a segfault (observed with
"ip link show dev ..." for a NIC with 125 VFs).

Handle message truncation in iplink_get() the same way as truncation in
rtnl_talk() would be handled: return an error.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
2017-09-01 14:15:00 -07:00
William Tu
9a1381d509 gre: add support for ERSPAN tunnel
The patch adds ERSPAN type II tunnel support. The implementation is
based on the draft at
 https://tools.ietf.org/html/draft-foschiano-erspan-01.

One of the purposes is for Linux box to be able to receive ERSPAN
monitoring traffic sent from the Cisco switch, by creating a ERSPAN
tunnel device. In addition, the patch also adds ERSPAN TX, so traffic
can also be encapsulated into ERSPAN and sent out.

The implementation reuses the key as ERSPAN session ID, and
field 'erspan' as ERSPAN Index fields:
./ip link add dev ers11 type erspan seq key 100 erspan 123 \
		local 172.16.1.200 remote 172.16.1.100

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Meenakshi Vohra <mvohra@vmware.com>
2017-08-23 10:06:54 -07:00
Julien Fortin
e4a1216aeb ip: iplink.c: open/close json obj for ip -brief -json link show dev DEV
Signed-off-by: Julien Fortin <julien@cumulusnetworks.com>
2017-08-17 18:02:40 -07:00
Stephen Hemminger
89ec74a3ea remove duplicated #include's
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-07-18 17:17:15 -07:00
Jakub Kicinski
1b5e809466 bpf: allow requesting XDP HW offload
Let XDP link set command request that the program be offloaded.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
2017-06-27 16:13:55 -07:00
Jakub Kicinski
1468381415 bpf: add xdpdrv for requesting XDP driver mode
Allow user to select XDP DRV_MODE flag by using xdpdrv keyword
instead of xdp or xdpgeneric.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
2017-06-27 16:13:55 -07:00
Eli Cohen
5a3ec4ba64 iplink: Update usage in help message
Add to usage message a description of how to configure Infiniband node
and port GUIDs. Also modify the man page to emphasize the GUIDs are
configured for Infiniband VFs.

Fixes: d91fb3f4c7 ("Add support for configuring Infiniband GUIDs")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
2017-06-05 12:29:36 -07:00
David Ahern
63891c7013 ip address: Change print_linkinfo_brief to take filter as an input
Change print_linkinfo_brief to take the filter as an input arg.
If the arg is NULL, use the global filter in ipaddress.c.

Signed-off-by: David Ahern <dsahern@gmail.com>
2017-05-30 17:54:03 -07:00
Daniel Borkmann
a872b870a5 bpf: add support for generic xdp
Follow-up to commit c7272ca720 ("bpf: add initial support for
attaching xdp progs") to also support generic XDP. This adds an
indicator for loaded generic XDP programs when programs are loaded
as shown in c7272ca720, but the driver still lacks native XDP
support.

  # ip link
  [...]
  3: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdpgeneric qdisc [...]
      link/ether 0c:c4:7a:03:f9:25 brd ff:ff:ff:ff:ff:ff
  [...]

In case the driver does support native XDP, but the user wants
to load the program as generic XDP (e.g. for testing purposes),
then this can be done with the same semantics as in c7272ca720,
but with 'xdpgeneric' instead of 'xdp' command for loading:

  # ip -force link set dev eno1 xdpgeneric obj xdp.o

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David S. Miller <davem@davemloft.net>
2017-05-01 09:28:19 -07:00
Stephen Hemminger
bb6ab47b16 iplink: whitespace cleanup
Break lines to conform to 80 col guideline.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-05-01 09:13:09 -07:00
Zhang Shengju
432b92a702 iplink: add support for IFLA_CARRIER attribute
Add support to set IFLA_CARRIER attribute.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
2017-05-01 09:06:54 -07:00
Robert Shearman
837552b445 iplink: add support for afstats subcommand
Add support for new afstats subcommand. This uses the new
IFLA_STATS_AF_SPEC attribute of RTM_GETSTATS messages to show
per-device, AF-specific stats. At the moment the kernel only supports
MPLS AF stats, so that is all that's implemented here.

The print_num function is exposed from ipaddress.c to be used for
printing the new stats so that the human-readable option, if set, can
be respected.

Example of use:

    $ ./ip/ip -f mpls link afstats dev eth1
    3: eth1
        mpls:
            RX: bytes  packets  errors  dropped  noroute
            9016       98       0       0        0
            TX: bytes  packets  errors  dropped
            7232       113      0       0

Signed-off-by: Robert Shearman <rshearma@brocade.com>
2017-03-10 08:44:55 -08:00
Nikolay Aleksandrov
94f1a22aa7 iplink: add support for xstats subcommand
This patch adds support for a new xstats link subcommand which uses the
specified link type's new parse/print_ifla_xstats callbacks to display
extended statistics.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-02-18 16:36:01 -08:00
Daniel Borkmann
c7272ca720 bpf: add initial support for attaching xdp progs
Now that we made the BPF loader generic as a library, reuse it
for loading XDP programs as well. This basically adds a minimal
start of a facility for iproute2 to load XDP programs. There
currently only exists the xdp1_user.c sample code in the kernel
tree that sets up netlink directly and an iovisor/bcc front-end.

Since we have all the necessary infrastructure in place already
from tc side, we can just reuse its loader back-end and thus
facilitate migration and usability among the two for people
familiar with tc/bpf already. Sharing maps, performing tail calls,
etc works the same way as with tc. Naturally, once kernel
configuration API evolves, we will extend new features for XDP
here as well, resp. extend dumping of related netlink attributes.

Minimal example:

  clang -target bpf -O2 -Wall -c prog.c -o prog.o
  ip [-force] link set dev em1 xdp obj prog.o       # attaching
  ip [-d] link                                      # dumping
  ip link set dev em1 xdp off                       # detaching

For the dump, intention is that in the first line for each ip
link entry, we'll see "xdp" to indicate that this device has an
XDP program attached. Once we dump some more useful information
via netlink (digest, etc), idea is that 'ip -d link' will then
display additional relevant program information below the "link/
ether [...]" output line for such devices, for example.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2016-12-09 12:44:12 -08:00
Zhang Shengju
6bd1ea28c5 link: add team and team_slave link type
Add missing team and team_slave link type.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
2016-11-29 14:03:00 -08:00
Stephen Hemminger
6773bcc227 iplink: cleanup style errors
Fix long strings causing checkpatch warnings
2016-10-09 19:24:38 -07:00
Moshe Shemesh
56e9f0ab19 ip link: Add support to configure SR-IOV VF to vlan protocol 802.1ad (VST QinQ)
Introduce a new API that exposes a list of vlans per VF (IFLA_VF_VLAN_LIST),
giving the ability for user-space application to specify it for the VF as
an option to support 802.1ad (VST QinQ).

We introduce struct vf_vlan_info, which extends struct vf_vlan and adds
an optional VF VLAN proto parameter.
Default VLAN-protocol is 802.1Q.

Add IFLA_VF_VLAN_LIST in addition to IFLA_VF_VLAN to keep backward
compatibility with older kernel versions.

Suitable ip link tool command examples:
 - Set vf vlan protocol 802.1ad (S-TAG)
	ip link set eth0 vf 1 vlan 100 proto 802.1ad
 - Set vf vlan S-TAG and vlan C-TAG (VST QinQ)
	ip link set eth0 vf 1 vlan 100 proto 802.1ad vlan 30 proto 802.1Q
 - Set vf to VST (802.1Q) mode
	ip link set eth0 vf 1 vlan 100 proto 802.1Q
 - Or by omitting the new parameter (backward compatible)
	ip link set eth0 vf 1 vlan 100

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
2016-10-09 19:17:15 -07:00
Hangbin Liu
22a84711f4 ip: Use specific slave id
The original bond/bridge/vrf and slaves use same id, which make people
confused. Use bond/bridge/vrf_slave as id name will make code more clear.

Acked-by: Phil Sutter <psutter@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
2016-09-22 16:39:55 -07:00
Phil Sutter
08c0466b11 ip-link: add missing {min,max}_tx_rate to help text
These vf options are described in man page already, they're just missing
in help output.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-08-17 13:54:31 -07:00
Davide Caratti
fd4df5b211 ip {link,address}: add 'macsec' item to TYPE list
fix output of "ip address help" and "ip link help". Update TYPE list in man
pages ip-address.8 and ip-link.8 as well.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
2016-07-28 11:12:39 -07:00
Phil Sutter
d17b136f7d Use C99 style initializers everywhere
This big patch was compiled by vimgrepping for memset calls and changing
to C99 initializer if applicable. One notable exception is the
initialization of union bpf_attr in tc/tc_bpf.c: changing it would break
for older gcc versions (at least <=3.4.6).

Calls to memset for struct rtattr pointer fields for parse_rtattr*()
were just dropped since they are not needed.

The changes here allowed the compiler to discover some unused variables,
so get rid of them, too.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
2016-07-20 12:05:24 -07:00
Phil Sutter
771a242890 iplink: List valid 'type' argument in ip link help text
Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-07-20 12:04:34 -07:00
Stephen Hemminger
ef0a738c8d ip: link style cleanup
break long lines and other trivial changes
2016-07-15 11:31:20 -07:00
Eli Cohen
d91fb3f4c7 Add support for configuring Infiniband GUIDs
Add two NLA's that allow configuration of Infiniband node or port GUIDs
by referencing the IPoIB net device set over the physical function. The
format to be used is as follows:

ip link set dev ib0 vf 0 node_guid 00:02:c9:03:00:21:6e:70
ip link set dev ib0 vf 0 port_guid 00:02:c9:03:00:21:6e:78

Signed-off-by: Eli Cohen <eli@mellanox.com>
2016-07-15 11:25:36 -07:00
David Ahern
104444c201 ip link/addr: Add support for vrf keyword
Add vrf keyword to 'ip link' and 'ip addr' commands (common list code).

Allows:
1. Adding a link to a VRF
       $ ip link set NAME vrf NAME

   Removing a link from a VRF still uses 'ip link set NAME nomaster'

2. Showing links associated with a VRF:
       $ ip link show vrf NAME

3. List addresses associated with links in a VRF
       $ ip -br addr show vrf red

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
2016-07-06 21:28:31 -07:00
Phil Sutter
0aae23468a Fix MAC address length check
I forgot to change the variable in the conditional, too.

Fixes: 8fe58d5894 ("iplink: Check address length via netlink")
Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-06-27 10:48:35 -07:00
Phil Sutter
8fe58d5894 iplink: Check address length via netlink
This is a feature which was lost during the conversion to netlink
interface: If the device exists and a user tries to change the link
layer address, query the kernel for the old address first and reject the
new one if sizes differ.

This patch adds the same check when setting VF address by assuming same
length as PF device.

Note that at least for VFs the check can't be done in kernel space since
struct ifla_vf_mac lacks a length field and due to netlink padding the
exact size can't be communicated to the kernel.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-06-21 08:50:14 -07:00
Phil Sutter
a89193a7d6 iplink: Add missing variable initialization
Without this, we might feed garbage to the kernel when the address is
shorter than expected.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-06-21 08:50:14 -07:00
Stephen Hemminger
56f5daac98 ip: code cleanup
Run all the ip code through checkpatch and have it fix the obvious stuff.
2016-03-21 11:52:19 -07:00
Phil Sutter
5c2ea5b8c0 iplink: fix help text syntax
Get rid of extraneous closing brackets and while here, merge the double
netns parameter.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2016-03-02 11:23:51 -08:00
Hiroshi Shimamoto
b6d77d9ee3 iplink: Support VF Trust
Add IFLA_VF_TRUST message to trust the VF.
PF can accept some privileged operation from the trusted VF.
For example, ixgbe PF doesn't allow to enable VF promiscuous mode until
the VF is trusted because it may hurt performance.

To trust VF.
 # ip link set dev eth0 vf 1 trust on

To untrust VF.
 # ip link set dev eth0 vf 1 trust off

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
2016-03-02 09:26:24 -08:00
Stephen Hemminger
2505780c20 Merge branch 'net-next' 2016-01-18 09:37:45 -08:00
Roopa Prabhu
f921f567d1 iplink: replace exit with return
This patch replaces exits with returns in iplink
command. Helps to continue on errors when
invoked with ip -force -batch.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
2016-01-11 08:23:27 -08:00
Bjørn Mork
8e12bc0a9d iplink: support show and set of "addrgenmode random"
"random" is a new IPv6 addrgenmode, enabling "stable_secret" type
addresses with an auto-generated secret.

$ ip link set eth0 addrgenmode random

$ ip -d link show dev eth0
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:21:86:a3:25:7d brd ff:ff:ff:ff:ff:ff promiscuity 0 addrgenmode random

Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
2016-01-06 09:20:59 -08:00
Bjørn Mork
8e098dd81a iplink: support setting addrgenmode stable_secret
It is possible to switch to another addrgenmode after setting a
valid secret.  Allow switching back without reconfiguring the
secret for completeness.

Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
2016-01-06 09:20:59 -08:00
Christoph Schulz
8aacb9bbbd ip: allow using a device "help" (or a prefix thereof)
Device names that match "help" or a prefix thereof should be allowed anywhere
a device name can be used. Note that a suitable keyword ("dev" or "name", the
latter for "ip tunnel") has to be used in these cases to resolve ambiguities.

Signed-off-by: Christoph Schulz <develop@kristov.de>
Reported-by: Leonhard Preis <leonhard@pre.is>
Reported-by: Wilhelm Wijkander <lists@0x5e.se>
2015-10-07 10:35:17 +01:00
Phil Sutter
940a96e6ca ip-link: do not support 'ip link add dev help'
Commit 0532555 ('Support "ip link add help" for rtnl_link API') added a
check for specified help parameter. Though due to the place where it has
been added to, it is not possible anymore to force a given parameter to
be interpreted as interface name by prefixing it with 'dev '. Fix this
by forcing whatever follows 'dev' to be presumed as interface name.

Signed-off-by: Phil Sutter <phil@nwl.cc>
2015-09-23 16:00:48 -07:00
Stephen Hemminger
f1e225beef Merge branch 'master' into net-next 2015-08-31 16:32:10 -07:00
Andy Gospodarek
5d295bb8e1 add support for brief output for link and addresses
This adds support for slightly less output than is normally provided by
'ip link show' and 'ip addr show'.  This is a bit better when you have a
host with lots of interfaces.  Sample output:

$ ip -br link show
lo               UNKNOWN        00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP>
p2p1             UP             08:00:27:ee:0b:3b <BROADCAST,MULTICAST,UP,LOWER_UP>
p7p1             UP             08:00:27:9d:62:9f <BROADCAST,MULTICAST,UP,LOWER_UP>
p8p1             DOWN           08:00:27:dc:d8:ca <NO-CARRIER,BROADCAST,MULTICAST,UP>
p9p1             UP             08:00:27:76:d9:75 <BROADCAST,MULTICAST,UP,LOWER_UP>
p7p1.100@p7p1    UP             08:00:27:9d:62:9f <BROADCAST,MULTICAST,UP,LOWER_UP>

$ ip -br -4 addr show
lo               UNKNOWN        127.0.0.1/8
p2p1             UP             192.168.56.2/24
p7p1             UP             70.0.0.1/24
p8p1             DOWN           80.0.0.1/24
p9p1             UP             10.0.5.15/24
p7p1.100@p7p1    UP             200.0.0.1/24

$ ip -br -6 addr show
lo               UNKNOWN        ::1/128
p2p1             UP             fe80::a00:27ff:feee:b3b/64
p7p1             UP             7000::1/8 fe80::a00:27ff:fe9d:629f/64
p8p1             DOWN           8000::1/8
p9p1             UP             fe80::a00:27ff:fe76:d975/64
p7p1.100@p7p1    UP             fe80::a00:27ff:fe9d:629f/64

$ ip -br addr show p7p1
p7p1             UP             70.0.0.1/24 7000::1/8 fe80::a00:27ff:fe9d:629f/64

v2: Now with color support!
v3: Better field width estimation (except netdev names to keep output at a
decent width) and whitespace fixup.

Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
2015-08-31 16:24:10 -07:00
Stephen Hemminger
6c5ffb9a2c iplink: cleanup whitespace and checkpatch issues
Mostly just use of {} and whitespace.
2015-08-25 15:57:04 -07:00
David Ahern
15faa0a30b add support for VRF device
Allow user to create a vrf device and specify its table binding.
Based on the iplink_vlan implementation.

Signed-off-by: Shrijeet Mukherjee <shm@cumulusnetworks.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
2015-08-23 10:14:16 -07:00
Stephen Hemminger
dfc3d015f6 Merge branch 'master' into net-next 2015-08-23 10:09:46 -07:00
Zhang Shengju
6843d36e3d ip-link: cut one level indentation
Cut one level indentation to make things easier to read.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
2015-08-19 16:09:09 -07:00
Stephen Hemminger
9a6422c243 Merge branch 'master' into net-next 2015-08-13 19:42:41 -07:00
Zhang Shengju
ff1e35edf5 ip-link: enhance prompt message
Enhance promtp message for 'spoofchk' and 'query_rss' flag, and fix a
typo.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
2015-08-13 14:10:22 -07:00
Stephen Hemminger
4b942cb1df Merge branch 'master' into net-next 2015-08-12 09:09:43 -07:00