Commit Graph

2115 Commits

Author SHA1 Message Date
Stephen Hemminger
cbb99f7fbe Update to latest kernel headers
Also add tipc_netlink.h for later TIPC support
2015-05-21 14:41:11 -07:00
David Ward
357c45ad3a tc: gred: Adopt the term VQ in the command syntax and output
In the GRED kernel source code, both of the terms "drop parameters"
(DP) and "virtual queue" (VQ) are used to refer to the same thing.
Each "DP" is better understood as a "set of drop parameters", since
it has values for limit, min, max, avpkt, etc. This terminology can
result in confusion when creating a GRED qdisc having multiple DPs.
Netlink attributes and struct members with the DP name seem to have
been left intact for compatibility, while the term VQ was otherwise
adopted in the code, which is more intuitive.

Use the VQ term in the tc command syntax and output (but maintain
compatibility with the old syntax).

Rewrite the usage text to be concise and similar to other qdiscs.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
2015-05-21 14:16:03 -07:00
David Ward
eb6d7d6af1 tc: gred: Handle unsigned values properly in option parsing/printing
DPs, def_DP, and DP are unsigned values that are sent and received
in TCA_GRED_* netlink attributes; handle them properly when they
are parsed or printed. Use MAX_DPs as the initial value for def_DP
and DP, and fix the operator used for bounds checking them.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
2015-05-21 14:16:03 -07:00
David Ward
1693a4d392 tc: gred: Improve parameter/statistics output
Make the output more consistent with the RED qdisc, and only show
details/statistics if the appropriate flag is set when calling tc.

Show the parameters used with "gred setup". Add missing statistics
"pdrop" and "other". Fix format specifiers for unsigned values.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
2015-05-21 14:16:03 -07:00
David Ward
a77905ef6a tc: gred: Print usage text if no arguments appear after "gred"
This is more helpful to the user, since the command takes two forms,
and the message that would otherwise appear about missing parameters
assumes one of those forms.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
2015-05-21 14:16:03 -07:00
David Ward
d73e0408e2 tc: gred: Fix whitespace issues in code
Signed-off-by: David Ward <david.ward@ll.mit.edu>
2015-05-21 14:16:03 -07:00
David Ward
7bf17a2264 tc: red: Mark "bandwidth" parameter as optional in usage text
Signed-off-by: David Ward <david.ward@ll.mit.edu>
2015-05-21 14:16:03 -07:00
David Ward
d93c909a4c tc: red, gred: Notify when using the default value for "bandwidth"
The "bandwidth" parameter is optional, but ensure the user is aware
of its default value, to proactively avoid configuration problems.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
2015-05-21 14:16:03 -07:00
David Ward
6c99695da2 tc: red, gred: Fix format specifier in burst size warning
burst is an unsigned value.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
2015-05-21 14:16:03 -07:00
David Ward
9d9a67c756 tc: red, gred: Rename overloaded variable wlog
It is used when parsing three different parameters, only one of
which is Wlog. Change the name to make the code less confusing.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
2015-05-21 14:16:03 -07:00
Vadim Kochan
699589f6df man ip-link: Remove extra GROUP explanation
Remove double explanation of GROUP option from 'ip link set' section.

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
2015-05-14 15:38:10 -07:00
Lennert Buytenhek
2c0feda8be man ip-link: Add missing lowpan link type
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
2015-05-11 09:19:22 -07:00
Daniel Borkmann
ec6f5abcea tc: minor cleanup on ingress
Fix whitespacing and remove the unnecessary condition.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2015-05-11 09:18:10 -07:00
Eric Dumazet
3bf5445c5e ss: dctcp changes
Missing space before dctcp: markers.

With dctcp, cwnd=2 is pretty common, just display cwnd value even
if cwnd has this value, it makes parsing easier.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
2015-05-11 09:16:43 -07:00
Eric Dumazet
656e8fdd2d ss: small optim in tcp_show_info()
Kernel can give us smaller tcp_info than our.

We copy the kernel provided structure and fill with 0
the remaining part.

Lets clear only the missing part to save some cycles, as we intend to
slightly increase tcp_info size in the future.

Signed-off-by: Eric Dumazet <edumazet@google.com>
2015-05-11 09:15:08 -07:00
Thomas Graf
38a7f26828 route: Add missing newline in helptext
Signed-off-by: Thomas Graf <tgraf@suug.ch>
2015-05-11 09:14:44 -07:00
WANG Cong
285e7768e8 tc: fill in handle before checking argc
When deleting a specific basic filter with handle,
tc command always ignores the 'handle' option, so
tcm_handle is always 0 and kernel deletes all filters
in the selected group. This is wrong, we should respect
'handle' in cmdline.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
2015-05-11 09:13:20 -07:00
Thomas Graf
f7dd7e5e71 iproute2: Fix typo in get_prefix_1()
Fixes a typo in get_prefix_1() which broke the prefix default
names { default | any | all }.

The most obvious fallout from this bug was:

	$ ip route add default via 1.1.1.1
	Error: an inet prefix is expected rather than "default".

Fixes: dacc5d4197 ("add basic mpls support to iproute")
Signed-off-by: Thomas Graf <tgraf@suug.ch>
2015-05-11 09:11:53 -07:00
Stephen Hemminger
906cafe3ff ip: fix exit code for addrlabel
The exit code for ip label was not correct.
The return from the command function is negated and turned into
the exit code on failure.
2015-05-07 08:11:30 -07:00
Stephen Hemminger
076ae7089a ip: fix exit code for rule failures
If ip rule command fails talking to kernel, exit code should be 2.
The sub-command is called by cmd loop and the exit code is negative
of return value from the command callback.
2015-05-07 08:11:30 -07:00
Stephen Hemminger
d58ba4ba2a ip: return correct exit code on route failure
If kernel complains about ip route request, exit status should be
2 not 1.

This fixes regression introduced by:
commit 42ecedd4ba
Author: Roopa Prabhu <roopa@cumulusnetworks.com>
Date:   Tue Mar 17 19:26:32 2015 -0700

    fix ip -force -batch to continue on errors
2015-05-07 08:11:30 -07:00
Stephen Hemminger
ce743da171 ip: document exit code
The ip command has always had a consistent exit status
document it so that developers see it.
2015-05-07 08:11:28 -07:00
Vlad Zolotarov
6c55c8c461 ip link set vf: Added "query_rss" command
Add a new option to toggle the ability of querying the RSS configuration of a specific VF.

VF RSS information like RSS hash key may be considered sensitive on some devices where
this information is shared between VF and PF and thus its querying may be prohibited by default.

This new option allows a system administrator with privileges to modify a PF state
to control if the above VF querying is allowed or not.

For example:
 To enable RSS querying of VF[0] of ethX:
 >> ip link set dev ethX vf 0 query_rss on

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-05-04 09:08:26 -07:00
Vadim Kochan
8916ccf66c ip link: Add group in usage() for 'ip link delete'
Show deleting by group in 'ip link help' output:

...
ip link delete { DEVICE | dev DEVICE | group DEVGROUP } type TYPE [ ARGS ]
...

Also show separately DEVICE option in { } list.

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
2015-05-04 09:00:59 -07:00
Vadim Kochan
7f74cf6de0 man ip-link: Add deleting links by group
Indicate possibility deleting virtual links by group.

Also changed the alignment of 'ip link delete' args
descriptions, to look like similary to 'ip link set'.

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
2015-05-04 09:00:52 -07:00
Vadim Kochan
57ff5a1096 ss: Fix wrong filter behaviour
Fixed applying family & socket type filters.
It was not possible to select UDP & UNIX sockets together.

Now selected families are ORed.

The problem was that filters were combined by AND.

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
Reported-By: Mihai Moldovan <ionic@ionic.de>
2015-05-04 08:58:47 -07:00
Daniel Borkmann
d937a74b6d tc: {m, f}_ebpf: add option for dumping verifier log
Currently, only on error we get a log dump, but I found it useful when
working with eBPF to have an option to also dump the log on success.
Also spotted a typo in a header comment, which is fixed here as well.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
2015-05-04 08:43:08 -07:00
Mathias Nyman
d7bd2db52c ip: Add color output option
It is hard to quickly find what you are looking for in the output of the
ip command. Color helps.

This patch adds a '-c' flag to highlight these with individual colors:
  - interface name
  - ip address
  - mac address
  - up/down state

Signed-off-by: Mathias Nyman <m.nyman@iki.fi>
Tested-by: Yegor Yefremov <yegorslists@googlemail.com>
2015-05-04 08:39:17 -07:00
Stephen Hemminger
aeedd8e1e7 update headers to reflect BPF changes
Reclone sanitized headers from 4.1-rc
2015-04-29 12:33:24 -07:00
Daniel Borkmann
279d6a8ba7 examples: bpf: fix ld offs to have same prog loaded on ingress/egress
Fix up the eBPF example program to match our kernel fix in a166151cbe33 ("bpf:
fix bpf helpers to use skb->mac_header relative offsets"). Tested on ingress
and egress paths.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
2015-04-27 16:42:32 -07:00
Daniel Borkmann
4bd624467b tc: built-in eBPF exec proxy
This work follows upon commit 6256f8c9e4 ("tc, bpf: finalize eBPF
support for cls and act front-end") and takes up the idea proposed by
Hannes Frederic Sowa to spawn a shell (or any other command) that holds
generated eBPF map file descriptors.

File descriptors, based on their id, are being fetched from the same
unix domain socket as demonstrated in the bpf_agent, the shell spawned
via execvpe(2) and the map fds passed over the environment, and thus
are made available to applications in the fashion of std{in,out,err}
for read/write access, for example in case of iproute2's examples/bpf/:

  # env | grep BPF
  BPF_NUM_MAPS=3
  BPF_MAP1=6        <- BPF_MAP_ID_QUEUE (id 1)
  BPF_MAP0=5        <- BPF_MAP_ID_PROTO (id 0)
  BPF_MAP2=7        <- BPF_MAP_ID_DROPS (id 2)

  # ls -la /proc/self/fd
  [...]
  lrwx------. 1 root root 64 Apr 14 16:46 0 -> /dev/pts/4
  lrwx------. 1 root root 64 Apr 14 16:46 1 -> /dev/pts/4
  lrwx------. 1 root root 64 Apr 14 16:46 2 -> /dev/pts/4
  [...]
  lrwx------. 1 root root 64 Apr 14 16:46 5 -> anon_inode:bpf-map
  lrwx------. 1 root root 64 Apr 14 16:46 6 -> anon_inode:bpf-map
  lrwx------. 1 root root 64 Apr 14 16:46 7 -> anon_inode:bpf-map

The advantage (as opposed to the direct/native usage) is that now the
shell is map fd owner and applications can terminate and easily reattach
to descriptors w/o any kernel changes. Moreover, multiple applications
can easily read/write eBPF maps simultaneously.

To further allow users for experimenting with that, next step is to add
a small helper that can get along with simple data types, so that also
shell scripts can make use of bpf syscall, f.e to read/write into maps.

Generally, this allows for prepopulating maps, or any runtime altering
which could influence eBPF program behaviour (f.e. different run-time
classifications, skb modifications, ...), dumping of statistics, etc.

Reference: http://thread.gmane.org/gmane.linux.network/357471/focus=357860
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
2015-04-27 16:39:23 -07:00
Nicolas Dichtel
505f91869f mroute: remove invalid check against NLM_F_MULTI
This flag is only for the netlink protocol (multi-part messages), no reason
to reject messages without it.

Note that this flag was removed by the following kernel patches (v3.14)
65886f439ab0 ipmr: fix mfc notification flags
f518338b1603 ip6mr: fix mfc notification flags

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
2015-04-27 11:41:46 -07:00
Nicolas Dichtel
b765eda924 libnamespaces: fix warning about syscall()
The warning was:
In file included from namespace.c:14:0:
../include/namespace.h: In function ‘setns’:
../include/namespace.h:37:2: warning: implicit declaration of function ‘syscall’ [-Wimplicit-function-declaration]

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
2015-04-27 11:41:46 -07:00
Nicolas Dichtel
afa5158f02 tc: fix compilation warning on 32bits arch
The warning was:
m_simple.c: In function ‘parse_simple’:
m_simple.c:142:4: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 3 has type ‘size_t’ [-Wformat]

Useful to be able to compile with -Werror.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
2015-04-27 11:41:46 -07:00
Vadim Kochan
46679bbbe8 tc util: Fix possible buffer overflow when print class id
Use correct handle buffer length.

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
2015-04-20 10:06:02 -07:00
Nicolas Dichtel
782cf01dc0 ipxfrm: wrong nl msg sent on deleteall cmd
XFRM netlink family is independent from the route netlink family. It's wrong
to call rtnl_wilddump_request(), because it will add a 'struct ifinfomsg' into
the header and the kernel will complain (at least for xfrm state):

netlink: 24 bytes leftover after parsing attributes in process `ip'.

Reported-by: Gregory Hoggarth <Gregory.Hoggarth@alliedtelesis.co.nz>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
2015-04-20 10:04:20 -07:00
Nicolas Dichtel
d652ccbf81 netns: allow to dump and monitor nsid
Two commands are added:
 - ip netns list-id
 - ip monitor nsid

A cache is also added to remember the association between the iproute2 netns
name (from /var/run/netns/) and the nsid.
To avoid interfering with the rth socket, a new rtnl socket (rtnsh) is used to
get nsid (we may send rtnl request during listing on rth).

Example:
$ ip netns list-id
nsid 0 (iproute2 netns name: foo)
$ ip monitor nsid
Deleted nsid 0 (iproute2 netns name: foo)
nsid 16 (iproute2 netns name: bar)

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
2015-04-20 10:02:38 -07:00
Pavel Šimerda
b1410e0ab1 lnstat: dump to stdout, not stderr
See also:

 * https://bugzilla.redhat.com/show_bug.cgi?id=736332

Signed-off-by: Pavel Šimerda <psimerda@redhat.com>
2015-04-20 09:58:46 -07:00
Pavel Šimerda
e7e2913fe4 lnstat: run indefinitely by default
See also:

 * https://bugzilla.redhat.com/show_bug.cgi?id=977845

Signed-off-by: Pavel Šimerda <psimerda@redhat.com>
2015-04-20 09:58:11 -07:00
Pavel Šimerda
a51842dcd7 cbq: fix find syntax in example
Without modification, using the example resulted in the following error:

[root@localhost sbin]# cbq restart
find: warning: you have specified the -maxdepth option after a
non-option argument (, but options are not positional (-maxdepth affects
tests specified before it as well as those specified after it).  Please
specify options before other arguments.

find: warning: you have specified the -maxdepth option after a
non-option argument (, but options are not positional (-maxdepth affects
tests specified before it as well as those specified after it).  Please
specify options before other arguments.

**CBQ: failed to compile CBQ configuration!

See also:

 * https://bugzilla.redhat.com/show_bug.cgi?id=539232

Reported-by: Mads Kiilerich <mads@kiilerich.com>
Signed-off-by: Pavel Šimerda <psimerda@redhat.com>
2015-04-20 09:57:14 -07:00
Pavel Šimerda
11a3e5c4b3 ip-xfrm: support 'proto any' with 'sport' and 'dport'
When creating an IPsec SA that sets 'proto any' (IPPROTO_IP) and
specifies 'sport' and 'dport' at the same time in selector, the
following error is issued:

"sport" and "dport" are invalid with proto=ip

However using IPPROTO_IP with ports is completely legal and necessary
when one wants to share the SA on both TCP and UDP. One of the
applications requiring sharing SAs is 3GPP IMS AKA authentication.

See also:

 * https://bugzilla.redhat.com/show_bug.cgi?id=497355

Reported-by: Jiří Klimeš <jklimes@redhat.com>
Signed-off-by: Pavel Šimerda <psimerda@redhat.com>
2015-04-20 09:56:44 -07:00
Pavel Šimerda
06ec9039c3 turn Makefile more distribution friendly
Changes:

 * Accept directory settings from environment.
 * Remove redundant ROOTDIR variable.
 * Set KERNEL_INCLUDE default to '/usr/include'.
 * Use CFLAGS from environemnt.

Note: In the long term it might be better to improve the configure
script to generate those parts of the Makefile in a manner similar
to autoconf. It might be even practical to autotoolize the package.

Signed-off-by: Pavel Šimerda <psimerda@redhat.com>
2015-04-20 09:53:24 -07:00
Felix Fietkau
b8d5c9a71b tc: add support for connmark action
Add ability to add the netfilter connmark support.

Typical usage:
...lets tag outgoing icmp with mark 0x10..
iptables -tmangle -A PREROUTING -p icmp -j CONNMARK --set-mark 0x10
..add on ingress of $ETH an extractor for connmark...
tc filter add dev $ETH parent ffff: prio 4 protocol ip \
u32 match ip protocol 1 0xff \
flowid 1:1 \
action connmark continue
...if the connmark was 0x11, we police to a ridic rate of 10Kbps
tc filter add dev $ETH parent ffff: prio 5 protocol ip \
handle 0x11 fw flowid 1:1 \
action police rate 10kbit burst 10k

Other ways to use the connmark is to supply the zone, index and
branching choice. Refer to help.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2015-04-13 10:49:45 -07:00
Stephen Hemminger
94f665387e update kernel headers and add tc_connmark.h
Needed for later tc action patches
2015-04-13 10:49:33 -07:00
Andy Gospodarek
aa05b988f5 iproute2: unify naming for entries offloaded to hardware
The kernel now has the capability to offload FDB and FIB entries to hardware.
It is important to let users know if table entries are also offloaded to
hardware.  Currently offloaded FDB entries are indicated by the existence of
the flag 'external' on the entry as of the following commit:

commit 28467b7f3f
Author: Scott Feldman <sfeldma@gmail.com>
Date:   Thu Dec 4 09:57:15 2014 +0100

    bridge/fdb: add flag/indication for FDB entry synced from offload device

When the patch to add support for indicating that FIB entries were also
offloaded as posted to netdev by Scott Feldman it became clear that 'external'
would not be an ideal name for routes.  There could definitely be confusion
about what this might mean since many routes are to external networks -- a
collision/confusion that did not happen with FDB.

Scott Feldman asked me to check with others and build concensus around a name.
After speaking with several people about this I am proposing we refer to both
FDB and FIB entries that are currently backed by hardware (based on the work
done in rocker) with the flag 'offload' appended to the end ofthe entry.

Some people liked the string 'external,' others liked 'hardware,' but the point
is to communicate that these routes are available to something that will will
offload the forwarding normally done by the kernel.  Since the term 'offload'
is used so frequently it seems appropriate to use the same language in
ip/bridge output.

The term 'offload' also seems to resonate with many of the people who have
responded on Scott's original thread or to those who I reached out to directly
and did respond to my query, so it seems we have reached consensus that it
should be the term used going forward.

v2: rebased against net-next branch

Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
CC: Jamal Hadi Salim <jhs@mojatatu.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: John W. Linville <linville@tuxdriver.com>
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
CC: Scott Feldman <sfeldma@gmail.com>
CC: Stephen Hemminger <stephen@networkplumber.org>
2015-04-13 09:40:46 -07:00
Stephen Hemminger
93531fac41 Merge branch 'master' into net-next 2015-04-13 09:39:46 -07:00
Stephen Hemminger
672acc7238 fix whitespace 2015-04-13 09:39:34 -07:00
Stephen Hemminger
aed6d85d15 v4.0.0 2015-04-13 08:55:11 -07:00
Nicolas Dichtel
4c7d9a5888 ipnetns: add a runtime check for RTM_GETNSID support
The goal of this patch is to test during the runtime if the command RTM_GETNSID
is supported by the kernel.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
2015-04-13 08:50:10 -07:00
Nicolas Dichtel
5a2ce86823 Revert "ip netns: Fix rtnl error while print netns list"
This reverts commit d116ff3414.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
2015-04-13 08:50:10 -07:00