Commit Graph

587 Commits

Author SHA1 Message Date
Vadim Kochan
4612d04d6b tc class: Show class names from file
It is possible to use class names from file /etc/iproute2/cls_names
which tc will use when showing class info:

    # tc/tc -nm class show dev lo
	class htb 1:10 parent 1:1 leaf 10: prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
	class htb 1:1 root rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
	class htb web#1:20 parent 1:1 leaf 20: prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
	class htb 1:2 root rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
	class htb 1:30 parent 1:1 leaf 30: prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
	class htb voip#1:40 parent 1:2 leaf 40: prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
	class htb 1:50 parent 1:2 leaf 50: prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
	class htb 1:60 parent 1:2 leaf 60: prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b

or to specify via file path:

    # tc/tc -nm -cf /tmp/cls_names class show dev lo

Class names file contains simple "maj:min  name" structure:

1:20    web
1:40    voip

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
2015-03-15 12:27:40 -07:00
Daniel Borkmann
32caee9fc7 m_bpf: remove unrelevant help lines
Left-overs when copying this over from cls_bpf. ;) Lets remove them.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
2015-02-27 19:00:51 -08:00
Jiri Pirko
86ab59a666 tc: add support for BPF based actions
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
2015-02-05 10:38:13 -08:00
Jiri Pirko
1d129d191a tc: push bpf common code into separate file
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
2015-02-05 10:38:13 -08:00
Jamal Hadi Salim
564663b4ca actions: Get vlan action to work in pipeline
When specified in a graph such as:
action vlan ... action foobar
the vlan action chewed more than it can swallow

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2015-01-13 17:22:44 -08:00
Vadim Kochan
67e1d73be1 tc: Allow to easy change network namespace
Added new '-netns' option to simplify executing following cmd:

    ip netns exec NETNS tc OPTIONS COMMAND OBJECT

    to

    tc -n[etns] NETNS OPTIONS COMMAND OBJECT

e.g.:

    tc -net vnet0 qdisc

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
2014-12-27 10:22:34 -08:00
Vadim Kochan
d954b34a1f tc class: Show classes as ASCII graph
Added new '-g[raph]' option which shows classes in the graph view.

Meanwhile only generic stats info output is supported.

e.g.:

$ tc/tc -g class show dev tap0
+---(1:2) htb rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
|    +---(1:40) htb prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
|    +---(1:50) htb rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
|    |    +---(1:51) htb prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
|    |
|    +---(1:60) htb prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
|
+---(1:1) htb rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
     +---(1:10) htb prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
     +---(1:20) htb prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
     +---(1:30) htb prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b

$ tc/tc -g -s class show dev tap0
+---(1:2) htb rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
|    |    Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
|    |    rate 0bit 0pps backlog 0b 0p requeues 0
|    |
|    +---(1:40) htb prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
|    |          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
|    |          rate 0bit 0pps backlog 0b 0p requeues 0
|    |
|    +---(1:50) htb rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
|    |    |     Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
|    |    |     rate 0bit 0pps backlog 0b 0p requeues 0
|    |    |
|    |    +---(1:51) htb prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
|    |               Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
|    |               rate 0bit 0pps backlog 0b 0p requeues 0
|    |
|    +---(1:60) htb prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
|               Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
|               rate 0bit 0pps backlog 0b 0p requeues 0
|
+---(1:1) htb rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
     |    Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
     |    rate 0bit 0pps backlog 0b 0p requeues 0
     |
     +---(1:10) htb prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
     |          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
     |          rate 0bit 0pps backlog 0b 0p requeues 0
     |
     +---(1:20) htb prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
     |          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
     |          rate 0bit 0pps backlog 0b 0p requeues 0
     |
     +---(1:30) htb prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
                Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
                rate 0bit 0pps backlog 0b 0p requeues 0

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
2014-12-27 10:16:51 -08:00
Stephen Hemminger
5c2c10b17e Merge branch 'net-next' 2014-12-24 12:23:00 -08:00
Stephen Hemminger
3d0b7439df whitespace cleanup
Remove all trailing whitespace and space before tabs.
2014-12-20 15:47:17 -08:00
Stephen Hemminger
c9b8aef6ae Merge branch 'master' into net-next 2014-12-09 16:33:59 -08:00
Stephen Hemminger
b2e116d6c3 tc: minor spelling fixes 2014-12-03 19:28:34 -08:00
Jiri Pirko
8b1c0216d8 tc: add support for vlan tc action
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Reviewed-by: Cong Wang <cwang@twopensource.com>
2014-12-03 09:29:21 -08:00
Stephen Hemminger
edd3979272 emp: fix warning on deprecated bison directive
emp_ematch.y:12.1-13: warning: deprecated directive, use ‘%name-prefix’ [-Wdeprecated]
 %name-prefix="ematch_"
 ^^^^^^^^^^^^^
2014-10-09 08:31:10 -07:00
Jamal Hadi Salim
863ecb04b4 discourage use of direct policer interface
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2014-10-09 08:26:57 -07:00
Jamal Hadi Salim
287bf3a990 route classifier support for multiple actions
route can now use the action syntax

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2014-10-09 08:26:57 -07:00
Jamal Hadi Salim
08139c2ffb tcindex classifier support for multiple actions
tcindex can now use the action syntax

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2014-10-09 08:26:56 -07:00
Andy Furniss
a07c6d6135 add missing underscore to man page and example nf_mark ematch
The man page and the "fail" example are missing an underscore in the
nf_mark ematch.

eg.

tc filter add dev eth0 parent ffff:  basic match 'meta(nfmark gt 24)'
classid 2:4

meta: unknown meta id

... >>meta(nfmark gt 24)<< ...
... meta(>>nfmark<< gt 24)...
Usage: meta(OBJECT { eq | lt | gt } OBJECT)
where: OBJECT  := { META_ID | VALUE }
        META_ID := id [ shift SHIFT ] [ mask MASK ]

Example: meta(nfmark gt 24)
          meta(indev shift 1 eq "ppp")
          meta(tcindex mask 0xf0 eq 0xf0)

For a list of meta identifiers, use meta(list).
Illegal "ematch"

meta(list) does correctly show nf_mark and the above test works with
nf_mark.

Signed-off-by: Andy Furniss adf.lists@gmail.com
2014-10-09 08:24:00 -07:00
Jamal Hadi Salim
10f5a375ea rsvp classifier support for multiple actions
Example setup:

sudo tc qdisc del dev eth0 root handle 1:0 prio
sudo tc qdisc add dev eth0 root handle 1:0 prio

sudo tc filter add dev eth0 pref 10 proto ip parent 1:0 \
rsvp session 10.0.0.1 ipproto icmp \
classid 1:1  \
action police rate 1kbit burst 90k pipe \
action ok

tc -s filter show dev eth0 parent 1:0

filter protocol ip pref 10 rsvp
filter protocol ip pref 10 rsvp fh 0x0001100a flowid 1:1 session
10.0.0.1 ipproto icmp
        action order 1:  police 0x5 rate 1Kbit burst 23440b mtu 2Kb
action pipe overhead 0b
ref 1 bind 1
        Action statistics:
        Sent 98000 bytes 1000 pkt (dropped 0, overlimits 761 requeues 0)
        backlog 0b 0p requeues 0

        action order 2: gact action pass
         random type none pass val 0
         index 2 ref 1 bind 1 installed 60 sec used 3 sec
        Action statistics:
        Sent 74578 bytes 761 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Tested-by: John Fastabend <john.r.fastabend@intel.com>
2014-09-29 08:47:33 -07:00
Jamal Hadi Salim
954de6c72b actions: BugFix action stats to display with -s
Was broken by commit 288abf513f
Lets not be too clever and have a separate call to print flushed
actions info.

Broken looks like:
root@moja-1:~# tc actions add  action drop index 4
root@moja-1:~# tc -s actions ls action gact

    action order 0: gact action drop
     random type none pass val 0
     index 4 ref 1 bind 0 installed 9 sec used 4 sec

The fixed version looks like:
    action order 0: gact action drop
     random type none pass val 0
     index 4 ref 1 bind 0 installed 9 sec used 4 sec
         Sent 108948 bytes 1297 pkts (dropped 1297, overlimits 0)

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2014-09-29 08:47:19 -07:00
Jay Vosburgh
3757185b29 tc/netem: loss gemodel options fixes
First, the default value for 1-k is documented as being 0, but is
currently being set to 1. (100%).  This causes all packets to be dropped
in the good state if 1-k is not explicitly specified.  Fix this by setting
the default to 0.

	Second, the 1-h option is parsed correctly, however, the kernel is
expecting "h", not 1-h.  Fix this by inverting the "1-h" percentage before
sending to and after receiving from the kernel.  This does change the
behavior, but makes it consistent with the netem documentation and the
literature on the Gilbert-Elliot model, which refer to "1-h" and "1-k,"
not "h" or "k" directly.

	Last, fix a minor formatting issue for the options reporting.

Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
2014-08-04 10:15:10 -07:00
Yang Yingliang
aeb199d5ce fq: allow options of fair queue set to ~0U
Some options of fair queue cannot be (~0U). It leads to maxrate
cannot be reset to unlimited because it cannot be (~0U). Allow
the options being ~0U.

Tested by the following command:
 # tc qdisc add dev eth4 root handle 1: fq limit 2000 flow_limit 200 maxrate 100mbit quantum 2000 initial_quantum 1600
 # tc -s -d qdisc show
qdisc fq 1: dev eth4 root refcnt 2 limit 2000p flow_limit 200p buckets 1024 quantum 2000 initial_quantum 1600 maxrate 100Mbit
 Sent 1492 bytes 10 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  1 flows (0 inactive, 0 throttled)
  0 gc, 0 highprio, 0 throttled

 # tc qdisc change dev eth4 root handle 1: fq limit 4294967295 flow_limit 4294967295 maxrate 34359738360 quantum 4294967295 initial_quantum 4294967295
 # tc -s -d qdisc show
qdisc fq 1: dev eth4 root refcnt 2 limit 4294967295p flow_limit 4294967295p buckets 1024 quantum 4294967295 initial_quantum 4294967295
 Sent 38372 bytes 216 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  2 flows (1 inactive, 0 throttled)
  0 gc, 2 highprio, 7 throttled

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
2014-06-09 12:42:36 -07:00
Sergey V. Lobanov
3ff10e82c1 Fixed 'tc qdisc show' for tbf when latency<0
When limit<burst latency becomes <0, for example:
 # tc qdisc add dev eth0 root handle 1: tbf limit 100K burst 256K rate 256kbit
 # tc qdisc show
 qdisc tbf 1: dev eth0 root refcnt 2 rate 256Kbit burst 256Kb lat 4290.0s

If latency<0 there is no reason to show it. Limit will be printed instead of
latency when latency<0:
 # tc qdisc show
 qdisc tbf 1: dev eth0 root refcnt 2 rate 256Kbit burst 256Kb limit 100Kb

Signed-off-by: Sergey V. Lobanov <sergey@lobanov.in>
2014-05-28 17:08:16 -07:00
Jamal Hadi Salim
288abf513f actions: correctly report the number of actions flushed
This also fixes a long standing bug of not sanely reporting the
action chain ordering

Sample scenario test

on window 1(event window):
run "tc monitor" and observe events

on window 2:
sudo tc actions add action drop index 10
sudo tc actions add action ok index 12
sudo tc actions ls action gact
sudo tc actions flush action gact

See the event window reporting two entries
(doing another listing should show empty generic actions)

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2014-05-28 16:54:31 -07:00
Jamal Hadi Salim
9282d08d93 actions: keyword flowid or classid terminates action pipeline
scenario testcase:

TC="sudo ./tc/tc"
DEV="dev eth0"
$TC qdisc del $DEV ingress
$TC qdisc add $DEV ingress
$TC filter add $DEV parent ffff: protocol ip u32 match ip src 10.0.0.0/24 action police rate 6Mbit burst 6Mbit drop flowid :1
$TC filter add $DEV parent ffff: protocol ip u32 match ip dst 10.0.0.0/24 action police rate 1Gbit burst 1Gbit pass flowid :1
$TC -s filter ls $DEV parent ffff: protocol ip
$TC qdisc del $DEV ingress
$TC qdisc add $DEV ingress
$TC filter add $DEV parent ffff: protocol ip u32 match ip src 10.0.0.0/24 flowid 1:1 action police rate 6Mbit burst 6Mbit drop
$TC filter add $DEV parent ffff: protocol ip u32 match ip dst 10.0.0.0/24 flowid 1:2 action police rate 1Gbit burst 1Gbit pass

$TC -s filter ls $DEV parent ffff: protocol ip
$TC qdisc del $DEV ingress
$TC qdisc add $DEV ingress
$TC filter add $DEV parent ffff: protocol ip pref 10 \
u32 match ip protocol 1 0xff \
flowid 1:10 \
action skbedit mark 11 \
action police rate 10kbit burst 10k pipe index 1 \
action skbedit mark 12 \
action police rate 20kbit burst 20k pipe index 2 \
action mirred egress mirror dev dummy0

$TC -s filter ls $DEV parent ffff: protocol ip
$TC qdisc del $DEV ingress
$TC qdisc add $DEV ingress
$TC filter add $DEV parent ffff: protocol ip pref 10 \
u32 match ip protocol 1 0xff \
action skbedit mark 11 \
action police rate 10kbit burst 10k pipe index 1 \
action skbedit mark 12 \
action police rate 20kbit burst 20k pipe index 2 \
action mirred egress mirror dev dummy0 \
flowid 1:10

$TC -s filter ls $DEV parent ffff: protocol ip

Reported-by: Seann Herdejurgen <seann@herdejurgen.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2014-05-28 16:54:28 -07:00
Jamal Hadi Salim
cacba03b10 Remove unnecessary debug statement
Reported-by: Seann Herdejurgen <seann@herdejurgen.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2014-05-28 16:54:26 -07:00
Natanael Copa
dd9cc0ee81 iproute2: various header include fixes for compiling with musl libc
We need limits.h for LONG_MIN and LONG_MAX, sys/param.h for MIN and
sys/select for struct timeval.

This fixes the following compile errors with musl libc:

f_bpf.c: In function 'bpf_parse_opt':
f_bpf.c:181:12: error: 'LONG_MIN' undeclared (first use in this function)
   if (h == LONG_MIN || h == LONG_MAX) {
            ^
...

tc_util.o: In function `print_tcstats2_attr':
tc_util.c:(.text+0x13fe): undefined reference to `MIN'
tc_util.c:(.text+0x1465): undefined reference to `MIN'
tc_util.c:(.text+0x14ce): undefined reference to `MIN'
tc_util.c:(.text+0x154c): undefined reference to `MIN'
tc_util.c:(.text+0x160a): undefined reference to `MIN'
tc_util.o:tc_util.c:(.text+0x174e): more undefined references to `MIN' follow
...

tc_stab.o: In function `print_size_table':
tc_stab.c:(.text+0x40f): undefined reference to `MIN'
...

fdb.c:247:30: error: 'ULONG_MAX' undeclared (first use in this function)
        (vni >> 24) || vni == ULONG_MAX)
                              ^

lnstat.h:28:17: error: field 'last_read' has incomplete type
  struct timeval last_read;  /* last time of read */
                 ^

Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
2014-05-28 16:51:39 -07:00
Andreas Greve
6e2e5ec28b fix print_ipt: segfault if more then one filter with action -j MARK.
BUG: tc filter show ... produce a segmentation fault if more than one
filter rule with action -j MARK exists.

Reason: In print_ipt(...) xtables will be initialzed with a
pointer to the static struct tcipt_globals at xtables_init_all().
Later on the fields .opts and .options_offset of tcipt_globals are
modified. The call of xtables_free_opts(1) at the end of print(...)
does not restore the original values of tcipt_globals for the
modified fields. It only frees some allocated memory and sets
.opts to NULL. This leads to a segmentation fault when print_ipt()
is called for the next filter rule with action -j MARK.

Fix: Cloneing tcipt_globals on the stack as tmp_tcipt_globals and
use it instead of tcipt_globals, so tcipt_globals will be not
modified.

Signed-off-by: Andreas Greve <andreas.greve@a-greve.de>
2014-05-13 13:10:31 -07:00
Terry Lam
ac74bd2a71 support for Heavy Hitter Filter (HHF) qdisc
$tc qdisc add dev eth0 hhf help
Usage: ... hhf [ limit PACKETS ] [ quantum BYTES]
               [ hh_limit NUMBER ]
               [ reset_timeout TIME ]
               [ admit_bytes BYTES ]
               [ evict_timeout TIME ]
               [ non_hh_weight NUMBER ]

$tc -s -d qdisc show dev eth0
qdisc hhf 8005: root refcnt 32 limit 1000p quantum 1514 hh_limit 2048
reset_timeout 40.0ms admit_bytes 131072 evict_timeout 1.0s non_hh_weight 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
  backlog 0b 0p requeues 0
    drop_overlimit 0 hh_overlimit 0 tot_hh 0 cur_hh 0

HHF qdisc parameters:
- limit: max number of packets in qdisc (default 1000)
- quantum: max deficit per RR round (default 1 MTU)
- hh_limit: max number of HHs to keep states (default 2048)
- reset_timeout: time to reset HHF counters (default 40ms)
- admit_bytes: counter thresh to classify as HH (default 128KB)
- evict_timeout: threshold to evict idle HHs (default 1s)
- non_hh_weight:  DRR weight for mice (default 2)

Signed-off-by: Terry Lam <vtlam@google.com>
2014-05-09 12:10:47 -07:00
Jay Vosburgh
8f9672af7a tc/netem: fix loss state display and p14 parsing
The display of the entire netem loss state is shown as if it
were gemodel state, as the loss state information is assigned to the
wrong pointer.  Correct this by assigning the loss state to the correct
pointer.

	Additionally, attempting to set netem loss state will result in
random values in the p14 state probability because the option value
passed to the kernel by tc netem is not parsed or initialized.  Fix this
by supplying a default value of 0 for p14 and parsing the p14 value if
one is supplied.

Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
2014-05-09 12:06:58 -07:00
Hiroaki SHIMODA
4d4da09e00 htb: Move direct_qlen code part to htb_parse_opt().
The direct_qlen command option is used with qdisc operation.
It happened to be implemented in htb_parse_class_opt() which is called
with class operation.

Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
2014-03-21 14:20:06 -07:00
WANG Cong
1c9af05071 pedit: do not print debugging information by default
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
2014-02-10 14:43:52 -08:00
Yang Yingliang
dad2f72bef netem: add 64bit rates support
netem support 64bit rates start from linux-3.13.
Add 64bit rates support in tc tools.

tc qdisc show dev eth0
qdisc netem 1: dev eth4 root refcnt 2 limit 1000 rate 35Gbit

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Eric Dumazet <edumazet@google.com>
2014-01-20 12:32:15 -08:00
Yang Yingliang
a01de0a336 tbf: support sending burst/mtu to kernel directly
To avoid loss when transforming burst to buffer in userspace, send
burst/mtu to kernel directly.

Kernel commit 2e04ad424b("sch_tbf: add TBF_BURST/TBF_PBURST attribute")
make it can handle burst/mtu.

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
2014-01-20 12:32:14 -08:00
Vijay Subramanian
80dd880dd0 PIE: Proportional Integral controller Enhanced
Proportional Integral controller Enhanced (PIE) is a scheduler to address the
bufferbloat problem.

We present here a lightweight design, PIE(Proportional Integral controller
Enhanced) that can effectively control the average queueing latency to a target
value. Simulation results, theoretical analysis and Linux testbed results have
shown that PIE can ensure low latency and achieve high link utilization under
various congestion situations. The design does not require per-packet
timestamp, so it incurs very small overhead and is simple enough to implement
in both hardware and software.  "

For more information, please see technical paper about PIE in the IEEE
Conference on High Performance Switching and Routing 2013. A copy of the paper
can be found at ftp://ftpeng.cisco.com/pie/.

Please also refer to the IETF draft submission at
http://tools.ietf.org/html/draft-pan-tsvwg-pie-00

All relevant code, documents and test scripts and results can be found at
ftp://ftpeng.cisco.com/pie/.

For problems with the iproute2/tc or Linux kernel code, please contact Vijay
Subramanian (vijaynsu@cisco.com or subramanian.vijay@gmail.com) Mythili Prabhu
(mysuryan@cisco.com)

Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: Mythili Prabhu <mysuryan@cisco.com>
CC: Dave Taht <dave.taht@bufferbloat.net>
2014-01-09 22:50:47 -08:00
Stephen Hemminger
ef056b2190 Merge branch 'master' into net-next-for-3.13 2014-01-09 22:44:17 -08:00
Jamal Hadi Salim
f24a7e7205 dont skip action order
attached.

cheers,
jamal
commit 58d78f9f6447df324cdeb99262442c5e3f1f924b
Author: Jamal Hadi Salim <jhs@mojatatu.com>
Date:   Sun Dec 22 10:34:18 2013 -0500

    dont skip displaying of action chains or lists by TCA_ACT_MAX_PRIO

    Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2013-12-28 10:57:34 -08:00
Jamal Hadi Salim
b159a7f1ae allow batch gets of actions
Attached.

cheers,
jamal
commit c5f30cabef14c951596210b96bc9b423b0d39592
Author: Jamal Hadi Salim <hadi@mojatatu.com>
Date:   Sun Dec 22 10:24:17 2013 -0500

    Allow batching of action gets
    Example:
    ----
    tc actions get \
    action gact index 100 \
    action gact index 4
    ----

    Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2013-12-28 10:57:34 -08:00
Jamal Hadi Salim
352f6f97be simple print newline
attached.

cheers,
jamal
commit d7869e6167c3553e93e254940b0647032b40fed8
Author: Jamal Hadi Salim <jhs@mojatatu.com>
Date:   Sun Dec 22 07:46:28 2013 -0500

    print new line at the end for aesthetics

    Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2013-12-28 10:57:34 -08:00
Jamal Hadi Salim
4bfb21ca20 policer - retire old syntax
attached.

cheers,
jamal
commit b82057d9ec851a8aba8a295b959190ef5098f330
Author: Jamal Hadi Salim <jhs@mojatatu.com>
Date:   Sat Dec 21 17:00:11 2013 -0500

    After a decade of trying to deprecate the old policer syntax,
    I believe it is time to kill it. The kernel build option for old
    policer is gone for at least 5 years now (although backward
    compatibility is still there). Being backward compatible meant
    hijacking the keyword "action" and was obstructing policies like:

    tc filter add dev eth0 parent ffff: protocol ip pref 10 \
    u32 match ip protocol 1 0xff flowid 1:10 \
    action skbedit mark 1 \
    action police rate 10kbit burst 10k pipe \
    action skbedit mark 2 \
    action police rate 20kbit burst 20k pipe \
    action action mirred egress mirror dev dummy0

    Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2013-12-28 10:57:34 -08:00
Jamal Hadi Salim
02b1d345b7 skbedit print missing metadata
skbedit should print the index and other generic metadata info

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2013-12-28 10:57:34 -08:00
Jamal Hadi Salim
64b7db4db7 skbedit to default to pipe
Allow skbedit to be used as is in an action chain by default
without need to specify pipe

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2013-12-28 10:57:34 -08:00
Stephen Hemminger
4d98ab00de Fix FSF address in file headers 2013-12-06 15:05:07 -08:00
Eric Dumazet
8cecdc2837 tc: more user friendly rates
Display more user friendly rates.

10Mbit is more readable than 10000Kbit

Before :
class htb 1:2 root prio 0 rate 10000Kbit ceil 10000Kbit ...

After:
class htb 1:2 root prio 0 rate 10Mbit ceil 10Mbit ...

Signed-off-by: Eric Dumazet <edumazet@google.com>
2013-12-02 23:48:11 -08:00
Yang Yingliang
ddc6243e9a tbf: add 64bit rates support
tbf support 64bit rates start from linux-3.13.
Add 64bit rates support in tc tools.

tc qdisc show dev eth0
qdisc tbf 1: root refcnt 2 rate 40000Mbit burst 230000b peakrate 50000Mbit minburst 87500b lat 50.0ms

This is a followup to ("htb: support 64bit rates").

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Cc: Eric Dumazet <edumazet@google.com>
2013-12-02 23:46:56 -08:00
Eric Dumazet
8334bb325d htb: support 64bit rates
Starting from linux-3.13, we can break the 32bit limitation of
rates on HTB qdisc/classes.

Prior limit was 34.359.738.360 bits per second.

lpq83:~# tc -s qdisc show dev lo ; tc -s class show dev lo
qdisc htb 1: root refcnt 2 r2q 2000 default 1 direct_packets_stat 0 direct_qlen 6000
 Sent 6591936144493 bytes 149549182 pkt (dropped 0, overlimits 213757419 requeues 0)
 rate 39464Mbit 114938pps backlog 0b 15p requeues 0
class htb 1:1 root prio 0 rate 50000Mbit ceil 50000Mbit burst 200000b cburst 0b
 Sent 6591942184547 bytes 149549310 pkt (dropped 0, overlimits 0 requeues 0)
 rate 39464Mbit 114938pps backlog 0b 15p requeues 0
 lended: 149549310 borrowed: 0 giants: 0
 tokens: 336 ctokens: -164

Signed-off-by: Eric Dumazet <edumazet@google.com>
2013-11-22 17:36:18 -08:00
Daniel Borkmann
d05df6861f tc: add cls_bpf frontend
This is the iproute2 part of the kernel patch "net: sched:
add BPF-based traffic classifier".

[Will re-submit later again for iproute2 when window for
 -next submissions opens.]

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Thomas Graf <tgraf@suug.ch>
2013-10-30 16:45:05 -07:00
Nigel Kukard
9bea14ff6b Fix tc stats when using -batch mode
There are two global variables in tc/tc_class.c:
__u32 filter_qdisc;
__u32 filter_classid;

These are not re-initialized for each line received in -batch mode:
class show dev eth0 parent 1: classid 1:1
class show dev eth0 parent 1: classid 1:1
Error: duplicate "classid": "1:1" is the second value.

This patch fixes the issue by initializing the two globals when we
enter print_class().

Signed-off-by: Nigel Kukard <nkukard@lbsd.net>
2013-10-30 16:37:07 -07:00
Stephen Hemminger
734c0ca2ca htb: remove old unused duplicate qdisc name
Alexey had htb2 as name for version in ancient code.
2013-10-27 12:28:38 -07:00
Stephen Hemminger
0a502b21e3 Fix handling of qdis without options
Some qdisc like htb want the parse_qopt to be called even if no options
present. Fixes regression caused by:

e9e78b0db0 is the first bad commit
commit e9e78b0db0
Author: Stephen Hemminger <stephen@networkplumber.org>
Date:   Mon Aug 26 08:41:19 2013 -0700

    tc: allow qdisc without options
2013-10-27 12:26:47 -07:00
Jamal Hadi Salim
e26520e5c1 action: typo nat fix
If you taketh you giveth.
I Went the LinuxWay and copied this for m_simple.c and noticed
this one typo (I wonder where it came from?;->).

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2013-09-30 21:31:40 -07:00
Jamal Hadi Salim
087f46ee4e tc: introduce simple action
Simple action is already in the kernel for years now as an
example. This complements it with user space control.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2013-09-30 21:29:34 -07:00
Stephen Hemminger
af60cf40c9 Merge branch 'net-next-3.11' 2013-09-23 13:16:48 -07:00
Eric Dumazet
b43f331828 htb: add support for direct_qlen attribute
TCA_HTB_DIRECT_QLEN attribute is supported since linux-3.10

HTB classes use an internal pfifo queue, which limit was not reported
by tc, and value inherited from device tx_queue_len at setup time.

With this patch, tc displays the value and can change it.

Signed-off-by: Eric Dumazet <edumazet@google.com>
2013-09-20 09:48:13 -07:00
Eric Dumazet
8f7574edd8 tc: support TCA_STATS_RATE_EST64
Since linux-3.11, rate estimator can provide TCA_STATS_RATE_EST64
when rate (bytes per second) is above 2^32 (~34 Mbits)

Change tc to use this attribute for high rates.

Signed-off-by: Eric Dumazet <edumazet@google.com>
2013-09-20 09:46:33 -07:00
Eric Dumazet
bc113e46a3 pkt_sched: fq: Fair Queue packet scheduler
Support for FQ packet scheduler

$ tc qd add dev eth0 root fq help
Usage: ... fq [ limit PACKETS ] [ flow_limit PACKETS ]
              [ quantum BYTES ] [ initial_quantum BYTES ]
              [ maxrate RATE  ] [ buckets NUMBER ]
              [ [no]pacing ]

$ tc -s -d qd
qdisc fq 8002: dev eth0 root refcnt 32 limit 10000p flow_limit 100p
buckets 256 quantum 3028 initial_quantum 15140
 Sent 216532416 bytes 148395 pkt (dropped 0, overlimits 0 requeues 14)
 backlog 0b 0p requeues 14
  511 flows (511 inactive, 0 throttled)
  110 gc, 0 highprio, 0 retrans, 1143 throttled, 0 flows_plimit

limit	: max number of packets on whole Qdisc (default 10000)

flow_limit : max number of packets per flow (default 100)

quantum : the max deficit per RR round (default is 2 MTU)

initial_quantum : initial credit for new flows (default is 10 MTU)

maxrate : max per flow rate (default : unlimited)

buckets : number of RB trees (default : 1024) in hash table.
               (consumes 8 bytes per bucket)

[no]pacing : disable/enable pacing (default is enable)

Usage :

tc qdisc add dev $ETH root fq

tc qdisc del dev $ETH root 2>/dev/null
tc qdisc add dev $ETH root handle 1: mq
for i in `seq 1 4`
do
  tc qdisc add dev $ETH parent 1:$i est 1sec 4sec fq
done

Signed-off-by: Eric Dumazet <edumazet@google.com>
2013-09-20 09:43:40 -07:00
Jesper Dangaard Brouer
3e92ff522a linklayer interface between kernel and tc/userspace
This iproute2 tc patch is connected to the kernel
 - commit 8a8e3d84b17 (net_sched: restore "linklayer atm" handling)

The rate table calculated by tc, have gotten replaced in the kernel
and is no-longer used for lookups.

This happened in kernel release v3.8 caused by kernel
 - commit 56b765b79 ("htb: improved accuracy at high rates").
This change unfortunately caused breakage of tc overhead and
linklayer parameters.

 Kernel overhead handling got fixed in kernel v3.10 by
 - commit 01cb71d2d47 (net_sched: restore "overhead xxx" handling)

 Kernel linklayer handling got fixed in kernel v3.11 by
 - commit 8a8e3d84b17 (net_sched: restore "linklayer atm" handling)

The linklayer fix introduced a struct change, that allow the linklayer
attribute to be transferred between tc and kernel. This patch make use
of this linklayer attribute.

The linklayer setting is transfer to the kernel.  And linklayer
setting received from the kernel is printed with a prefixed
"linklayer" when listing current configuration.  The default
TC_LINKLAYER_ETHERNET is only printed in detailed output mode.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2013-09-03 08:21:24 -07:00
Stephen Hemminger
e9e78b0db0 tc: allow qdisc without options
Pfifo_fast needs no options. So don't force it to have parsing code.
2013-08-26 08:41:19 -07:00
Stephen Hemminger
b8a45897b9 More minor spelling fixes 2013-08-04 15:10:05 -07:00
Stephen Hemminger
a3aa47a559 Make tc and ip batch mode consistent
Change the code for tc and ip so that batch mode is handled
the same.
2013-07-16 10:04:05 -07:00
Eric Dumazet
a303853e84 get_rate: detect 32bit overflows
On Mon, 2013-06-03 at 16:36 +0100, Ben Hutchings wrote:

> Oops, I read this as being strtol() currently, not strtod().  Currently
> '1.5gbit' will work, but this change will break that.  So I think you
> need to keep bps as a double.

Arg

> Then here I think the check should be *rate != floor(bps), i.e. accept
> rounding down of a non-integer number of bytes but any other change is
> assumed to be overflow.

Thanks Ben, here is v4 then ;)

[PATCH v4] get_rate: detect 32bit overflows

Current rate limit is 34.359.738.360 bit per second, and
unfortunately 40Gbps links are above it.

overflows in get_rate() are currently not detected, and some
users are confused. Let's detect this and complain.

Note that some qdisc are ready to get extended range, but this will
need additional attributes and new iproute2

With help from Ben Hutchings

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
2013-06-07 09:24:56 -07:00
Stephen Hemminger
22fa92e367 htb: fix indentation
iproute2 uses kernel style indenting
2013-06-07 08:54:45 -07:00
Eric Dumazet
44f1ff0afc htb: report overhead attribute
"tc class show dev ..." omits the overhead attribute for HTB.

After patch I have :

tc class add dev $DEV parent 1: classid 1:1 est 1sec 4sec htb \
    rate 12Mbit mtu 1500 quantum 1514 overhead 20

tc class show dev $DEV
class htb 1:1 root prio 0 rate 12000Kbit overhead 20 ceil 12000Kbit
burst 1500b cburst 1500b

Signed-off-by: Eric Dumazet <edumazet@google.com>
2013-06-07 08:53:53 -07:00
Alexander Duyck
cfa292defa iproute2: act_ipt fix xtables breakage on older versions.
In trying to build on a RHEL6.3 I ran into several build issues that are
addressed in this patch.

The first is that xtables_merge_options only has 3 parameters.  It appears
this is how this code was originally.  As such for the case where the version
is less than 6 I am assuming it would be correct to maintain the original
setup that only had 3 parameters being passed instead of 4.

I also ran into an issue with the define for __ALIGN_KERNEL not being present.
I believe this may be due to the fact that __ALIGN_KERNEL was moved into a
separate header from ALIGN after the UAPI changes.  In order to just cover all
of the bases I have moved the main definition for the macros into
__ALIGN_KERNEL_MASK and __ALIGN_KERNEL and if ALIGN is also needed then it is
just a direct redefine to __ALIGN_KERNEL.

Cc: Hasan Chowdhury <shemonc@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2013-05-01 08:01:47 -07:00
Stephen Hemminger
e7b24b67db Fix build when shared libraries are disabled
On some platforms, shared libraries are not used. The stub code
need some updating to not generate errors.
2013-03-13 08:29:59 -07:00
Kees van Reeuwijk
3bed7bb7e7 iproute2: clearer error messages for fifo and tbf qdiscs
Clearer error messages for fifo and tbf qdiscs:
- Say who is complaining
- Don't just say a parameter is bad, show the offending parameter
- Be clearer about duplicate parameters vs illegal pairs of parameters
- Try to give multiple error messages rather than let the user discover the errors one by one
- When there are parameter aliases, try to use the variant that was used, or at least mention them all

Note that in the old version an empty parameter list to tbf would just cause an explain() message
without a specific error message. By simply removing the relevant error check, the code now
handles this error more gracefully by printing an error message for all mandatory parameters.
It still prints the explain() message.

Signed-off-by: Kees van Reeuwijk <reeuwijk@few.vu.nl>
2013-02-21 08:34:34 -08:00
Stephen Hemminger
d1f28cf181 ip: make local functions static 2013-02-12 11:38:35 -08:00
Benjamin Poirier
5ab3a4de5e Use pkg-config to obtain xtables.h path
On openSUSE 12.2 (at least) xtables.h is not installed in the system-wide
include dir but in /usr/include/iptables-1.4.16.3/. This results in the
following build failure:
em_ipset.c:26:21: fatal error: xtables.h: No such file or directory

Other includers of xtables.h already call out to pkg-config
2013-02-11 09:19:54 -08:00
Johannes Naab
e72ca3fbb0 iproute2: tc netem rate: allow negative packet/cell overhead
by fixing the parsing of command-line tokens

Signed-off-by: Johannes Naab <jn@stusta.de>
2013-02-04 09:06:50 -08:00
Jamal Hadi Salim
852d51222d iproute2: act_ipt fix xtables breakage
Fixes breakage with xtables API starting with version 1.4.10

Signed-off-by: Hasan Chowdhury <shemonc@gmail.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
2013-01-16 08:14:48 -08:00
Strake
5bd9dd49ae include needed files
Needed to build iproute2 with musl
2012-12-23 11:49:06 -08:00
Mike Frysinger
e4fc4ada33 allow pkg-config to be customized
Rather than hard coding `pkg-config`, use ${PKG_CONFIG} so people can
override it to their specific version (like when cross-compiling).

This is the same way the upstream pkg-config code works.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2012-11-11 16:21:34 -08:00
Matt Burgess
92905c6e0d iproute2-3.6.0 assumes presence of iptables
Hi,

When compiling iproute2-3.6.0 on a host that doesn't have iptables available, I get the following error:

gcc -Wall -Wstrict-prototypes -O2 -I../include -DRESOLVE_HOSTNAMES
-DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\" -D_GNU_SOURCE
-DCONFIG_GACT -DCONFIG_GACT_PROB -DYY_NO_INPUT   -c -o em_ipset.o
em_ipset.c
em_ipset.c:26:21: fatal error: xtables.h: No such file or directory

Fixed by the following patch, which guards the building of em_ipset.o on
the presence of suitable headers.

Thanks,

Matt.
2012-10-03 08:51:29 -07:00
Rostislav Lisovy
7b5f30e14f Ematch used to classify CAN frames according to their identifiers
This ematch enables effective filtering of CAN frames (AF_CAN) based
on CAN identifiers with masking of compared bits. Implementation
utilizes bitmap based classification for standard frame format (SFF)
which is optimized for minimal overhead.

Signed-off-by: Rostislav Lisovy <lisovy@gmail.com>
2012-08-20 13:11:55 -07:00
Dan Kenigsberg
f1675d615b utils: invarg: msg precedes the faulty arg
fix all call which reversed the arg order.

Signed-off-by: Dan Kenigsberg <danken@redhat.com>
2012-08-17 13:35:36 -07:00
Florian Westphal
8194411a42 tc: add ipset ematch
example usage:
tc filter add dev $dev parent $id: basic match not ipset'(foobar src)' ..

also updates iproute2/ematch_map, else tc complains:
Error: Unable to find ematch "ipset" in /etc/iproute2/ematch_map
Please assign a unique ID to the ematch kind the suggested entry is:
        8       ipset

when trying to use this ematch.

(text ematch (5) only exists in kernel, a vlan ematch (6) exists neither in
 kernel nor userspace, but kernel headers define TCF_EM_VLAN == 6).
2012-08-13 08:33:50 -07:00
Li Wei
6cef544b96 tc: man: change man page and comment to confirm to code's behavior.
Since the get_rate() code incorrectly interpreted bare number, the
behavior is not the same as man page and comment described.

We need to change the man page and comment for compatible with the
existing usage by scripts.
2012-07-12 09:05:28 -07:00
Li Wei
424adc19bf tc: filter: validate filter priority in userspace.
Because we use the high 16 bits of tcm_info to pass prio value to
kernel, thus it's range would be [0, 0xffff], without validation
in tc when user pass a lager(>65535) priority, the actual priority
set in kernel would confuse the user.

So, add a validation to ensure prio in the range.
2012-07-10 15:39:30 -07:00
Hiroaki SHIMODA
690b11f4a6 tc: u32: Fix firstfrag filter.
On current firstfrag filter, all non fragmented packets are matched.
firstfrag should check MF bit.

Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
2012-07-10 15:39:02 -07:00
Hiroaki SHIMODA
1d62f99fe2 tc: u32: Fix icmp_code off.
The off of icmp_code is not 20 but 21. Also offmask should be 0 unless
nexthdr+ is specified.

Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
2012-07-10 15:39:02 -07:00
Li Wei
3c4f545633 tc: prio: Perform more strict check on priomap.
Since band number counts from zero thus band must be little than
opt.bands.
2012-06-18 12:25:08 -07:00
Vijay Subramanian
50a3ec3c46 tc-codel: Update usage text
codel can take 'noecn' as an option. This also makes it consistent with the
manpage.

Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
2012-05-24 15:02:05 -07:00
Eric Dumazet
c3524efc14 fq_codel: Fair Queue Codel AQM
Fair Queue Codel packet scheduler

Principles :

- Packets are classified (internal classifier or external) on flows.
- This is a Stochastic model (as we use a hash, several flows might
                              be hashed on same slot)
- Each flow has a CoDel managed queue.
- Flows are linked onto two (Round Robin) lists,
  so that new flows have priority on old ones.

- For a given flow, packets are not reordered (CoDel uses a FIFO)
- head drops only.
- ECN capability is on by default.
- Very low memory footprint (64 bytes per flow)

tc qdisc ... fq_codel [ limit PACKETS ] [ flows number ]
                      [ target TIME ] [ interval TIME ] [ noecn ]
                      [ quantum BYTES ]

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dave Taht <dave.taht@bufferbloat.net>
Cc: Kathleen Nichols <nichols@pollere.com>
Cc: Van Jacobson <van@pollere.net>
Cc: Tom Herbert <therbert@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Changli Gao <xiaosuo@gmail.com>
2012-05-22 14:17:49 -07:00
Eric Dumazet
185d88f99b tc_codel: Controlled Delay AQM
An implementation of CoDel AQM, from Kathleen Nichols and Van Jacobson.

http://queue.acm.org/detail.cfm?id=2209336

This AQM main input is no longer queue size in bytes or packets, but the
delay packets stay in (FIFO) queue.

As we don't have infinite memory, we still can drop packets in enqueue()
in case of massive load, but mean of CoDel is to drop packets in
dequeue(), using a control law based on two simple parameters :

target : target sojourn time (default 5ms)
interval : width of moving time window (default 100ms)

Selected packets are dropped, unless ECN is enabled and packets can get
ECN mark instead.

Usage: tc qdisc ... codel [ limit PACKETS ] [ target TIME ]
                          [ interval TIME ] [ ecn ]

qdisc codel 10: parent 1:1 limit 2000p target 3.0ms interval 60.0ms ecn
 Sent 13347099587 bytes 8815805 pkt (dropped 0, overlimits 0 requeues 0)
 rate 202365Kbit 16708pps backlog 113550b 75p requeues 0
  count 116 lastcount 98 ldelay 4.3ms dropping drop_next 816us
  maxpacket 1514 ecn_mark 84399 drop_overlimit 0

CoDel must be seen as a base module, and should be used keeping in mind
there is still a FIFO queue. So a typical setup will probably need a
hierarchy of several qdiscs and packet classifiers to be able to meet
whatever constraints a user might have.

One possible example would be to use fq_codel, which combines Fair
Queueing and CoDel, in replacement of sfq / sfq_red.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dave Taht <dave.taht@bufferbloat.net>
2012-05-22 14:13:52 -07:00
Vijay Subramanian
1070205dc0 tc-netem: Add support for ECN packet marking
This patch provides support for marking packets with ECN instead of
dropping them with netem. This makes it possible to make use of the
netem ECN marking feature that was added recently to the kernel.

Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
2012-05-22 14:10:21 -07:00
Christoph J. Thompson
5c434a9e5a iproute2 - Fix up and simplify variables pointing to install directories
Define where is the are located the iproute2 config files.
Get rid of trailing slashes for paths in several file.

Signed-off-by: Christoph J. Thompson <cjsthompson@gmail.com>
2012-04-12 09:49:10 -07:00
Stephen Hemminger
ff24746cca Convert to use rta_getattr_ functions
User new functions (inspired by libmnl) to do type safe access
of routeing attributes
2012-04-10 08:47:55 -07:00
Anton Danilov
90d98edf39 csum action, fix typo 2012-03-15 14:24:59 -07:00
Andreas Henriksson
f526af995e iproute: fix tc -iec display of Mibit rates
As reported by Thomas Mühlgrabner <muehltom@cable.vol.at>
in http://bugs.debian.org/662979 :

 When showing htb class configuration with "tc -iec class show",
 the output for Mibit is actually the value for bit.
 Example: configure a class with a ceil of 1000Mibit.
 Output states 1048576000 Mibit.

The cause is missing parenteses in the display code of tc....

(Please also note that a lower value of 100Mibit will be displayed
as 102400 Kibit, which I think is kind of ugly.)

Reported-by: Thomas Mühlgrabner <muehltom@cable.vol.at>
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
2012-03-10 09:13:58 -08:00
Yegor Yefremov
8ced4fcd50 iproute2: cleanup dependencies
LIBNETLINK will be defined in the main Makefile, so
both ../lib/libnetlink.a ../lib/libutil.a will be
automatically appended during linking. Otherwise
../lib/libnetlink.a ../lib/libutil.a will appear
twice during linking.

Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
2012-02-27 08:27:54 -08:00
Petr Sabata
e2a4536a43 iproute2: tc - mqprio formatted print fix
Just a minor correction of mqprio printf()'s.

Reported-by: Petr Písař <ppisar@redhat.com>
Signed-off-by: Petr Šabata <contyk@redhat.com>
2012-02-22 15:23:12 -08:00
Stephen Hemminger
d798a0483e red: add missing include math.h
red now uses pow() function.
2012-02-06 09:45:50 -08:00
Vijay Subramanian
14a1c164d1 netem: Fail cleanly if user input is wrong
(Resending patch since it looks like my earlier mail did not make it to
netdev).

netem reordering requires that the delay parameter be given. Currently, if no
delay is given, tc prints the error message but still installs the qdisc. Fix
this by printing the usage and failing cleanly.

Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
2012-01-20 11:21:58 -08:00
Eric Dumazet
1b6f0bb5be gred: support TCA_GRED_MAX_P attribute
TCA_GRED_MAX_P permits to express high resolution probabilities.

New output (on 3.3+ kernel) :

disc gred 9442: root refcnt 17
 DP:0 (prio 1) Average Queue 0b Measured Queue 0b
	 Packet drops: 0 (forced 0 early 0)
	 Packet totals: 20 (bytes 2584)
 limit 31460b min 3000b max 9000b ewma 5 probability 0.05 Scell_log 15

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
2012-01-20 08:12:24 -08:00
Eric Dumazet
650252d8c3 choke: support TCA_CHOKE_MAX_P
TCA_CHOKE_MAX_P permits to express high resolution RED probability.

tc qdisc add dev $DEV parent 1:1 handle 10: est 1sec 8sec choke \
	limit 90 ecn min 10 max 30 probability 0.05 bandwidth 10Mbit

Before patch :

tc -s -d qdisc show dev eth3
qdisc ... limit 90p min 10p max 30p ecn ewma 3 Plog 19 Scell_log 13

After :

qdisc ... limit 90p min 10p max 30p ecn ewma 3 probability 0.05
Scell_log 13

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
2012-01-20 08:12:23 -08:00
Eric Dumazet
6987ecf083 sfq: add optional RED on top of SFQ
Adds an optional Random Early Detection on each SFQ flow queue.

Traditional SFQ limits count of packets, while RED permits to also
control number of bytes per flow, and adds ECN capability as well.

1) We dont handle the idle time management in this RED implementation,
since each 'new flow' begins with a null qavg. We really want to address
backlogged flows.

2) if headdrop is selected, we try to ecn mark first packet instead of
currently enqueued packet. This gives faster feedback for tcp flows
compared to traditional RED [ marking the last packet in queue ]

Example of use :

tc qdisc add dev $DEV parent 1:1 handle 10: est 1sec 4sec sfq \
	limit 3000 headdrop flows 512 divisor 16384 \
	redflowlimit 100000 min 8000 max 60000 probability 0.20 ecn

qdisc sfq 10: parent 1:1 limit 3000p quantum 1514b depth 127 headdrop
flows 512/16384 divisor 16384
 ewma 6 min 8000b max 60000b probability 0.2 ecn
 prob_mark 0 prob_mark_head 4876 prob_drop 6131
 forced_mark 0 forced_mark_head 0 forced_drop 0
 Sent 1175211782 bytes 777537 pkt (dropped 6131, overlimits 11007
requeues 0)
 rate 99483Kbit 8219pps backlog 689392b 456p requeues 0

In this test, with 64 netperf TCP_STREAM sessions, 50% using ECN enabled
flows, we can see number of packets CE marked is smaller than number of
drops (for non ECN flows)

If same test is run, without RED, we can check backlog is much bigger.

qdisc sfq 10: parent 1:1 limit 3000p quantum 1514b depth 127 headdrop
flows 512/16384 divisor 16384
 Sent 1148683617 bytes 795006 pkt (dropped 0, overlimits 0 requeues 0)
 rate 98429Kbit 8521pps backlog 1221290b 841p requeues 0

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
2012-01-20 08:12:22 -08:00
Eric Dumazet
54a2fce832 red: fix adaptive spelling
Reported-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
2012-01-20 08:12:21 -08:00
Eric Dumazet
e7e4abea3e red: Add adaptative algo Logged in as shemminger
Enable Adaptative RED algo, using :

tc qdisc  ... red limit BYTES ... adaptative ...

Support of high precision probability/max_p setting and reporting, with
support of old kernels.

With a new kernel, "Plog ..." is replaced in tc output by "probability
value" :

qdisc red 10: dev eth3 parent 1:1 limit 360Kb min 30Kb max 90Kb ecn ewma
5 probability 0.09 Scell_log 15
2012-01-19 14:45:20 -08:00
Hagen Paul Pfeifer
6b8dc4deea tc: netem rate shaping and cell extension
This patch add rate shaping as well as cell support. The link-rate can be
specified via rate options. Three optional arguments control the cell
knobs: packet-overhead, cell-size, cell-overhead. To ratelimit eth0 root
queue to 5kbit/s, with a 20 byte packet overhead, 100 byte cell size and
a 5 byte per cell overhead:

	tc qdisc add dev eth0 root netem rate 5kbit 20 100 5

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
2012-01-19 14:28:27 -08:00
Jan Engelhardt
8e91a80d97 iproute2: fix calling up the xt action
Upsteam: has not been sent yet

Requesting the xt action never succeeded because it registered
using the wrong name.
2012-01-03 15:07:38 -08:00
Jan Engelhardt
d7aa57d450 iproute2: proper detection of libxtables position and flags
Upstream: not sent yet

Any tests involving iptables _MUST_ utilize pkg-config to find the
proper locations of the installation.
2012-01-03 15:05:25 -08:00
Stephen Hemminger
155ad8023b ematch: fix warning about unused input()
Use existing compile flag to indicate that input() is not used
by tc ematch, fixes compiler warning.
2012-01-03 13:55:59 -08:00
Stephen Hemminger
5761f04fb8 ematch: fix warning about yyerror and const
yyerror() should take const char * on current bison.
2012-01-03 13:55:00 -08:00
Stephen Hemminger
cd70f3f522 libnetlink: remove unused junk callback
Both rtnl_talk and rtnl_dump had a callback for handling portions
of netlink message that do not match the correct pid or seq.
But this callback was never used by any part of iproute2 so remove
it.
2011-12-28 10:37:12 -08:00
Eric Dumazet
d060de7f8d netem: fix a typo in explain()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
2011-12-24 11:21:33 -08:00
Stephen Hemminger
3c7950af59 netem: add support for 4 state and GE loss model
Incorporate support for new loss models.
2011-12-22 17:08:11 -08:00
Eric Dumazet
841fc7bc98 red: harddrop support and cleanups
Add harddrop support (kernel support added a long time ago), and various
cleanups.

min BYTES, max BYTES are now optional and follow Sally Floyd's
recommendations.

By the way, our default 2% probability is a bit low, Sally recommends 10%.
Not a big deal if upcoming adaptative algo is deployed.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
2011-12-08 16:43:18 -08:00
Eric Dumazet
ab15aeacf5 red: make burst optional
Documentation advises to set burst to (min+min+max)/(3*avpkt)

Let tc do this automatically if user doesnt provide burst himself.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
2011-12-01 09:23:49 -08:00
Eric Dumazet
0cf67ead7b red: give a hint about burst value
Check for burst values that are too small.

Reported-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
2011-12-01 09:23:43 -08:00
Thomas Jarosch
fcbd0165fc tc: Use correct variable type for get_distribution() result
get_distribution() returns an int.

cppcheck reported:
[tc/q_netem.c:243]: (style) Checking if unsigned variable 'dist_size' is less than zero.

The mismatch actually rendered the error checking
after get_distribution() ineffective.

Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
2011-11-23 14:46:24 -08:00
Thomas Jarosch
a3da01c519 tc: Remove unused variable 'res'.
Detected by cppcheck.

Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
2011-11-23 14:46:21 -08:00
Stephen Hemminger
93ba481acb cleanup ematch yacc files
make clean needs to remove all the yacc output files for ematch.
2011-11-02 16:39:36 -07:00
Michal Soltys
41f6004139 HFSC (7) & (8) documentation + assorted changes
This patch adds detailed documentation for HFSC scheduler. It roughly
follows HFSC paper, but tries to not rely too much on math side of things.
Post-paper/Linux specific subjects (timer resolution, ul service curve, etc.)
are also discussed.

I've read it many times over, but it's a lengthy chunk of text - so try
to be understanding in case I made some mistakes.

tc-hfsc(7): explains algorithm in detail (very long)
tc-hfsc(8): explains command line options briefly
tc(8): adds references to new man pages
Makefile: adds man7 directory to install target
q_hfsc.c: minimal help text changes, consistency with tc-hfsc(8)

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2011-11-02 16:33:50 -07:00
Mike Frysinger
aa48b5931a tc: fix parallel build file with lex/yacc
Building iproute2 in parallel might hit the race failure:
	emp_ematch.l:2:30: fatal error: emp_ematch.yacc.h:
		No such file or directory
	make[1]: *** [emp_ematch.lex.o] Error 1

This is because we currently allow the yacc/lex files to generate and
compile in parallel.  So add a simple dependency to make sure yacc has
finished before we attempt to compile the lex output.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2011-10-18 15:02:21 -07:00
Thomas Jarosch
1a6543c56b Fix memory leak of lname variable in get_target_name()
Detected by cppcheck.

Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
2011-10-07 11:17:10 -07:00
Thomas Jarosch
9f1ba57016 Fix wrong sanity check in choke_parse_opt()
Detected by cppcheck.

Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
2011-10-07 11:17:03 -07:00
Thomas Jarosch
6d5ee98a7c Fix wrong comparison in cmp_print_eopt()
Detected by cppcheck.

Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
2011-10-07 11:16:15 -07:00
Dan McGee
4f3626f920 xt: only unset fields if m is non NULL 2011-08-31 12:18:49 -07:00
Florian Westphal
05fb9184f2 tc: filter: fix default 'protocol all' on little-endian platforms
when specifiying filters without 'protocol' keyword, tc will
default to 'protocol all'.

Unfortunately, this missed a byte-ordering conversion.
2011-08-31 10:55:13 -07:00
Stephen Hemminger
c441bd4c1b Add QFQ scheduler
Basic configuration support for QFQ.
Still need to add manual page.
2011-07-13 13:46:34 -07:00
Stephen Hemminger
be181323c1 Remove redundant limits.h
redo.
2011-07-13 09:49:17 -07:00
Andreas Henriksson
73de5d9680 iproute2: Fix building xt module against xtables version 6
iptables/xtables apparently changed API again.... Now you need to pass
and extra parameter (orig_opts) which was not needed before.

Sprinkle some lovely pre-processor magic to be compatible with both older
and new versions. In the beginning of times XTABLES_VERSION_CODE didn't
exist. Then it was (0x10000 * major + 0x100 * minor + patch) when it was
first introduced (according to git), but now it's at 6...
Don't know what official iptables releases has defined it to over time.
Lets just hope none of the older versions with is has the define
higher then 6 is still around.... so only the "current" versioning
scheme is supported.... lets see how long this lasts now.

For the API change in xtables, see:
http://git.netfilter.org/cgi-bin/gitweb.cgi?p=iptables.git;a=commitdiff;h=600f38db82548a683775fd89b6e136673e924097

Signed-off-by: Andreas Henriksson <andreas@fatal.se>
2011-07-11 10:18:14 -07:00
Petr Sabata
5582c0cffd iproute2: Remove unreachable code
This patch removes unreachable, useless code.

Signed-off-by: Petr Sabata <contyk@redhat.com>
2011-07-11 10:13:51 -07:00
Stephen Hemminger
49dff8c88c xt match: fix set-never-used warning 2011-06-29 15:59:41 -07:00
Stephen Hemminger
02ee3dbc78 skbedit: fix set-never-used warning 2011-06-29 15:59:02 -07:00
Stephen Hemminger
bf808cbf84 tc: fix set never used warning in red 2011-06-20 14:34:30 -07:00
Stephen Hemminger
bcd7abddd4 tc filter: fix dport/sport in pretty print output
Problem reported by Peter Lebbing on Debian.
The decode of source and destination port filters in pretty print
mode was backwards.
2011-05-19 09:19:17 -07:00
John Fastabend
892eba309f iproute2: improve mqprio inputs for queue offsets and counts
This changes mqprio input format to be more user friendly.

Old usage,

 # ./tc/tc qdisc add dev eth3 root mqprio help
Usage: ... mqprio [num_tc NUMBER] [map P0 P1...]
                  [offset txq0 txq1 ...] [count cnt0 cnt1 ...] [hw 1|0]

New usage,

 # ./tc/tc qdisc add dev eth3 root mqprio help
Usage: ... mqprio [num_tc NUMBER] [map P0 P1 ...]
                  [queues count1@offset1 count2@offset2 ...] [hw 1|0]

Suggested-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
2011-04-26 14:59:32 -07:00
John Fastabend
914953046a iproute2: tc add mqprio qdisc support
Add mqprio qdisc support. Output matches the following,

qdisc mq 0: dev eth1 root
qdisc mq 0: dev eth2 root
qdisc mqprio 8001: dev eth3 root  tc 8 map 0 1 2 3 4 5 6 7 1 1 1 1 1 1 1 1
             queues:(0:7) (8:15) (16:23) (24:31) (32:39) (40:47) (48:55) (56:63)

And usage is,

Usage: ... mclass [num_tc NUMBER] [map P0 P1...]
                  [offset txq0 txq1 ...] [count cnt0 cnt1 ...] [hw 1|0]

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
2011-04-12 14:28:19 -07:00
Juliusz Chroboczek
d7f3299d59 tc : SFB flow scheduler
Supports SFB qdisc (included in linux-2.6.39)

1) Setup phase : accept non default parameters

2) dump information

qdisc sfb 11: parent 1:11 limit 1 max 25 target 20
  increment 0.00050 decrement 0.00005 penalty rate 10 burst 20 (600000ms 60000ms)
 Sent 47991616 bytes 521648 pkt (dropped 549245, overlimits 549245 requeues 0)
 rate 7193Kbit 9774pps backlog 0b 0p requeues 0
  earlydrop 0 penaltydrop 0 bucketdrop 0 queuedrop 549245 childdrop 0 marked 0
  maxqlen 0 maxprob 0.00000 avgprob 0.00000

Signed-off-by: Juliusz Chroboczek <Juliusz.Chroboczek@pps.jussieu.fr>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
2011-04-12 14:27:37 -07:00
Stephen Hemminger
59a935d204 Update email address of netem 2011-04-12 14:24:01 -07:00
Stephen Hemminger
d7ac9ad4f4 Fix warning in u32 from assignment in conditional 2011-04-12 14:23:39 -07:00
Eric Dumazet
f3f28c2126 sfq: add divisor support
In 2.6.39, we can build SFQ queues with a given hash table size,
2011-02-25 12:59:53 -08:00
Stephen Hemminger
a4eca97cff CHOKe scheduler
TC commands for CHOKe qdisc
2011-01-31 09:09:50 -08:00
Gregoire Baron
3822cc986c tc: add ACT_CSUM action support (csum)
Add the iproute2 support for the ACT_CSUM action. Can be used as
following, certainly in conjunction with the ACT_PEDIT action (pedit):

 # In order to DNAT (stateless) IPv4 packet from 192.168.1.100 to
 #  0x12345678 (18.52.86.120), and update the IPv4 header checksum and
 #  the UDP checksum (the last one, only if the packet is UDP).
tc filter add eth0 prio 1 protocol ip parent ffff: \
  u32 match ip src 192.168.1.100/32 flowid :1 \
    action pedit munge offset 16 u32 set 0x12345678 \
      pipe csum ip and udp

 # In order to alter destination address of IPv6 TCP packets from fc00::1
 #  and correct the TCP checksum (nothing happened? except maybe for
 #  checksums in the TCP payload ...).
tc filter add eth0 prio 1 protocol ipv6 parent ffff: \
  u32 match ip6 src fc00::1/128 match ip6 protocol 0x06 0xff flowid :1 \
    action pedit munge offset 24 u32 set 0x12345678 \
      pipe csum tcp
2010-12-01 11:17:46 -08:00
Changli Gao
7162c92148 iproute2: tc: f_flow: add key rxhash
We can use rxhash to classify the traffic into flows. As rxhash maybe
supplied by NIC or RPS, it is cheaper.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
2010-11-30 09:57:36 -08:00
Mike Frysinger
be3c4d4f3c m_xt: stop using xtables_set_revision()
iptables dropped the xtables_set_revision() function around version 1.4.9,
so set the rev directly ourselves.  This should be compatible back to the
original version m_xt itself is designed for.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2010-11-30 09:48:38 -08:00
Stephen Hemminger
cb4bd0ec8d Fix GRED options clearing
Bug reported where priorities of GRED DP's are ignored.
The option parsing sets opt then memset was clearing these
values.
2010-08-25 09:04:55 -07:00
Stephen Hemminger
e3d153c1fb Fix byte order of ether address match for u32
The u32 key match was incorrect byte order when using ether source
or destination address matching.
2010-08-02 11:55:30 -07:00
Andreas Henriksson
02833d1b38 tc: make symbols loaded from tc action modules global.
Fixes problems with xtables based MARK target ("ipt" module).
When tc loads the "ipt" (xt) module it kept the symbols local,
this made loading of libxtables not find the required struct.

currently ipt/xt is the only tc action module.
iproute2 never seem to do dlclose.
hopefully the modules doesn't export more symbols then needed.

In this situation hopefully the RTLD_GLOBAL flag won't hurt us.

I've been using this patch in the Debian package of iproute for
the last 3 weeks and noone has complained.
( This fixes http://bugs.debian.org/584898 )

Signed-off-by: Andreas Henriksson <andreas@fatal.se>
2010-08-02 09:54:59 -07:00
Stephen Hemminger
4b45abd1f0 Fix NULL pointer reference when using basic match
If basic match has no tree of matches underneath
then print_ematch would core dump.
2010-07-29 18:03:35 -07:00
Petr Lautrbach
0156412215 iproute: fix tc generating ipv6 priority filter
This patch adds ipv6 filter priority/traffic class function
static int parse_ip6_class(int *argc_p, char ***argv_p, struct tc_u32_sel *sel)
shifting filter value to 5th bit and ignoring "at" as header position
is exactly given.

Signed-off-by: Petr Lautrbach <plautrba@redhat.com>
2010-07-23 12:29:35 -07:00
Mike Frysinger
bf512683e0 tc: revert "echo" in install target
The recent commit "iproute2: add option to build m_xt as a tc module"
(ab814d6355) looks like it wrongly included debug changes in the
install target.  So drop the `echo` so the tc binary actually gets
installed again.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2010-07-23 12:28:25 -07:00
Bart Trojanowski
608a96c727 fix build issues with flex ver 2.5
When building on an old environment, the flex generated
tc/emp_ematch.lex.c file would not compile.  The error given was:

emp_ematch.lex.c:1686: error: expected ‘;’, ‘,’ or ‘)’ before numeric constant

The emp_ematch.l uses 'str' as a start symbol name, and  flex would create
a '#define str 1' statement.  This particular version of flex,
unfortunately, used 'str' as names of string variables in the generated
parser functions.  This is line 1686 in the generated file:

YY_BUFFER_STATE ematch__scan_string (yyconst char * str )

This patch just substitutes 'str' for 'lexstr' in emp_ematch.l to avoid
the collision.
2010-04-22 15:27:42 -07:00
Andreas Henriksson
ab814d6355 iproute2: add option to build m_xt as a tc module (v3)
This will build the xt module (action ipt) of tc as a
shared object that is linked at runtime by tc if used,
rather then built into tc.

This is similar to how the atm qdisc support
is handled (q_atm.so).

Signed-off-by: Andreas Henriksson <andreas@xxxxxxxx>
2010-04-12 11:40:29 -07:00
Stephen Hemminger
edaaa11e5a Workaround missing ALIGN() macro. 2010-03-29 17:37:49 -07:00
Stephen Hemminger
1b84ad557e Remove mirred debug message
Other commands are quiet if successful. mirred action had leftover
debug message.
2010-03-29 17:32:37 -07:00
Stephen Hemminger
609ceb807d Workaround missing ALIGN() macro
XT_ALIGN() calls ALIGN macro but ALIGN is in kernel source not userspace.
2010-03-29 15:17:48 -07:00
Andreas Henriksson
12ddfff76c iproute2: detect iptables modules dir in configure.
Try to automatically detect iptables modules directory.

Make the configure script look for iptables modules.
This also makes it possible to specify it on the
command line while building via "make IPT_LIB_DIR=/foo/bar".

Signed-off-by: Andreas Henriksson <andreas@fatal.se>
2010-03-29 15:10:20 -07:00
jamal
e906975a53 skbedit: use get_u32 for parsing mark
parsing a mark as a classid allows for acceptance of strange
informal input.

cheers,
jamal
commit aad0da6507ff8a95a63ed8e529c05f52be5b0e75
Author: Jamal Hadi Salim <hadi@cyberus.ca>
Date:   Mon Feb 15 06:45:29 2010 -0500

    skbedit: use get_u32 for parsing mark

    get_u32 is the more appropriate parser for a mark.

    Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
2010-03-03 16:35:30 -08:00
Hagen Paul Pfeifer
f703129d34 tc: add new queue discipline: head drop fifo
This adds the required changes to gain access to
the head drop classfull queuing discipline named
pfifo_head_drop. In difference to pfifo or pfifo_fast
this queuing discipline will drop the first packet
in the case of queue congestion. As a result the queue
contain always the freshest packets.

To replace the current a root queueing discipline
for eth0:
$ tc qdisc replace dev eth0 root pfifo_head_drop

And show statistics:
$ tc -s qdisc show dev eth0

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
2010-03-03 16:15:44 -08:00
Florian Westphal
8d8de1139c tc: remove stale code
remove unused #define and "ok" statements.

Signed-off-by: Florian Westphal <fwestphal@astaro.com>
2010-01-21 10:13:01 -08:00
Florian Westphal
ddf216c863 tc: red, gred, tbf: more helpful error messages
$ tc qdisc add dev eth1 root tbf
RTNETLINK answers: Invalid argument

$ tc qdisc add dev eth1 root red
RTNETLINK answers: Invalid argument

with patch:
$ tc qdisc add dev eth1 root red
Required parameter (min, max, burst, limit, avpkt) is missing

$ tc qdisc add dev eth1 root tbf
Usage: ... tbf limit BYTES burst BYTES[/BYTES] rate KBPS ...

Signed-off-by: Florian Westphal <fw@strlen.de>
2010-01-21 10:12:57 -08:00
Mike Frysinger
73152614bc tc: respect LDFLAGS for %.so targets
Since there aren't any targets that currently use this pattern rule, this
is more of a proactive fix.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2010-01-21 10:05:39 -08:00
Jamal Hadi Salim
e04dd30a38 skbedit: Add support to mark packets
This adds support for setting the skb mark.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
2009-12-26 11:12:43 -08:00
Stephen Hemminger
985f4578c6 Fix warning about strtod() return value 2009-12-26 10:20:50 -08:00
Andreas Henriksson
a36ceb85d7 Add new (iptables 1.4.5 compatible) tc/ipt/xt module.
Add a new cleaned up m_xt.c based on m_xt_old.c
The new m_xt.c has been updated to use the new names and new api
that xtables exposes in iptables 1.4.5.
All the old internal api cruft has also been dropped.

Additionally, a configure script test is added to check for
the new xtables api and set the TC_CONFIG_XT flag in Config.
(tc/Makefile already handles this flag in previous commit.)

Signed-off-by: Andreas Henriksson <andreas@fatal.se>
2009-12-26 10:09:27 -08:00
Andreas Henriksson
80d689d055 Keep the old tc/ipt/xt module for compatibility.
Move the file and rename the configure flags.
The file is being kept around for iptables < 1.4.5 compatibility.

Signed-off-by: Andreas Henriksson <andreas@fatal.se>
2009-12-26 10:09:26 -08:00
Patrick McHardy
c90308ffc7 f_fw: fix compat mode
The kernel takes a lack of options as indication that the fw classifier
should operate in compatibility mode, where marks are mapped directly to
classids.

Commit e22b42a (tc mask patch) broke this by adding an empty TCA_OPTIONS
attribute even if no handle is specified. Restore the old behaviour.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-12-01 16:20:01 -08:00
Stephen Hemminger
232642c28c Remove Changes: comments
Discourage developers from putting change log in comments
now that software has been under change control for 5 years.
2009-12-01 15:49:48 -08:00
Mike Frysinger
05b4f8492b tc: remove dlfcn.h from files that dont need it
A bunch of source files look like they're copy & pasted from other files,
and some include header files that they don't actually need.  Since dlfcn
has very specific usage (and is a pain on a static-only system), drop it
where it isn't really needed.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2009-11-13 14:14:07 -08:00
Mike Frysinger
f2e27cfb01 support static-only systems
The iptables code supports a "no shared libs" mode where it can be used
without requiring dlfcn related functionality.  This adds similar support
to iproute2 so that it can easily be used on systems like nommu Linux (but
obviously with a few limitations -- no dynamic plugins).

Rather than modify every location that uses dlfcn.h, I hooked the dlfcn.h
header with stub functions when shared library support is disabled.  Then
symbol lookup is done via a local static lookup table (which is generated
automatically at build time) so that internal symbols can be found.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2009-11-10 10:44:20 -08:00
Mike Frysinger
729cbe84b8 tc/q_atm.so: respect LDFLAGS
The q_atm.so target defines its own link target, but it doesn't respect the
$(LDFLAGS) variable.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2009-08-06 14:50:08 -07:00
Stephen Hemminger
1558971d43 fix handling of GRED DPs args 2009-05-26 15:58:05 -07:00
Denys Fedoryshchenko
f4a8b23d39 Filter class output by classid
Sometimes while dividing bandwidth by classes it is useful to see how some
specific class doing things live.

Which my simple patch it is possible to do
watch -n1 "tc -s -d class show dev eth0.2022 classid 1:1520"
and to get live statistics, how packets queued or dropped, and how much
bandwidth used (if estimator defined) for specific class.

Signed-off-by: Denys Fedoryshchenko <denys@visp.net.lb>
2009-05-26 15:20:26 -07:00
Stephen Hemminger
ebde878097 Allow default DP of zero in gred
To emulate WRED behaviour, allow default DP of zero.
2009-05-26 15:15:01 -07:00
Stephen Hemminger
d13cee6d59 Add IPV6 match pretty print 2009-05-26 15:14:29 -07:00
Stephen Hemminger
b4d41f41b6 Add u32 extension to match on ether source/destination
Use existing u32 mechanism to match based on Ethernet header.
No need for protocol that already exists.
2009-04-15 15:39:34 -07:00
Thomas Graf
ff213c4bf2 cgroup support
Stephen,

iproute2 part of the cgroup classifier that has been included upstream
for a while. Please apply.
2009-04-13 13:38:33 -07:00
Stephen Hemminger
9fce67dd46 Remove goto chain
The selector logic is clearer with if / else if
2009-04-03 09:44:04 -07:00
Stephen Hemminger
52d6a85050 remove duplicate limits.h 2009-03-27 11:07:46 -07:00
Petr Jediný
10494d2724 Changing commandline help text to be more uniform... 2009-03-27 11:05:44 -07:00
Stephen Hemminger
44e50c8e78 Add missing limits.h
Need limits.h to get INT_MIN on Debian
2009-03-01 20:36:38 -08:00
Denys Fedoryschenko
a589dcda9c Fix memory leak in local options
This change was forgotten by Stephen in the last release

Signed-off-by: Denys Fedoryschenko <denys@visp.net.lb>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
2009-02-19 09:04:06 -08:00
Jamal Hadi Salim
63c7d26f94 Breakage noticed when debian upgraded to xtables (iptables > 1.4.1)
Many thanks to Yevgeny Kosarzhevsky <yevg@pisem.net> for reporting
and a lot of testing

Thanks to Jan Engelhardt <jengelh@medozas.de> for a lot of advice
Thanks to Denys Fedoryschenko <denys@visp.net.lb> for some sample
code that he tried and thanks to Andreas Henriksson <andreas@fatal.se>
(who maintains iproute2 on debian) for the persistent followup.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
2009-02-19 09:02:13 -08:00
Stephen Hemminger
46a6573259 fix uninitialized memory in tc_skbedit
Original from: Alexander Duyck <alexander.h.duyck@intel.com>

A bug was found in which the memory for the tc_skbedit struct was being
used uninitialized to 0.  Alternative version of original fix
using initializer rather than memset.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2009-02-19 08:59:06 -08:00
Patrick McHardy
c86f34942a iproute: add DRR support
add DRR support

This patch adds support for the DRR scheduler I just sent
to iproute.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-01-27 16:11:39 -08:00
Stephen Hemminger
bdc213423a Fix leftovers from earlier change
Still had references to l_name.
2009-01-07 17:20:14 -08:00
Denys Fedoryshchenko
6e34e7dc0a Fix tc/m_ipt memory leaks
1)optind according iptables sources have to be set to 0. If it is set to 1, in
batch it will mess up things. Also in iptables sources i notice that ->tflags
and ->used need to be reset.

2)Since target->t = fw_calloc(1, size); allocated memory in function build_st,
it have to be freed at the end, or in batch we will have memory leak. TODO:
Probably it must be freed in all "return -1" cases in parse_ipt after
build_st. About this i am not sure, up to Stephen.

3)new_name was malloc'ed, but not freed
2009-01-06 19:46:11 -08:00
Alexander Duyck
fe1a34fa81 add support for multiq qdisc
Add support for multiq qdisc
	This patch adds the ability to configure the multiq qdisc.  Since the qdisc does not require any input it will pull the number of bands directly from the device that it is added to the root of.

usage: tc qdisc add dev <DEV> root handle <HANDLE> multiq

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2009-01-06 19:29:25 -08:00
Alexander Duyck
f72a7aab0c add support for skbedit action
Provides ability to edit queue_mapping field
	Provides ability to edit priority field

usage: action skbedit [queue_mapping QUEUE_MAPPING] [priority PRIORITY]
	at least one option must be select, or both at the same time

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2009-01-06 19:27:03 -08:00
Stephen Hemminger
3a99df7074 tc filter help should just print usage
Doing tc filter help should end argument processing.
This prevents extraneous messages.  Reported by Marcela Maslanova
2008-10-13 07:00:48 -07:00
Stephen Hemminger
bc7d1bd88d Fix duplicate return
Get rid of dead code
2008-09-19 08:49:07 -07:00
Andreas Henriksson
5e3bb534ae iproute: DESTDIR vs LIBDIR.
Hello Rafael Almeida.

I noticed your patch adding DESTDIR support in the latest iproute2 release.
Much appreciated! Soon the debian packages might be able to move to actually
using "make install" rather then it's own installation procedure when
building packages. I've noticed something that will break though....

Debian packages usually sets DESTDIR=debian/tmp/ and packages the contents
of that directory as if it where the root file system. This will break
the /usr/lib/{tc,ip}/ module loading, because they DESTDIR (/usr) will be
/whatever-the-build-path-was/debian/tmp/lib/{tc,ip}/.
I beleive others usually call this the LIBDIR to make the separation between
DISTDIR being the (possibly temporary) place things are put when build is
done, and LIBDIR (and others) are used for actual runtime paths.

I'm attaching a patch that I think fixes this, but would be really happy if
you could have a look at to verify I'm not screwing something up.

--
Regards,
Andreas Henriksson

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
2008-09-17 22:04:02 -07:00
Jussi Kivilinna
839c8456fb add generic size table for qdiscs
Patch adds generic size table that is similiar to rate table, with
difference that size table stores link layer packet size.

Based on patch by Patrick McHardy
 http://marc.info/?l=linux-netdev&m=115201979221729&w=2

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
2008-09-17 21:57:15 -07:00
Patrick McHardy
87953940f9 cls_flow: add perturbation support
commit 337628b9aca63fda7622701191d6304c83438909
Author: Patrick McHardy <kaber@trash.net>
Date:   Fri Jul 4 04:54:56 2008 +0200

    cls_flow: add perturbation support

    Signed-off-by: Patrick McHardy <kaber@trash.net>

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
2008-09-17 21:53:37 -07:00
Stephen Hemminger
5a67f8f9d3 Update to 2.6.27 API
The one issue was the old multiqueue API, so that is handled
by tc_util.h
2008-09-15 12:05:11 -07:00
Denys Fedoryshchenko
11bbe7fd11 long/ulong iproute-git fix
This patch fixes bug in Metadata ematch attributes parser

strtoul on error return ULONG_MAX, not LONG_MAX

Patch attached as file
2008-07-31 15:25:15 -07:00
Rafael Almeida
b514b3587e Fixed installation when changing DESTDIR
After changing the DESTDIR the installated binaries have some issues
due to hard coded paths. For example, using distributions on NetEm
would segfault.

I've changed iplink.c and tc_util.c so they are now aware of DESTDIR.
Along with that change I needed to change the main Makefile so it
defines the DESTDIR macro when calling gcc.

I also changed the paths so that during the installation sbin, etc,
share and lib directories are created directly inside of the DESTDIR,
instead of creating a usr directory inside that. That's the behaviour
of most packages out there, so I think most users will be expecting
that to happen.
2008-07-25 13:40:19 -07:00
Patrick McHardy
ae76106841 tc: don't set protococol field on filter delete
> # tc filter show dev eth1 | grep 4:29:d1
> filter parent 1: protocol ip pref 5 u32 fh 4:29:d1 order 209 key ht 4
> bkt 29 flowid 1:b7aa
>
> # tc filter del dev eth1 parent 1: pref 5 handle 4:29:d1 u32
> RTNETLINK answers: Invalid argument
> We have an error talking to the kernel
>
> after rollback to package"sys-apps/iproute2-2.6.24.20080108" all
> deleted normal...

The current iproute version uses "protocol all" by default
if its not specified. This is actually only useful for creating
new filters, on deletion an unset protocol is treated as wildcard.
2008-06-23 09:09:45 -07:00
Stephen Hemminger
b6da1afc73 ematch related bugfix and cleanup
Bugfix: use strtoul rather than strtol for bstrtol to handle large key/mask.
Deinline larger functions to save space.
2008-05-29 11:54:19 -07:00
jamal
1750abe2ba Infrastructure for pretty printing
And last for now ..

cheers,
jamal

[PATCH 3/3] [TC/U32] Infrastructure for pretty printing

This patch makes it easy to add pretty printers of different protocols.
For starters it makes use of ipv4 and raw printers.
Add more later ...

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
2008-05-09 15:50:12 -07:00
jamal
eefcbc7206 Expose the filter protocol
makes protocol accessible ..

cheers,
jamal

[PATCH 2/3] [TC/FILTERS] Expose the filter protocol

Expose the filter protocol so it can be used by underlying
classifiers when they need it.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
2008-05-09 15:44:46 -07:00
Stephen Hemminger
44dcfe8201 Change formatting of u32 back to default
Don't break scripts that depend on previous offset/value format.
Introduce a new -pretty flag for decoding, and (*gasp*) document
the formatting arguments.
2008-05-09 15:42:34 -07:00
Patrick McHardy
083a5f00a1 Fix classifier help
commit c504ffd627ac211eebf5ed34ef0fbfd7f1dbb347
Author: Patrick McHardy <kaber@trash.net>
Date:   Wed Mar 26 07:38:43 2008 +0100

    [IPROUTE]: Fix classifier help

    The new check whether the user has specified a protocol makes
    "ip filter <type> help" fails with "protocol is required".

    This could be fixed by moving it further down, but a more user-friendly
    way it to simply use ETH_P_ALL as default if nothing is specified.

    Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-17 10:07:02 -07:00
Jesper Dangaard Brouer
292f29b42c ATM cell alignment.
Introducing the function that does the ATM cell alignment, and
modifying tc_calc_rtable() to use this based upon a linklayer
parameter.

Modified from original to use constants from atm.h and
fix all the usages of rtable in same patch.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
2008-04-17 10:04:31 -07:00
Stephen Hemminger
1a5bd776a2 In police, fix uninitialized "overhead" variable.
Bug introduced by myself in an earlier patch series.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
2008-04-17 09:12:38 -07:00
Jesper Dangaard Brouer
f71f75f39b police, implement overhead parameter parsing.
For police, implement overhead parameter parsing.

The change is ABI (Application Binary Interface) backward compatible
with older kernels, but will first have effect from kernel 2.6.24.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
2008-04-01 11:27:42 -07:00
Jesper Dangaard Brouer
2a1f78b376 CBQ, doc usage of overhead parameter.
CBQ remember to doc usage of overhead parameter.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
2008-04-01 11:27:35 -07:00
Jesper Dangaard Brouer
08fd01843f CBQ, implement overhead parameter parsing.
For CBQ, implement overhead parameter parsing.

The change is ABI (Application Binary Interface) backward compatible
with older kernels, but will first have effect from kernel 2.6.24.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
2008-04-01 11:27:25 -07:00
Jesper Dangaard Brouer
1db5e2ec13 CBQ use matches() function instead of strcmp().
Change CBQ to use matches() function instead of strcmp().

This resembels the usage in other parse functions, and allows
partial command parameter matching.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
2008-04-01 11:27:17 -07:00