Commit Graph

4528 Commits

Author SHA1 Message Date
David Ahern
31ae2912f7 libnetlink: Rename rtnl_wilddump_* to rtnl_linkdump_*
Rename rtnl_wilddump_req_filter to rtnl_linkdump_req_filter,
rtnl_wilddump_request to rtnl_linkdump_req and
rtnl_wilddump_req_filter_fn to rtnl_linkdump_req_filter_fn.

In all cases drop the type argument which at this point is only
RTM_GETLINK and hardcode in the functions.

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-02 18:39:08 -07:00
David Ahern
efb0b383d9 libnetlink: Convert GETNSID dumps to use rtnl_nsiddump_req
Add rtnl_nsiddump_req for namespace id dumps using the proper rtgenmsg
as the header. Convert existing RTM_GETNSID dumps to use it.

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-02 18:39:04 -07:00
David Ahern
ff41db8a75 libnetlink: Convert GETNEIGHTBL dumps to use rtnl_neightbldump_req
Add rtnl_neightbldump_req for neighbor table dumps using the proper ndtmsg
as the header. Convert existing RTM_GETNEIGHTBL dumps to use it.

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-02 18:39:02 -07:00
David Ahern
9e0ab19c4d libnetlink: Convert GETNEIGH dumps to use rtnl_neighdump_req
Add rtnl_neighdump_req for neighbor dumps using the proper ndmsg
as the header. Convert existing rtnl_wilddump_request for RTM_GETNEIGH
to use it.

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-02 18:38:59 -07:00
David Ahern
b05d9a3d58 libnetlink: Convert GETRULE dumps to use rtnl_ruledump_req
Add rtnl_ruledump_req for fib fule dumps using the proper fib_rule_hdr
as the header. Convert existing RTM_GETRULE dumps to use it.

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-02 18:38:56 -07:00
David Ahern
ddee16bc96 libnetlink: Convert GETNETCONF dumps to use rtnl_netconfdump_req
Add rtnl_netconfdump_req for netconf dumps using the proper netconfmsg
as the header. Convert existing RTM_GETNETCONF dumps to use it.

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-02 18:38:34 -07:00
David Ahern
9dbe6df411 libnetlink: Convert GETMDB dumps to use rtnl_mdbdump_req
Add rtnl_mdbdump_req for mdb dumps using the proper br_port_msg as
the header. Convert existing RTM_GETMDB dumps to use it.

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-02 18:38:31 -07:00
David Ahern
393600231a libnetlink: Convert GETADDRLABEL dumps to use rtnl_addrlbldump_req
Add rtnl_addrlbldump_req for address label dumps using the proper
ifaddrlblmsg as the header. Convert existing RTM_GETADDRALBEL dumps
to use it.

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-02 18:38:29 -07:00
David Ahern
bfb27dfaac libnetlink: Convert GETROUTE dumps to use rtnl_routedump_req
Add rtnl_routedump_req for route dumps using the proper rtmsg
as the header. Convert existing RTM_GETROUTE dumps to use it.

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-02 18:38:27 -07:00
David Ahern
46917d0895 libnetlink: Convert GETADDR dumps to use rtnl_addrdump_req
Add rtnl_addrdump_req for address dumps using the proper ifaddrmsg
as the header. Convert existing RTM_GETADDR dumps to use it.

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-02 18:38:21 -07:00
Eelco Chaudron
5ac138324e tc_util: Add support for showing TCA_STATS_BASIC_HW statistics
Add support for showing hardware specific counters to easy
troubleshooting hardware offload.

$ tc -s filter show dev enp3s0np0 parent ffff:
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
  eth_type ipv4
  dst_ip 2.0.0.0
  src_ip 1.0.0.0
  ip_flags nofrag
  in_hw
        action order 1: mirred (Egress Redirect to device eth1) stolen
        index 1 ref 1 bind 1 installed 0 sec used 0 sec
        Action statistics:
        Sent 534884742 bytes 8915697 pkt (dropped 0, overlimits 0 requeues 0)
        Sent software 187542 bytes 4077 pkt
        Sent hardware 534697200 bytes 8911620 pkt
        backlog 0b 0p requeues 0
        cookie 89173e6a44447001becfd486bda17e29

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-02 14:45:33 -07:00
Pieter Jansen van Vuuren
56155d4df8 tc: f_flower: add geneve option match support to flower
Allow matching on options in Geneve tunnel headers.

The options can be described in the form
CLASS:TYPE:DATA/CLASS_MASK:TYPE_MASK:DATA_MASK, where CLASS is
represented as a 16bit hexadecimal value, TYPE as an 8bit
hexadecimal value and DATA as a variable length hexadecimal value.

e.g.
 # ip link add name geneve0 type geneve dstport 0 external
 # tc qdisc add dev geneve0 ingress
 # tc filter add dev geneve0 protocol ip parent ffff: \
     flower \
       enc_src_ip 10.0.99.192 \
       enc_dst_ip 10.0.99.193 \
       enc_key_id 11 \
       geneve_opts 0102:80:1122334421314151/ffff:ff:ffffffffffffffff \
       ip_proto udp \
       action mirred egress redirect dev eth1

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-02 14:39:55 -07:00
Roopa Prabhu
51eb02254b ipneigh: update man page and help for router
While at it also add missing text for proxy in the man page.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-10-01 17:36:35 -07:00
Nikolay Aleksandrov
c3ded6e4a0 bridge: fdb: add support for sticky flag
Add support for the new sticky flag that can be set on fdbs and update the
man page.

CC: David Ahern <dsahern@gmail.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-09-28 10:52:22 -07:00
David Ahern
d9c0be4e97 Update kernel headers
Update kernel headers to commit
a804e5e218754 ("selftests: forwarding: test for bridge sticky flag")

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-09-28 10:51:15 -07:00
Roopa Prabhu
c2cd14acc7 ipneigh: support setting of NTF_ROUTER on neigh entries
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-09-28 09:53:08 -07:00
David Ahern
7b2e200679 Merge branch 'iproute2-master' into iproute2-next
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-09-28 09:52:41 -07:00
Stephen Hemminger
b45e300024 libnetlink: don't return error on success
Change to error handling broke normal code.

Fixes: c60389e4f9 ("libnetlink: fix leak and using unused memory on error")
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-25 10:08:48 +02:00
Stephen Hemminger
5dc2204c01 testsuite: add libmnl
Supporting external ack requires libmnl now.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-25 09:59:37 +02:00
Petr Vorel
8804a8c0d3 Makefile: Add check target
Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-25 09:56:40 +02:00
Lorenzo Bianconi
c1360e3b48 iplink_vxlan: take into account preferred_family creating vxlan device
Take into account the configured preferred_family if neither saddr or
daddr are provided since otherwise vxlan kernel module will use IPv4 as
default remote inet family neglecting the one provided by userspace.
This behaviour was originally in commit 97d564b90c ("vxlan: use
preferred address family when neither group or remote is specified").
The issue can be triggered with the following reproducer:

$ip -6 link add vxlan1 type vxlan id 42 dev enp0s2 \
     proxy nolearning l2miss l3miss
$bridge fdb add 46:47:1f:a7:1c:25 dev vxlan1 dst 2000::2
RTNETLINK answers: Address family not supported by protocol

Fixes: 1e9b8072de ("iplink_vxlan: Get rid of inet_get_addr()")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-25 09:52:56 +02:00
Hangbin Liu
fa1e658e84 iplink: fix incorrect any address handling for ip tunnels
After commit d42c7891d2 ("utils: Do not reset family for default, any,
all addresses"), when call get_addr() for any/all addresses, we will set
addr->flags to ADDRTYPE_INET_UNSPEC if family is AF_INET/AF_INET6, which
makes is_addrtype_inet() checking passed and assigns incorrect address
to kernel. The ip link cmd will return error like:

]# ip link add ipip1 type ipip local any remote 1.1.1.1
RTNETLINK answers: Numerical result out of range

Fix it by using is_addrtype_inet_not_unspec() to avoid unspec addresses.

geneve, vxlan are not affected as they use AF_UNSPEC family when call
get_addr()

Reported-by: Jianlin Shi <jishi@redhat.com>
Fixes: d42c7891d2 ("utils: Do not reset family for default, any, all addresses")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-21 11:28:33 -07:00
Stephen Hemminger
11152f0a0d Makefile: add help target
Add help target to Makefile

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-21 09:15:26 -07:00
Petr Vorel
133c1a6c87 testsuite: Warn about empty $(IPVERS)
alltests target requires having symlink created by configure target
(default target). Without that there is no test being run.

Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-21 08:59:52 -07:00
Petr Vorel
3537633dcf testsuite: Generate generate_nlmsg when needed
Commit 886f2c43 added generate_nlmsg.c. Running alltests
target, which uses the binary required to run 'make -C tools' before.

Fixes: 886f2c43 testsuite: Generate nlmsg blob at runtime

Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-21 08:59:52 -07:00
Petr Vorel
f15836faec testsuite: Fix missing generate_nlmsg
Commit ad23e152 caused generate_nlmsg to be always missing:

$ make alltests
make: ./tools/generate_nlmsg: Command not found

Create testclean: to remove only results directory.

Fixes: ad23e152 testsuite: remove all temp files and implement make clean

Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-21 08:59:52 -07:00
Hangbin Liu
88272775e2 iplink: add ipvtap support
IPVLAN and IPVTAP are using the same functions and parameters. So we can
just add a new link_util with id ipvtap. Others are the same.

Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-09-20 17:53:56 -07:00
David Ahern
34212c73b7 Merge branch 'iproute2-master' into iproute2-next
Conflicts:
	ip/iproute_lwtunnel.c

In addition to merge conflict between bd59e5b151 and 94a8722f2f,
updated the code added by the latter commit based on the change of the
former (ie., added ret = to the new rta_addattr_l).

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-09-20 17:53:27 -07:00
Leon Romanovsky
d090fbf33b rdma: Fix representation of PortInfo CapabilityMask
The port capability mask represents IBTA PortInfo specification,
but as it is written in description of kernel commit 2f944c0fbf58
("RDMA: Fix storage of PortInfo CapabilityMask in the kernel"),
the bit 26 was mistakenly overwritten.

The rdmatool followed it too and mislead users by presenting wrong
value. Since it never showed proper value, we update the whole
port_cap_mask to comply with IBTA and show real HW values.

Fixes: da990ab40a ("rdma: Add link object")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-17 08:59:13 -07:00
Stephen Hemminger
c60389e4f9 libnetlink: fix leak and using unused memory on error
If an error happens in multi-segment message (tc only)
then report the error and stop processing further responses.
This also fixes refering to the buffer after free.

The sequence check is not necessary here because the
response message has already been validated to be in
the window of the sequence number of the iov.

Reported-by: Mahesh Bandewar <mahesh@bandewar.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Mahesh Bandewar <maheshb@google.com>
2018-09-17 08:58:21 -07:00
Toke Høiland-Jørgensen
2153e01f36 q_cake: Also print nonat, nowash and no-ack-filter keywords
Similar to the previous patch for no-split-gso, the negative keywords for
'nat', 'wash' and 'ack-filter' were not printed either. Add those well.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-14 11:32:46 -07:00
Hangbin Liu
92bba4ed40 bridge/mdb: fix missing new line when show bridge mdb
The bridge mdb show is broken on current iproute2. e.g.
]# bridge mdb show
34: br0  veth0_br  224.1.1.2  temp 34: br0  veth0_br  224.1.1.1  temp

After fix:
]# bridge mdb show
34: br0  veth0_br  224.1.1.2  temp
34: br0  veth0_br  224.1.1.1  temp

v2: Use json print lib as Stephen suggested.
v3: No need to use is_json_context() as print_string() could handle both cases.
v4: use new function print_nl() to print new line in non-json mode.

Reported-by: Ying Xu <yinxu@redhat.com>
Fixes: c7c1a1ef51 ("bridge: colorize output and use JSON print library")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-13 16:02:33 -07:00
Toke Høiland-Jørgensen
b914fe5f1c q_cake: Add printing of no-split-gso option
When the GSO splitting was turned into dual split-gso/no-split-gso options,
the printing of the latter was left out. Add that, so output is consistent
with the options passed.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-12 12:59:38 -07:00
Stephen Hemminger
b85076cd74 lib: introduce print_nl
Common pattern in iproute commands is to print a line seperator
in non-json mode. Make that a simple function.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-11 08:29:33 -07:00
Phil Sutter
bd59e5b151 ip-route: Fix segfault with many nexthops
It was possible to crash ip-route by adding an IPv6 route with 37
nexthop statements. A simple reproducer is:

| for i in `seq 37`; do
| 	nhs="nexthop via 1111::$i "$nhs
| done
| ip -6 route add 3333::/64 $nhs

The related code was broken in multiple ways:

* parse_one_nh() assumed that rta points to 4kB of storage but caller
  provided just 1kB. Fixed by passing 'len' parameter with the correct
  value.

* Error checking of rta_addattr*() calls in parse_one_nh() and called
  functions was completely absent, so with above fix in place output
  flood would occur due to parser looping forever.

While being at it, increase message buffer sizes to 4k. This allows for
at most 144 nexthops.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-10 12:14:50 -07:00
Caleb Raitto
40c2916fda tc/mqprio: Print extra info on invalid args.
Print the name of the argument that wasn't understood.

Signed-off-by: Caleb Raitto <caraitto@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-10 12:14:00 -07:00
Stephen Hemminger
ae775666cf genl: remove unnecessary extern
extern not necessary on function prototype.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-10 11:53:07 -07:00
Stephen Hemminger
ad618b7984 tc/fifo: remove unnecessary prototype
The prototype for prio_print_opt is already in tc_util.h

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-10 11:50:22 -07:00
Stephen Hemminger
0f36267485 bridge: fix vlan show formatting
The output of vlan show was broken previous change to use json_print.
Clean the code up and return to original format.

Note: the JSON syntax has changed to make the bridge vlan
show more like other outputs (e.g. ip -j li show).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-10 11:48:06 -07:00
Stephen Hemminger
2ed82667b8 bridge: use print_json for some outputs
Rather than using is_json_context(), use the print_string functions
which handle both cases.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-10 11:47:11 -07:00
Stephen Hemminger
f5fc738736 bridge: minor change to mdb print
Get port ifname once rather than on both sides of if(is_json_context).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-10 11:47:11 -07:00
Caleb Raitto
781ee3270d man: Change numtc to num_tc
The argument parser only accepts num_tc:

https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/tree/tc/q_mqprio.c#n55

Signed-off-by: Caleb Raitto <caraitto@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-09-10 11:47:11 -07:00
Stephen Hemminger
27886a1241 uapi: update ib_verbs
Merge current uapi from 4.19-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-08-31 15:03:49 -07:00
David Ahern
b555ff737a Merge branch 'netem-slot-param' into iproute2-next
Yousuk Seung  says:

====================

This series adds support for the new "slot" netem parameter for
slotting. Slotting is an approximation of shared media that gather up
packets within a varying delay window before delivering them nearly at
once.

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
2018-08-30 11:08:43 -07:00
Yousuk Seung
588dd51e2c q_netem: slotting with non-uniform distribution
Extend slotting with support for non-uniform distributions. This is
similar to netem's non-uniform distribution delay feature.

Syntax:
   slot distribution DISTRIBUTION DELAY JITTER [packets MAX_PACKETS] \
      [bytes MAX_BYTES]

The syntax and use of the distribution table is the same as in the
non-uniform distribution delay feature. A file DISTRIBUTION must be
present in TC_LIB_DIR (e.g. /usr/lib/tc) containing numbers scaled by
NETEM_DIST_SCALE. A random value x is selected from the table and it
takes DELAY + ( x * JITTER ) as delay. Correlation between values is not
supported.

Examples:
  Normal distribution delay with mean = 800us and stdev = 100us.
  > tc qdisc add dev eth0 root netem slot distribution normal \
    800us 100us

  Optionally set the max slot size in bytes and/or packets.
  > tc qdisc add dev eth0 root netem slot distribution normal \
    800us 100us bytes 64k packets 42

Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-08-30 11:08:19 -07:00
Dave Taht
b6268fbd58 q_netem: support delivering packets in delayed time slots
Slotting is a crude approximation of the behaviors of shared media such
as cable, wifi, and LTE, which gather up a bunch of packets within a
varying delay window and deliver them, relative to that, nearly all at
once.

It works within the existing loss, duplication, jitter and delay
parameters of netem. Some amount of inherent latency must be specified,
regardless.

The new "slot" parameter specifies a minimum and maximum delay between
transmission attempts.

The "bytes" and "packets" parameters can be used to limit the amount of
information transferred per slot.

Examples of use:

tc qdisc add dev eth0 root netem delay 200us \
        slot 800us 10ms bytes 64k packets 42

A more correct example, using stacked netem instances and a packet limit
to emulate a tail drop wifi queue with slots and variable packet
delivery, with a 200Mbit isochronous underlying rate, and 20ms path
delay:

tc qdisc add dev eth0 root handle 1: netem delay 20ms rate 200mbit \
         limit 10000
tc qdisc add dev eth0 parent 1:1 handle 10:1 netem delay 200us \
         slot 800us 10ms bytes 64k packets 42 limit 512

Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-08-30 11:07:46 -07:00
Dave Taht
abf70ef494 tc: support conversions to or from 64 bit nanosecond-based time
Using a 32 bit field to represent time in nanoseconds results in a
maximum value of about 4.3 seconds, which is well below many observed
delays in WiFi and LTE, and barely in the ballpark for a trip past the
Earth's moon, Luna.

Using 64 bit time fields in nanoseconds allows us to simulate
network diameters of several hundred light-years. However, only
conversions to and from ns, us, ms, and seconds are provided.

The iproute2 64 bit api uses signed values for time. Being able to
represent positive or negative time allows us to calculate +/- deltas
between, for example, the CLOCK_TAI and CLOCK_REALTIME clocks.

Time related utility functions in tc_util.c are moved to lib/utils.c.

Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-08-30 11:04:38 -07:00
David Ahern
c4e0ea8e9b Merge branch 'iproute2-master' into iproute2-next
Signed-off-by: David Ahern <dsahern@gmail.com>
2018-08-30 11:04:05 -07:00
Florent Fourcot
2bfe28710e tc/htb: remove unused variable
Since introduction of htb module, this variable has never been used.

Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-08-30 08:00:45 -07:00
Mahesh Bandewar
5d5586b058 iproute: make clang happy
These are primarily fixes for "string is not string literal" warnings
/ errors (with -Werror -Wformat-nonliteral). This should be a no-op
change. I had to replace couple of print helper functions with the
code they call as it was becoming harder to eliminate these warnings,
however these helpers were used only at couple of places, so no
major change as such.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-08-30 07:58:09 -07:00