When sending accumulated compound command results an error, check 'force'
option before exiting. Move return code check after putting batch bufs and
freeing iovs to prevent memory leak. Break from loop, instead of returning
error code to allow cleanup at the end of batch function. Don't reset ret
code on each iteration.
Fixes: 485d0c6001 ("tc: Add batchsize feature for filter and actions")
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch adds support for OUTPUT_MARK in xfrm state to exercise the
functionality added by kernel commit 077fbac405bf
("net: xfrm: support setting an output mark.").
Sample output-
(with mark and output-mark)
src 192.168.1.1 dst 192.168.1.2
proto esp spi 0x00004321 reqid 0 mode tunnel
replay-window 0 flag af-unspec
mark 0x10000/0x3ffff output-mark 0x20000
auth-trunc xcbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b211 96
enc cbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b233
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
(with mark only)
src 192.168.1.1 dst 192.168.1.2
proto esp spi 0x00004321 reqid 0 mode tunnel
replay-window 0 flag af-unspec
mark 0x10000/0x3ffff
auth-trunc xcbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b211 96
enc cbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b233
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
(with output-mark only)
src 192.168.1.1 dst 192.168.1.2
proto esp spi 0x00004321 reqid 0 mode tunnel
replay-window 0 flag af-unspec
output-mark 0x20000
auth-trunc xcbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b211 96
enc cbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b233
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
(no mark and output-mark)
src 192.168.1.1 dst 192.168.1.2
proto esp spi 0x00004321 reqid 0 mode tunnel
replay-window 0 flag af-unspec
auth-trunc xcbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b211 96
enc cbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b233
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
v1->v2: Moved the XFRMA_OUTPUT_MARK print after XFRMA_MARK in
xfrm_xfrma_print() as mentioned by Lorenzo
v2->v3: Fix one help formatting error as mentioned by Lorenzo.
Keep mark and output-mark on the same line and add man page info as
mentioned by David.
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch adds basic support for Qualcomm rmnet devices.
Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
As mentioned in the ip-address man page, an address label must
be equal to the device name or prefixed by the device name
followed by a colon. Currently the only check on this input is
to see if the device name appears at the beginning of the label
string.
This commit adds an additional check to ensure label == dev or
continues with a colon.
Signed-off-by: Patrick Talbert <ptalbert@redhat.com>
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
In the commit 9a362cc71a, new userspace header:
(i.e rdma/rdma_user_cm.h -> linux/in6.h)
is included before the kernel space header:
(i.e utils.h -> resolv.h -> netinet/in.h).
This leads to unsynchronous some IP headers and compiler got failure
with error: redefinition of some structs IP.
In this commit, just reorder this including to make them in-sync.
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add support for:
BGP
ISIS
OSPF
RIP
EIGRP
Routing protocols to iproute2.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
In the commit 94f6a80 on next-net, TIPC_NLA_LINK_NAME attribute should be
retrieved and validated via TIPC_NLA_LINK nesting entry in
tipc_nl_node_get_link().
According to that commit, TIPC_NLA_LINK_NAME value passing via
tipc link get command must follow above hierachy.
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The 'link-netnsid' argument needs a number. Add 'link-netns' when the user
wants to use the iproute2 netns name instead of the nsid.
Example:
ip link add ipip1 link-netns foo type ipip remote 10.16.0.121 local 10.16.0.249
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
When iproute2 has a name for the nsid, let's display it. It's more
user friendly than a number.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Since commit 049c58539f ("devlink: mnlg: Add support for extended ack")
devlink requires NETLINK_{CAP,EXT}_ACK. This prevents devlink from
working with older kernels that don't support these features.
host # ./devlink/devlink
Failed to connect to devlink Netlink
Fixes: 049c58539f ("devlink: mnlg: Add support for extended ack")
Cc: Arkadi Sharshevsky <arkadis@mellanox.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Parse and display those attributes.
Example:
ip l a type dummy
ip netns add foo
ip monitor link&
ip l s dummy1 netns foo
Deleted 6: dummy1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether 66:af:3a:3f:a0:89 brd ff:ff:ff:ff:ff:ff new-nsid 0 new-ifindex 6
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Currently, calling 'ip xfrm monitor all' will
actually invoke the 'all-nsid' command because the
soft-match for 'all-nsid' occurs before the precise
match for 'all'. This patch rearranges the checks
so that the 'all' command, itself an alias for
invoking 'ip xfrm monitor' with no argument, can
be called consistent with the syntax for other ip
commands that accept an 'all'.
Signed-off-by: Nathan Harold <nharold@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
A recent commit changed rtnl_talk_* to return the response message in
allocated memory so callers need to free it. The change to name_is_vrf
did not save the device index which is pointing to a struct inside the
now allocated and freed memory resulting in garbage getting returned
in some cases.
Fix by using a stack variable to save the return value and only set
it to ifi->ifi_index after all checks are done and before the answer
buffer is freed.
Fixes: 86bf43c7c2 ("lib/libnetlink: update rtnl_talk to support malloc buff at run time")
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
RTA_CACHEINFO can be sent for non-cloned routes. If the attribute is
present print it. Allows route dumps to print expires times for example
which can exist on FIB entries.
Signed-off-by: David Ahern <dsahern@gmail.com>
The ip command would always lookup the network device index
even when not necessary. This slows down operations like creating
lots of VLAN's.
David reported the original issue, this is an alternative patch
that solves it in a slightly more general method.
Using iproute2 to create a bridge and add 4094 vlans to it can take from
2 to 3 *minutes*. The reason is the extraneous call to ll_name_to_index.
ll_name_to_index results in an ioctl(SIOCGIFINDEX) call which in turn
invokes dev_load. If the index does not exist, which it won't when
creating a new link, dev_load calls modprobe twice -- once for
netdev-NAME and again for NAME. This is unnecessary overhead for each
link create.
When ip link is invoked for a new device, there is no reason to
call ll_name_to_index for the new device. With this patch, creating
a bridge and adding 4094 vlans takes less than 3 *seconds*.
old:
# time ip -batch ip-vlan.batch
real 3m13.727s
user 0m0.076s
sys 0m1.959s
new:
# time ip -batch ip-vlan.batch
real 0m3.222s
user 0m0.044s
sys 0m1.777s
Reported-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
rta_expires is a signed int; print it as one.
Fixes: 663c3cb231 ("iproute: implement JSON and color output")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Currently NETNS_RUN_DIR is hardcoded and refers to /var/run/netns.
However, some systems (e.g. Android) doesn't have /var
which results in error attempts to create network namespaces on these
systems. This change makes NETNS_RUN_DIR configurable at build time
by allowing to pass environment variable to make command.
Also, this change makes /etc/netns directory configurable through
NETNS_ETC_DIR environment variable.
For example: ./configure && NETNS_RUN_DIR=/mnt/vendor/netns make
Tested: verified that iproute2 with configuration mentioned above
creates namespaces in /mnt/vendor/netns
Signed-off-by: Pavel Maltsev <pavelm@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Steve Wise says:
====================
This series enhances the iproute2 rdma tool to include displaying
driver-specific resource attributes. It is the user-space part of the
kernel driver resource tracking series that has been accepted for merging
into linux-4.18 [1]
If there are no additional review comments, it can now be merged, I think.
Changes since v2:
- resync rdma_netlink.h to fix uapi break
Changes since v1:
- commit log editorial fixes
- cite kernel commits that updated rdma_netlink.h in the
iproute2 commit syncing this header
- reorder stack definitions ala "reverse christmas tree"
- correctly handle unknown driver attributes when printing
Changes since v0/rfc:
- changed "provider" to "driver" based on kernel side changes
- updated man pages
- removed "RFC" tag
Thanks,
Steve.
[1] https://www.spinics.net/lists/linux-rdma/msg64199.html
====================
Signed-off-by: David Ahern <dsahern@gmail.com>
Update the man pages for the resource attributes as well
as the driver-specific attributes.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This enhancement allows printing rdma device-specific state, if provided
by the kernel. This is done in a generic manner, so rdma tool doesn't
need to know about the details of every type of rdma device.
Driver attributes for a rdma resource are in the form of <key,
[print_type], value> tuples, where the key is a string and the value can
be any supported driver attribute. The print_type attribute, if present,
provides a print format to use vs the standard print format for the type.
For example, the default print type for a PROVIDER_S32 value is "%d ",
but "0x%x " if the print_type of PRINT_TYPE_HEX is included inthe tuple.
Driver resources are only printed when the -dd flag is present.
If -p is present, then the output is formatted to not exceed 80 columns,
otherwise it is printed as a single row to be grep/awk friendly.
Example output:
# rdma resource show qp lqpn 1028 -dd -p
link cxgb4_0/- lqpn 1028 rqpn 0 type RC state RTS rq-psn 0 sq-psn 0 path-mig-state MIGRATED pid 0 comm [nvme_rdma]
sqid 1028 flushed 0 memsize 123968 cidx 85 pidx 85 wq_pidx 106 flush_cidx 85 in_use 0
size 386 flags 0x0 rqid 1029 memsize 16768 cidx 43 pidx 41 wq_pidx 171 msn 44 rqt_hwaddr 0x2a8a5d00
rqt_size 256 in_use 128 size 130
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
We make it easier for users to correlate between 128-bit node
identities and 32-bit node hash number by extending the 'node list'
command to also show the hash number.
We also improve the 'nametable show' command to show the node identity
instead of the node hash number. Since the former potentially is much
longer than the latter, we make room for it by eliminating the (to the
user) irrelevant publication key. We also reorder some of the columns so
that the node id comes last, since this looks nicer and is more logical.
Signed-off-by: David Ahern <dsahern@gmail.com>
In order to make TDC tests match the output patterns, the missing space
character must be added in the mode output string.
Fixes: 8744c5d338 ("tc: jsonify ife action")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Currently there is no way to log offloading errors if the rule is not
explicitly marked as skip_sw, making it hard for other applications such
as Open vSwitch to log why a given could not be offloaded.
This patch adds support for signaling the kernel that more verbose
logging is wanted, which now will include such messages.
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Allowing 0% is sometimes useful for example in netem loss and drop
or perhaps dropping all traffic in a HTB bin.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199745
Reported-by: stuartmarsden@gmail.com
Fixes: 927e3cfb52 ("tc: B.W limits can now be specified in %.")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
As the kernel code says, limit is actually the amount of packets it can
hold queued at a time, as per:
static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
struct sk_buff **to_free)
{
...
if (unlikely(sch->q.qlen >= sch->limit))
return qdisc_drop_all(skb, sch, to_free);
So lets fix the description of the field in the man page.
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
As the kernel code says, limit is actually the amount of packets it can
hold queued at a time, as per:
static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
struct sk_buff **to_free)
{
...
if (unlikely(sch->q.qlen >= sch->limit))
return qdisc_drop_all(skb, sch, to_free);
So lets fix the description of the field in the man page.
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Users have reported a regression due to ip now dropping capabilities
unconditionally.
zerotier-one VPN and VirtualBox use ambient capabilities in their
binary and then fork out to ip to set routes and links, and this
does not work anymore.
As a workaround, do not drop caps if CAP_NET_ADMIN (the most common
capability used by ip) is set with the INHERITABLE flag.
Users that want ip vrf exec to work do not need to set INHERITABLE,
which will then only set when the calling program had privileges to
give itself the ambient capability.
Fixes: ba2fc55b99 ("Drop capabilities if not running ip exec vrf with libcap")
Signed-off-by: Luca Boccassi <bluca@debian.org>
In this commit we introduce the ability to set and get
MTU for UDP media and bearer.
For set and get properties such as tolerance, window and priority,
we already do:
$ tipc media set PPROPERTY media MEDIA
$ tipc media get PPROPERTY media MEDIA
$ tipc bearer set OPTION media MEDIA ARGS
$ tipc bearer get [OPTION] media MEDIA ARGS
The same has been extended for MTU, with an exception to support
only media type UDP.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: GhantaKrishnamurthy MohanKrishna <mohan.krishna.ghanta.krishnamurthy@ericsson.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Ss was using slabinfo to try and intuit TCP statistics.
The slabinfo changed several times since 2.4 and all these statistics
are broken by renames and slab merging. Plus slabinfo does not exist
at all if kernel is compiled with SLUB option.
Rather than trying to fix kernel, just trim away the no longer
valid statistics.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>