Currently new json object opens (and delete_json_obj closes) the object as
an array, what adds prints for the matching bracket '[' ']' at the
start/end of the object. This patch adds new_json_obj_plain() and the
matching delete_json_obj_plain() to enable opening and closing json object,
not as array and leave it to the using function to decide which type of
object to open/close as the main object.
Signed-off-by: Ron Diskin <rondi@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Until now print_#type functions supported printing constant names and
unknown (variable) values only.
Add functions to allow printing when the name is also sent to the
function as a variable.
Signed-off-by: Ron Diskin <rondi@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
fread(3) returns size_t data type which is unsigned, thus check
`if (fread(...) < 0)' is always false. To check if fread(3) has
failed, user should check error indicator with ferror(3).
This commit also changes read logic a little bit by being less
forgiving for errors. Previous logic was checking if fread(3)
read *at least* required ammount of data, now code checks if
fread(3) read *exactly* expected ammount of data. This makes
sense because code parses very specific binary file, and reading
even 1 less/more byte than expected, will later corrupt data anyway.
Signed-off-by: Michał Łyszczek <michal.lyszczek@bofc.pl>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Extend ll_name_to_index() to get the index of a netdevice using
alternative interface name. Allow alternative long names to pass checks
in couple of ip link/addr commands.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Implement addition/deletion of lists of properties, currently
alternative ifnames. Also extent the ip link show command to list them.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Alternative names are related to the "parent name". That means,
whenever ll_remember_index() is called to add/delete/update and it founds
the "parent name" im object by ifindex, processes related
alternative name im objects too. Put them in a list which holds the
relationship with the parent.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
If two processes attempt to invoke bpf_map_attach() at the same time,
then they will both create maps, then the first will successfully pin
the map to the filesystem and the second will not pin the map, but will
continue operating with a reference to its own copy of the map. As a
result, the sharing of the same map will be broken from the two programs
that were concurrently loaded via loaders using this library.
Fix this by adding a retry in the case where the pinning fails because
the map already exists on the filesystem. In that case, re-attempt
opening a fd to the map on the filesystem as it shows that another
program already created and pinned a map at that location.
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This reduces stack usage, as asprintf allocates memory on the heap.
This indirectly fixes a snprintf truncation warning (from gcc v9.2.1):
bpf.c: In function ‘bpf_get_work_dir’:
bpf.c:784:49: warning: ‘snprintf’ output may be truncated before the last format character [-Wformat-truncation=]
784 | snprintf(bpf_wrk_dir, sizeof(bpf_wrk_dir), "%s/", mnt);
| ^
bpf.c:784:2: note: ‘snprintf’ output between 2 and 4097 bytes into a destination of size 4096
784 | snprintf(bpf_wrk_dir, sizeof(bpf_wrk_dir), "%s/", mnt);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: e42256699c ("bpf: make tc's bpf loader generic and move into lib")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
get_s64() uses internally strtoll() to parse the value out of a given
string. strtoll() returns a long long. However, the intermediate variable is
long only which might be 32 bit on some systems. So, fix it.
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Fixes: fcc16c22 ("provide common json output formatter")
Signed-off-by: Ivan Delalande <colona@arista.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
iproute has an utility function which checks if a string is a prefix for
another one, to allow use of abbreviated commands, e.g. 'addr' or 'a'
instead of 'address'.
This routine unfortunately considers an empty string as prefix
of any pattern, leading to undefined behaviour when an empty
argument is passed to ip:
# ip ''
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
# tc ''
qdisc noqueue 0: dev lo root refcnt 2
# ip address add 192.0.2.0/24 '' 198.51.100.1 dev dummy0
# ip addr show dev dummy0
6: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 02:9d:5e:e9:3f:c0 brd ff:ff:ff:ff:ff:ff
inet 192.0.2.0/24 brd 198.51.100.1 scope global dummy0
valid_lft forever preferred_lft forever
Rewrite matches() so it takes care of an empty input, and doesn't
scan the input strings three times: the actual implementation
does 2 strlen and a memcpy to accomplish the same task.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Update the llproto_names array to allow users to reference the mpls
protocol ids with the names 'mpls_uc' for unicast MPLS and 'mpls_mc' for
multicast.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The netns_{save,restore} functions are only used in ipnetns.c now, since
the restore is not needed anymore after the netns exec command.
Move them in ipnetns.c, and make them static.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
'ip netns exec' changes the current netns just before executing a child
process, and restores it after forking. This is needed if we're running
in batch or do_all mode.
Some cleanups must be done both in the parent and in the child: the
parent must restore the previous netns, while the child must reset any
VRF association.
Unfortunately, if do_all is set, the VRF are not reset in the child, and
the spawned processes are started with the wrong VRF context. This can
be triggered with this script:
# ip -b - <<-'EOF'
link add type vrf table 100
link set vrf0 up
link add type dummy
link set dummy0 vrf vrf0 up
netns add ns1
EOF
# ip -all -b - <<-'EOF'
vrf exec vrf0 true
netns exec setsid -f sleep 1h
EOF
# ip vrf pids vrf0
314 sleep
# ps 314
PID TTY STAT TIME COMMAND
314 ? Ss 0:00 sleep 1h
Refactor cmd_exec() and pass to it a function pointer which is called in
the child before the final exec. In the netns exec case the function just
resets the VRF and switches netns.
Doing it in the child is less error prone and safer, because the parent
environment is always kept unaltered.
After this refactor some utility functions became unused, so remove them.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add a new parameter '-Numeric' to show the number of protocol, scope,
dsfield, etc directly instead of converting it to human readable name.
Do the same on tc and ss.
This patch is based on David Ahern's previous patch.
Suggested-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Devlink commands which implements the dumpit callback may return error.
The netlink function netlink_dump() sends the errno value as the payload
of the message, while answering user space with NLMSG_DONE.
To enable receiving errno value for dumpit commands we have to check for
it in the message. If it is a negative value then the dump returned an
error so we should set errno accordingly and check for ext_ack in case
it was set.
Fixes: 049c58539f ("devlink: mnlg: Add support for extended ack")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
groups > 31 have to be joined using the setsockopt. Since the nexthop
group is 32, add a helper to allow 'ip monitor' to listen for nexthop
messages.
Signed-off-by: David Ahern <dsahern@gmail.com>
When creating a new netns or executing a program into an existing one,
the unshare() or setns() calls will change the current netns.
In batch mode, this can run commands on the wrong interfaces, as the
ifindex value is meaningful only in the current netns. For example, this
command fails because veth-c doesn't exists in the init netns:
# ip -b - <<-'EOF'
netns add client
link add name veth-c type veth peer veth-s netns client
addr add 192.168.2.1/24 dev veth-c
EOF
Cannot find device "veth-c"
Command failed -:7
But if there are two devices with the same name in the init and new netns,
ip will build a wrong ll_map with indexes belonging to the new netns,
and will execute actions in the init netns using this wrong mapping.
This script will flush all eth0 addresses and bring it down, as it has
the same ifindex of veth0 in the new netns:
# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.76/24 brd 192.168.122.255 scope global dynamic eth0
valid_lft 3598sec preferred_lft 3598sec
# ip -b - <<-'EOF'
netns add client
link add name veth0 type veth peer name veth1
link add name veth-ns type veth peer name veth0 netns client
link set veth0 down
address flush veth0
EOF
# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default qlen 1000
link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff
3: veth1@veth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether c2:db:d0:34:13:4a brd ff:ff:ff:ff:ff:ff
4: veth0@veth1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether ca:9d:6b:5f:5f:8f brd ff:ff:ff:ff:ff:ff
5: veth-ns@if2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 32:ef:22:df:51:0a brd ff:ff:ff:ff:ff:ff link-netns client
The same issue can be triggered by the netns exec subcommand with a
sligthy different script:
# ip netns add client
# ip -b - <<-'EOF'
netns exec client true
link add name veth0 type veth peer name veth1
link add name veth-ns type veth peer name veth0 netns client
link set veth0 down
address flush veth0
EOF
Fix this by adding two netns_{save,reset} functions, which are used
to get a file descriptor for the init netns, and restore it after
each batch command.
netns_save() is called before the unshare() or setns(),
while netns_restore() is called after each command.
Fixes: 0dc34c7713 ("iproute2: Add processless network namespace support")
Reviewed-and-tested-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Before the patch:
$ ip netns add foo
$ ip link add name veth1 address 2a:a5:5c:b9:52:89 type veth peer name veth2 address 2a:a5:5c:b9:53:90 netns foo
RTNETLINK answers: No such device
RTNETLINK answers: No such device
But the command was successful. This may break script. Let's remove those
error messages.
Fixes: 55870dfe7f ("Improve batch and dump times by caching link lookups")
Reported-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
For a NETROM "ip link show dev nr0" will show
4: nr0: <NOARP,UP,LOWER_UP> mtu 236 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/generic 88:98:6a:a4:84:40:0a brd 00:00:00:00:00:00:00
But rather link/netrom is expected to be displayed.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
ip route uses ll_name_to_index and ll_index_to_name to convert between
device names and indices. At the moment both use for the ioctl based glibc
functions if_nametoindex and if_indextoname and does not cache the result.
When using a batch file or dumping large number of routes this means the
same device lookups can be done repeatedly adding unnecessary overhead
(socket + ioctl + close for each device lookup).
Add a new function, ll_link_get, to send a netlink based RTM_GETLINK. If
successful, cache the result in idx_head and name_head so future lookups
can re-use the entry. Update ll_name_to_index and ll_index_to_name to use
ll_link_get and only fallback to the glibc functions if it fails.
With this change the time to install 720,022 routes with 2 ecmp nexthops
where the nexthop device is given is reduced from 31.4 seconds to 19.2
seconds. A dump of those routes drops from 13.3 to 2.8 seconds.
Signed-off-by: David Ahern <dsahern@gmail.com>
In the past, we tried to increase the buffer size up to 32 KB in order
to reduce number of syscalls per dump.
Commit 2d34851cd3 ("lib/libnetlink: re malloc buff if size is not enough")
brought the size back to 4KB because the kernel can not know the application
is ready to receive bigger requests.
See kernel commits 9063e21fb026 ("netlink: autosize skb lengthes") and
d35c99ff77ec ("netlink: do not enter direct reclaim from netlink_dump()")
for more details.
Fixes: 2d34851cd3 ("lib/libnetlink: re malloc buff if size is not enough")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
in this way, a useless cast to unsigned int is avoided in bpf_print_ops()
and print_tunnel().
Tested with:
# ./tdc.py -c bpf
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Cc: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The issue is discovered for bpf selftest test_skb_cgroup.sh.
Currently we have,
$ ./test_skb_cgroup_id.sh
Wait for testing link-local IP to become available ... OK
Object has unknown BTF type: 13!
[PASS]
In the above the BTF type 13 refers to BTF kind
BTF_KIND_FUNC_PROTO.
This patch added support of BTF_KIND_FUNC_PROTO and
BTF_KIND_FUNC during type parsing.
With this patch, I got
$ ./test_skb_cgroup_id.sh
Wait for testing link-local IP to become available ... OK
[PASS]
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
While iproute2 correctly uses ifinfomsg struct as the ancillary header
when requesting an FDB dump on old kernels, it sets the message type to
RTM_GETLINK. This results in wrong reply being returned.
Fix this by using RTM_GETNEIGH instead.
Before:
$ bridge fdb show brport dummy0
Not RTM_NEWNEIGH: 00000158 00000010 00000002
After:
$ bridge fdb show brport dummy0
2a:0b:41:1c:92:d3 vlan 1 master br0 permanent
2a:0b:41:1c:92:d3 master br0 permanent
33:33:00:00:00:01 self permanent
01:00:5e:00:00:01 self permanent
Fixes: 05880354c2 ("bridge: fdb: Fix filtering with strict checking disabled")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: LiLiang <liali@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Without this fix, the VF info can't be showed using command
"ip link".
146: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 24:8a:07:ad:78:52 brd ff:ff:ff:ff:ff:ff
vf 0 MAC 02:25:d0:12:01:01, spoof checking off, link-state auto, trust off, query_rss off
vf 1 MAC 02:25:d0:12:01:02, spoof checking off, link-state auto, trust off, query_rss off
Fixes: d97b16b2c9 ("libnetlink: linkdump_req: Only AF_UNSPEC family expects an ext_filter_mask")
Signed-off-by: Chris Mi <chrism@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The bridge command 'vlan show' calls rtnl_linkdump_req_filter for
family AF_BRIDGE. Update rtnl_linkdump_req_filter to send the filter
for that family as well.
Fixes: d97b16b2c9 ("libnetlink: linkdump_req: Only AF_UNSPEC family expects an ext_filter_mask")
Reported-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Add RTNL_HANDLE_F_STRICT_CHK flag and set in rth flags to let know
commands know if the kernel supports strict checking.
Extracted from patch from Ido to fix filtering with strict checking
enabled.
Cc: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add filter function to rtnl_neighdump_req and a buffer to the
request for the filter functions to append attributes.
Signed-off-by: David Ahern <dsahern@gmail.com>
iproute2 has been updated for the new strict policy in the kernel. Add a
helper to call setsockopt to enable the feature. Add a call to ip.c and
bridge.c
The setsockopt fails on older kernels and the error can be safely ignored
- any new fields or attributes are ignored by the older kernel.
Signed-off-by: David Ahern <dsahern@gmail.com>
Add a filter function to rtnl_addrdump_req to set device index in the
address dump request if the user is filtering addresses by device. In
addition, add a new ipaddr_link_get to do a single RTM_GETLINK request
instead of a device dump yet still store the data in the linfo list.
Signed-off-by: David Ahern <dsahern@gmail.com>
Add a filter option to rtnl_routedump_req and use it to set rtm_flags
removing the need for rtnl_rtcache_request for dump requests.
Signed-off-by: David Ahern <dsahern@gmail.com>
Only AF_UNSPEC handled by rtnl_dump_ifinfo expects an ext_filter_mask
on a dump request. Update the linkdump request functions to only set
and send ext_filter_mask for AF_UNSPEC.
Signed-off-by: David Ahern <dsahern@gmail.com>
Change nlmsg_len from sizeof(req) to use NLMSG_LENGTH on the header.
2 of the inner headers are not 4-byte aligned, so add a 0-length buf
after the header with the __aligned(NLMSG_ALIGNTO) to ensure the size
of the request is large enough. Use NLMSG_ALIGN in NLMSG_LENGTH to set
nlmsg_len.
Signed-off-by: David Ahern <dsahern@gmail.com>
Print any extack message that has been appended to a NLMSG_DONE message.
To avoid duplication, move the existing print code to a new helper.
Signed-off-by: David Ahern <dsahern@gmail.com>
DECnet belongs in the history museum of dead protocols along
with Appletalk and IPX.
Linux support has outlived its natural life and the time has
come to remove it from iproute2. Dead code is a source
of bugs and exploits.
If anyone actually has DECnet running on some old distribution
they can just keep to the old version of iproute2.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>