When creating a new netns or executing a program into an existing one,
the unshare() or setns() calls will change the current netns.
In batch mode, this can run commands on the wrong interfaces, as the
ifindex value is meaningful only in the current netns. For example, this
command fails because veth-c doesn't exists in the init netns:
# ip -b - <<-'EOF'
netns add client
link add name veth-c type veth peer veth-s netns client
addr add 192.168.2.1/24 dev veth-c
EOF
Cannot find device "veth-c"
Command failed -:7
But if there are two devices with the same name in the init and new netns,
ip will build a wrong ll_map with indexes belonging to the new netns,
and will execute actions in the init netns using this wrong mapping.
This script will flush all eth0 addresses and bring it down, as it has
the same ifindex of veth0 in the new netns:
# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.76/24 brd 192.168.122.255 scope global dynamic eth0
valid_lft 3598sec preferred_lft 3598sec
# ip -b - <<-'EOF'
netns add client
link add name veth0 type veth peer name veth1
link add name veth-ns type veth peer name veth0 netns client
link set veth0 down
address flush veth0
EOF
# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default qlen 1000
link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff
3: veth1@veth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether c2:db:d0:34:13:4a brd ff:ff:ff:ff:ff:ff
4: veth0@veth1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether ca:9d:6b:5f:5f:8f brd ff:ff:ff:ff:ff:ff
5: veth-ns@if2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 32:ef:22:df:51:0a brd ff:ff:ff:ff:ff:ff link-netns client
The same issue can be triggered by the netns exec subcommand with a
sligthy different script:
# ip netns add client
# ip -b - <<-'EOF'
netns exec client true
link add name veth0 type veth peer name veth1
link add name veth-ns type veth peer name veth0 netns client
link set veth0 down
address flush veth0
EOF
Fix this by adding two netns_{save,reset} functions, which are used
to get a file descriptor for the init netns, and restore it after
each batch command.
netns_save() is called before the unshare() or setns(),
while netns_restore() is called after each command.
Fixes: 0dc34c7713 ("iproute2: Add processless network namespace support")
Reviewed-and-tested-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
import asm-generic/sockios.h to fix the compile errors from the
movement of timestamp macros.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Update kernel headers to commit
a734d1f4c2fc ("net: openvswitch: return an error instead of doing BUG_ON()")
Signed-off-by: David Ahern <dsahern@gmail.com>
These warnings:
../include/uapi/linux/sockios.h:42:0: warning: "SIOCGSTAMP" redefined
../include/uapi/linux/sockios.h:43:0: warning: "SIOCGSTAMPNS" redefined
are from kernel commit 0768e17073dc5 ("net: socket: implement 64-bit
timestamps"). This commit moved the definitions of SIOCGSTAMP and
SIOCGSTAMPNS from include/asm-generic/sockios.h to
include/uapi/linux/sockios.h. Older OS'es already define them in
/usr/include/asm-generic/sockios.h resulting in ugly compile errors now:
In file included from ll_types.c:24:0:
../include/uapi/linux/sockios.h:42:0: warning: "SIOCGSTAMP" redefined
#define SIOCGSTAMP SIOCGSTAMP_OLD
In file included from /usr/include/x86_64-linux-gnu/asm/sockios.h:1:0,
from /usr/include/asm-generic/socket.h:5,
from /usr/include/x86_64-linux-gnu/asm/socket.h:1,
from /usr/include/x86_64-linux-gnu/bits/socket.h:368,
from /usr/include/x86_64-linux-gnu/sys/socket.h:38,
from ll_types.c:17:
/usr/include/asm-generic/sockios.h:11:0: note: this is the location of the previous definition
#define SIOCGSTAMP 0x8906 /* Get stamp (timeval) */
so wrap them in #ifndef.
Signed-off-by: David Ahern <dsahern@gmail.com>
Update kernel headers to commit
148f025d41a8 ("Merge branch 'hns3-next'")
Note, these warnings:
../include/uapi/linux/sockios.h:42:0: warning: "SIOCGSTAMP" redefined
../include/uapi/linux/sockios.h:43:0: warning: "SIOCGSTAMPNS" redefined
are due to kernel commit
0768e17073dc5 ("net: socket: implement 64-bit timestamps")
which moved the definitions from include/asm-generic/sockios.h
to include/uapi/linux/sockios.h
Signed-off-by: David Ahern <dsahern@gmail.com>
Update kernel headers to commit:
bfbae2eafe05 ("Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue")
Signed-off-by: David Ahern <dsahern@gmail.com>
While iproute2 correctly uses ifinfomsg struct as the ancillary header
when requesting an FDB dump on old kernels, it sets the message type to
RTM_GETLINK. This results in wrong reply being returned.
Fix this by using RTM_GETNEIGH instead.
Before:
$ bridge fdb show brport dummy0
Not RTM_NEWNEIGH: 00000158 00000010 00000002
After:
$ bridge fdb show brport dummy0
2a:0b:41:1c:92:d3 vlan 1 master br0 permanent
2a:0b:41:1c:92:d3 master br0 permanent
33:33:00:00:00:01 self permanent
01:00:5e:00:00:01 self permanent
Fixes: 05880354c2 ("bridge: fdb: Fix filtering with strict checking disabled")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: LiLiang <liali@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
AF_XDP is an address family that is optimized for high performance
packet processing.
This patch adds AF_XDP support to ss(8) so that sockets can be queried
and monitored.
Example:
$ sudo ss --xdp -e -p -m
Recv-Q Send-Q Local Address:Port Peer Address:Port
0 0 enp134s0f0:q20 *
users:(("xdpsock",pid=17787,fd=3)) ino:39424 sk:4
rx(entries:2048)
tx(entries:2048)
umem(id:1,size:8388608,num_pages:2048,chunk_size:2048,headroom:0,ifindex:7,
qid:20,zc:0,refs:1)
fr(entries:2048)
cr(entries:2048) skmem:(r0,rb212992,t0,tb212992,f0,w0,o0,bl0,d0)
0 0 enp24s0f0:q0 *
users:(("xdpsock",pid=17780,fd=3)) ino:37384 sk:5
rx(entries:2048)
tx(entries:2048)
umem(id:0,size:8388608,num_pages:2048,chunk_size:2048,headroom:0,ifindex:6,
qid:0,zc:1,refs:1)
fr(entries:2048)
cr(entries:2048) skmem:(r0,rb212992,t0,tb212992,f0,w0,o0,bl0,d0)
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Update kernel headers to commit:
c829f5f52db9 ("cxgb4: cxgb4_tc_u32: use struct_size() in kvzalloc()")
and import xdp_diag.h for the next patch.
Signed-off-by: David Ahern <dsahern@gmail.com>
Add RTNL_HANDLE_F_STRICT_CHK flag and set in rth flags to let know
commands know if the kernel supports strict checking.
Extracted from patch from Ido to fix filtering with strict checking
enabled.
Cc: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add filter function to rtnl_neighdump_req and a buffer to the
request for the filter functions to append attributes.
Signed-off-by: David Ahern <dsahern@gmail.com>
iproute2 has been updated for the new strict policy in the kernel. Add a
helper to call setsockopt to enable the feature. Add a call to ip.c and
bridge.c
The setsockopt fails on older kernels and the error can be safely ignored
- any new fields or attributes are ignored by the older kernel.
Signed-off-by: David Ahern <dsahern@gmail.com>
Add a filter function to rtnl_addrdump_req to set device index in the
address dump request if the user is filtering addresses by device. In
addition, add a new ipaddr_link_get to do a single RTM_GETLINK request
instead of a device dump yet still store the data in the linfo list.
Signed-off-by: David Ahern <dsahern@gmail.com>
Add a filter option to rtnl_routedump_req and use it to set rtm_flags
removing the need for rtnl_rtcache_request for dump requests.
Signed-off-by: David Ahern <dsahern@gmail.com>
ip l add dev tun type gretap external
ip r a 10.0.0.1 encap ip dst 192.168.152.171 id 1000 dev gretap
For gretap Key example when the command set the id but don't set the
TUNNEL_KEY flags. There is no key field in the send packet
In the lwtunnel situation, some TUNNEL_FLAGS should can be set by
userspace
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
NETLINK_DUMP_STRICT_CHK can be used for all GET requests,
dumps as well as doit handlers. Replace the DUMP in the
name with GET make that clearer.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
While most distributions long ago switched to the iproute2 suite
of utilities, which allow class-e (240.0.0.0/4) address assignment,
distributions relying on busybox, toybox and other forms of
ifconfig cannot assign class-e addresses without this kernel patch.
While CIDR has been obsolete for 2 decades, and a survey of all the
open source code in the world shows the IN_whatever macros are also
obsolete... rather than obsolete CIDR from this ioctl entirely, this
patch merely enables class-e assignment, sanely.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Update kernel headers to commit
055722716c39 ("tipc: fix uninitialized value for broadcast retransmission")
Signed-off-by: David Ahern <dsahern@gmail.com>
DECnet belongs in the history museum of dead protocols along
with Appletalk and IPX.
Linux support has outlived its natural life and the time has
come to remove it from iproute2. Dead code is a source
of bugs and exploits.
If anyone actually has DECnet running on some old distribution
they can just keep to the old version of iproute2.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>