mirror_iproute2

mirror of https://git.proxmox.com/git/mirror_iproute2 synced 2025-08-25 01:11:15 +00:00

Author	SHA1	Message	Date
Andrea Claudi	68c46872ce	ip address: do not set mngtmpaddr option for IPv4 addresses 'mngtmpaddr' option make the kernel manage temporary addresses created from the specified one as template on behalf of Privacy Extensions (RFC3041). This option should be available only for IPv6 addresses, as correctly stated in the manpage. However it is possible to set mngtmpaddr on IPv4 addresses, too: $ ip link add dummy0 type dummy $ ip -4 addr add 192.168.1.1 dev dummy0 mngtmpaddr $ ip a 1: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 1a:6d:c6:96:ca:f8 brd ff:ff:ff:ff:ff:ff inet 192.168.1.1/32 scope global mngtmpaddr dummy0 valid_lft forever preferred_lft forever Fix this adding a check on the protocol family before setting IFA_F_MANAGETEMPADDR flag. Fixes: `5b7e21c417` ("add support for IFA_F_MANAGETEMPADDR") Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-28 15:18:28 -07:00
Andrea Claudi	e4448b6c7d	ip address: do not set home option for IPv4 addresses 'home' option designates a IPv6 address as "home address" as defined in RFC 6275. This option should be available only for IPv6 addresses, as correctly stated in the manpage. However it is possible to set home on IPv4 addresses, too: $ ip link add dummy0 type dummy $ ip -4 addr add 192.168.1.1 dev dummy0 home $ ip a 1: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 1a:6d:c6:96:ca:f8 brd ff:ff:ff:ff:ff:ff inet 192.168.1.1/32 scope global home dummy0 valid_lft forever preferred_lft forever Fix this adding a check on the protocol family before setting IFA_F_HOMEADDRESS flag. Fixes: `bac735c53a` ("enabled to manipulate the flags of IFA_F_HOMEADDRESS or IFA_F_NODAD from ip.") Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-28 15:18:28 -07:00
Andrea Claudi	8ae99cc46d	ip address: do not set nodad option for IPv4 addresses Duplicate Address Detection (RFC 4862) is available only for IPv6 addresses. As a consequence, 'nodad' option, turning it off, should be available only for IPv6, and is defined like that in the man page. However it is possible to set nodad on IPv4 addresses, too: $ ip link add dummy0 type dummy $ ip -4 addr add 192.168.1.1 dev dummy0 nodad $ ip a 1: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 1a:6d:c6:96:ca:f8 brd ff:ff:ff:ff:ff:ff inet 192.168.1.1/32 scope global nodad dummy0 valid_lft forever preferred_lft forever Fix this adding a check on the protocol family before setting IFA_F_NODAD flag. Fixes: `bac735c53a` ("enabled to manipulate the flags of IFA_F_HOMEADDRESS or IFA_F_NODAD from ip.") Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-28 15:18:28 -07:00
Stefano Brivio	b5cf263670	iproute: Set flags and attributes on dump to get IPv6 cached routes to be flushed With a current (5.1) kernel version, IPv6 exception routes can't be listed (ip -6 route list cache) or flushed (ip -6 route flush cache). Kernel support for this is being added back. Relevant net-next commits: 564c91f7e563 fib_frontend, ip6_fib: Select routes or exceptions dump from RTM_F_CLONED ef11209d4219 Revert "net/ipv6: Bail early if user only wants cloned entries" 3401bfb1638e ipv6/route: Don't match on fc_nh_id if not set in ip6_route_del() bf9a8a061ddc ipv6/route: Change return code of rt6_dump_route() for partial node dumps 1e47b4837f3b ipv6: Dump route exceptions if requested 40cb35d5dc04 ip6_fib: Don't discard nodes with valid routing information in fib6_locate_1() However, to allow the kernel to filter routes based on the RTM_F_CLONED flag, we need to make sure this flag is always passed when we want cached routes to be dumped, and we can also pass table and output interface attributes to have the kernel filtering on them, if requested by the user. Use the existing iproute_dump_filter() as a filter for the dump request in iproute_flush(). This way, 'ip -6 route flush cache' works again. v2: Instead of creating a separate 'filter' function dealing with RTM_F_CACHED only, use the existing iproute_dump_filter() and get table and oif kernel filtering for free. Suggested by David Ahern. Fixes: `aba5acdfdb` ("(Logical change 1.3)") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-26 14:27:00 -07:00
Hangbin Liu	5a403866f3	ip/iptoken: fix dump error when ipv6 disabled When we disable IPv6 from the start up (ipv6.disable=1), there will be no IPv6 route info in the dump message. If we return -1 when ifi->ifi_family != AF_INET6, we will get error like $ ip token list Dump terminated which will make user feel confused. There is no need to return -1 if the dump message not match. Return 0 is enough. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-26 14:23:12 -07:00
Stephen Hemminger	f799505372	devlink: replace print macros with functions Using functions is safer, and printing is not performance critical. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-26 09:18:18 -07:00
Eyal Birger	bfa757e02f	tc: adjust xtables_match and xtables_target to changes in recent iptables iptables commit 933400b37d09 ("nft: xtables: add the infrastructure to translate from iptables to nft") added an additional member to struct xtables_match and struct xtables_target. This change is available for libxtables12 and up. Add these members conditionally to support both newer and older versions. Fixes: `dd29621578` ("tc: add em_ipt ematch for calling xtables matches from tc matching context") Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-24 16:12:17 -07:00
David Ahern	f7eef91897	Merge branch 'master' into next Conflicts: include/uapi/linux/snmp.h Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-21 15:59:24 -07:00
Jakub Kicinski	b3cf1167e7	tc: q_netem: JSON-ify the output Add JSON output support to q_netem. The normal output is untouched. In JSON output always use seconds as the base of time units, and non-percentage numbers (0.01 instead of 1%). Try to always report the fields, even if they are zero. All this should make the output more machine-friendly. v2: less macroes Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-21 15:51:35 -07:00
Nicolas Dichtel	6d77d9c6ae	ip monitor: display interfaces from all groups Only interface from group 0 were displayed. ip monitor calls ipaddr_reset_filter() and there is no reason to not reset the filter group in this function. Fixes: `c4fdf75d3d` ("ip link: fix display of interface groups") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-21 12:59:50 -07:00
Matteo Croce	b2e2922373	netns: make netns_{save,restore} static The netns_{save,restore} functions are only used in ipnetns.c now, since the restore is not needed anymore after the netns exec command. Move them in ipnetns.c, and make them static. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-20 14:30:41 -07:00
Matteo Croce	d81d4ba15d	ip vrf: use hook to change VRF in the child On vrf exec, reset the VRF associations in the child process, via the new hook added to cmd_exec(). In this way, the parent doesn't have to reset the VRF associations before spawning other processes. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-20 14:30:41 -07:00
Matteo Croce	903818fbf9	netns: switch netns in the child when executing commands 'ip netns exec' changes the current netns just before executing a child process, and restores it after forking. This is needed if we're running in batch or do_all mode. Some cleanups must be done both in the parent and in the child: the parent must restore the previous netns, while the child must reset any VRF association. Unfortunately, if do_all is set, the VRF are not reset in the child, and the spawned processes are started with the wrong VRF context. This can be triggered with this script: # ip -b - <<-'EOF' link add type vrf table 100 link set vrf0 up link add type dummy link set dummy0 vrf vrf0 up netns add ns1 EOF # ip -all -b - <<-'EOF' vrf exec vrf0 true netns exec setsid -f sleep 1h EOF # ip vrf pids vrf0 314 sleep # ps 314 PID TTY STAT TIME COMMAND 314 ? Ss 0:00 sleep 1h Refactor cmd_exec() and pass to it a function pointer which is called in the child before the final exec. In the netns exec case the function just resets the VRF and switches netns. Doing it in the child is less error prone and safer, because the parent environment is always kept unaltered. After this refactor some utility functions became unused, so remove them. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-20 14:30:41 -07:00
Pete Morici	b16f525323	Add support for configuring MACsec gcm-aes-256 cipher type. Signed-off-by: Pete Morici <pmorici@dev295.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-18 09:55:51 -07:00
Andrea Claudi	8063feebba	Makefile: use make -C make provides a handy -C option to change directory before reading the makefiles or doing anything else. Use that instead of the "cd dir && make && cd .." pattern, thus simplifying sintax for some makefiles. Changes from v1: - Drop an obviously wrong leftover on testsuite/iproute2/Makefile Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-18 09:52:58 -07:00
Stephen Hemminger	77a380379f	uapi: update headers and add if_link.h and if_infiniband.h Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-18 09:48:21 -07:00
Michael Forney	578cadcc68	ipmroute: Prevent overlapping storage of `filter` global This variable has the same name as `struct xfrm_filter filter` in ip/ipxfrm.c, but overrides that definition since `struct rtfilter` is larger. This is visible when built with -Wl,--warn-common in LDFLAGS: /usr/bin/ld: ipxfrm.o: warning: common of `filter' overridden by larger common from ipmroute.o Signed-off-by: Michael Forney <mforney@mforney.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-18 09:43:29 -07:00
Hangbin Liu	ca697cee4c	ip: add a new parameter -Numeric Add a new parameter '-Numeric' to show the number of protocol, scope, dsfield, etc directly instead of converting it to human readable name. Do the same on tc and ss. This patch is based on David Ahern's previous patch. Suggested-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-18 08:37:47 -07:00
David Ahern	e92d221022	Merge branch 'master' into next Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-14 07:29:40 -07:00
David Ahern	82cdb4d445	tools: Fix include path for generate_nlmsg Compile of tools directory fails with: make -C tools CC generate_nlmsg ../../lib/libnetlink.c:28:27: fatal error: linux/nexthop.h: No such file or directory #include <linux/nexthop.h> ^ compilation terminated. Add local uapi to build path. Fixes: `74829ca7dd` ("libnetlink: Add helper to create nexthop dump request") Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-14 06:50:55 -07:00
Andrea Claudi	41bf0c69c0	Makefile: use make -C to change directory make provides a handy -C option to change directory before reading the makefiles or doing anything else. Use that instead of the "cd dir && make && cd .." pattern, thus simplifying sintax for some makefiles. Changes from v1: - Drop an obviously wrong leftover in testsuite/iproute2/Makefile Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Reviewed-and-tested-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-14 06:44:39 -07:00
Stephen Hemminger	b0a09ace39	testsuite: intent if/else in Makefile Indent both arms of if/else equally. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-12 08:48:33 -07:00
Moshe Shemesh	c934da8aaa	devlink: mnlg: Catch returned error value of dumpit commands Devlink commands which implements the dumpit callback may return error. The netlink function netlink_dump() sends the errno value as the payload of the message, while answering user space with NLMSG_DONE. To enable receiving errno value for dumpit commands we have to check for it in the message. If it is a negative value then the dump returned an error so we should set errno accordingly and check for ext_ack in case it was set. Fixes: `049c58539f` ("devlink: mnlg: Add support for extended ack") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-12 08:43:14 -07:00
David Ahern	2357abbbfa	Merge branch 'nexthop-objects' into next David Ahern says: ==================== This set adds support for nexthop objects to the ip command. The syntax for nexthop objects is identical to the current 'ip route .. nexthop ...' syntax making it easy to convert existing use cases. v2 - Fixed header use in rtnl_nexthopdump_req as noted by roopa - made rth_del static per Stephen's request and fixed coding style - removed print_nh_gateway and exported print_rta_gateway to reuse the iproute.c code (keeps consistency in output) - added examples to commit message - fixed monitor use when specific groups requested - fixed usage in 'ip nexthop' - added manpage ==================== Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-11 10:32:07 -07:00
David Ahern	e7cd93e7af	ipmonitor: Add nexthop option to monitor Add capability to ip-monitor to listen and dump nexthop messages. Since the nexthop group = 32 which exceeds the max groups bit field, 2 separate flags are needed - one that defaults on to indicate nexthop group is joined by default and a second that indicates a specific selection by the user (e.g, ip mon nexthop route). Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-11 10:31:30 -07:00
David Ahern	12387e2c14	ip route: Add option to use nexthop objects Add nhid option for routes to use nexthop objects by id. Example: $ ip nexthop add id 1 via 10.99.1.2 dev veth1 $ ip route add 10.100.1.0/24 nhid 1 $ ip route ls ... 10.100.1.0/24 nhid 1 via 10.99.1.2 dev veth1 Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-11 10:31:28 -07:00
David Ahern	42cce67e71	ip: Add man page for nexthop command Document 'ip nexthop' options in a man page with a few examples. Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-11 10:31:06 -07:00
David Ahern	63df8e8543	Add support for nexthop objects Add nexthop subcommand to ip. Implement basic commands for creating, deleting and dumping nexthop objects. Syntax follows 'nexthop' syntax from existing 'ip route' command. Examples: 1. Single path $ ip nexthop add id 1 via 10.99.1.2 dev veth1 $ ip nexthop ls id 1 via 10.99.1.2 src 10.99.1.1 dev veth1 scope link 2. ECMP $ ip nexthop add id 2 via 10.99.3.2 dev veth3 $ ip nexthop add id 1001 group 1/2 --> creates a nexthop group with 2 component nexthops: id 1 and id 2 both the same weight $ ip nexthop ls id 1 via 10.99.1.2 src 10.99.1.1 dev veth1 scope link id 2 via 10.99.3.2 src 10.99.3.1 dev veth3 scope link id 1001 group 1/2 3. Weighted multipath $ ip nexthop add id 1002 group 1,10/2,20 --> creates a nexthop group with 2 component nexthops: id 1 with a weight of 10 and id 2 with a weight of 20 $ ip nexthop ls id 1 via 10.99.1.2 src 10.99.1.1 dev veth1 scope link id 2 via 10.99.3.2 src 10.99.3.1 dev veth3 scope link id 1001 group 1/2 id 1002 group 1,10/2,20 Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-11 10:30:58 -07:00
David Ahern	48a1e96d90	ip route: Export print_rt_flags, print_rta_if and print_rta_gateway Export print_rt_flags and print_rta_if for use by the nexthop command. Change print_rta_gateway to take the family versus rtmsg struct and export for use by the nexthop command. Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-11 10:30:55 -07:00
David Ahern	74829ca7dd	libnetlink: Add helper to create nexthop dump request Add rtnl_nexthopdump_req to initiate a dump request of nexthop objects. Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-11 10:30:53 -07:00
David Ahern	10631938f1	uapi: Import nexthop object API Add nexthop.h from kernel with the uapi for nexthop objects. Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-11 10:30:50 -07:00
David Ahern	9860becfe3	libnetlink: Add helper to add a group via setsockopt groups > 31 have to be joined using the setsockopt. Since the nexthop group is 32, add a helper to allow 'ip monitor' to listen for nexthop messages. Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-11 10:30:48 -07:00
David Ahern	7392401027	lwtunnel: Pass encap and encap_type attributes to lwt_parse_encap lwt_parse_encap currently assumes the encap attribute is RTA_ENCAP and the type is RTA_ENCAP_TYPE. Change lwt_parse_encap to take these as input arguments for reuse by nexthop code which has the attributes as NHA_ENCAP and NHA_ENCAP_TYPE. Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-11 10:30:46 -07:00
David Ahern	2360b8cb21	libnetlink: Set NLA_F_NESTED in rta_nest Kernel now requires NLA_F_NESTED to be set on new nested attributes. Set NLA_F_NESTED in rta_nest. Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-11 10:30:39 -07:00
Mahesh Bandewar	ba126dcad2	ip6tunnel: fix 'ip -6 {show\|change} dev <name>' cmds Inclusion of 'dev' is allowed by the syntax but not handled correctly by the command. It produces no output for show command and falsely successful for change command but does not make any changes. can be verified with the following steps # ip -6 tunnel add ip6tnl1 mode ip6gre local fd::1 remote fd::2 tos inherit ttl 127 encaplimit none # ip -6 tunnel show ip6tnl1 <correct output> # ip -6 tunnel show dev ip6tnl1 <no output but correct output after this change> # ip -6 tunnel change dev ip6tnl1 local 2001🔢:1 remote 2001🔢:2 encaplimit none ttl 127 tos inherit allow-localremote # echo $? 0 # ip -6 tunnel show ip6tnl1 <no changes applied, but changes are correctly applied after this change> Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-10 10:43:09 -07:00
Matteo Croce	80a931d41c	ip: reset netns after each command in batch mode When creating a new netns or executing a program into an existing one, the unshare() or setns() calls will change the current netns. In batch mode, this can run commands on the wrong interfaces, as the ifindex value is meaningful only in the current netns. For example, this command fails because veth-c doesn't exists in the init netns: # ip -b - <<-'EOF' netns add client link add name veth-c type veth peer veth-s netns client addr add 192.168.2.1/24 dev veth-c EOF Cannot find device "veth-c" Command failed -:7 But if there are two devices with the same name in the init and new netns, ip will build a wrong ll_map with indexes belonging to the new netns, and will execute actions in the init netns using this wrong mapping. This script will flush all eth0 addresses and bring it down, as it has the same ifindex of veth0 in the new netns: # ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff inet 192.168.122.76/24 brd 192.168.122.255 scope global dynamic eth0 valid_lft 3598sec preferred_lft 3598sec # ip -b - <<-'EOF' netns add client link add name veth0 type veth peer name veth1 link add name veth-ns type veth peer name veth0 netns client link set veth0 down address flush veth0 EOF # ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff 3: veth1@veth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether c2:db:d0:34:13:4a brd ff:ff:ff:ff:ff:ff 4: veth0@veth1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether ca:9d:6b:5f:5f:8f brd ff:ff:ff:ff:ff:ff 5: veth-ns@if2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 32:ef:22:df:51:0a brd ff:ff:ff:ff:ff:ff link-netns client The same issue can be triggered by the netns exec subcommand with a sligthy different script: # ip netns add client # ip -b - <<-'EOF' netns exec client true link add name veth0 type veth peer name veth1 link add name veth-ns type veth peer name veth0 netns client link set veth0 down address flush veth0 EOF Fix this by adding two netns_{save,reset} functions, which are used to get a file descriptor for the init netns, and restore it after each batch command. netns_save() is called before the unshare() or setns(), while netns_restore() is called after each command. Fixes: `0dc34c7713` ("iproute2: Add processless network namespace support") Reviewed-and-tested-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-10 10:42:14 -07:00
David Ahern	9a4f0ba478	Merge branch 'master' into next Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-10 10:32:07 -07:00
Kevin Darbyshire-Bryant	d7f2bccd0f	tc: add support for action act_ctinfo ctinfo is a tc action restoring data stored in conntrack marks to various fields. At present it has two independent modes of operation, restoration of DSCP into IPv4/v6 diffserv and restoration of conntrack marks into packet skb marks. It understands a number of parameters specific to this action in additional to the usual action syntax. Each operating mode is independent of the other so all options are optional, however not specifying at least one mode is a bit pointless. Usage: ... ctinfo [dscp mask [statemask]] [cpmark [mask]] [zone ZONE] [CONTROL] [index <INDEX>] DSCP mode dscp enables copying of a DSCP stored in the conntrack mark into the ipv4/v6 diffserv field. The mask is a 32bit field and specifies where in the conntrack mark the DSCP value is located. It must be 6 contiguous bits long. eg. 0xfc000000 would restore the DSCP from the upper 6 bits of the conntrack mark. The DSCP copying may be optionally controlled by a statemask. The statemask is a 32bit field, usually with a single bit set and must not overlap the dscp mask. The DSCP restore operation will only take place if the corresponding bit/s in conntrack mark ANDed with the statemask yield a non zero result. eg. dscp 0xfc000000 0x01000000 would retrieve the DSCP from the top 6 bits, whilst using bit 25 as a flag to do so. Bit 26 is unused in this example. CPMARK mode cpmark enables copying of the conntrack mark to the packet skb mark. In this mode it is completely equivalent to the existing act_connmark action. Additional functionality is provided by the optional mask parameter, whereby the stored conntrack mark is logically ANDed with the cpmark mask before being stored into skb mark. This allows shared usage of the conntrack mark between applications. eg. cpmark 0x00ffffff would restore only the lower 24 bits of the conntrack mark, thus may be useful in the event that the upper 8 bits are used by the DSCP function. Usage: ... ctinfo [dscp mask [statemask]] [cpmark [mask]] [zone ZONE] [CONTROL] [index <INDEX>] where : dscp MASK is the bitmask to restore DSCP STATEMASK is the bitmask to determine conditional restoring cpmark MASK mask applied to restored packet mark ZONE is the conntrack zone CONTROL := reclassify \| pipe \| drop \| continue \| ok \| goto chain <CHAIN_INDEX> Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-10 10:24:38 -07:00
David Ahern	ed624243da	uapi: Import tc_ctinfo uapi Add tc_ctinfo.h uapi file from kernel. Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-10 10:23:32 -07:00
David Ahern	b2f8eb7f8a	Update kernel headers Update kernel headers to commit: ad3a9ee0b623 ("ocelot: remove unused variable 'rc' in vcap_cmd()") Signed-off-by: David Ahern <dsahern@gmail.com>	2019-06-10 09:39:08 -07:00
Davide Caratti	0ee4d17954	tc: simple: don't hardcode the control action the following TDC test case: b776 - Replace simple action with invalid goto chain control checks if the kernel correctly validates the 'goto chain' control action, when it is specified in 'act_simple' rules. The test systematically fails because the control action is hardcoded in parse_simple(), i.e. it is not parsed by command line arguments, so its value is constantly TC_ACT_PIPE. Because of that, the following command: # tc action add action simple sdata "test" drop index 7 installs an 'act_simple' rule that never drops packets, and whose 'index' is the first IDR available, plus an 'act_gact' rule with 'index' equal to 7, that drops packets. Use parse_action_control_dflt(), like we did on many other TC actions, to make the control action configurable also with 'act_simple'. The expected results of test b776 are summarized below: iproute2 v kernel->\| 5.1-rc2 (and previous) \| 5.1-rc3 (and subsequent) ------------------+-------------------------+------------------------- 5.1.0 \| FAIL (bad IDR) \| FAIL (bad IDR) 5.1.0(patched) \| FAIL (no rule/bad sdata)\| PASS Changes since v1: - reword commit message, thanks Stephen Hemminger Fixes: `087f46ee4e` ("tc: introduce simple action") CC: Andrea Claudi <aclaudi@redhat.com> CC: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-06 14:43:08 -07:00
Roman Mashak	fa49588973	tc: Fix binding of gact action by index. The following operation fails: % sudo tc actions add action pipe index 1 % sudo tc filter add dev lo parent ffff: \ protocol ip pref 10 u32 match ip src 127.0.0.2 \ flowid 1:10 action gact index 1 Bad action type index Usage: ... gact <ACTION> [RAND] [INDEX] Where: ACTION := reclassify \| drop \| continue \| pass \| pipe \| goto chain <CHAIN_INDEX> \| jump <JUMP_COUNT> RAND := random <RANDTYPE> <ACTION> <VAL> RANDTYPE := netrand \| determ VAL : = value not exceeding 10000 JUMP_COUNT := Absolute jump from start of action list INDEX := index value used However, passing a control action of gact rule during filter binding works: % sudo tc filter add dev lo parent ffff: \ protocol ip pref 10 u32 match ip src 127.0.0.2 \ flowid 1:10 action gact pipe index 1 Binding by reference, i.e. by index, has to consistently work with any tc action. Since tc is sensitive to the order of keywords passed on the command line, we can teach gact to skip parsing arguments as soon as it sees 'gact' followed by 'index' keyword. Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-06 14:41:31 -07:00
Parav Pandit	2cc10ce81d	devlink: Increase bus, device buffer size to 64 bytes Device name on mdev bus is 36 characters long which follow standard uuid RFC 4122. This is probably the longest name that a kernel will return for a device. Hence increase the buffer size to 64 bytes. Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-06 14:41:17 -07:00
Davide Caratti	4ae441e3d1	man: tc-skbedit.8: document 'inheritdsfield' while at it, fix missing square bracket near 'ptype' and a typo in the action description (it's -> its). Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-06-04 09:39:53 -07:00
David Ahern	339b14ab5e	Merge branch 'rdma-net-namespace' into next Parav Pandit says: ==================== RDMA subsystem can be running in either of the modes. (a) Sharing RDMA devices among multiple net namespaces or (b) Exclusive mode where RDMA device is bound to single net namespace This patch series adds (1) query command to query rdma subsystem sharing mode (2) set command to change rdma subsystem sharing mode (3) assign rdma device to a net namespace rdma tool examples: (a) Query current rdma subsys net namespace sharing mode $ rdma sys show netns shared (b) Change rdma subsys mode to exclusive mode $ rdma sys set netns exclusive $ rdma sys show netns exclusive (c) Assign rdma device to a specific newly created net namespace $ ip netns add foo $ rdma dev set mlx5_1 netns foo ==================== Signed-off-by: David Ahern <dsahern@gmail.com>	2019-05-31 15:10:55 -07:00
Parav Pandit	c2ffce5d39	rdma: Add man page for rdma dev set netns command Add man page to describe additional set netns command for rdma device. Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2019-05-31 15:10:33 -07:00
Parav Pandit	d17a0248a2	rdma: Add an option to set net namespace of rdma device Enrich rdmatool with an option to set network namespace of RDMA device. After successful execution of it, rdma device will be accessible only in assigned network namespace. rdma tool command examples and output. First set netns mode to exclusive. $ rdma system set netns exclusive Now create network namespace and assign RDMA device to this network namespace. $ ip netns add foo $ rdma dev set mlx5_1 netns foo Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2019-05-31 15:10:32 -07:00
Parav Pandit	e861272015	rdma: Add man pages for rdma system commands Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2019-05-31 15:10:31 -07:00
Parav Pandit	c4572a465b	rdma: Add an option to query,set net namespace sharing sys parameter Enrich rdmatool with an option to query rdma subsystem parameter whether rdma devices are shared among multiple network namespaces or exclusive to single network namespace. rdma tool command examples and output. $ rdma system show netns shared $ rdma system set netns exclusive $ rdma system show netns exclusive Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2019-05-31 15:10:29 -07:00
Nicolas Dichtel	c442234858	iplink: don't try to get ll addr len when creating an iface It will obviously fail. This is a follow up of the commit `757837230a` ("lib: suppress error msg when filling the cache"). Suggested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-05-30 11:03:20 -07:00

... 13 14 15 16 17 ...

5401 Commits