mirror_iproute2

mirror of https://git.proxmox.com/git/mirror_iproute2 synced 2025-10-11 00:24:34 +00:00

Author	SHA1	Message	Date
Daniel Borkmann	91d88eeb10	{f,m}_bpf: allow updates on program arrays Since we have all infrastructure in place now, allow atomic live updates on program arrays. This can be very useful e.g. in case programs that are being tail-called need to be replaced, f.e. when classifier functionality needs to be changed, new protocols added/removed during runtime, etc. Thus, provide a way for in-place code updates, minimal example: Given is an object file cls.o that contains the entry point in section 'classifier', has a globally pinned program array 'jmp' with 2 slots and id of 0, and two tail called programs under section '0/0' (prog array key 0) and '0/1' (prog array key 1), the section encoding for the loader is <id/key>. Adding the filter loads everything into cls_bpf: tc filter add dev foo parent ffff: bpf da obj cls.o Now, the program under section '0/1' needs to be replaced with an updated version that resides in the same section (also full path to tc's subfolder of the mount point can be passed, e.g. /sys/fs/bpf/tc/globals/jmp): tc exec bpf graft m:globals/jmp obj cls.o sec 0/1 In case the program resides under a different section 'foo', it can also be injected into the program array like: tc exec bpf graft m:globals/jmp key 1 obj cls.o sec foo If the new tail called classifier program is already available as a pinned object somewhere (here: /sys/fs/bpf/tc/progs/parser), it can be injected into the prog array like: tc exec bpf graft m:globals/jmp key 1 fd m:progs/parser In the kernel, the program on key 1 is being atomically replaced and the old one's refcount dropped. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org>	2015-11-29 11:55:16 -08:00
Daniel Borkmann	f6793eec46	{f, m}_bpf: allow for user-defined object pinnings The recently introduced object pinning can be further extended in order to allow sharing maps beyond tc namespace. F.e. maps that are being pinned from tracing side, can be accessed through this facility as well. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org>	2015-11-29 11:55:16 -08:00
Daniel Borkmann	9e607f2e72	{f, m}_bpf: check map attributes when fetching as pinned Make use of the new show_fdinfo() facility and verify that when a pinned map is being fetched that its basic attributes are the same as the map we declared from the ELF file. I.e. when placed into the globalns, collisions could occur. In such a case warn the user and bail out. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org>	2015-11-29 11:55:16 -08:00
Daniel Borkmann	910b543dcc	{f,m}_bpf: make tail calls working Now that we have the possibility of sharing maps, it's time we get the ELF loader fully working with regards to tail calls. Since program array maps are pinned, we can keep them finally alive. I've noticed two bugs that are being fixed in bpf_fill_prog_arrays() with this patch. Example code comes as follow-up. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org>	2015-11-29 11:55:16 -08:00
Stephen Hemminger	fece33c195	Merge branch 'master' into net-next	2015-11-29 11:53:43 -08:00
Tom Herbert	35f59d862f	vxlan: Add support for remote checksum offload This patch adds support to remote checksum checksum offload to VXLAN. This patch adds remcsumtx and remcsumrx to ip vxlan configuration to enable remote checksum offload for transmit and receive on the VXLAN tunnel. https://tools.ietf.org/html/draft-herbert-vxlan-rco-00 Example: ip link add name vxlan0 type vxlan id 42 group 239.1.1.1 dev eth0 \ udpcsum remcsumtx remcsumrx Testing: Ran single netperf over mlnx4 to illustrate the effest: - Without RCO (UDP csum set to zero) 4335.99 Mbps - With RCO enabled 7661.81 Mbps Signed-off-by: Tom Herbert <tom@herbertland.com>	2015-11-29 11:53:02 -08:00
Phil Sutter	61170fd88d	get rid of unnecessary fgets() buffer size limitation fgets() will read at most size-1 bytes into the buffer and add a terminating null-char at the end. Therefore it is not necessary to pass a reduced buffer size when calling it. This change was generated using the following semantic patch: @@ identifier buf, fp; @@ - fgets(buf, sizeof(buf) - 1, fp) + fgets(buf, sizeof(buf), fp) Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-29 11:48:24 -08:00
Phil Sutter	d572ed4d0a	get rid of remaining -Wunused-result warnings Although not fundamentally necessary to check return codes in these spots, preventing the warnings will put new ones into focus. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-29 11:48:24 -08:00
Phil Sutter	c29d37925a	ss: review is_ephemeral() No need to keep static port boundaries global, they are not used directly. Keeping them local also allows to safely reduce their names to the minimum. Assign hardcoded fallback values also if fscanf() fails. Get rid of unnecessary braces around return parameter. Instead of more or less duplicating is_ephemeral() in run_ssfilter(), simply call the function instead. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-29 11:48:24 -08:00
Phil Sutter	596307ea3d	ss: reduce max indentation level in init_service_resolver() Exit early or continue on error instead of putting conditional into conditional to make reading the code a bit easier. Also, the call to memcpy() can be skipped by initialising prog with the desired prefix. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-29 11:48:24 -08:00
Phil Sutter	db3ef44c54	lnstat: review lnstat_update() Instead of calling rewind() and fgets() before every call to scan_lines(), move them into scan_lines() itself. This should also fix compat mode, as before the second call to scan_lines() the first line was skipped unconditionally. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-29 11:48:24 -08:00
Phil Sutter	fc31817d1f	bridge.8: minor formatting cleanup - Replace commas at end of subsection with dots. - Replace double whitespace by single one. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-29 11:47:29 -08:00
Phil Sutter	ea6cbab792	iproute: restrict hoplimit values to be in range [0; 255] Technically, the range of possible hoplimit values are defined by IPv4 and IPv6 header formats. Both define the field to be eight bits in size, which leads to a value range of [0;255]. Setting a packet's hoplimit field to 0 though makes not much sense, as the next hop would immediately drop the packet. Therefore Linux uses 0 as a special value indicating to use the system's default hoplimit (configurable via sysctl). In iproute, setting the hoplimit of a route to 0 is equivalent to omitting the hoplimit parameter alltogether, so it is actually not necessary to allow that value to be specified, but keep it anyway for backwards compatibility. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-29 11:47:29 -08:00
Phil Sutter	d81f54d599	iptoken: simplify iptoken_list a bit Since it uses only a single filter, rtnl_dump_filter() can be used. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-29 11:47:29 -08:00
Phil Sutter	906dfe4887	ipaddress: drop unnecessary check in ipaddr_list_flush_or_save() Right after ipaddr_reset_filter(), filter.family is always AF_UNSPEC. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-29 11:47:29 -08:00
Phil Sutter	d25ec03e1d	ipaddress: fix ipaddr_flush for Linux >= 3.1 Linux version 3.1 introduced a consistency check for netlink dumps in commit 670dc28 ("netlink: advertise incomplete dumps"). This bites iproute2 when flushing more addresses than can fit into a single RTM_GETADDR response. To silence the spurious error message "Dump was interrupted and may be inconsistent.", advise rtnl_dump_filter_l() to not care about NLM_F_DUMP_INTR. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-29 11:47:29 -08:00
Phil Sutter	8e72880f6b	libnetlink: introduce nc_flags Allow for a filter to ignore certain nlmsg_flags. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-29 11:47:29 -08:00
Phil Sutter	c6995c4802	ipaddress: simplify ipaddr_flush() Since it's no longer relevant whether an IP address is primary or secondary when flushing, ipaddr_flush() can be simplified a bit. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-29 11:47:29 -08:00
Stephen Hemminger	68ef507249	rt_names: style cleanup Cleanup all checkpatch complaints about whitespace in rt_names.	2015-11-29 11:41:23 -08:00
David Ahern	13ada95da4	Add support for rt_tables.d Add support for reading table id/name mappings from rt_tables.d directory. Suggested-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com>	2015-11-29 11:29:31 -08:00
John W. Linville	906ac5437a	geneve: add support for IPv6 link partners Signed-off-by: John W. Linville <linville@tuxdriver.com>	2015-11-23 16:23:11 -08:00
John W. Linville	6581df5ef3	geneve: add support for IPv6 link partners Signed-off-by: John W. Linville <linville@tuxdriver.com>	2015-11-23 16:21:55 -08:00
Daniel Borkmann	32e93fb7f6	{f,m}_bpf: allow for sharing maps This larger work addresses one of the bigger remaining issues on tc's eBPF frontend, that is, to allow for persistent file descriptors. Whenever tc parses the ELF object, extracts and loads maps into the kernel, these file descriptors will be out of reach after the tc instance exits. Meaning, for simple (unnested) programs which contain one or multiple maps, the kernel holds a reference, and they will live on inside the kernel until the program holding them is unloaded, but they will be out of reach for user space, even worse with (also multiple nested) tail calls. For this issue, we introduced the concept of an agent that can receive the set of file descriptors from the tc instance creating them, in order to be able to further inspect/update map data for a specific use case. However, while that is more tied towards specific applications, it still doesn't easily allow for sharing maps accross multiple tc instances and would require a daemon to be running in the background. F.e. when a map should be shared by two eBPF programs, one attached to ingress, one to egress, this currently doesn't work with the tc frontend. This work solves exactly that, i.e. if requested, maps can now be _arbitrarily_ shared between object files (PIN_GLOBAL_NS) or within a single object (but various program sections, PIN_OBJECT_NS) without "loosing" the file descriptor set. To make that happen, we use eBPF object pinning introduced in kernel commit b2197755b263 ("bpf: add support for persistent maps/progs") for exactly this purpose. The shipped examples/bpf/bpf_shared.c code from this patch can be easily applied, for instance, as: - classifier-classifier shared: tc filter add dev foo parent 1: bpf obj shared.o sec egress tc filter add dev foo parent ffff: bpf obj shared.o sec ingress - classifier-action shared (here: late binding to a dummy classifier): tc actions add action bpf obj shared.o sec egress pass index 42 tc filter add dev foo parent ffff: bpf obj shared.o sec ingress tc filter add dev foo parent 1: bpf bytecode '1,6 0 0 4294967295,' \ action bpf index 42 The toy example increments a shared counter on egress and dumps its value on ingress (if no sharing (PIN_NONE) would have been chosen, map value is 0, of course, due to the two map instances being created): [...] <idle>-0 [002] ..s. 38264.788234: : map val: 4 <idle>-0 [002] ..s. 38264.788919: : map val: 4 <idle>-0 [002] ..s. 38264.789599: : map val: 5 [...] ... thus if both sections reference the pinned map(s) in question, tc will take care of fetching the appropriate file descriptor. The patch has been tested extensively on both, classifier and action sides. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2015-11-23 16:10:44 -08:00
Neil Horman	e149d4e843	iproute2: Ignore EADDRNOTAVAIL errors during address flush operation I found recently that, if I disabled address promotion in the kernel, that ip addr flush dev <dev> would fail with an EADDRNOTAVAIL errno (though the flush operation would in fact flush all addresses from an interface properly) Whats happening is that, if I add a primary and multiple secondary addresses to an interface, the flush operation first ennumerates them all with a GETADDR \| DUMP operation, then sends a delete request for each address. But the kernel, having promotion disabled, deletes all secondary addresses when the primary is removed. That means, that several delete requests may still be pending in the netlink request for addresses that have been removed on our behalf, resulting in EADDRNOTAVAIL return codes. It seems the simplest thing to do is to understand that EADDRUNAVAIL isn't a fatal outcome on a flush operation, as it just indicates that an address which you want to remove is already removed, so it can safely be ignored. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Stephen Hemminger <stephen@networkplumber.org> CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>	2015-11-23 15:59:08 -08:00
Phil Sutter	6e2e2cf03a	bridge.8: document fdb replace command Despite commit 45a82e5 ("iproute vxlan add support for fdb replace command"), the 'fdb replace' command was not mentioned in bridge.8. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-23 15:58:07 -08:00
Phil Sutter	fdb347f7fd	lnstat: fix header displaying mechanism The algorithm depends on the loop counter ('i') to increment by one in each iteration. Though if running endlessly (count==0), the counter was not incremented at all. Also change formatting of the header printing conditional a bit so it's hopefully easier to read. Fixes: `e7e2913` ("lnstat: run indefinitely by default") Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-23 15:54:05 -08:00
Phil Sutter	869fcabecc	lnstat: describe -s option in help output Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-23 15:54:05 -08:00
Stephen Hemminger	0198930b55	update kernel headers to 4.4-rc1 Post merge window changes	2015-11-23 15:53:04 -08:00
Phil Sutter	f7b49a3fc7	ip_common.h header cleanup - Drop 'extern' keyword from all function prototypes. - Make line breaking of print_* functions consistent. - Make print_ntable() and ipntable_reset_filter() static and remove their declaration. - Drop declaration of non-existent ipaddr_list() and iproute_monitor(). Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-23 15:44:03 -08:00
Stephen Hemminger	23d6c997d9	misc: remove extra blank line	2015-11-23 15:42:34 -08:00
Stephen Hemminger	5699275b42	man8: scrub trailing whitespace Remove extraneous whitespace	2015-11-23 15:41:37 -08:00
Ville Skyttä	ac0817ef66	man: Spelling fixes Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>	2015-11-23 15:39:25 -08:00
Ville Skyttä	85e3c87c82	man: Syntax and warning fixes Fix syntax issues and warnings highlighted by `man --warnings=w' from man-db 2.7.1. Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>	2015-11-23 15:39:25 -08:00
Phil Sutter	04ce8d3eda	ip{,6}tunnel: put spaces around non-unary operators Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-23 15:26:37 -08:00
Phil Sutter	f53ecee818	iptunnel: sanitize copying tunnel name Since p->name is only IFNAMSIZ bytes, do not copy more than IFNAMSIZ - 1 bytes into it so there remains at least a single null byte in the end. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-23 15:26:37 -08:00
Phil Sutter	c957821b18	iptunnel: share common code when determining the default interface name Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-23 15:26:37 -08:00
Phil Sutter	0dd4d2b37f	iptunnel: simplify parsing TTL, allow 'hlim' as identifier Instead of parsing an unsigned integer and checking boundaries, simply parse u8. This and the added ttl alias 'hlim' provide consistency with ip6tunnel. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-23 15:26:37 -08:00
Phil Sutter	2520598a1a	iptunnel: share common code when setting tunnel mode Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-23 15:26:37 -08:00
Phil Sutter	7894ce7722	ip6tunnel: fix coding style: no newline between brace and else Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-23 15:26:37 -08:00
Phil Sutter	9af72f819e	ip6tunnel: print local/remote addresses like iptunnel does This makes output consistent with iptunnel, also supporting reverse DNS lookup for remote address if requested. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-23 15:26:37 -08:00
Phil Sutter	c4527d7ba3	ip{,6}tunnel: align do_tunnels_list() a bit In iptunnel, declare loop variables inside the loop as done in ip6tunnel. Fix and simplify goto logic in ip6tunnel: - Failure to read over header lines would have left fp opened. - By returning directly upon fopen() failure, fp can be closed unconditionally in the end. Use the same goto logic in iptunnel, as well. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-23 15:26:37 -08:00
Phil Sutter	4b3cb96281	iptunnel: use ll_name_to_index() for physical interface lookup Although the cache is only initialized in do_show(), this way it is at least consistent with ip6tunnel. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-23 15:26:37 -08:00
Phil Sutter	6ddb1e8c90	ip{, 6}tunnel: unify behaviour if physical device is not found Make ip6tunnel print an error message as well. While there, get rid of unnecessary line breaking. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-23 15:26:37 -08:00
Phil Sutter	a7ed1520ee	ip/tunnel: introduce tnl_parse_key() Instead of duplicating the same code six times (key, ikey and okey in iptunnel and ip6tunnel), have a common parsing routine. This has the added benefit of having the same verbose error message in ip6tunnel as well as iptunnel. I'm not sure if parsing an IPv4 address as key makes sense for ip6tunnel, but the code was there before so this patch at least doesn't make it worse. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-23 15:26:37 -08:00
Phil Sutter	8de592d05c	ip{, 6}tunnel: get rid of extraneous whitespace when printing Put whitespace in the beginning of optional parts, not as suffix anywhere. Also drop double whitespaces in between words. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-23 15:26:37 -08:00
Aaro Koskinen	caf8875b3c	misc/Makefile: use PKG_CONFIG Use PKG_CONFIG from Config - it works better when cross-compiling. Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>	2015-11-23 15:25:50 -08:00
Stephen Hemminger	115b4d8873	Merge branch 'master' into net-next	2015-11-03 16:38:15 -08:00
Stephen Hemminger	6720eceff7	v4.3.0	2015-11-03 16:34:46 -08:00
Stephen Hemminger	1e5aa99024	Merge branch 'master' into net-next	2015-11-03 16:31:57 -08:00
Phil Sutter	b5bb1820e8	lib/utils: improve error messages of get_addr() and get_prefix() Instead of statically complaining about illegal inet address, use get_family() to get the address family right. Based on a patch by Hangbin Liu to print "inet6" for AF_INET6 made more generic by me. Signed-off-by: Phil Sutter <phil@nwl.cc>	2015-11-03 16:28:36 -08:00

... 2 3 4 5 6 ...

2532 Commits