mirror_iproute2

mirror of https://git.proxmox.com/git/mirror_iproute2 synced 2025-08-16 19:37:30 +00:00

Author	SHA1	Message	Date
Petr Machata	a0a4b6618c	lib: sprint_size(): Uncrustify the code a bit Ideally this and the rate printing would both be converted to a common helper, but unfortunately the two format differently and this would break tests and scripts out there. So just make the code look less like a wad of hay. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-09 02:30:36 +00:00
Petr Machata	adbe5de966	lib: Move sprint_size() from tc here, add print_size() When displaying sizes of various sorts, tc commonly uses the function sprint_size() to format the size into a buffer as a human-readable string. This string is then displayed either using print_string(), or in some code even fprintf(). As a result, a typical sequence of code when formatting a size is something like the following: SPRINT_BUF(b); print_uint(PRINT_JSON, "foo", NULL, foo); print_string(PRINT_FP, NULL, "foo %s ", sprint_size(foo, b)); For a concept as broadly useful as size, it would be better to have a dedicated function in json_print. To that end, move sprint_size() from tc_util to json_print. Add helpers print_size() and print_color_size() that wrap arount sprint_size() and provide the JSON dispatch as appropriate. Since print_size() should be the preferred interface, convert vast majority of uses of sprint_size() to print_size(). Two notable exceptions are: - q_tbf, which does not show the size as such, but uses the string "$human_readable_size/$cell_size" even in JSON. There is simply no way to have print_size() emit the same text, because print_size() in JSON mode should of course just use the raw number, without human-readable frills. - q_cake, which relies on the existence of sprint_size() in its macro-based formatting helpers. There might be ways to convert this particular case, but given q_tbf simply cannot be converted, leave it as is. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-09 02:30:25 +00:00
Petr Machata	60265cc226	lib: Move print_rate() from tc here; modernize The functions print_rate() and sprint_rate() are useful for formatting rate-like values. The DCB tool would find these useful in the maxrate subtool. However, the current interface to these functions uses a global variable use_iec as a flag indicating whether 1024- or 1000-based powers should be used when formatting the rate value. For general use, a global variable is not a great way of passing arguments to a function. Besides, it is unlike most other printing functions in that it deals in buffers and ignores JSON. Therefore make the interface to print_rate() explicit by converting use_iec to an ordinary parameter. Since the interface changes anyway, convert it to follow the pattern of other json_print functions (except for the now-explicit use_iec parameter). Move to json_print.c. Add a wrapper to tc, so that all the call sites do not need to repeat the use_iec global variable argument, and convert all call sites. In q_cake.c, the conversion is not straightforward due to usage of a macro that is shared across numerous data types. Simply hand-roll the corresponding code, which seems better than making an extra helper for one call site. Drop sprint_rate() now that everybody just uses print_rate(). Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-09 02:30:15 +00:00
Petr Machata	cdd9425315	Move the use_iec declaration to the tools The tools "ip" and "tc" use a flag "use_iec", which indicates whether, when formatting rate values, the prefixes "K", "M", etc. should refer to powers of 1024, or powers of 1000. The flag is currently kept as a global variable in "ip" and "tc", but is nonetheless declared in util.h. Instead, move the declaration to tool-specific headers ip/ip_common.h and tc/tc_common.h. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-09 02:28:43 +00:00
Paolo Lungaroni	69629b4e43	seg6: add support for vrftable attribute in SRv6 End.DT4/DT6 behaviors We introduce the "vrftable" attribute for supporting the SRv6 End.DT4 and End.DT6 behaviors in iproute2. The "vrftable" attribute indicates the routing table associated with the VRF device used by SRv6 End.DT4/DT6 for routing IPv4/IPv6 packets. The SRv6 End.DT4/DT6 is used to implement IPv4/IPv6 L3 VPNs based on Segment Routing over IPv6 networks in multi-tenants environments. It decapsulates the received packets and it performs the IPv4/IPv6 routing lookup in the routing table of the tenant. The SRv6 End.DT4/DT6 leverages a VRF device in order to force the routing lookup into the associated routing table using the "vrftable" attribute. Some examples: $ ip -6 route add 2001:db8::1 encap seg6local action End.DT4 vrftable 100 dev eth0 $ ip -6 route add 2001:db8::2 encap seg6local action End.DT6 vrftable 200 dev eth0 Standard Output: $ ip -6 route show 2001:db8::1 2001:db8::1 encap seg6local action End.DT4 vrftable 100 dev eth0 metric 1024 pref medium JSON Output: $ ip -6 -j -p route show 2001:db8::2 [ { "dst": "2001:db8::2", "encap": "seg6local", "action": "End.DT6", "vrftable": 200, "dev": "eth0", "metric": 1024, "flags": [ ], "pref": "medium" } ] v2: - no changes made: resubmit after pulling out this patch from the kernel patchset. v1: - mixing this patch with the kernel patchset confused patckwork. Signed-off-by: Paolo Lungaroni <paolo.lungaroni@cnit.it> Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-09 02:27:42 +00:00
David Ahern	cfad32569f	Update kernel headers Update kernel headers to commit: afae3cc2da10 ("net: atheros: simplify the return expression of atl2_phy_setup_autoneg_adv()") Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-09 02:25:34 +00:00
David Ahern	8065d28218	Merge branch 'main' into next Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-04 16:25:12 +00:00
David Ahern	b3c4a55064	Only compile mnl_utils when HAVE_MNL is defined New lib/mnl_utils.c fails to compile if libmnl is not installed: mnl_utils.c:9:10: fatal error: libmnl/libmnl.h: No such file or directory 9 \| #include <libmnl/libmnl.h> Make it dependent on HAVE_MNL. Fixes: `72858c7b77` ("lib: Extract from devlink/mnlg a helper, mnlu_socket_open()") Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-04 16:19:05 +00:00
Stephen Hemminger	2e80ae89ca	Merge branch 'gcc-10' into main	2020-12-03 08:33:06 -08:00
Luca Boccassi	755b1c584e	tc/mqprio: json-ify output As reported by a Debian user, mqprio output in json mode is invalid: { "kind": "mqprio", "handle": "8021:", "dev": "enp1s0f0", "root": true, "options": { tc 2 map 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 queues:(0:3) (4:7) mode:channel shaper:dcb} } json-ify it, while trying to maintain the same formatting for standard output. New output: { "kind": "mqprio", "handle": "8001:", "root": true, "options": { "tc": 2, "map": [ 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], "queues": [ [ 0, 3 ], [ 4, 7 ] ], "mode": "channel", "shaper": "dcb" } } https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972784 Reported-by: Roméo GINON <romeo.ginon@ilexia.com> Signed-off-by: Luca Boccassi <bluca@debian.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2020-12-03 08:32:42 -08:00
Luca Boccassi	975c4944e8	ip/netns: use flock when setting up /run/netns If multiple ip processes are ran at the same time to set up separate network namespaces, and it is the first time so /run/netns has to be set up first, and they end up doing it at the same time, the processes might enter a recursive loop creating thousands of mount points, which might crash the system depending on resources available. Try to take a flock on /run/netns before doing the mount() dance, to ensure this cannot happen. But do not try too hard, and if it fails continue after printing a warning, to avoid introducing regressions. First reported on Debian: https://bugs.debian.org/949235 To reproduce (WARNING: run in a VM to avoid system lockups): for i in {0..9} do strace -e trace=mount -e inject=mount:delay_exit=1000000 ip \ netns add "testnetns$i" 2>&1 \| tee "$i.log" & done wait The strace is to ensure the problem always reproduces, to add an artificial synchronization point after the first mount(). Reported-by: Etienne Dechamps <etienne@edechamps.fr> Signed-off-by: Luca Boccassi <bluca@debian.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2020-12-03 08:31:23 -08:00
Vlad Buslov	ea130da81e	tc: implement support for action terse dump Implement support for action terse dump using new TCA_ACT_FLAG_TERSE_DUMP value of TCA_ROOT_FLAGS tlv. Set the flag when user requested it with following example CLI (-br for 'brief'): $ tc -s -br actions ls action tunnel_key total acts 2 action order 0: tunnel_key index 1 Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 action order 1: tunnel_key index 2 Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 In terse mode dump only outputs essential data needed to identify the action (kind, index) and stats, if requested by the user. Signed-off-by: Vlad Buslov <vlad@buslov.dev> Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-03 03:51:06 +00:00
Vlad Buslov	00fffb2d79	tc: use TCA_ACT_ prefix for action flags Use TCA_ACT_FLAG_LARGE_DUMP_ON alias according to new preferred naming for action flags. Signed-off-by: Vlad Buslov <vlad@buslov.dev> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-03 03:49:14 +00:00
David Ahern	23683dec32	Update kernel headers Update kernel headers to commit: cec85994c6b4 ("bareudp: constify device_type declaration") Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-03 03:47:07 +00:00
Sergey Ryazanov	d7190d4ced	ip: add IP_LIB_DIR environment variable Do not hardcode /usr/lib/ip as a path and allow libraries path configuration in run-time. Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-12-02 16:37:07 +00:00
Stephen Hemminger	fb054cb336	uapi: update devlink.h Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2020-11-29 21:17:22 -08:00
Stephen Hemminger	c95d63e4fb	uapi: update devlink.h Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2020-11-29 21:16:50 -08:00
Stephen Hemminger	cae2e9291a	f_u32: fix compiler gcc-10 compiler warning With gcc-10 it complains about array subscript error. f_u32.c: In function ‘u32_parse_opt’: f_u32.c:1113:24: warning: array subscript 0 is outside the bounds of an interior zero-length array ‘struct tc_u32_key[0]’ [-Wzero-length-bounds] 1113 \| hash = sel2.sel.keys[0].val & sel2.sel.keys[0].mask; \| ~~~~~~~~~~~~~^~~ In file included from tc_util.h:11, from f_u32.c:26: ../include/uapi/linux/pkt_cls.h:253:20: note: while referencing ‘keys’ 253 \| struct tc_u32_key keys[0]; \| This is because the keys are actually allocated in the second element of the parent structure. Simplest way to address the warning is to assign directly to the keys in the containing structure. This has always been in iproute2 (pre-git) so no Fixes. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2020-11-29 16:20:33 -08:00
Stephen Hemminger	c014983921	misc: fix compiler warning in ifstat and nstat The code here was doing strncpy() in a way that causes gcc 10 warning about possible string overflow. Just use strlcpy() which will null terminate and bound the string as expected. This has existed since start of git era so no Fixes tag. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2020-11-29 16:20:31 -08:00
Stephen Hemminger	2319db9052	tc: fix compiler warnings in ip6 pedit Gcc-10 complains about referencing a zero size array. This occurs because the array of keys is actually in the following structure which is part of the overall selector. The original code was safe, but better to just use the key array directly. Fixes: `2d9a8dc439` ("tc: p_ip6: Support pedit of IPv6 dsfield") Cc: petrm@mellanox.com Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2020-11-29 16:20:23 -08:00
Stephen Hemminger	5bdc4e9151	bridge: fix string length warning Gcc-10 complains about possible string length overflow. This can't happen Ethernet address format is always limited to 18 characters or less. Just resize the temp buffer. Fixes: `70dfb0b883` ("iplink: bridge: export bridge_id and designated_root") Cc: nikolay@cumulusnetworks.com Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2020-11-29 16:20:16 -08:00
Stephen Hemminger	f817699939	devlink: fix uninitialized warning GCC-10 complains about uninitialized variable. devlink.c: In function ‘cmd_dev’: devlink.c:2803:12: warning: ‘val_u32’ may be used uninitialized in this function [-Wmaybe-uninitialized] 2803 \| val_u16 = val_u32; \| ~~~~~~~~^~~~~~~~~ devlink.c:2747:11: note: ‘val_u32’ was declared here 2747 \| uint32_t val_u32; \| ^~~~~~~ This is a false positive because it can't figure out the control flow when the parse returns error. Fixes: `2557dca2b0` ("devlink: Add string to uint{8,16,32} conversion for generic parameters") Cc: shalomt@mellanox.com Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2020-11-29 16:19:36 -08:00
Vladimir Oltean	c29f65db34	bridge: add support for L2 multicast groups Extend the 'bridge mdb' command for the following syntax: bridge mdb add dev br0 port swp0 grp 01:02:03:04:05:06 permanent Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-29 20:54:02 +00:00
Luca Boccassi	f5c1246e6a	Add dcb/.gitignore Signed-off-by: Luca Boccassi <bluca@debian.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-29 20:39:47 +00:00
David Ahern	f98ce50046	Merge branch 'libbpf' into next Hangbin Liu says: ==================== This series converts iproute2 to use libbpf for loading and attaching BPF programs when it is available. This means that iproute2 will correctly process BTF information and support the new-style BTF-defined maps, while keeping compatibility with the old internal map definition syntax. This is achieved by checking for libbpf at './configure' time, and using it if available. By default the system libbpf will be used, but static linking against a custom libbpf version can be achieved by passing LIBBPF_DIR to configure. LIBBPF_FORCE can be set to on to force configure abort if no suitable libbpf is found (useful for automatic packaging that wants to enforce the dependency), or set off to disable libbpf check and build iproute2 with legacy bpf. The old iproute2 bpf code is kept and will be used if no suitable libbpf is available. When using libbpf, wrapper code ensures that iproute2 will still understand the old map definition format, including populating map-in-map and tail call maps before load. The examples in bpf/examples are kept, and a separate set of examples are added with BTF-based map definitions for those examples where this is possible (libbpf doesn't currently support declaratively populating tail call maps). At last, Thanks a lot for Toke's help on this patch set. v6: a) print runtime libbpf version in ip -V and tc -V v5: a) Fix LIBBPF_DIR typo and description, use libbpf DESTDIR as LIBBPF_DIR dest. b) Fix bpf_prog_load_dev typo. c) rebase to latest iproute2-next. v4: a) Make variable LIBBPF_FORCE able to control whether build iproute2 with libbpf or not. b) Add new file bpf_glue.c to for libbpf/legacy mixed bpf calls. c) Fix some build issues and shell compatibility error. v3: a) Update configure to Check function bpf_program__section_name() separately b) Add a new function get_bpf_program__section_name() to choose whether to use bpf_program__title() or not. c) Test build the patch on Fedora 33 with libbpf-0.1.0-1.fc33 and libbpf-devel-0.1.0-1.fc33 v2: a) Remove self defined IS_ERR_OR_NULL and use libbpf_get_error() instead. b) Add ipvrf with libbpf support. Here are the test results with patched iproute2: == Show libbpf version $ ip -V ip utility, iproute2-5.9.0, libbpf 0.1.0 $ tc -V tc utility, iproute2-5.9.0, libbpf 0.1.0 == setup env $ clang -O2 -Wall -g -target bpf -c bpf_graft.c -o btf_graft.o $ clang -O2 -Wall -g -target bpf -c bpf_map_in_map.c -o btf_map_in_map.o $ clang -O2 -Wall -g -target bpf -c bpf_shared.c -o btf_shared.o $ clang -O2 -Wall -g -target bpf -c legacy/bpf_cyclic.c -o bpf_cyclic.o $ clang -O2 -Wall -g -target bpf -c legacy/bpf_graft.c -o bpf_graft.o $ clang -O2 -Wall -g -target bpf -c legacy/bpf_map_in_map.c -o bpf_map_in_map.o $ clang -O2 -Wall -g -target bpf -c legacy/bpf_shared.c -o bpf_shared.o $ clang -O2 -Wall -g -target bpf -c legacy/bpf_tailcall.c -o bpf_tailcall.o $ rm -rf /sys/fs/bpf/xdp/globals $ /root/iproute2/ip/ip link add type veth $ /root/iproute2/ip/ip link set veth0 up $ /root/iproute2/ip/ip link set veth1 up == Load objs $ /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa $ /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 4 tag 3056d2382e53f27c jited $ ls /sys/fs/bpf/xdp/globals jmp_tc $ bpftool map show 1: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B $ bpftool prog show 4: xdp name cls_aaa tag 3056d2382e53f27c gpl loaded_at 2020-10-22T08:04:21-0400 uid 0 xlated 80B jited 71B memlock 4096B btf_id 5 $ /root/iproute2/ip/ip link set veth0 xdp off $ /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress $ /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 8 tag 4420e72b2a601ed7 jited $ ls /sys/fs/bpf/xdp/globals jmp_tc map_inner map_outer $ bpftool map show 1: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 2: array name map_inner flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 3: array_of_maps name map_outer flags 0x0 key 4B value 4B max_entries 1 memlock 4096B $ bpftool prog show 8: xdp name imain tag 4420e72b2a601ed7 gpl loaded_at 2020-10-22T08:04:23-0400 uid 0 xlated 336B jited 193B memlock 4096B map_ids 3 btf_id 10 $ /root/iproute2/ip/ip link set veth0 xdp off $ /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress $ /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 12 tag 9cbab549c3af3eab jited $ ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef: map_sh /sys/fs/bpf/xdp/globals: jmp_tc map_inner map_outer $ bpftool map show 1: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 2: array name map_inner flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 3: array_of_maps name map_outer flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 4: array name map_sh flags 0x0 key 4B value 4B max_entries 1 memlock 4096B $ bpftool prog show 12: xdp name imain tag 9cbab549c3af3eab gpl loaded_at 2020-10-22T08:04:25-0400 uid 0 xlated 224B jited 139B memlock 4096B map_ids 4 btf_id 15 $ /root/iproute2/ip/ip link set veth0 xdp off == Load objs again to make sure maps could be reused $ /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa $ /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 16 tag 3056d2382e53f27c jited $ ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef: map_sh /sys/fs/bpf/xdp/globals: jmp_tc map_inner map_outer $ bpftool map show 1: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 2: array name map_inner flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 3: array_of_maps name map_outer flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 4: array name map_sh flags 0x0 key 4B value 4B max_entries 1 memlock 4096B $ bpftool prog show 16: xdp name cls_aaa tag 3056d2382e53f27c gpl loaded_at 2020-10-22T08:04:27-0400 uid 0 xlated 80B jited 71B memlock 4096B btf_id 20 $ /root/iproute2/ip/ip link set veth0 xdp off $ /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress $ /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 20 tag 4420e72b2a601ed7 jited $ ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef: map_sh /sys/fs/bpf/xdp/globals: jmp_tc map_inner map_outer $ bpftool map show [236/4518] 1: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 2: array name map_inner flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 3: array_of_maps name map_outer flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 4: array name map_sh flags 0x0 key 4B value 4B max_entries 1 memlock 4096B $ bpftool prog show 20: xdp name imain tag 4420e72b2a601ed7 gpl loaded_at 2020-10-22T08:04:29-0400 uid 0 xlated 336B jited 193B memlock 4096B map_ids 3 btf_id 25 $ /root/iproute2/ip/ip link set veth0 xdp off $ /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress $ /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 24 tag 9cbab549c3af3eab jited $ ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef: map_sh /sys/fs/bpf/xdp/globals: jmp_tc map_inner map_outer $ bpftool map show 1: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 2: array name map_inner flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 3: array_of_maps name map_outer flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 4: array name map_sh flags 0x0 key 4B value 4B max_entries 1 memlock 4096B $ bpftool prog show 24: xdp name imain tag 9cbab549c3af3eab gpl loaded_at 2020-10-22T08:04:31-0400 uid 0 xlated 224B jited 139B memlock 4096B map_ids 4 btf_id 30 $ /root/iproute2/ip/ip link set veth0 xdp off $ rm -rf /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals == Testing if we can load new-style objects (using xdp-filter as an example) $ /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_all.o sec xdp_filter $ /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 28 tag e29eeda1489a6520 jited $ ls /sys/fs/bpf/xdp/globals filter_ethernet filter_ipv4 filter_ipv6 filter_ports xdp_stats_map $ bpftool map show 5: percpu_array name xdp_stats_map flags 0x0 key 4B value 16B max_entries 5 memlock 4096B btf_id 35 6: percpu_array name filter_ports flags 0x0 key 4B value 8B max_entries 65536 memlock 1576960B btf_id 35 7: percpu_hash name filter_ipv4 flags 0x0 key 4B value 8B max_entries 10000 memlock 1064960B btf_id 35 8: percpu_hash name filter_ipv6 flags 0x0 key 16B value 8B max_entries 10000 memlock 1142784B btf_id 35 9: percpu_hash name filter_ethernet flags 0x0 key 6B value 8B max_entries 10000 memlock 1064960B btf_id 35 $ bpftool prog show 28: xdp name xdpfilt_alw_all tag e29eeda1489a6520 gpl loaded_at 2020-10-22T08:04:33-0400 uid 0 xlated 2408B jited 1405B memlock 4096B map_ids 9,5,7,8,6 btf_id 35 $ /root/iproute2/ip/ip link set veth0 xdp off $ /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_ip.o sec xdp_filter $ /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 32 tag 2f2b9dbfb786a5a2 jited $ ls /sys/fs/bpf/xdp/globals filter_ethernet filter_ipv4 filter_ipv6 filter_ports xdp_stats_map $ bpftool map show 5: percpu_array name xdp_stats_map flags 0x0 key 4B value 16B max_entries 5 memlock 4096B btf_id 35 6: percpu_array name filter_ports flags 0x0 key 4B value 8B max_entries 65536 memlock 1576960B btf_id 35 7: percpu_hash name filter_ipv4 flags 0x0 key 4B value 8B max_entries 10000 memlock 1064960B btf_id 35 8: percpu_hash name filter_ipv6 flags 0x0 key 16B value 8B max_entries 10000 memlock 1142784B btf_id 35 9: percpu_hash name filter_ethernet flags 0x0 key 6B value 8B max_entries 10000 memlock 1064960B btf_id 35 $ bpftool prog show 32: xdp name xdpfilt_alw_ip tag 2f2b9dbfb786a5a2 gpl loaded_at 2020-10-22T08:04:35-0400 uid 0 xlated 1336B jited 778B memlock 4096B map_ids 7,8,5 btf_id 40 $ /root/iproute2/ip/ip link set veth0 xdp off $ /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_tcp.o sec xdp_filter $ /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 36 tag 18c1bb25084030bc jited $ ls /sys/fs/bpf/xdp/globals filter_ethernet filter_ipv4 filter_ipv6 filter_ports xdp_stats_map $ bpftool map show 5: percpu_array name xdp_stats_map flags 0x0 key 4B value 16B max_entries 5 memlock 4096B btf_id 35 6: percpu_array name filter_ports flags 0x0 key 4B value 8B max_entries 65536 memlock 1576960B btf_id 35 7: percpu_hash name filter_ipv4 flags 0x0 key 4B value 8B max_entries 10000 memlock 1064960B btf_id 35 8: percpu_hash name filter_ipv6 flags 0x0 key 16B value 8B max_entries 10000 memlock 1142784B btf_id 35 9: percpu_hash name filter_ethernet flags 0x0 key 6B value 8B max_entries 10000 memlock 1064960B btf_id 35 $ bpftool prog show 36: xdp name xdpfilt_alw_tcp tag 18c1bb25084030bc gpl loaded_at 2020-10-22T08:04:37-0400 uid 0 xlated 1128B jited 690B memlock 4096B map_ids 6,5 btf_id 45 $ /root/iproute2/ip/ip link set veth0 xdp off $ rm -rf /sys/fs/bpf/xdp/globals == Load new btf defined maps $ /root/iproute2/ip/ip link set veth0 xdp obj btf_graft.o sec aaa $ /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 40 tag 3056d2382e53f27c jited $ ls /sys/fs/bpf/xdp/globals jmp_tc $ bpftool map show 10: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B $ bpftool prog show 40: xdp name cls_aaa tag 3056d2382e53f27c gpl loaded_at 2020-10-22T08:04:39-0400 uid 0 xlated 80B jited 71B memlock 4096B btf_id 50 $ /root/iproute2/ip/ip link set veth0 xdp off $ /root/iproute2/ip/ip link set veth0 xdp obj btf_map_in_map.o sec ingress $ /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 44 tag 4420e72b2a601ed7 jited $ ls /sys/fs/bpf/xdp/globals jmp_tc map_outer $ bpftool map show 10: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 11: array name map_inner flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 13: array_of_maps name map_outer flags 0x0 key 4B value 4B max_entries 1 memlock 4096B $ bpftool prog show 44: xdp name imain tag 4420e72b2a601ed7 gpl loaded_at 2020-10-22T08:04:41-0400 uid 0 xlated 336B jited 193B memlock 4096B map_ids 13 btf_id 55 $ /root/iproute2/ip/ip link set veth0 xdp off $ /root/iproute2/ip/ip link set veth0 xdp obj btf_shared.o sec ingress $ /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 48 tag 9cbab549c3af3eab jited $ ls /sys/fs/bpf/xdp/globals jmp_tc map_outer map_sh $ bpftool map show 10: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 11: array name map_inner flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 13: array_of_maps name map_outer flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 14: array name map_sh flags 0x0 key 4B value 4B max_entries 1 memlock 4096B $ bpftool prog show 48: xdp name imain tag 9cbab549c3af3eab gpl loaded_at 2020-10-22T08:04:43-0400 uid 0 xlated 224B jited 139B memlock 4096B map_ids 14 btf_id 60 $ /root/iproute2/ip/ip link set veth0 xdp off $ rm -rf /sys/fs/bpf/xdp/globals == Test load objs by tc $ /root/iproute2/tc/tc qdisc add dev veth0 ingress $ /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_cyclic.o sec 0xabccba/0 $ /root/iproute2/tc/tc filter add dev veth0 parent ffff: bpf obj bpf_graft.o $ /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/0 $ /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/1 $ /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 43/0 $ /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec classifier $ /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff $ ls /sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d /sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f /sys/fs/bpf/xdp/globals /sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d: jmp_tc /sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f: jmp_ex jmp_tc map_sh /sys/fs/bpf/xdp/globals: jmp_tc $ bpftool map show 15: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B owner_prog_type sched_cls owner jited 16: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B owner_prog_type sched_cls owner jited 17: prog_array name jmp_ex flags 0x0 key 4B value 4B max_entries 1 memlock 4096B owner_prog_type sched_cls owner jited 18: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 2 memlock 4096B owner_prog_type sched_cls owner jited 19: array name map_sh flags 0x0 key 4B value 4B max_entries 1 memlock 4096B $ bpftool prog show 52: sched_cls name cls_loop tag 3e98a40b04099d36 gpl loaded_at 2020-10-22T08:04:45-0400 uid 0 xlated 168B jited 133B memlock 4096B map_ids 15 btf_id 65 56: sched_cls name cls_entry tag 0fbb4d9310a6ee26 gpl loaded_at 2020-10-22T08:04:45-0400 uid 0 xlated 144B jited 121B memlock 4096B map_ids 16 btf_id 70 60: sched_cls name cls_case1 tag e06a3bd62293d65d gpl loaded_at 2020-10-22T08:04:45-0400 uid 0 xlated 328B jited 216B memlock 4096B map_ids 19,17 btf_id 75 66: sched_cls name cls_case1 tag e06a3bd62293d65d gpl loaded_at 2020-10-22T08:04:45-0400 uid 0 xlated 328B jited 216B memlock 4096B map_ids 19,17 btf_id 80 72: sched_cls name cls_case1 tag e06a3bd62293d65d gpl loaded_at 2020-10-22T08:04:45-0400 uid 0 xlated 328B jited 216B memlock 4096B map_ids 19,17 btf_id 85 78: sched_cls name cls_case1 tag e06a3bd62293d65d gpl loaded_at 2020-10-22T08:04:45-0400 uid 0 xlated 328B jited 216B memlock 4096B map_ids 19,17 btf_id 90 79: sched_cls name cls_case2 tag ee218ff893dca823 gpl loaded_at 2020-10-22T08:04:45-0400 uid 0 xlated 336B jited 218B memlock 4096B map_ids 19,18 btf_id 90 80: sched_cls name cls_exit tag e78a58140deed387 gpl loaded_at 2020-10-22T08:04:45-0400 uid 0 xlated 288B jited 177B memlock 4096B map_ids 19 btf_id 90 I also run the following upstream kselftest with patches iproute2 and all passed. test_lwt_ip_encap.sh test_xdp_redirect.sh test_tc_redirect.sh test_xdp_meta.sh test_xdp_veth.sh test_xdp_vlan.sh ==================== Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-24 22:24:15 -07:00
Hangbin Liu	71c7c1fb4f	examples/bpf: add bpf examples with BTF defined maps Users should try use the new BTF defined maps instead of struct bpf_elf_map defined maps. The tail call examples are not added yet as libbpf doesn't currently support declaratively populating tail call maps. Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Hangbin Liu <haliu@redhat.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-24 22:14:08 -07:00
Hangbin Liu	1ac8285a69	examples/bpf: move struct bpf_elf_map defined maps to legacy folder Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Hangbin Liu <haliu@redhat.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-24 22:14:06 -07:00
Hangbin Liu	6d61a2b557	lib: add libbpf support This patch converts iproute2 to use libbpf for loading and attaching BPF programs when it is available, which is started by Toke's implementation[1]. With libbpf iproute2 could correctly process BTF information and support the new-style BTF-defined maps, while keeping compatibility with the old internal map definition syntax. The old iproute2 bpf code is kept and will be used if no suitable libbpf is available. When using libbpf, wrapper code in bpf_legacy.c ensures that iproute2 will still understand the old map definition format, including populating map-in-map and tail call maps before load. In bpf_libbpf.c, we init iproute2 ctx and elf info first to check the legacy bytes. When handling the legacy maps, for map-in-maps, we create them manually and re-use the fd as they are associated with id/inner_id. For pin maps, we only set the pin path and let libbp load to handle it. For tail calls, we find it first and update the element after prog load. Other maps/progs will be loaded by libbpf directly. [1] https://lore.kernel.org/bpf/20190820114706.18546-1-toke@redhat.com/ Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Hangbin Liu <haliu@redhat.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-24 22:14:05 -07:00
Hangbin Liu	dc800a4ed4	lib: make ipvrf able to use libbpf and fix function name conflicts There are directly calls in libbpf for bpf program load/attach. So we could just use two wrapper functions for ipvrf and convert them with libbpf support. Function bpf_prog_load() is removed as it's conflict with libbpf function name. bpf.c is moved to bpf_legacy.c for later main libbpf support in iproute2. Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Hangbin Liu <haliu@redhat.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-24 22:14:04 -07:00
Hangbin Liu	503e9229b0	iproute2: add check_libbpf() and get_libbpf_version() This patch aim to add basic checking functions for later iproute2 libbpf support. First we add check_libbpf() in configure to see if we have bpf library support. By default the system libbpf will be used, but static linking against a custom libbpf version can be achieved by passing libbpf DESTDIR to variable LIBBPF_DIR for configure. Another variable LIBBPF_FORCE is used to control whether to build iproute2 with libbpf. If set to on, then force to build with libbpf and exit if not available. If set to off, then force to not build with libbpf. When dynamically linking against libbpf, we can't be sure that the version we discovered at compile time is actually the one we are using at runtime. This can lead to hard-to-debug errors. So we add a new file lib/bpf_glue.c and a helper function get_libbpf_version() to get correct libbpf version at runtime. Signed-off-by: Hangbin Liu <haliu@redhat.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-24 22:14:02 -07:00
David Ahern	ee5d4b24e3	Merge branch 'main' into next Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-24 22:04:48 -07:00
Roi Dayan	ed40b7e2ae	tc flower: fix parsing vlan_id and vlan_prio When protocol is vlan then eth_type is set to the vlan eth type. So when parsing vlan_id and vlan_prio need to check tc_proto is vlan and not eth_type. Fixes: `4c551369e0` ("tc flower: use right ethertype in icmp/arp parsing") Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-24 21:45:20 -07:00
Petr Machata	ca5ec9a17a	ip: iptuntap: Convert to use print_on_off() Instead of rolling a custom on-off printer, use the one added to utils.c. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-24 21:43:41 -07:00
Petr Machata	66e574c4c5	ip: ipnetconf: Convert to use print_on_off() Instead of rolling a custom on-off printer, use the one added to utils.c. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-24 21:43:34 -07:00
Petr Machata	07d82b4a79	ip: iplink_bridge_slave: Convert to use print_on_off() Instead of rolling a custom on-off printer, use the one added to utils.c. Note that _print_onoff() has an extra parameter for a JSON-specific flag name. However that argument is not used, and never was. Therefore when moving over to print_on_off(), drop this argument. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-24 21:43:30 -07:00
Petr Machata	3e0d2a73ba	ip: iplink_bridge_slave: Port over to parse_on_off() Invoke parse_on_off() from bridge_slave_parse_on_off() instead of hand-rolling one. Exit on failure, because the invarg that was ivoked here before would. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-24 21:43:27 -07:00
Petr Machata	5f685d064b	ip: iplink: Convert to use parse_on_off() Invoke parse_on_off() instead of rolling a custom function. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-24 21:43:23 -07:00
Petr Machata	94d12fd796	bridge: link: Convert to use print_on_off() Instead of rolling a custom on-off printer, use the one added to utils.c. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-24 21:43:19 -07:00
Petr Machata	9262ccc3ed	bridge: link: Port over to parse_on_off() Convert bridge/link.c from a custom on_off parser to the new global one. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-24 21:43:14 -07:00
David Ahern	e1ae6efbb8	Merge branch 'nexthop-flags' into next Ido Schimmel says: ==================== From: Ido Schimmel <idosch@nvidia.com> Patch #1 prints the recently added 'RTNH_F_TRAP' flag. Patch #2 makes sure that nexthop flags are always printed for nexthop objects. Even when the nexthop does not have a device, such as a blackhole nexthop or a group. Example output with netdevsim: $ ip nexthop id 1 via 192.0.2.2 dev eth0 scope link trap id 2 blackhole trap id 3 group 2 trap Example output with mlxsw: $ ip nexthop id 1 via 192.0.2.2 dev swp3 scope link offload id 2 blackhole offload id 3 group 2 offload Tested with fib_nexthops.sh that uses "ip nexthop" output: Tests passed: 164 Tests failed: 0 ==================== Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-22 12:46:30 -07:00
Ido Schimmel	0788678991	nexthop: Always print nexthop flags Currently, the nexthop flags are only printed when the nexthop has a nexthop device. The offload / trap indication is therefore not printed for nexthop groups. Instead, always print the nexthop flags, regardless if the nexthop has a nexthop device or not. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-22 12:43:56 -07:00
Ido Schimmel	3de35f41be	ip route: Print "trap" nexthop indication The kernel can now signal that a nexthop is trapping packets instead of forwarding them. Print the flag to help users understand the offload state of each nexthop. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-22 12:42:20 -07:00
David Ahern	db8b149b16	Update kernel headers Update kernel headers to commit: f9e425e99b07 ("octeontx2-af: Add support for RSS hashing based on Transport protocol field") Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-22 12:41:23 -07:00
Stephen Hemminger	7a49ff9d79	bridge: report correct version Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2020-11-15 08:58:52 -08:00
Zahari Doychev	4c551369e0	tc flower: use right ethertype in icmp/arp parsing Currently the icmp and arp parsing functions are called with incorrect ethtype in case of vlan or cvlan filter options. In this case either cvlan_ethtype or vlan_ethtype has to be used. The ethtype is now updated each time a vlan ethtype is matched during parsing. Signed-off-by: Zahari Doychev <zahari.doychev@linux.com> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-13 20:07:38 -07:00
David Ahern	1ed00380b0	Merge branch 'dcb-tool' into next Petr Machata says: ==================== The Linux DCB interface allows configuration of a broad range of hardware-specific attributes, such as TC scheduling, flow control, per-port buffer configuration, TC rate, etc. Currently a common libre tool for configuration of DCB is OpenLLDP. This suite contains a daemon that uses Linux DCB interface to configure HW according to the DCB TLVs exchanged over an interface. The daemon can also be controlled by a client, through which the user can adjust and view the configuration. The downside of using OpenLLDP is that it is somewhat heavyweight and difficult to use in scripts, and does not support extensions such as buffer and rate commands. For access to many HW features, one would be perfectly fine with a fire-and-forget tool along the lines of "ip" or "tc". For scripting in particular, this would be ideal. This author is aware of one such tool, mlnx_qos from Mellanox OFED scripts collection[1]. The downside here is that the tool is very verbose, the command line language is awkward to use, it is not packaged in Linux distros, and generally has the appearance of a very vendor-specific tool, despite not being one. This patchset addresses the above issues by providing a seed of a clean, well-documented, easily usable, extensible fire-and-forget tool for DCB configuration: # dcb ets set dev eni1np1 \ tc-tsa all:strict 0:ets 1:ets 2:ets \ tc-bw all:0 0:33 1:33 2:34 # dcb ets show dev eni1np1 tc-tsa tc-bw tc-tsa 0:ets 1:ets 2:ets 3:strict 4:strict 5:strict 6:strict 7:strict tc-bw 0:33 1:33 2:34 3:0 4:0 5:0 6:0 7:0 # dcb ets set dev eni1np1 tc-bw 1:30 2:37 # dcb -j ets show dev eni1np1 \| jq '.tc_bw[2]' 37 The patchset proceeds as follows: - Many tools in iproute2 have an option to work in batch mode, where the commands to run are given in a file. The code to handle batching is largely the same independent of the tool in question. In patch #1, add a helper to handle the batching, and migrate individual tools to use it. - A number of configuration options come in a form of an on-off switch. This in turn can be considered a special case of parsing one of a given set of strings. In patch #2, extract helpers to parse one of a number of strings, on top of which build an on-off parser. Currently each tool open-codes the logic to parse the on-off toggle. A future patch set will migrate instances of this code over to the new helpers. - The on/off toggles from previous list item sometimes need to be dumped. While in the FP output, one typically wishes to maintain consistency with the command line and show actual strings, "on" and "off", in JSON output one would rather use booleans. This logic is somewhat annoying to have to open-code time and again. Therefore in patch #3, add a helper to do just that. - The DCB tool is built on top of libmnl. Several routines will be basically the same in DCB as they are currently in devlink. In patches #4-#6, extract them to a new module, mnl_utils, for easy reuse. - Much of DCB is built around arrays. A syntax similar to the iplink_vlan's ingress-qos-map / egress-qos-map is very handy for describing changes done to such arrays. Therefore in patch #7, extract a helper, parse_mapping(), which manages parsing of key-value arrays. In patch #8, fix a buglet in the helper, and in patch #9, extend it to allow setting of all array elements in one go. - In patch #10, add a skeleton of "dcb", which contains common helpers and dispatches to subtools for handling of individual objects. The skeleton is empty as of this patch. In patch #11, add "dcb_ets", a module for handling of specifically DCB ETS objects. The intention is to gradually add handlers for at least PFC, APP, peer configuration, buffers and rates. [1] https://github.com/Mellanox/mlnx-tools/tree/master/ofed_scripts ==================== Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-13 19:48:52 -07:00
Petr Machata	ef15b07601	dcb: Add a subtool for the DCB ETS object ETS, for "Enhanced Transmission Selection", is a set of configurations that permit configuration of mapping of priorities to traffic classes, traffic selection algorithm to use per traffic class, bandwidth allocation, etc. Add a dcb subtool to allow showing and tweaking of individual ETS configuration options. For example: # dcb ets show dev eni1np1 willing on ets_cap 8 cbs off tc-bw 0:0 1:0 2:0 3:0 4:100 5:0 6:0 7:0 pg-bw 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 tc-tsa 0:strict 1:strict 2:strict 3:strict 4:ets 5:strict 6:strict 7:strict prio-tc 0:1 1:3 2:5 3:0 4:0 5:0 6:0 7:0 reco-tc-bw 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 reco-tc-tsa 0:strict 1:strict 2:strict 3:strict 4:strict 5:strict 6:strict 7:strict reco-prio-tc 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-13 19:43:19 -07:00
Petr Machata	67033d1c1c	Add skeleton of a new tool, dcb The Linux DCB interface allows configuration of a broad range of hardware-specific attributes, such as TC scheduling, flow control, per-port buffer configuration, TC rate, etc. Add a new tool to show that configuration and tweak it. DCB allows configuration of several objects, and possibly could expand to pre-standard CEE interfaces. Therefore the tool itself is a lean shell that dispatches to subtools each dedicated to one of the objects. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-13 19:43:19 -07:00
Petr Machata	66a2d71487	lib: parse_mapping: Recognize a keyword "all" The DCB tool will have to provide an interface to a number of fixed-size arrays. Unlike the egress- and ingress-qos-map, it makes good sense to have an interface to set all members to the same value. For example to set strict priority on all TCs besides select few, or to reset allocated bandwidth to all zeroes, again besides several explicitly-given ones. To support this usage, extend the parse_mapping() with a boolean that determines whether this special use is supported. If "all" is given and recognized, mapping_cb is called with the key of -1. Have iplink_vlan pass false for allow_all. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-13 19:43:15 -07:00
Petr Machata	bc3523ae70	lib: parse_mapping: Update argc, argv on error Currently argc and argv are not updated unless parsing of all of the mapping was successful. However in that case, "ip link" will point at the wrong argument when complaining: # ip link add name eth0.100 link eth0 type vlan id 100 egress 1:1 2:foo Error: argument "1" is wrong: invalid egress-qos-map Update argc and argv even in the case of parsing error, so that the right element is indicated. Signed-off-by: Petr Machata <me@pmachata.org> Signed-off-by: David Ahern <dsahern@gmail.com>	2020-11-13 19:43:15 -07:00

1 2 3 4 5 ...

5401 Commits