mirror_ubuntu-kernels

mirror of https://git.proxmox.com/git/mirror_ubuntu-kernels.git synced 2025-11-11 23:11:01 +00:00

Author	SHA1	Message	Date
Andrii Nakryiko	2a6a9bf261	libbpf: Don't call libc APIs with NULL pointers Sanitizer complains about qsort(), bsearch(), and memcpy() being called with NULL pointer. This can only happen when the associated number of elements is zero, so no harm should be done. But still prevent this from happening to keep sanitizer runs clean from extra noise. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211124002325.1737739-5-andrii@kernel.org	2021-11-26 00:15:02 +01:00
Andrii Nakryiko	401891a9de	libbpf: Fix potential misaligned memory access in btf_ext__new() Perform a memory copy before we do the sanity checks of btf_ext_hdr. This prevents misaligned memory access if raw btf_ext data is not 4-byte aligned ([0]). While at it, also add missing const qualifier. [0] Closes: https://github.com/libbpf/libbpf/issues/391 Fixes: `2993e0515b` ("tools/bpf: add support to read .BTF.ext sections") Reported-by: Evgeny Vereshchagin <evvers@ya.ru> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211124002325.1737739-3-andrii@kernel.org	2021-11-26 00:14:06 +01:00
Andrii Nakryiko	99a12a32fe	libbpf: Prevent deprecation warnings in xsk.c xsk.c is using own APIs that are marked for deprecation internally. Given xsk.c and xsk.h will be gone in libbpf 1.0, there is no reason to do public vs internal function split just to avoid deprecation warnings. So just add a pragma to silence deprecation warnings (until the code is removed completely). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211124193233.3115996-4-andrii@kernel.org	2021-11-25 23:35:46 +01:00
Andrii Nakryiko	a9606f405f	libbpf: Use bpf_map_create() consistently internally Remove all the remaining uses of to-be-deprecated bpf_create_map*() APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211124193233.3115996-3-andrii@kernel.org	2021-11-25 23:35:46 +01:00
Andrii Nakryiko	992c422541	libbpf: Unify low-level map creation APIs w/ new bpf_map_create() Mark the entire zoo of low-level map creation APIs for deprecation in libbpf 0.7 ([0]) and introduce a new bpf_map_create() API that is OPTS-based (and thus future-proof) and matches the BPF_MAP_CREATE command name. While at it, ensure that gen_loader sends map_extra field. Also remove now unneeded btf_key_type_id/btf_value_type_id logic that libbpf is doing anyways. [0] Closes: https://github.com/libbpf/libbpf/issues/282 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211124193233.3115996-2-andrii@kernel.org	2021-11-25 23:35:46 +01:00
Andrii Nakryiko	16e0c35c6f	libbpf: Load global data maps lazily on legacy kernels Load global data maps lazily, if kernel is too old to support global data. Make sure that programs are still correct by detecting if any of the to-be-loaded programs have relocation against any of such maps. This allows to solve the issue ([0]) with bpf_printk() and Clang generating unnecessary and unreferenced .rodata.strX.Y sections, but it also goes further along the CO-RE lines, allowing to have a BPF object in which some code can work on very old kernels and relies only on BPF maps explicitly, while other BPF programs might enjoy global variable support. If such programs are correctly set to not load at runtime on old kernels, bpf_object will load and function correctly now. [0] https://lore.kernel.org/bpf/CAK-59YFPU3qO+_pXWOH+c1LSA=8WA1yabJZfREjOEXNHAqgXNg@mail.gmail.com/ Fixes: `aed659170a` ("libbpf: Support multiple .rodata.* and .data.* BPF maps") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211123200105.387855-1-andrii@kernel.org	2021-11-25 23:05:23 +01:00
Florent Revest	8cccee9e91	libbpf: Change bpf_program__set_extra_flags to bpf_program__set_flags bpf_program__set_extra_flags has just been introduced so we can still change it without breaking users. This new interface is a bit more flexible (for example if someone wants to clear a flag). Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211119180035.1396139-1-revest@chromium.org	2021-11-19 14:21:37 -08:00
Andrii Nakryiko	efdd3eb801	libbpf: Accommodate DWARF/compiler bug with duplicated structs According to [0], compilers sometimes might produce duplicate DWARF definitions for exactly the same struct/union within the same compilation unit (CU). We've had similar issues with identical arrays and handled them with a similar workaround in `6b6e6b1d09` ("libbpf: Accomodate DWARF/compiler bug with duplicated identical arrays"). Do the same for struct/union by ensuring that two structs/unions are exactly the same, down to the integer values of field referenced type IDs. Solving this more generically (allowing referenced types to be equivalent, but using different type IDs, all within a single CU) requires a huge complexity increase to handle many-to-many mappings between canonidal and candidate type graphs. Before we invest in that, let's see if this approach handles all the instances of this issue in practice. Thankfully it's pretty rare, it seems. [0] https://lore.kernel.org/bpf/YXr2NFlJTAhHdZqq@krava/ Reported-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211117194114.347675-1-andrii@kernel.org	2021-11-19 16:59:16 +01:00
Andrii Nakryiko	7615209f42	libbpf: Add runtime APIs to query libbpf version Libbpf provided LIBBPF_MAJOR_VERSION and LIBBPF_MINOR_VERSION macros to check libbpf version at compilation time. This doesn't cover all the needs, though, because version of libbpf that application is compiled against doesn't necessarily match the version of libbpf at runtime, especially if libbpf is used as a shared library. Add libbpf_major_version() and libbpf_minor_version() returning major and minor versions, respectively, as integers. Also add a convenience libbpf_version_string() for various tooling using libbpf to print out libbpf version in a human-readable form. Currently it will return "v0.6", but in the future it can contains some extra information, so the format itself is not part of a stable API and shouldn't be relied upon. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20211118174054.2699477-1-andrii@kernel.org	2021-11-19 16:25:57 +01:00
Jakub Kicinski	50fc24944a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-11-18 13:13:16 -08:00
Jakub Kicinski	f083ec3160	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2021-11-16 We've added 12 non-merge commits during the last 5 day(s) which contain a total of 23 files changed, 573 insertions(+), 73 deletions(-). The main changes are: 1) Fix pruning regression where verifier went overly conservative rejecting previsouly accepted programs, from Alexei Starovoitov and Lorenz Bauer. 2) Fix verifier TOCTOU bug when using read-only map's values as constant scalars during verification, from Daniel Borkmann. 3) Fix a crash due to a double free in XSK's buffer pool, from Magnus Karlsson. 4) Fix libbpf regression when cross-building runqslower, from Jean-Philippe Brucker. 5) Forbid use of bpf_ktime_get_coarse_ns() and bpf_timer_() helpers in tracing programs due to deadlock possibilities, from Dmitrii Banshchikov. 6) Fix checksum validation in sockmap's udp_read_sock() callback, from Cong Wang. 7) Various BPF sample fixes such as XDP stats in xdp_sample_user, from Alexander Lobakin. 8) Fix libbpf gen_loader error handling wrt fd cleanup, from Kumar Kartikeya Dwivedi. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: udp: Validate checksum in udp_read_sock() bpf: Fix toctou on read-only map's constant scalar tracking samples/bpf: Fix build error due to -isystem removal selftests/bpf: Add tests for restricted helpers bpf: Forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing progs libbpf: Perform map fd cleanup for gen_loader in case of error samples/bpf: Fix incorrect use of strlen in xdp_redirect_cpu tools/runqslower: Fix cross-build samples/bpf: Fix summary per-sec stats in xdp_sample_user selftests/bpf: Check map in map pruning bpf: Fix inner map state pruning regression. xsk: Fix crash on double free in buffer pool ==================== Link: https://lore.kernel.org/r/20211116141134.6490-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-11-16 16:53:48 -08:00
Yonghong Song	69a055d546	libbpf: Fix a couple of missed btf_type_tag handling in btf.c Commit `2dc1e488e5` ("libbpf: Support BTF_KIND_TYPE_TAG") added the BTF_KIND_TYPE_TAG support. But to test vmlinux build with ... #define __user __attribute__((btf_type_tag("user"))) ... I needed to sync libbpf repo and manually copy libbpf sources to pahole. To simplify process, I used BTF_KIND_RESTRICT to simulate the BTF_KIND_TYPE_TAG with vmlinux build as "restrict" modifier is barely used in kernel. But this approach missed one case in dedup with structures where BTF_KIND_RESTRICT is handled and BTF_KIND_TYPE_TAG is not handled in btf_dedup_is_equiv(), and this will result in a pahole dedup failure. This patch fixed this issue and a selftest is added in the subsequent patch to test this scenario. The other missed handling is in btf__resolve_size(). Currently the compiler always emit like PTR->TYPE_TAG->... so in practice we don't hit the missing BTF_KIND_TYPE_TAG handling issue with compiler generated code. But lets add case BTF_KIND_TYPE_TAG in the switch statement to be future proof. Fixes: `2dc1e488e5` ("libbpf: Support BTF_KIND_TYPE_TAG") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211115163937.3922235-1-yhs@fb.com	2021-11-16 13:10:52 +01:00
Jakub Kicinski	a5bdc36354	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2021-11-15 We've added 72 non-merge commits during the last 13 day(s) which contain a total of 171 files changed, 2728 insertions(+), 1143 deletions(-). The main changes are: 1) Add btf_type_tag attributes to bring kernel annotations like __user/__rcu to BTF such that BPF verifier will be able to detect misuse, from Yonghong Song. 2) Big batch of libbpf improvements including various fixes, future proofing APIs, and adding a unified, OPTS-based bpf_prog_load() low-level API, from Andrii Nakryiko. 3) Add ingress_ifindex to BPF_SK_LOOKUP program type for selectively applying the programmable socket lookup logic to packets from a given netdev, from Mark Pashmfouroush. 4) Remove the 128M upper JIT limit for BPF programs on arm64 and add selftest to ensure exception handling still works, from Russell King and Alan Maguire. 5) Add a new bpf_find_vma() helper for tracing to map an address to the backing file such as shared library, from Song Liu. 6) Batch of various misc fixes to bpftool, fixing a memory leak in BPF program dump, updating documentation and bash-completion among others, from Quentin Monnet. 7) Deprecate libbpf bpf_program__get_prog_info_linear() API and migrate its users as the API is heavily tailored around perf and is non-generic, from Dave Marchevsky. 8) Enable libbpf's strict mode by default in bpftool and add a --legacy option as an opt-out for more relaxed BPF program requirements, from Stanislav Fomichev. 9) Fix bpftool to use libbpf_get_error() to check for errors, from Hengqi Chen. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (72 commits) bpftool: Use libbpf_get_error() to check error bpftool: Fix mixed indentation in documentation bpftool: Update the lists of names for maps and prog-attach types bpftool: Fix indent in option lists in the documentation bpftool: Remove inclusion of utilities.mak from Makefiles bpftool: Fix memory leak in prog_dump() selftests/bpf: Fix a tautological-constant-out-of-range-compare compiler warning selftests/bpf: Fix an unused-but-set-variable compiler warning bpf: Introduce btf_tracing_ids bpf: Extend BTF_ID_LIST_GLOBAL with parameter for number of IDs bpftool: Enable libbpf's strict mode by default docs/bpf: Update documentation for BTF_KIND_TYPE_TAG support selftests/bpf: Clarify llvm dependency with btf_tag selftest selftests/bpf: Add a C test for btf_type_tag selftests/bpf: Rename progs/tag.c to progs/btf_decl_tag.c selftests/bpf: Test BTF_KIND_DECL_TAG for deduplication selftests/bpf: Add BTF_KIND_TYPE_TAG unit tests selftests/bpf: Test libbpf API function btf__add_type_tag() bpftool: Support BTF_KIND_TYPE_TAG libbpf: Support BTF_KIND_TYPE_TAG ... ==================== Link: https://lore.kernel.org/r/20211115162008.25916-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-11-15 08:49:23 -08:00
Kumar Kartikeya Dwivedi	ba05fd36b8	libbpf: Perform map fd cleanup for gen_loader in case of error Alexei reported a fd leak issue in gen loader (when invoked from bpftool) [0]. When adding ksym support, map fd allocation was moved from stack to loader map, however I missed closing these fds (relevant when cleanup label is jumped to on error). For the success case, the allocated fd is returned in loader ctx, hence this problem is not noticed. Make three changes, first MAX_USED_MAPS in MAX_FD_ARRAY_SZ instead of MAX_USED_PROGS, the braino was not a problem until now for this case as we didn't try to close map fds (otherwise use of it would have tried closing 32 additional fds in ksym btf fd range). Then, do a cleanup for all nr_maps fds in cleanup label code, so that in case of error all temporary map fds from bpf_gen__map_create are closed. Then, adjust the cleanup label to only generate code for the required number of program and map fds. To trim code for remaining program fds, lay out prog_fd array in stack in the end, so that we can directly skip the remaining instances. Still stack size remains same, since changing that would require changes in a lot of places (including adjustment of stack_off macro), so nr_progs_sz variable is only used to track required number of iterations (and jump over cleanup size calculated from that), stack offset calculation remains unaffected. The difference for test_ksyms_module.o is as follows: libbpf: //prog cleanup iterations: before = 34, after = 5 libbpf: //maps cleanup iterations: before = 64, after = 2 Also, move allocation of gen->fd_array offset to bpf_gen__init. Since offset can now be 0, and we already continue even if add_data returns 0 in case of failure, we do not need to distinguish between 0 offset and failure case 0, as we rely on bpf_gen__finish to check errors. We can also skip check for gen->fd_array in add_*_fd functions, since bpf_gen__init will take care of it. [0]: https://lore.kernel.org/bpf/CAADnVQJ6jSitKSNKyxOrUzwY2qDRX0sPkJ=VLGHuCLVJ=qOt9g@mail.gmail.com Fixes: `18f4fccbf3` ("libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211112232022.899074-1-memxor@gmail.com	2021-11-12 17:23:46 -08:00
Sasha Levin	7246f4dcac	tools/lib/lockdep: drop liblockdep TL;DR: While a tool like liblockdep is useful, it probably doesn't belong within the kernel tree. liblockdep attempts to reuse kernel code both directly (by directly building the kernel's lockdep code) as well as indirectly (by using sanitized headers). This makes liblockdep an integral part of the kernel. It also makes liblockdep quite unique: while other userspace code might use sanitized headers, it generally doesn't attempt to use kernel code directly which means that changes on the kernel side of things don't affect (and break) it directly. All our workflows and tooling around liblockdep don't support this uniqueness. Changes that go into the kernel code aren't validated to not break in-tree userspace code. liblockdep ended up being very fragile, breaking over and over, to the point that living in the same tree as the lockdep code lost most of it's value. liblockdep should continue living in an external tree, syncing with the kernel often, in a controllable way. Signed-off-by: Sasha Levin <sashal@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-11-12 11:07:17 -08:00
Yonghong Song	2dc1e488e5	libbpf: Support BTF_KIND_TYPE_TAG Add libbpf support for BTF_KIND_TYPE_TAG. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211112012614.1505315-1-yhs@fb.com	2021-11-11 17:41:11 -08:00
Andrii Nakryiko	4178893465	libbpf: Make perf_buffer__new() use OPTS-based interface Add new variants of perf_buffer__new() and perf_buffer__new_raw() that use OPTS-based options for future extensibility ([0]). Given all the currently used API names are best fits, re-use them and use ___libbpf_override() approach and symbol versioning to preserve ABI and source code compatibility. struct perf_buffer_opts and struct perf_buffer_raw_opts are kept as well, but they are restructured such that they are OPTS-based when used with new APIs. For struct perf_buffer_raw_opts we keep few fields intact, so we have to also preserve the memory location of them both when used as OPTS and for legacy API variants. This is achieved with anonymous padding for OPTS "incarnation" of the struct. These pads can be eventually used for new options. [0] Closes: https://github.com/libbpf/libbpf/issues/311 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111053624.190580-6-andrii@kernel.org	2021-11-11 16:54:05 -08:00
Andrii Nakryiko	6084f5dc92	libbpf: Ensure btf_dump__new() and btf_dump_opts are future-proof Change btf_dump__new() and corresponding struct btf_dump_ops structure to be extensible by using OPTS "framework" ([0]). Given we don't change the names, we use a similar approach as with bpf_prog_load(), but this time we ended up with two APIs with the same name and same number of arguments, so overloading based on number of arguments with ___libbpf_override() doesn't work. Instead, use "overloading" based on types. In this particular case, print callback has to be specified, so we detect which argument is a callback. If it's 4th (last) argument, old implementation of API is used by user code. If not, it must be 2nd, and thus new implementation is selected. The rest is handled by the same symbol versioning approach. btf_ext argument is dropped as it was never used and isn't necessary either. If in the future we'll need btf_ext, that will be added into OPTS-based struct btf_dump_opts. struct btf_dump_opts is reused for both old API and new APIs. ctx field is marked deprecated in v0.7+ and it's put at the same memory location as OPTS's sz field. Any user of new-style btf_dump__new() will have to set sz field and doesn't/shouldn't use ctx, as ctx is now passed along the callback as mandatory input argument, following the other APIs in libbpf that accept callbacks consistently. Again, this is quite ugly in implementation, but is done in the name of backwards compatibility and uniform and extensible future APIs (at the same time, sigh). And it will be gone in libbpf 1.0. [0] Closes: https://github.com/libbpf/libbpf/issues/283 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111053624.190580-5-andrii@kernel.org	2021-11-11 16:54:05 -08:00
Andrii Nakryiko	957d350a8b	libbpf: Turn btf_dedup_opts into OPTS-based struct btf__dedup() and struct btf_dedup_opts were added before we figured out OPTS mechanism. As such, btf_dedup_opts is non-extensible without breaking an ABI and potentially crashing user application. Unfortunately, btf__dedup() and btf_dedup_opts are short and succinct names that would be great to preserve and use going forward. So we use ___libbpf_override() macro approach, used previously for bpf_prog_load() API, to define a new btf__dedup() variant that accepts only struct btf * and struct btf_dedup_opts * arguments, and rename the old btf__dedup() implementation into btf__dedup_deprecated(). This keeps both source and binary compatibility with old and new applications. The biggest problem was struct btf_dedup_opts, which wasn't OPTS-based, and as such doesn't have `size_t sz;` as a first field. But btf__dedup() is a pretty rarely used API and I believe that the only currently known users (besides selftests) are libbpf's own bpf_linker and pahole. Neither use case actually uses options and just passes NULL. So instead of doing extra hacks, just rewrite struct btf_dedup_opts into OPTS-based one, move btf_ext argument into those opts (only bpf_linker needs to dedup btf_ext, so it's not a typical thing to specify), and drop never used `dont_resolve_fwds` option (it was never used anywhere, AFAIK, it makes BTF dedup much less useful and efficient). Just in case, for old implementation, btf__dedup_deprecated(), detect non-NULL options and error out with helpful message, to help users migrate, if there are any user playing with btf__dedup(). The last remaining piece is dedup_table_size, which is another anachronism from very early days of BTF dedup. Since then it has been reduced to the only valid value, 1, to request forced hash collisions. This is only used during testing. So instead introduce a bool flag to force collisions explicitly. This patch also adapts selftests to new btf__dedup() and btf_dedup_opts use to avoid selftests breakage. [0] Closes: https://github.com/libbpf/libbpf/issues/281 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111053624.190580-4-andrii@kernel.org	2021-11-11 16:54:05 -08:00
Andrii Nakryiko	a6ca715831	libbpf: Add ability to get/set per-program load flags Add bpf_program__flags() API to retrieve prog_flags that will be (or were) supplied to BPF_PROG_LOAD command. Also add bpf_program__set_extra_flags() API to allow to set extra flags, in addition to those determined by program's SEC() definition. Such flags are logically OR'ed with libbpf-derived flags. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111051758.92283-2-andrii@kernel.org	2021-11-11 16:44:26 -08:00
Linus Torvalds	f54ca91fe6	Networking fixes for 5.16-rc1, including fixes from bpf, can and netfilter. Current release - regressions: - bpf: do not reject when the stack read size is different from the tracked scalar size - net: fix premature exit from NAPI state polling in napi_disable() - riscv, bpf: fix RV32 broken build, and silence RV64 warning Current release - new code bugs: - net: fix possible NULL deref in sock_reserve_memory - amt: fix error return code in amt_init(); fix stopping the workqueue - ax88796c: use the correct ioctl callback Previous releases - always broken: - bpf: stop caching subprog index in the bpf_pseudo_func insn - security: fixups for the security hooks in sctp - nfc: add necessary privilege flags in netlink layer, limit operations to admin only - vsock: prevent unnecessary refcnt inc for non-blocking connect - net/smc: fix sk_refcnt underflow on link down and fallback - nfnetlink_queue: fix OOB when mac header was cleared - can: j1939: ignore invalid messages per standard - bpf, sockmap: - fix race in ingress receive verdict with redirect to self - fix incorrect sk_skb data_end access when src_reg = dst_reg - strparser, and tls are reusing qdisc_skb_cb and colliding - ethtool: fix ethtool msg len calculation for pause stats - vlan: fix a UAF in vlan_dev_real_dev() when ref-holder tries to access an unregistering real_dev - udp6: make encap_rcv() bump the v6 not v4 stats - drv: prestera: add explicit padding to fix m68k build - drv: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge - drv: mvpp2: fix wrong SerDes reconfiguration order Misc & small latecomers: - ipvs: auto-load ipvs on genl access - mctp: sanity check the struct sockaddr_mctp padding fields - libfs: support RENAME_EXCHANGE in simple_rename() - avoid double accounting for pure zerocopy skbs Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGNQdwACgkQMUZtbf5S IrsiMQ//f66lTJ8PJ5Qj70hX9dC897olx7uGHB9eiKoyOcJI459hFlfXwRU2T4Tf fPNwPNUQ9Mynw9tX/jWEi+7zd6r6TSHGXK49U9/rIbQ95QjKY4LHowIE63x+vPl2 5Cpf+80zXC3DUX1fijgyG1ujnU3kBaqopTxDLmlsHw2PGkwT5Ox1DUwkhc370eEL xlpq3PYGWA8/AQNyhSVBkG/UmoLaq0jYNP5yVcOj4jGjgcgLe1SLrqczENr35QHZ cRkuBsFBMBZF7wSX2f9qQIB/+b1pcLlD9IO+K3S7Ruq+rUd7qfL/tmwNxEh0axYK AyIun1Bxcy7QJGjtpGAz+Ku7jS9T3HxzyxhqilQo3co8jAW0WJ1YwHl+XPgQXyjV DLG6Vxt4syiwsoSXGn8MQugs4nlBT+0qWl8YamIR+o7KkAYPc2QWkXlzEDfNeIW8 JNCZA3sy7VGi1ytorZGx16sQsEWnyRG9a6/WV20Dr+HVs1SKPcFzIfG6mVngR07T mQMHnbAF6Z5d8VTcPQfMxd7UH48s1bHtk5lcSTa3j0Cw+GkA6ytTmjPdJ1qRcdkH dl9jAfADe4O6frG+9XH7FEFqhmkghVI7bOCA4ZOhClVaIcDGgEZc2y7sY9/oZ7P4 KXBD2R5X1caCUM0UtzwL7/8ddOtPtHIrFnhY+7+I6ijt9qmI0BY= =Ttgq -----END PGP SIGNATURE----- Merge tag 'net-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, can and netfilter. Current release - regressions: - bpf: do not reject when the stack read size is different from the tracked scalar size - net: fix premature exit from NAPI state polling in napi_disable() - riscv, bpf: fix RV32 broken build, and silence RV64 warning Current release - new code bugs: - net: fix possible NULL deref in sock_reserve_memory - amt: fix error return code in amt_init(); fix stopping the workqueue - ax88796c: use the correct ioctl callback Previous releases - always broken: - bpf: stop caching subprog index in the bpf_pseudo_func insn - security: fixups for the security hooks in sctp - nfc: add necessary privilege flags in netlink layer, limit operations to admin only - vsock: prevent unnecessary refcnt inc for non-blocking connect - net/smc: fix sk_refcnt underflow on link down and fallback - nfnetlink_queue: fix OOB when mac header was cleared - can: j1939: ignore invalid messages per standard - bpf, sockmap: - fix race in ingress receive verdict with redirect to self - fix incorrect sk_skb data_end access when src_reg = dst_reg - strparser, and tls are reusing qdisc_skb_cb and colliding - ethtool: fix ethtool msg len calculation for pause stats - vlan: fix a UAF in vlan_dev_real_dev() when ref-holder tries to access an unregistering real_dev - udp6: make encap_rcv() bump the v6 not v4 stats - drv: prestera: add explicit padding to fix m68k build - drv: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge - drv: mvpp2: fix wrong SerDes reconfiguration order Misc & small latecomers: - ipvs: auto-load ipvs on genl access - mctp: sanity check the struct sockaddr_mctp padding fields - libfs: support RENAME_EXCHANGE in simple_rename() - avoid double accounting for pure zerocopy skbs" * tag 'net-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (123 commits) selftests/net: udpgso_bench_rx: fix port argument net: wwan: iosm: fix compilation warning cxgb4: fix eeprom len when diagnostics not implemented net: fix premature exit from NAPI state polling in napi_disable() net/smc: fix sk_refcnt underflow on linkdown and fallback net/mlx5: Lag, fix a potential Oops with mlx5_lag_create_definer() gve: fix unmatched u64_stats_update_end() net: ethernet: lantiq_etop: Fix compilation error selftests: forwarding: Fix packet matching in mirroring selftests vsock: prevent unnecessary refcnt inc for nonblocking connect net: marvell: mvpp2: Fix wrong SerDes reconfiguration order net: ethernet: ti: cpsw_ale: Fix access to un-initialized memory net: stmmac: allow a tc-taprio base-time of zero selftests: net: test_vxlan_under_vrf: fix HV connectivity test net: hns3: allow configure ETS bandwidth of all TCs net: hns3: remove check VF uc mac exist when set by PF net: hns3: fix some mac statistics is always 0 in device version V2 net: hns3: fix kernel crash when unload VF while it is being reset net: hns3: sync rx ring head in echo common pull net: hns3: fix pfc packet number incorrect after querying pfc parameters ...	2021-11-11 09:49:36 -08:00
Kumar Kartikeya Dwivedi	3a74ac2d11	libbpf: Compile using -std=gnu89 The minimum supported C standard version is C89, with use of GNU extensions, hence make sure to catch any instances that would break the build for this mode by passing -std=gnu89. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211105234243.390179-4-memxor@gmail.com	2021-11-09 13:27:52 -08:00
Andrii Nakryiko	8f7b239ea8	libbpf: Free up resources used by inner map definition It's not enough to just free(map->inner_map), as inner_map itself can have extra memory allocated, like map name. Fixes: `646f02ffdd` ("libbpf: Add BTF-defined map-in-map support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com> Link: https://lore.kernel.org/bpf/20211107165521.9240-3-andrii@kernel.org	2021-11-07 09:14:15 -08:00
Andrii Nakryiko	5c5edcdebf	libbpf: Remove deprecation attribute from struct bpf_prog_prep_result This deprecation annotation has no effect because for struct deprecation attribute has to be declared after struct definition. But instead of moving it to the end of struct definition, remove it. When deprecation will go in effect at libbpf v0.7, this deprecation attribute will cause libbpf's own source code compilation to trigger deprecation warnings, which is unavoidable because libbpf still has to support that API. So keep deprecation of APIs, but don't mark structs used in API as deprecated. Fixes: `e21d585cb3` ("libbpf: Deprecate multi-instance bpf_program APIs") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20211103220845.2676888-8-andrii@kernel.org	2021-11-07 08:34:23 -08:00
Andrii Nakryiko	bcc40fc002	libbpf: Stop using to-be-deprecated APIs Remove all the internal uses of libbpf APIs that are slated to be deprecated in v0.7. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211103220845.2676888-6-andrii@kernel.org	2021-11-07 08:34:23 -08:00
Andrii Nakryiko	e32660ac6f	libbpf: Remove internal use of deprecated bpf_prog_load() variants Remove all the internal uses of bpf_load_program_xattr(), which is slated for deprecation in v0.7. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211103220845.2676888-5-andrii@kernel.org	2021-11-07 08:34:23 -08:00
Andrii Nakryiko	d10ef2b825	libbpf: Unify low-level BPF_PROG_LOAD APIs into bpf_prog_load() Add a new unified OPTS-based low-level API for program loading, bpf_prog_load() ([0]). bpf_prog_load() accepts few "mandatory" parameters as input arguments (program type, name, license, instructions) and all the other optional (as in not required to specify for all types of BPF programs) fields into struct bpf_prog_load_opts. This makes all the other non-extensible APIs variant for BPF_PROG_LOAD obsolete and they are slated for deprecation in libbpf v0.7: - bpf_load_program(); - bpf_load_program_xattr(); - bpf_verify_program(). Implementation-wise, internal helper libbpf__bpf_prog_load is refactored to become a public bpf_prog_load() API. struct bpf_prog_load_params used internally is replaced by public struct bpf_prog_load_opts. Unfortunately, while conceptually all this is pretty straightforward, the biggest complication comes from the already existing bpf_prog_load() high-level API, which has nothing to do with BPF_PROG_LOAD command. We try really hard to have a new API named bpf_prog_load(), though, because it maps naturally to BPF_PROG_LOAD command. For that, we rename old bpf_prog_load() into bpf_prog_load_deprecated() and mark it as COMPAT_VERSION() for shared library users compiled against old version of libbpf. Statically linked users and shared lib users compiled against new version of libbpf headers will get "rerouted" to bpf_prog_deprecated() through a macro helper that decides whether to use new or old bpf_prog_load() based on number of input arguments (see ___libbpf_overload in libbpf_common.h). To test that existing bpf_prog_load()-using code compiles and works as expected, I've compiled and ran selftests as is. I had to remove (locally) selftest/bpf/Makefile -Dbpf_prog_load=bpf_prog_test_load hack because it was conflicting with the macro-based overload approach. I don't expect anyone else to do something like this in practice, though. This is testing-specific way to replace bpf_prog_load() calls with special testing variant of it, which adds extra prog_flags value. After testing I kept this selftests hack, but ensured that we use a new bpf_prog_load_deprecated name for this. This patch also marks bpf_prog_load() and bpf_prog_load_xattr() as deprecated. bpf_object interface has to be used for working with struct bpf_program. Libbpf doesn't support loading just a bpf_program. The silver lining is that when we get to libbpf 1.0 all these complication will be gone and we'll have one clean bpf_prog_load() low-level API with no backwards compatibility hackery surrounding it. [0] Closes: https://github.com/libbpf/libbpf/issues/284 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211103220845.2676888-4-andrii@kernel.org	2021-11-07 08:34:23 -08:00
Andrii Nakryiko	45493cbaf5	libbpf: Pass number of prog load attempts explicitly Allow to control number of BPF_PROG_LOAD attempts from outside the sys_bpf_prog_load() helper. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20211103220845.2676888-3-andrii@kernel.org	2021-11-07 08:34:23 -08:00
Andrii Nakryiko	be80e9cdbc	libbpf: Rename DECLARE_LIBBPF_OPTS into LIBBPF_OPTS It's confusing that libbpf-provided helper macro doesn't start with LIBBPF. Also "declare" vs "define" is confusing terminology, I can never remember and always have to look up previous examples. Bypass both issues by renaming DECLARE_LIBBPF_OPTS into a short and clean LIBBPF_OPTS. To avoid breaking existing code, provide: #define DECLARE_LIBBPF_OPTS LIBBPF_OPTS in libbpf_legacy.h. We can decide later if we ever want to remove it or we'll keep it forever because it doesn't add any maintainability burden. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20211103220845.2676888-2-andrii@kernel.org	2021-11-07 08:34:22 -08:00
Andrii Nakryiko	b8b5cb55f5	libbpf: Fix non-C89 loop variable declaration in gen_loader.c Fix the `int i` declaration inside the for statement. This is non-C89 compliant. See [0] for user report breaking BCC build. [0] https://github.com/libbpf/libbpf/issues/403 Fixes: `18f4fccbf3` ("libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/20211105191055.3324874-1-andrii@kernel.org	2021-11-06 16:16:52 -07:00
Arnaldo Carvalho de Melo	7f9f879243	Merge remote-tracking branch 'torvalds/master' into perf/core To pick up some tools/perf/ patches that went via tip/perf/core, such as: tools/perf: Add mem_hops field in perf_mem_data_src structure Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-11-06 15:49:33 -03:00
Mehrdad Arshad Rad	64165ddf8e	libbpf: Fix lookup_and_delete_elem_flags error reporting Fix bpf_map_lookup_and_delete_elem_flags() to pass the return code through libbpf_err_errno() as we do similarly in bpf_map_lookup_and_delete_elem(). Fixes: `f12b654327` ("libbpf: Streamline error reporting for low-level APIs") Signed-off-by: Mehrdad Arshad Rad <arshad.rad@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211104171354.11072-1-arshad.rad@gmail.com	2021-11-05 16:26:08 +01:00
Andrii Nakryiko	be2f2d1680	libbpf: Deprecate bpf_program__load() API Mark bpf_program__load() as deprecated ([0]) since v0.6. Also rename few internal program loading bpf_object helper functions to have more consistent naming. [0] Closes: https://github.com/libbpf/libbpf/issues/301 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211103051449.1884903-1-andrii@kernel.org	2021-11-03 14:29:56 -07:00
Andrii Nakryiko	b7332d2820	libbpf: Improve ELF relo sanitization Add few sanity checks for relocations to prevent div-by-zero and out-of-bounds array accesses in libbpf. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211103173213.1376990-6-andrii@kernel.org	2021-11-03 13:25:37 -07:00
Andrii Nakryiko	0d6988e16a	libbpf: Fix section counting logic e_shnum does include section #0 and as such is exactly the number of ELF sections that we need to allocate memory for to use section indices as array indices. Fix the off-by-one error. This is purely accounting fix, previously we were overallocating one too many array items. But no correctness errors otherwise. Fixes: `25bbbd7a44` ("libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211103173213.1376990-5-andrii@kernel.org	2021-11-03 13:25:37 -07:00
Andrii Nakryiko	62554d52e7	libbpf: Validate that .BTF and .BTF.ext sections contain data .BTF and .BTF.ext ELF sections should have SHT_PROGBITS type and contain data. If they are not, ELF is invalid or corrupted, so bail out. Otherwise this can lead to data->d_buf being NULL and SIGSEGV later on. Reported by oss-fuzz project. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211103173213.1376990-4-andrii@kernel.org	2021-11-03 13:25:37 -07:00
Andrii Nakryiko	88918dc12d	libbpf: Improve sanity checking during BTF fix up If BTF is corrupted DATASEC's variable type ID might be incorrect. Prevent this easy to detect situation with extra NULL check. Reported by oss-fuzz project. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211103173213.1376990-3-andrii@kernel.org	2021-11-03 13:25:37 -07:00
Andrii Nakryiko	833907876b	libbpf: Detect corrupted ELF symbols section Prevent divide-by-zero if ELF is corrupted and has zero sh_entsize. Reported by oss-fuzz project. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211103173213.1376990-2-andrii@kernel.org	2021-11-03 13:25:37 -07:00
Dave Marchevsky	f5aafbc2af	libbpf: Deprecate bpf_program__get_prog_info_linear As part of the road to libbpf 1.0, and discussed in libbpf issue tracker [0], bpf_program__get_prog_info_linear and its associated structs and helper functions should be deprecated. The functionality is too specific to the needs of 'perf', and there's little/no out-of-tree usage to preclude introduction of a more general helper in the future. [0] Closes: https://github.com/libbpf/libbpf/issues/313 Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211101224357.2651181-5-davemarchevsky@fb.com	2021-11-03 11:25:36 -07:00
Jakub Kicinski	b7b98f8689	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-11-01 We've added 181 non-merge commits during the last 28 day(s) which contain a total of 280 files changed, 11791 insertions(+), 5879 deletions(-). The main changes are: 1) Fix bpf verifier propagation of 64-bit bounds, from Alexei. 2) Parallelize bpf test_progs, from Yucong and Andrii. 3) Deprecate various libbpf apis including af_xdp, from Andrii, Hengqi, Magnus. 4) Improve bpf selftests on s390, from Ilya. 5) bloomfilter bpf map type, from Joanne. 6) Big improvements to JIT tests especially on Mips, from Johan. 7) Support kernel module function calls from bpf, from Kumar. 8) Support typeless and weak ksym in light skeleton, from Kumar. 9) Disallow unprivileged bpf by default, from Pawan. 10) BTF_KIND_DECL_TAG support, from Yonghong. 11) Various bpftool cleanups, from Quentin. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (181 commits) libbpf: Deprecate AF_XDP support kbuild: Unify options for BTF generation for vmlinux and modules selftests/bpf: Add a testcase for 64-bit bounds propagation issue. bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit. bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off. selftests/bpf: Fix also no-alu32 strobemeta selftest bpf: Add missing map_delete_elem method to bloom filter map selftests/bpf: Add bloom map success test for userspace calls bpf: Add alignment padding for "map_extra" + consolidate holes bpf: Bloom filter map naming fixups selftests/bpf: Add test cases for struct_ops prog bpf: Add dummy BPF STRUCT_OPS for test purpose bpf: Factor out helpers for ctx access checking bpf: Factor out a helper to prepare trampoline for struct_ops prog selftests, bpf: Fix broken riscv build riscv, libbpf: Add RISC-V (RV64) support to bpf_tracing.h tools, build: Add RISC-V to HOSTARCH parsing riscv, bpf: Increase the maximum number of iterations selftests, bpf: Add one test for sockmap with strparser selftests, bpf: Fix test_txmsg_ingress_parser error ... ==================== Link: https://lore.kernel.org/r/20211102013123.9005-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-11-01 19:59:46 -07:00
Magnus Karlsson	0b170456e0	libbpf: Deprecate AF_XDP support Deprecate AF_XDP support in libbpf ([0]). This has been moved to libxdp as it is a better fit for that library. The AF_XDP support only uses the public libbpf functions and can therefore just use libbpf as a library from libxdp. The libxdp APIs are exactly the same so it should just be linking with libxdp instead of libbpf for the AF_XDP functionality. If not, please submit a bug report. Linking with both libraries is supported but make sure you link in the correct order so that the new functions in libxdp are used instead of the deprecated ones in libbpf. Libxdp can be found at https://github.com/xdp-project/xdp-tools. [0] Closes: https://github.com/libbpf/libbpf/issues/270 Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20211029090111.4733-1-magnus.karlsson@gmail.com	2021-11-01 18:12:44 -07:00
Björn Töpel	589fed479b	riscv, libbpf: Add RISC-V (RV64) support to bpf_tracing.h Add macros for 64-bit RISC-V PT_REGS to bpf_tracing.h. Signed-off-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211028161057.520552-4-bjorn@kernel.org	2021-11-01 17:08:21 +01:00
Kumar Kartikeya Dwivedi	92274e24b0	libbpf: Use O_CLOEXEC uniformly when opening fds There are some instances where we don't use O_CLOEXEC when opening an fd, fix these up. Otherwise, it is possible that a parallel fork causes these fds to leak into a child process on execve. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211028063501.2239335-6-memxor@gmail.com	2021-10-28 16:30:07 -07:00
Kumar Kartikeya Dwivedi	549a632386	libbpf: Ensure that BPF syscall fds are never 0, 1, or 2 Add a simple wrapper for passing an fd and getting a new one >= 3 if it is one of 0, 1, or 2. There are two primary reasons to make this change: First, libbpf relies on the assumption a certain BPF fd is never 0 (e.g. most recently noticed in [0]). Second, Alexei pointed out in [1] that some environments reset stdin, stdout, and stderr if they notice an invalid fd at these numbers. To protect against both these cases, switch all internal BPF syscall wrappers in libbpf to always return an fd >= 3. We only need to modify the syscall wrappers and not other code that assumes a valid fd by doing >= 0, to avoid pointless churn, and because it is still a valid assumption. The cost paid is two additional syscalls if fd is in range [0, 2]. [0]: `e31eec77e4` ("bpf: selftests: Fix fd cleanup in get_branch_snapshot") [1]: https://lore.kernel.org/bpf/CAADnVQKVKY8o_3aU8Gzke443+uHa-eGoM0h7W4srChMXU1S4Bg@mail.gmail.com Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211028063501.2239335-5-memxor@gmail.com	2021-10-28 16:30:07 -07:00
Kumar Kartikeya Dwivedi	585a357198	libbpf: Add weak ksym support to gen_loader This extends existing ksym relocation code to also support relocating weak ksyms. Care needs to be taken to zero out the src_reg (currently BPF_PSEUOD_BTF_ID, always set for gen_loader by bpf_object__relocate_data) when the BTF ID lookup fails at runtime. This is not a problem for libbpf as it only sets ext->is_set when BTF ID lookup succeeds (and only proceeds in case of failure if ext->is_weak, leading to src_reg remaining as 0 for weak unresolved ksym). A pattern similar to emit_relo_kfunc_btf is followed of first storing the default values and then jumping over actual stores in case of an error. For src_reg adjustment, we also need to perform it when copying the populated instruction, so depending on if copied insn[0].imm is 0 or not, we decide to jump over the adjustment. We cannot reach that point unless the ksym was weak and resolved and zeroed out, as the emit_check_err will cause us to jump to cleanup label, so we do not need to recheck whether the ksym is weak before doing the adjustment after copying BTF ID and BTF FD. This is consistent with how libbpf relocates weak ksym. Logging statements are added to show the relocation result and aid debugging. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211028063501.2239335-4-memxor@gmail.com	2021-10-28 16:30:06 -07:00
Kumar Kartikeya Dwivedi	c24941cd37	libbpf: Add typeless ksym support to gen_loader This uses the bpf_kallsyms_lookup_name helper added in previous patches to relocate typeless ksyms. The return value ENOENT can be ignored, and the value written to 'res' can be directly stored to the insn, as it is overwritten to 0 on lookup failure. For repeating symbols, we can simply copy the previously populated bpf_insn. Also, we need to take care to not close fds for typeless ksym_desc, so reuse the 'off' member's space to add a marker for typeless ksym and use that to skip them in cleanup_relos. We add a emit_ksym_relo_log helper that avoids duplicating common logging instructions between typeless and weak ksym (for future commit). Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211028063501.2239335-3-memxor@gmail.com	2021-10-28 16:30:06 -07:00
Joanne Koong	47512102cd	libbpf: Add "map_extra" as a per-map-type extra flag This patch adds the libbpf infrastructure for supporting a per-map-type "map_extra" field, whose definition will be idiosyncratic depending on map type. For example, for the bloom filter map, the lower 4 bits of map_extra is used to denote the number of hash functions. Please note that until libbpf 1.0 is here, the "bpf_create_map_params" struct is used as a temporary means for propagating the map_extra field to the kernel. Signed-off-by: Joanne Koong <joannekoong@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211027234504.30744-3-joannekoong@fb.com	2021-10-28 13:22:49 -07:00
Joe Burton	689624f037	libbpf: Deprecate bpf_objects_list Add a flag to `enum libbpf_strict_mode' to disable the global `bpf_objects_list', preventing race conditions when concurrent threads call bpf_object__open() or bpf_object__close(). bpf_object__next() will return NULL if this option is set. Callers may achieve the same workflow by tracking bpf_objects in application code. [0] Closes: https://github.com/libbpf/libbpf/issues/293 Signed-off-by: Joe Burton <jevburton@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211026223528.413950-1-jevburton.kernel@gmail.com	2021-10-27 11:00:12 -07:00
Arnaldo Carvalho de Melo	3a55445f11	Merge remote-tracking branch 'torvalds/master' into perf/core To pick up the fixes from upstream. Fix simple conflict on session.c related to the file position fix that went upstream and is touched by the active decomp changes in perf/core. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-10-26 11:03:02 -03:00
Ilya Leoshkevich	3930198dc9	libbpf: Use __BYTE_ORDER__ Use the compiler-defined __BYTE_ORDER__ instead of the libc-defined __BYTE_ORDER for consistency. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211026010831.748682-3-iii@linux.ibm.com	2021-10-25 20:39:41 -07:00
Ilya Leoshkevich	45f2bebc80	libbpf: Fix endianness detection in BPF_CORE_READ_BITFIELD_PROBED() __BYTE_ORDER is supposed to be defined by a libc, and __BYTE_ORDER__ - by a compiler. bpf_core_read.h checks __BYTE_ORDER == __LITTLE_ENDIAN, which is true if neither are defined, leading to incorrect behavior on big-endian hosts if libc headers are not included, which is often the case. Fixes: `ee26dade0e` ("libbpf: Add support for relocatable bitfields") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211026010831.748682-2-iii@linux.ibm.com	2021-10-25 20:39:41 -07:00
Andrii Nakryiko	c4813e969a	libbpf: Deprecate ambiguously-named bpf_program__size() API The name of the API doesn't convey clearly that this size is in number of bytes (there needed to be a separate comment to make this clear in libbpf.h). Further, measuring the size of BPF program in bytes is not exactly the best fit, because BPF programs always consist of 8-byte instructions. As such, bpf_program__insn_cnt() is a better alternative in pretty much any imaginable case. So schedule bpf_program__size() deprecation starting from v0.7 and it will be removed in libbpf 1.0. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211025224531.1088894-5-andrii@kernel.org	2021-10-25 18:37:21 -07:00
Andrii Nakryiko	e21d585cb3	libbpf: Deprecate multi-instance bpf_program APIs Schedule deprecation of a set of APIs that are related to multi-instance bpf_programs: - bpf_program__set_prep() ([0]); - bpf_program__{set,unset}_instance() ([1]); - bpf_program__nth_fd(). These APIs are obscure, very niche, and don't seem to be used much in practice. bpf_program__set_prep() is pretty useless for anything but the simplest BPF programs, as it doesn't allow to adjust BPF program load attributes, among other things. In short, it already bitrotted and will bitrot some more if not removed. With bpf_program__insns() API, which gives access to post-processed BPF program instructions of any given entry-point BPF program, it's now possible to do whatever necessary adjustments were possible with set_prep() API before, but also more. Given any such use case is automatically an advanced use case, requiring users to stick to low-level bpf_prog_load() APIs and managing their own prog FDs is reasonable. [0] Closes: https://github.com/libbpf/libbpf/issues/299 [1] Closes: https://github.com/libbpf/libbpf/issues/300 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211025224531.1088894-4-andrii@kernel.org	2021-10-25 18:37:21 -07:00
Andrii Nakryiko	65a7fa2e4e	libbpf: Add ability to fetch bpf_program's underlying instructions Add APIs providing read-only access to bpf_program BPF instructions ([0]). This is useful for diagnostics purposes, but it also allows a cleaner support for cloning BPF programs after libbpf did all the FD resolution and CO-RE relocations, subprog instructions appending, etc. Currently, cloning BPF program is possible only through hijacking a half-broken bpf_program__set_prep() API, which doesn't really work well for anything but most primitive programs. For instance, set_prep() API doesn't allow adjusting BPF program load parameters which are necessary for loading fentry/fexit BPF programs (the case where BPF program cloning is a necessity if doing some sort of mass-attachment functionality). Given bpf_program__set_prep() API is set to be deprecated, having a cleaner alternative is a must. libbpf internally already keeps track of linear array of struct bpf_insn, so it's not hard to expose it. The only gotcha is that libbpf previously freed instructions array during bpf_object load time, which would make this API much less useful overall, because in between bpf_object__open() and bpf_object__load() a lot of changes to instructions are done by libbpf. So this patch makes libbpf hold onto prog->insns array even after BPF program loading. I think this is a small price for added functionality and improved introspection of BPF program code. See retsnoop PR ([1]) for how it can be used in practice and code savings compared to relying on bpf_program__set_prep(). [0] Closes: https://github.com/libbpf/libbpf/issues/298 [1] https://github.com/anakryiko/retsnoop/pull/1 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211025224531.1088894-3-andrii@kernel.org	2021-10-25 18:37:21 -07:00
Andrii Nakryiko	de5d0dcef6	libbpf: Fix off-by-one bug in bpf_core_apply_relo() Fix instruction index validity check which has off-by-one error. Fixes: `3ee4f53355` ("libbpf: Split bpf_core_apply_relo() into bpf_program independent helper.") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211025224531.1088894-2-andrii@kernel.org	2021-10-25 18:37:21 -07:00
Andrii Nakryiko	c825f5fee1	libbpf: Fix BTF header parsing checks Original code assumed fixed and correct BTF header length. That's not always the case, though, so fix this bug with a proper additional check. And use actual header length instead of sizeof(struct btf_header) in sanity checks. Fixes: `8a138aed4a` ("bpf: btf: Add BTF support to libbpf") Reported-by: Evgeny Vereshchagin <evvers@ya.ru> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211023003157.726961-2-andrii@kernel.org	2021-10-22 17:33:31 -07:00
Andrii Nakryiko	5245dafe3d	libbpf: Fix overflow in BTF sanity checks btf_header's str_off+str_len or type_off+type_len can overflow as they are u32s. This will lead to bypassing the sanity checks during BTF parsing, resulting in crashes afterwards. Fix by using 64-bit signed integers for comparison. Fixes: `d812362450` ("libbpf: Fix BTF data layout checks and allow empty BTF") Reported-by: Evgeny Vereshchagin <evvers@ya.ru> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211023003157.726961-1-andrii@kernel.org	2021-10-22 17:33:31 -07:00
Stanislav Fomichev	a77f879ba1	libbpf: Use func name when pinning programs with LIBBPF_STRICT_SEC_NAME We can't use section name anymore because they are not unique and pinning objects with multiple programs with the same progtype/secname will fail. [0] Closes: https://github.com/libbpf/libbpf/issues/273 Fixes: `33a2c75c55` ("libbpf: add internal pin_name") Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20211021214814.1236114-2-sdf@google.com	2021-10-22 16:53:11 -07:00
Hengqi Chen	6a886de070	libbpf: Add btf__type_cnt() and btf__raw_data() APIs Add btf__type_cnt() and btf__raw_data() APIs and deprecate btf__get_nr_type() and btf__get_raw_data() since the old APIs don't follow the libbpf naming convention for getters which omit 'get' in the name (see [0]). btf__raw_data() is just an alias to the existing btf__get_raw_data(). btf__type_cnt() now returns the number of all types of the BTF object including 'void'. [0] Closes: https://github.com/libbpf/libbpf/issues/279 Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211022130623.1548429-2-hengqi.chen@gmail.com	2021-10-22 16:09:14 -07:00
Mauricio Vásquez	1000298c76	libbpf: Fix memory leak in btf__dedup() Free btf_dedup if btf_ensure_modifiable() returns error. Fixes: `919d2b1dbb` ("libbpf: Allow modification of BTF and add btf__add_str API") Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211022202035.48868-1-mauricio@kinvolk.io	2021-10-22 16:00:53 -07:00
Andrii Nakryiko	fae1b05e6f	libbpf: Fix the use of aligned attribute Building libbpf sources out of kernel tree (in Github repo) we run into compilation error due to unknown __aligned attribute. It must be coming from some kernel header, which is not available to Github sources. Use explicit __attribute__((aligned(16))) instead. Fixes: `961632d541` ("libbpf: Fix dumping non-aligned __int128") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211022192502.2975553-1-andrii@kernel.org	2021-10-22 14:24:36 -07:00
David S. Miller	bdfa75ad70	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Lots of simnple overlapping additions. With a build fix from Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-22 11:41:16 +01:00
Andrii Nakryiko	26071635ac	libbpf: Simplify look up by name of internal maps Map name that's assigned to internal maps (.rodata, .data, .bss, etc) consist of a small prefix of bpf_object's name and ELF section name as a suffix. This makes it hard for users to "guess" the name to use for looking up by name with bpf_object__find_map_by_name() API. One proposal was to drop object name prefix from the map name and just use ".rodata", ".data", etc, names. One downside called out was that when multiple BPF applications are active on the host, it will be hard to distinguish between multiple instances of .rodata and know which BPF object (app) they belong to. Having few first characters, while quite limiting, still can give a bit of a clue, in general. Note, though, that btf_value_type_id for such global data maps (ARRAY) points to DATASEC type, which encodes full ELF name, so tools like bpftool can take advantage of this fact to "recover" full original name of the map. This is also the reason why for custom .data.* and .rodata.* maps libbpf uses only their ELF names and doesn't prepend object name at all. Another downside of such approach is that it is not backwards compatible and, among direct use of bpf_object__find_map_by_name() API, will break any BPF skeleton generated using bpftool that was compiled with older libbpf version. Instead of causing all this pain, libbpf will still generate map name using a combination of object name and ELF section name, but it will allow looking such maps up by their natural names, which correspond to their respective ELF section names. This means non-truncated ELF section names longer than 15 characters are going to be expected and supported. With such set up, we get the best of both worlds: leave small bits of a clue about BPF application that instantiated such maps, as well as making it easy for user apps to lookup such maps at runtime. In this sense it closes corresponding libbpf 1.0 issue ([0]). BPF skeletons will continue using full names for lookups. [0] Closes: https://github.com/libbpf/libbpf/issues/275 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211021014404.2635234-10-andrii@kernel.org	2021-10-21 17:10:10 -07:00
Andrii Nakryiko	aed659170a	libbpf: Support multiple .rodata.* and .data.* BPF maps Add support for having multiple .rodata and .data data sections ([0]). .rodata/.data are supported like the usual, but now also .rodata.<whatever> and .data.<whatever> are also supported. Each such section will get its own backing BPF_MAP_TYPE_ARRAY, just like .rodata and .data. Multiple .bss maps are not supported, as the whole '.bss' name is confusing and might be deprecated soon, as well as user would need to specify custom ELF section with SEC() attribute anyway, so might as well stick to just .data.* and .rodata.* convention. User-visible map name for such new maps is going to be just their ELF section names. [0] https://github.com/libbpf/libbpf/issues/274 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211021014404.2635234-8-andrii@kernel.org	2021-10-21 17:10:10 -07:00
Andrii Nakryiko	25bbbd7a44	libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps Remove internal libbpf assumption that there can be only one .rodata, .data, and .bss map per BPF object. To achieve that, extend and generalize the scheme that was used for keeping track of relocation ELF sections. Now each ELF section has a temporary extra index that keeps track of logical type of ELF section (relocations, data, read-only data, BSS). Switch relocation to this scheme, as well as .rodata/.data/.bss handling. We don't yet allow multiple .rodata, .data, and .bss sections, but no libbpf internal code makes an assumption that there can be only one of each and thus they can be explicitly referenced by a single index. Next patches will actually allow multiple .rodata and .data sections. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211021014404.2635234-5-andrii@kernel.org	2021-10-21 17:10:10 -07:00
Andrii Nakryiko	ad23b72384	libbpf: Use Elf64-specific types explicitly for dealing with ELF Minimize the usage of class-agnostic gelf_xxx() APIs from libelf. These APIs require copying ELF data structures into local GElf_xxx structs and have a more cumbersome API. BPF ELF file is defined to be always 64-bit ELF object, even when intended to be run on 32-bit host architectures, so there is no need to do class-agnostic conversions everywhere. BPF static linker implementation within libbpf has been using Elf64-specific types since initial implementation. Add two simple helpers, elf_sym_by_idx() and elf_rel_by_idx(), for more succinct direct access to ELF symbol and relocation records within ELF data itself and switch all the GElf_xxx usage into Elf64_xxx equivalents. The only remaining place within libbpf.c that's still using gelf API is gelf_getclass(), as there doesn't seem to be a direct way to get underlying ELF bitness. No functional changes intended. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211021014404.2635234-4-andrii@kernel.org	2021-10-21 17:10:10 -07:00
Andrii Nakryiko	29a30ff501	libbpf: Extract ELF processing state into separate struct Name currently anonymous internal struct that keeps ELF-related state for bpf_object. Just a bit of clean up, no functional changes. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211021014404.2635234-3-andrii@kernel.org	2021-10-21 17:10:10 -07:00
Andrii Nakryiko	b96c07f3b5	libbpf: Deprecate btf__finalize_data() and move it into libbpf.c There isn't a good use case where anyone but libbpf itself needs to call btf__finalize_data(). It was implemented for internal use and it's not clear why it was made into public API in the first place. To function, it requires active ELF data, which is stored inside bpf_object for the duration of opening phase only. But the only BTF that needs bpf_object's ELF is that bpf_object's BTF itself, which libbpf fixes up automatically during bpf_object__open() operation anyways. There is no need for any additional fix up and no reasonable scenario where it's useful and appropriate. Thus, btf__finalize_data() is just an API atavism and is better removed. So this patch marks it as deprecated immediately (v0.6+) and moves the code from btf.c into libbpf.c where it's used in the context of bpf_object opening phase. Such code co-location allows to make code structure more straightforward and remove bpf_object__section_size() and bpf_object__variable_offset() internal helpers from libbpf_internal.h, making them static. Their naming is also adjusted to not create a wrong illusion that they are some sort of method of bpf_object. They are internal helpers and are called appropriately. This is part of libbpf 1.0 effort ([0]). [0] Closes: https://github.com/libbpf/libbpf/issues/276 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211021014404.2635234-2-andrii@kernel.org	2021-10-21 17:10:10 -07:00
Ilya Leoshkevich	632f96d265	libbpf: Fix ptr_is_aligned() usages Currently ptr_is_aligned() takes size, and not alignment, as a parameter, which may be overly pessimistic e.g. for __i128 on s390, which must be only 8-byte aligned. Fix by using btf__align_of(). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211021104658.624944-2-iii@linux.ibm.com	2021-10-21 15:50:04 -07:00
Ilya Leoshkevich	961632d541	libbpf: Fix dumping non-aligned __int128 Non-aligned integers are dumped as bitfields, which is supported for at most 64-bit integers. Fix by using the same trick as btf_dump_float_data(): copy non-aligned values to the local buffer. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211013160902.428340-4-iii@linux.ibm.com	2021-10-20 11:40:45 -07:00
Ilya Leoshkevich	c9e982b879	libbpf: Fix dumping big-endian bitfields On big-endian arches not only bytes, but also bits are numbered in reverse order (see e.g. S/390 ELF ABI Supplement, but this is also true for other big-endian arches as well). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211013160902.428340-3-iii@linux.ibm.com	2021-10-20 11:40:01 -07:00
Dave Marchevsky	ebc7b50a38	libbpf: Migrate internal use of bpf_program__get_prog_info_linear In preparation for bpf_program__get_prog_info_linear deprecation, move the single use in libbpf to call bpf_obj_get_info_by_fd directly. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211011082031.4148337-2-davemarchevsky@fb.com	2021-10-20 10:35:49 -07:00
Adrian Hunter	6175047358	perf tools: Add support for PERF_RECORD_AUX_OUTPUT_HW_ID The PERF_RECORD_AUX_OUTPUT_HW_ID event provides a way to match AUX output data like Intel PT PEBS-via-PT back to the event that it came from, by providing a hardware ID that is present in the AUX output. Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Link: http://lore.kernel.org/lkml/20210907163903.11820-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-10-20 11:22:27 -03:00
Ian Rogers	92ec3cc94c	tools lib: Adopt list_sort() from the kernel sources Add list_sort.[ch] from the main kernel tree. The linux/bug.h #include is removed due to conflicting definitions. Add check-headers and modify perf build accordingly. MANIFEST and python-ext-sources fixes suggested by Arnaldo. Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrew Kilroy <andrew.kilroy@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Changbin Du <changbin.du@intel.com> Cc: Denys Zagorui <dzagorui@cisco.com> Cc: Fabian Hemmer <copy@copy.sh> Cc: Felix Fietkau <nbd@nbd.name> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joakim Zhang <qiangqing.zhang@nxp.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kees Kook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nicholas Fraser <nfraser@codeweavers.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: ShihCheng Tu <mrtoastcheng@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Wan Jiabing <wanjiabing@vivo.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20211015172132.1162559-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-10-20 10:30:59 -03:00
Quentin Monnet	d51b6b2287	libbpf: Remove Makefile warnings on out-of-sync netlink.h/if_link.h Although relying on some definitions from the netlink.h and if_link.h headers copied into tools/include/uapi/linux/, libbpf does not need those headers to stay entirely up-to-date with their original versions, and the warnings emitted by the Makefile when it detects a difference are usually just noise. Let's remove those warnings. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211010002528.9772-1-quentin@isovalent.com	2021-10-19 16:33:10 -07:00
Yonghong Song	223f903e9c	bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG Patch set [1] introduced BTF_KIND_TAG to allow tagging declarations for struct/union, struct/union field, var, func and func arguments and these tags will be encoded into dwarf. They are also encoded to btf by llvm for the bpf target. After BTF_KIND_TAG is introduced, we intended to use it for kernel __user attributes. But kernel __user is actually a type attribute. Upstream and internal discussion showed it is not a good idea to mix declaration attribute and type attribute. So we proposed to introduce btf_type_tag as a type attribute and existing btf_tag renamed to btf_decl_tag ([2]). This patch renamed BTF_KIND_TAG to BTF_KIND_DECL_TAG and some other declarations with _tag to _decl_tag to make it clear the tag is for declaration. In the future, BTF_KIND_TYPE_TAG might be introduced per [3]. [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ [2] https://reviews.llvm.org/D111588 [3] https://reviews.llvm.org/D111199 Fixes: `b5ea834dde` ("bpf: Support for new btf kind BTF_KIND_TAG") Fixes: `5b84bd1036` ("libbpf: Add support for BTF_KIND_TAG") Fixes: `5c07f2fec0` ("bpftool: Add support for BTF_KIND_TAG") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com	2021-10-18 18:35:36 -07:00
Shunsuke Nakamura	3ff6d64e68	libperf tests: Fix test_stat_cpu The `cpu` argument of perf_evsel__read() must specify the cpu index. perf_cpu_map__for_each_cpu() is for iterating the cpu number (not index) and is thus not appropriate for use with perf_evsel__read(). So, if there is an offline CPU, the cpu number specified in the argument may point out of range because the cpu number and the cpu index are different. Fix test_stat_cpu(). Testing it: # make tests -C tools/lib/perf/ make: Entering directory '/home/nakamura/kernel_src/linux-5.15-rc4_fix/tools/lib/perf' running static: - running tests/test-cpumap.c...OK - running tests/test-threadmap.c...OK - running tests/test-evlist.c...OK - running tests/test-evsel.c...OK running dynamic: - running tests/test-cpumap.c...OK - running tests/test-threadmap.c...OK - running tests/test-evlist.c...OK - running tests/test-evsel.c...OK make: Leaving directory '/home/nakamura/kernel_src/linux-5.15-rc4_fix/tools/lib/perf' Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20211011083704.4108720-1-nakamura.shun@fujitsu.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-10-14 15:41:35 -03:00
Shunsuke Nakamura	f304c8d949	libperf test evsel: Fix build error on !x86 architectures In test_stat_user_read, following build error occurs except i386 and x86_64 architectures: tests/test-evsel.c:129:31: error: variable 'pc' set but not used [-Werror=unused-but-set-variable] struct perf_event_mmap_page *pc; Fix build error. Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20211006095703.477826-1-nakamura.shun@fujitsu.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-10-14 15:41:35 -03:00
Hou Tao	ccaf12d621	libbpf: Support detecting and attaching of writable tracepoint program Program on writable tracepoint is BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, but its attachment is the same as BPF_PROG_TYPE_RAW_TRACEPOINT. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211004094857.30868-3-hotforest@gmail.com	2021-10-08 13:22:57 -07:00
Quentin Monnet	b79c2ce3ba	libbpf: Skip re-installing headers file if source is older than target The "install_headers" target in libbpf's Makefile would unconditionally export all API headers to the target directory. When those headers are installed to compile another application, this means that make always finds newer dependencies for the source files relying on those headers, and deduces that the targets should be rebuilt. Avoid that by making "install_headers" depend on the source header files, and (re-)install them only when necessary. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211007194438.34443-2-quentin@isovalent.com	2021-10-08 11:47:34 -07:00
Riccardo Mancini	73e40c9bd4	libperf cpumap: Use binary search in perf_cpu_map__idx() as array are sorted Since `7074674e73` ("perf cpumap: Maintain cpumaps ordered and without dups") perf_cpu_map elements are sorted in ascending order. This patch improves perf_cpu_map__idx() by using a binary search. Signed-off-by: Riccardo Mancini <rickyman7@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/f1543c15797169c21e8b205a4a6751159180580d.1629490974.git.rickyman7@gmail.com [ Removed 'else' after if + return, declared variables where needed ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-10-08 11:29:16 -03:00
Jakub Kicinski	9fe1155233	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-07 15:24:06 -07:00
Hengqi Chen	2088a3a71d	libbpf: Deprecate bpf_{map,program}__{prev,next} APIs since v0.7 Deprecate bpf_{map,program}__{prev,next} APIs. Replace them with a new set of APIs named bpf_object__{prev,next}_{program,map} which follow the libbpf API naming convention ([0]). No functionality changes. [0] Closes: https://github.com/libbpf/libbpf/issues/296 Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211003165844.4054931-2-hengqi.chen@gmail.com	2021-10-06 12:34:02 -07:00
Hengqi Chen	4a404a7e8a	libbpf: Deprecate bpf_object__unload() API since v0.6 BPF objects are not reloadable after unload. Users are expected to use bpf_object__close() to unload and free up resources in one operation. No need to expose bpf_object__unload() as a public API, deprecate it ([0]). Add bpf_object__unload() as an alias to internal bpf_object_unload() and replace all bpf_object__unload() uses to avoid compilation errors. [0] Closes: https://github.com/libbpf/libbpf/issues/290 Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211002161000.3854559-1-hengqi.chen@gmail.com	2021-10-06 12:34:02 -07:00
Quentin Monnet	929bef4677	bpf: Use $(pound) instead of \# in Makefiles Recent-ish versions of make do no longer consider number signs ("#") as comment symbols when they are inserted inside of a macro reference or in a function invocation. In such cases, the symbols should not be escaped. There are a few occurrences of "\#" in libbpf's and samples' Makefiles. In the former, the backslash is harmless, because grep associates no particular meaning to the escaped symbol and reads it as a regular "#". In samples' Makefile, recent versions of make will pass the backslash down to the compiler, making the probe fail all the time and resulting in the display of a warning about "make headers_install" being required, even after headers have been installed. A similar issue has been addressed at some other locations by commit `9564a8cf42` ("Kbuild: fix # escaping in .cmd files for future Make"). Let's address it for libbpf's and samples' Makefiles in the same fashion, by using a "$(pound)" variable (pulled from tools/scripts/Makefile.include for libbpf, or re-defined for the samples). Reference for the change in make: https://git.savannah.gnu.org/cgit/make.git/commit/?id=c6966b323811c37acedff05b57 Fixes: `2f38304127` ("libbpf: Make libbpf_version.h non-auto-generated") Fixes: `07c3bbdb1a` ("samples: bpf: print a warning about headers_install") Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211006111049.20708-1-quentin@isovalent.com	2021-10-06 12:34:02 -07:00
Andrii Nakryiko	7ca6112159	libbpf: Add API that copies all BTF types from one BTF object to another Add a bulk copying api, btf__add_btf(), that speeds up and simplifies appending entire contents of one BTF object to another one, taking care of copying BTF type data, adjusting resulting BTF type IDs according to their new locations in the destination BTF object, as well as copying and deduplicating all the referenced strings and updating all the string offsets in new BTF types as appropriate. This API is intended to be used from tools that are generating and otherwise manipulating BTFs generically, such as pahole. In pahole's case, this API is useful for speeding up parallelized BTF encoding, as it allows pahole to offload all the intricacies of BTF type copying to libbpf and handle the parallelization aspects of the process. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/bpf/20211006051107.17921-2-andrii@kernel.org	2021-10-06 15:35:46 +02:00
Kumar Kartikeya Dwivedi	18f4fccbf3	libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations This change updates the BPF syscall loader to relocate BTF_KIND_FUNC relocations, with support for weak kfunc relocations. The general idea is to move map_fds to loader map, and also use the data for storing kfunc BTF fds. Since both reuse the fd_array parameter, they need to be kept together. For map_fds, we reserve MAX_USED_MAPS slots in a region, and for kfunc, we reserve MAX_KFUNC_DESCS. This is done so that insn->off has more chances of being <= INT16_MAX than treating data map as a sparse array and adding fd as needed. When the MAX_KFUNC_DESCS limit is reached, we fall back to the sparse array model, so that as long as it does remain <= INT16_MAX, we pass an index relative to the start of fd_array. We store all ksyms in an array where we try to avoid calling the bpf_btf_find_by_name_kind helper, and also reuse the BTF fd that was already stored. This also speeds up the loading process compared to emitting calls in all cases, in later tests. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211002011757.311265-9-memxor@gmail.com	2021-10-05 17:07:42 -07:00
Kumar Kartikeya Dwivedi	466b2e1397	libbpf: Resolve invalid weak kfunc calls with imm = 0, off = 0 Preserve these calls as it allows verifier to succeed in loading the program if they are determined to be unreachable after dead code elimination during program load. If not, the verifier will fail at runtime. This is done for ext->is_weak symbols similar to the case for variable ksyms. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211002011757.311265-8-memxor@gmail.com	2021-10-05 17:07:42 -07:00
Kumar Kartikeya Dwivedi	9dbe601563	libbpf: Support kernel module function calls This patch adds libbpf support for kernel module function call support. The fd_array parameter is used during BPF program load to pass module BTFs referenced by the program. insn->off is set to index into this array, but starts from 1, because insn->off as 0 is reserved for btf_vmlinux. We try to use existing insn->off for a module, since the kernel limits the maximum distinct module BTFs for kfuncs to 256, and also because index must never exceed the maximum allowed value that can fit in insn->off (INT16_MAX). In the future, if kernel interprets signed offset as unsigned for kfunc calls, this limit can be increased to UINT16_MAX. Also introduce a btf__find_by_name_kind_own helper to start searching from module BTF's start id when we know that the BTF ID is not present in vmlinux BTF (in find_ksym_btf_id). Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211002011757.311265-7-memxor@gmail.com	2021-10-05 17:07:42 -07:00
Jakub Kicinski	6b7b0c3091	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== bpf-next 2021-10-02 We've added 85 non-merge commits during the last 15 day(s) which contain a total of 132 files changed, 13779 insertions(+), 6724 deletions(-). The main changes are: 1) Massive update on test_bpf.ko coverage for JITs as preparatory work for an upcoming MIPS eBPF JIT, from Johan Almbladh. 2) Add a batched interface for RX buffer allocation in AF_XDP buffer pool, with driver support for i40e and ice from Magnus Karlsson. 3) Add legacy uprobe support to libbpf to complement recently merged legacy kprobe support, from Andrii Nakryiko. 4) Add bpf_trace_vprintk() as variadic printk helper, from Dave Marchevsky. 5) Support saving the register state in verifier when spilling <8byte bounded scalar to the stack, from Martin Lau. 6) Add libbpf opt-in for stricter BPF program section name handling as part of libbpf 1.0 effort, from Andrii Nakryiko. 7) Add a document to help clarifying BPF licensing, from Alexei Starovoitov. 8) Fix skel_internal.h to propagate errno if the loader indicates an internal error, from Kumar Kartikeya Dwivedi. 9) Fix build warnings with -Wcast-function-type so that the option can later be enabled by default for the kernel, from Kees Cook. 10) Fix libbpf to ignore STT_SECTION symbols in legacy map definitions as it otherwise errors out when encountering them, from Toke Høiland-Jørgensen. 11) Teach libbpf to recognize specialized maps (such as for perf RB) and internally remove BTF type IDs when creating them, from Hengqi Chen. 12) Various fixes and improvements to BPF selftests. ==================== Link: https://lore.kernel.org/r/20211002001327.15169-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-01 19:58:02 -07:00
Hengqi Chen	f731052325	libbpf: Support uniform BTF-defined key/value specification across all BPF maps A bunch of BPF maps do not support specifying BTF types for key and value. This is non-uniform and inconvenient[0]. Currently, libbpf uses a retry logic which removes BTF type IDs when BPF map creation failed. Instead of retrying, this commit recognizes those specialized maps and removes BTF type IDs when creating BPF map. [0] Closes: https://github.com/libbpf/libbpf/issues/355 Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210930161456.3444544-2-hengqi.chen@gmail.com	2021-10-01 15:31:50 -07:00
Andrii Nakryiko	b0e875bac0	libbpf: Fix memory leak in strset Free struct strset itself, not just its internal parts. Fixes: `90d76d3ece` ("libbpf: Extract internal set-of-strings datastructure APIs") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20211001185910.86492-1-andrii@kernel.org	2021-10-01 22:54:38 +02:00
Jakub Kicinski	dd9a887b35	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net drivers/net/phy/bcm7xxx.c `d88fd1b546` ("net: phy: bcm7xxx: Fixed indirect MMD operations") `f68d08c437` ("net: phy: bcm7xxx: Add EPHY entry for 72165") net/sched/sch_api.c `b193e15ac6` ("net: prevent user from passing illegal stab size") `69508d4333` ("net_sched: Use struct_size() and flex_array_size() helpers") Both cases trivial - adjacent code additions. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-09-30 14:49:21 -07:00
Kumar Kartikeya Dwivedi	4729445b47	libbpf: Fix segfault in light skeleton for objects without BTF When fed an empty BPF object, bpftool gen skeleton -L crashes at btf__set_fd() since it assumes presence of obj->btf, however for the sequence below clang adds no .BTF section (hence no BTF). Reproducer: $ touch a.bpf.c $ clang -O2 -g -target bpf -c a.bpf.c $ bpftool gen skeleton -L a.bpf.o /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) / / THIS FILE IS AUTOGENERATED! */ struct a_bpf { struct bpf_loader_ctx ctx; Segmentation fault (core dumped) The same occurs for files compiled without BTF info, i.e. without clang's -g flag. Fixes: `6723474373` (libbpf: Generate loader program out of BPF ELF file.) Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210930061634.1840768-1-memxor@gmail.com	2021-09-30 23:19:58 +02:00
Kumar Kartikeya Dwivedi	e68ac00827	libbpf: Fix skel_internal.h to set errno on loader retval < 0 When the loader indicates an internal error (result of a checked bpf system call), it returns the result in attr.test.retval. However, tests that rely on ASSERT_OK_PTR on NULL (returned from light skeleton) may miss that NULL denotes an error if errno is set to 0. This would result in skel pointer being NULL, while ASSERT_OK_PTR returning 1, leading to a SEGV on dereference of skel, because libbpf_get_error relies on the assumption that errno is always set in case of error for ptr == NULL. In particular, this was observed for the ksyms_module test. When executed using `./test_progs -t ksyms`, prior tests manipulated errno and the test didn't crash when it failed at ksyms_module load, while using `./test_progs -t ksyms_module` crashed due to errno being untouched. Fixes: `6723474373` (libbpf: Generate loader program out of BPF ELF file.) Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210927145941.1383001-11-memxor@gmail.com	2021-09-29 20:42:32 -07:00
Toke Høiland-Jørgensen	161ecd5379	libbpf: Properly ignore STT_SECTION symbols in legacy map definitions The previous patch to ignore STT_SECTION symbols only added the ignore condition in one of them. This fails if there's more than one map definition in the 'maps' section, because the subsequent modulus check will fail, resulting in error messages like: libbpf: elf: unable to determine legacy map definition size in ./xdpdump_xdp.o Fix this by also ignoring STT_SECTION in the first loop. Fixes: `c3e8c44a90` ("libbpf: Ignore STT_SECTION symbols in 'maps' section") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210929213837.832449-1-toke@redhat.com	2021-09-29 15:50:32 -07:00
Alexei Starovoitov	66fe332417	libbpf: Make gen_loader data aligned. Align gen_loader data to 8 byte boundary to make sure union bpf_attr, bpf_insns and other structs are aligned. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210927145941.1383001-9-memxor@gmail.com	2021-09-29 13:27:19 -07:00
Andrii Nakryiko	7c80c87ad5	selftests/bpf: Switch sk_lookup selftests to strict SEC("sk_lookup") use Update "sk_lookup/" definition to be a stand-alone type specifier, with backwards-compatible prefix match logic in non-libbpf-1.0 mode. Currently in selftests all the "sk_lookup/<whatever>" uses just use <whatever> for duplicated unique name encoding, which is redundant as BPF program's name (C function name) uniquely and descriptively identifies the intended use for such BPF programs. With libbpf's SEC_DEF("sk_lookup") definition updated, switch existing sk_lookup programs to use "unqualified" SEC("sk_lookup") section names, with no random text after it. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20210928161946.2512801-11-andrii@kernel.org	2021-09-28 13:51:20 -07:00
Andrii Nakryiko	dd94d45cf0	libbpf: Add opt-in strict BPF program section name handling logic Implement strict ELF section name handling for BPF programs. It utilizes `libbpf_set_strict_mode()` framework and adds new flag: LIBBPF_STRICT_SEC_NAME. If this flag is set, libbpf will enforce exact section name matching for a lot of program types that previously allowed just partial prefix match. E.g., if previously SEC("xdp_whatever_i_want") was allowed, now in strict mode only SEC("xdp") will be accepted, which makes SEC("") definitions cleaner and more structured. SEC() now won't be used as yet another way to uniquely encode BPF program identifier (for that C function name is better and is guaranteed to be unique within bpf_object). Now SEC() is strictly BPF program type and, depending on program type, extra load/attach parameter specification. Libbpf completely supports multiple BPF programs in the same ELF section, so multiple BPF programs of the same type/specification easily co-exist together within the same bpf_object scope. Additionally, a new (for now internal) convention is introduced: section name that can be a stand-alone exact BPF program type specificator, but also could have extra parameters after '/' delimiter. An example of such section is "struct_ops", which can be specified by itself, but also allows to specify the intended operation to be attached to, e.g., "struct_ops/dctcp_init". Note, that "struct_ops_some_op" is not allowed. Such section definition is specified as "struct_ops+". This change is part of libbpf 1.0 effort ([0], [1]). [0] Closes: https://github.com/libbpf/libbpf/issues/271 [1] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#stricter-and-more-uniform-bpf-program-section-name-sec-handling Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20210928161946.2512801-10-andrii@kernel.org	2021-09-28 13:51:19 -07:00
Andrii Nakryiko	d41ea045a6	libbpf: Complete SEC() table unification for BPF_APROG_SEC/BPF_EAPROG_SEC Complete SEC() table refactoring towards unified form by rewriting BPF_APROG_SEC and BPF_EAPROG_SEC definitions with SEC_DEF(SEC_ATTACHABLE_OPT) (for optional expected_attach_type) and SEC_DEF(SEC_ATTACHABLE) (mandatory expected_attach_type), respectively. Drop BPF_APROG_SEC, BPF_EAPROG_SEC, and BPF_PROG_SEC_IMPL macros after that, leaving SEC_DEF() macro as the only one used. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20210928161946.2512801-9-andrii@kernel.org	2021-09-28 13:51:19 -07:00
Andrii Nakryiko	15ea31fadd	libbpf: Refactor ELF section handler definitions Refactor ELF section handler definitions table to use a set of flags and unified SEC_DEF() macro. This allows for more succinct and table-like set of definitions, and allows to more easily extend the logic without adding more verbosity (this is utilized in later patches in the series). This approach is also making libbpf-internal program pre-load callback not rely on bpf_sec_def definition, which demonstrates that future pluggable ELF section handlers will be able to achieve similar level of integration without libbpf having to expose extra types and APIs. For starters, update SEC_DEF() definitions and make them more succinct. Also convert BPF_PROG_SEC() and BPF_APROG_COMPAT() definitions to a common SEC_DEF() use. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20210928161946.2512801-8-andrii@kernel.org	2021-09-28 13:51:19 -07:00
Andrii Nakryiko	13d35a0cf1	libbpf: Reduce reliance of attach_fns on sec_def internals Move closer to not relying on bpf_sec_def internals that won't be part of public API, when pluggable SEC() handlers will be allowed. Drop pre-calculated prefix length, and in various helpers don't rely on this prefix length availability. Also minimize reliance on knowing bpf_sec_def's prefix for few places where section prefix shortcuts are supported (e.g., tp vs tracepoint, raw_tp vs raw_tracepoint). Given checking some string for having a given string-constant prefix is such a common operation and so annoying to be done with pure C code, add a small macro helper, str_has_pfx(), and reuse it throughout libbpf.c where prefix comparison is performed. With __builtin_constant_p() it's possible to have a convenient helper that checks some string for having a given prefix, where prefix is either string literal (or compile-time known string due to compiler optimization) or just a runtime string pointer, which is quite convenient and saves a lot of typing and string literal duplication. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20210928161946.2512801-7-andrii@kernel.org	2021-09-28 13:51:19 -07:00
Andrii Nakryiko	12d9466d8b	libbpf: Refactor internal sec_def handling to enable pluggability Refactor internals of libbpf to allow adding custom SEC() handling logic easily from outside of libbpf. To that effect, each SEC()-handling registration sets mandatory program type/expected attach type for a given prefix and can provide three callbacks called at different points of BPF program lifetime: - init callback for right after bpf_program is initialized and prog_type/expected_attach_type is set. This happens during bpf_object__open() step, close to the very end of constructing bpf_object, so all the libbpf APIs for querying and updating bpf_program properties should be available; - pre-load callback is called right before BPF_PROG_LOAD command is called in the kernel. This callbacks has ability to set both bpf_program properties, as well as program load attributes, overriding and augmenting the standard libbpf handling of them; - optional auto-attach callback, which makes a given SEC() handler support auto-attachment of a BPF program through bpf_program__attach() API and/or BPF skeletons <skel>__attach() method. Each callbacks gets a `long cookie` parameter passed in, which is specified during SEC() handling. This can be used by callbacks to lookup whatever additional information is necessary. This is not yet completely ready to be exposed to the outside world, mainly due to non-public nature of struct bpf_prog_load_params. Instead of making it part of public API, we'll wait until the planned low-level libbpf API improvements for BPF_PROG_LOAD and other typical bpf() syscall APIs, at which point we'll have a public, probably OPTS-based, way to fully specify BPF program load parameters, which will be used as an interface for custom pre-load callbacks. But this change itself is already a good first step to unify the BPF program hanling logic even within the libbpf itself. As one example, all the extra per-program type handling (sleepable bit, attach_btf_id resolution, unsetting optional expected attach type) is now more obvious and is gathered in one place. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20210928161946.2512801-6-andrii@kernel.org	2021-09-28 13:51:19 -07:00
Andrii Nakryiko	9673268f03	libbpf: Add "tc" SEC_DEF which is a better name for "classifier" As argued in [0], add "tc" ELF section definition for SCHED_CLS BPF program type. "classifier" is a misleading terminology and should be migrated away from. [0] https://lore.kernel.org/bpf/270e27b1-e5be-5b1c-b343-51bd644d0747@iogearbox.net/ Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210928161946.2512801-2-andrii@kernel.org	2021-09-28 13:51:19 -07:00
David S. Miller	4ccb9f03fe	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2021-09-28 The following pull-request contains BPF updates for your net tree. We've added 10 non-merge commits during the last 14 day(s) which contain a total of 11 files changed, 139 insertions(+), 53 deletions(-). The main changes are: 1) Fix MIPS JIT jump code emission for too large offsets, from Piotr Krysiuk. 2) Fix x86 JIT atomic/fetch emission when dst reg maps to rax, from Johan Almbladh. 3) Fix cgroup_sk_alloc corner case when called from interrupt, from Daniel Borkmann. 4) Fix segfault in libbpf's linker for objects without BTF, from Kumar Kartikeya Dwivedi. 5) Fix bpf_jit_charge_modmem for applications with CAP_BPF, from Lorenz Bauer. 6) Fix return value handling for struct_ops BPF programs, from Hou Tao. 7) Various fixes to BPF selftests, from Jiri Benc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> ,	2021-09-28 13:52:46 +01:00
Kumar Kartikeya Dwivedi	bcfd367c28	libbpf: Fix segfault in static linker for objects without BTF When a BPF object is compiled without BTF info (without -g), trying to link such objects using bpftool causes a SIGSEGV due to btf__get_nr_types accessing obj->btf which is NULL. Fix this by checking for the NULL pointer, and return error. Reproducer: $ cat a.bpf.c extern int foo(void); int bar(void) { return foo(); } $ cat b.bpf.c int foo(void) { return 0; } $ clang -O2 -target bpf -c a.bpf.c $ clang -O2 -target bpf -c b.bpf.c $ bpftool gen obj out a.bpf.o b.bpf.o Segmentation fault (core dumped) After fix: $ bpftool gen obj out a.bpf.o b.bpf.o libbpf: failed to find BTF info for object 'a.bpf.o' Error: failed to link 'a.bpf.o': Unknown error -22 (-22) Fixes: `a46349227c` (libbpf: Add linker extern resolution support for functions and global variables) Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210924023725.70228-1-memxor@gmail.com	2021-09-28 09:29:03 +02:00
Toke Høiland-Jørgensen	c3e8c44a90	libbpf: Ignore STT_SECTION symbols in 'maps' section When parsing legacy map definitions, libbpf would error out when encountering an STT_SECTION symbol. This becomes a problem because some versions of binutils will produce SECTION symbols for every section when processing an ELF file, so BPF files run through 'strip' will end up with such symbols, making libbpf refuse to load them. There's not really any reason why erroring out is strictly necessary, so change libbpf to just ignore SECTION symbols when parsing the ELF. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210927205810.715656-1-toke@redhat.com	2021-09-27 21:29:37 -07:00
Jakub Kicinski	2fcd14d0f7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net net/mptcp/protocol.c `977d293e23` ("mptcp: ensure tx skbs always have the MPTCP ext") `efe686ffce` ("mptcp: ensure tx skbs always have the MPTCP ext") same patch merged in both trees, keep net-next. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-09-23 11:19:49 -07:00
Andrii Nakryiko	cc10623c68	libbpf: Add legacy uprobe attaching support Similarly to recently added legacy kprobe attach interface support through tracefs, support attaching uprobes using the legacy interface if host kernel doesn't support newer FD-based interface. For uprobes event name consists of "libbpf_" prefix, PID, sanitized binary path and offset within that binary. Structuraly the code is aligned with kprobe logic refactoring in previous patch. struct bpf_link_perf is re-used and all the same legacy_probe_name and legacy_is_retprobe fields are used to ensure proper cleanup on bpf_link__destroy(). Users should be aware, though, that on old kernels which don't support FD-based interface for kprobe/uprobe attachment, if the application crashes before bpf_link__destroy() is called, uprobe legacy events will be left in tracefs. This is the same limitation as with legacy kprobe interfaces. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210921210036.1545557-5-andrii@kernel.org	2021-09-21 19:40:09 -07:00
Andrii Nakryiko	46ed5fc33d	libbpf: Refactor and simplify legacy kprobe code Refactor legacy kprobe handling code to follow the same logic as uprobe legacy logic added in the next patchs: - add append_to_file() helper that makes it simpler to work with tracefs file-based interface for creating and deleting probes; - move out probe/event name generation outside of the code that adds/removes it, which simplifies bookkeeping significantly; - change the probe name format to start with "libbpf_" prefix and include offset within kernel function; - switch 'unsigned long' to 'size_t' for specifying kprobe offsets, which is consistent with how uprobes define that, simplifies printf()-ing internally, and also avoids unnecessary complications on architectures where sizeof(long) != sizeof(void *). This patch also implicitly fixes the problem with invalid open() error handling present in poke_kprobe_events(), which (the function) this patch removes. Fixes: `ca304b40c2` ("libbpf: Introduce legacy kprobe events support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210921210036.1545557-4-andrii@kernel.org	2021-09-21 19:40:09 -07:00
Andrii Nakryiko	303a257223	libbpf: Fix memory leak in legacy kprobe attach logic In some error scenarios legacy_probe string won't be free()'d. Fix this. This was reported by Coverity static analysis. Fixes: `ca304b40c2` ("libbpf: Introduce legacy kprobe events support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210921210036.1545557-2-andrii@kernel.org	2021-09-21 19:40:08 -07:00
Grant Seltzer	97c140d94e	libbpf: Add doc comments in libbpf.h This adds comments above functions in libbpf.h which document their uses. These comments are of a format that doxygen and sphinx can pick up and render. These are rendered by libbpf.readthedocs.org These doc comments are for: - bpf_object__find_map_by_name() - bpf_map__fd() - bpf_map__is_internal() - libbpf_get_error() - libbpf_num_possible_cpus() Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210918031457.36204-1-grantseltzer@gmail.com	2021-09-20 17:23:31 -07:00
Ian Rogers	aba5daeb64	libperf evsel: Make use of FD robust. FD uses xyarray__entry that may return NULL if an index is out of bounds. If NULL is returned then a segv happens as FD unconditionally dereferences the pointer. This was happening in a case of with perf iostat as shown below. The fix is to make FD an "int*" rather than an int and handle the NULL case as either invalid input or a closed fd. $ sudo gdb --args perf stat --iostat list ... Breakpoint 1, perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50 50 { (gdb) bt #0 perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50 #1 0x000055555585c188 in evsel__open_cpu (evsel=0x5555560951a0, cpus=0x555556093410, threads=0x555556086fb0, start_cpu=0, end_cpu=1) at util/evsel.c:1792 #2 0x000055555585cfb2 in evsel__open (evsel=0x5555560951a0, cpus=0x0, threads=0x555556086fb0) at util/evsel.c:2045 #3 0x000055555585d0db in evsel__open_per_thread (evsel=0x5555560951a0, threads=0x555556086fb0) at util/evsel.c:2065 #4 0x00005555558ece64 in create_perf_stat_counter (evsel=0x5555560951a0, config=0x555555c34700 <stat_config>, target=0x555555c2f1c0 <target>, cpu=0) at util/stat.c:590 #5 0x000055555578e927 in __run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0) at builtin-stat.c:833 #6 0x000055555578f3c6 in run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0) at builtin-stat.c:1048 #7 0x0000555555792ee5 in cmd_stat (argc=1, argv=0x7fffffffe4a0) at builtin-stat.c:2534 #8 0x0000555555835ed3 in run_builtin (p=0x555555c3f540 <commands+288>, argc=3, argv=0x7fffffffe4a0) at perf.c:313 #9 0x0000555555836154 in handle_internal_command (argc=3, argv=0x7fffffffe4a0) at perf.c:365 #10 0x000055555583629f in run_argv (argcp=0x7fffffffe2ec, argv=0x7fffffffe2e0) at perf.c:409 #11 0x0000555555836692 in main (argc=3, argv=0x7fffffffe4a0) at perf.c:539 ... (gdb) c Continuing. Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/). /bin/dmesg \| grep -i perf may provide additional information. Program received signal SIGSEGV, Segmentation fault. 0x00005555559b03ea in perf_evsel__close_fd_cpu (evsel=0x5555560951a0, cpu=1) at evsel.c:166 166 if (FD(evsel, cpu, thread) >= 0) v3. fixes a bug in perf_evsel__run_ioctl where the sense of a branch was backward. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20210918054440.2350466-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-09-18 17:43:06 -03:00
Dave Marchevsky	6c66b0e7c9	libbpf: Use static const fmt string in __bpf_printk The __bpf_printk convenience macro was using a 'char' fmt string holder as it predates support for globals in libbpf. Move to more efficient 'static const char', but provide a fallback to the old way via BPF_NO_GLOBAL_DATA so users on old kernels can still use the macro. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210917182911.2426606-6-davemarchevsky@fb.com	2021-09-17 14:02:05 -07:00
Dave Marchevsky	c2758baa97	libbpf: Modify bpf_printk to choose helper based on arg count Instead of being a thin wrapper which calls into bpf_trace_printk, libbpf's bpf_printk convenience macro now chooses between bpf_trace_printk and bpf_trace_vprintk. If the arg count (excluding format string) is >3, use bpf_trace_vprintk, otherwise use the older helper. The motivation behind this added complexity - instead of migrating entirely to bpf_trace_vprintk - is to maintain good developer experience for users compiling against new libbpf but running on older kernels. Users who are passing <=3 args to bpf_printk will see no change in their bytecode. __bpf_vprintk functions similarly to BPF_SEQ_PRINTF and BPF_SNPRINTF macros elsewhere in the file - it allows use of bpf_trace_vprintk without manual conversion of varargs to u64 array. Previous implementation of bpf_printk macro is moved to __bpf_printk for use by the new implementation. This does change behavior of bpf_printk calls with >3 args in the "new libbpf, old kernels" scenario. Before this patch, attempting to use 4 args to bpf_printk results in a compile-time error. After this patch, using bpf_printk with 4 args results in a trace_vprintk helper call being emitted and a load-time failure on older kernels. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210917182911.2426606-5-davemarchevsky@fb.com	2021-09-17 14:02:05 -07:00
Andrii Nakryiko	942025c9f3	libbpf: Constify all high-level program attach APIs Attach APIs shouldn't need to modify bpf_program/bpf_map structs, so change all struct bpf_program and struct bpf_map pointers to const pointers. This is completely backwards compatible with no functional change. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210916015836.1248906-8-andrii@kernel.org	2021-09-17 09:05:41 -07:00
Andrii Nakryiko	91b555d73e	libbpf: Schedule open_opts.attach_prog_fd deprecation since v0.7 bpf_object_open_opts.attach_prog_fd makes a pretty strong assumption that bpf_object contains either only single freplace BPF program or all of BPF programs in BPF object are freplaces intended to replace different subprograms of the same target BPF program. This seems both a bit confusing, too assuming, and limiting. We've had bpf_program__set_attach_target() API which allows more fine-grained control over this, on a per-program level. As such, mark open_opts.attach_prog_fd as deprecated starting from v0.7, so that we have one more universal way of setting freplace targets. With previous change to allow NULL attach_func_name argument, and especially combined with BPF skeleton, arguable bpf_program__set_attach_target() is a more convenient and explicit API as well. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210916015836.1248906-7-andrii@kernel.org	2021-09-17 09:05:41 -07:00
Andrii Nakryiko	2d5ec1c66e	libbpf: Allow skipping attach_func_name in bpf_program__set_attach_target() Allow to use bpf_program__set_attach_target to only set target attach program FD, while letting libbpf to use target attach function name from SEC() definition. This might be useful for some scenarios where bpf_object contains multiple related freplace BPF programs intended to replace different sub-programs in target BPF program. In such case all programs will have the same attach_prog_fd, but different attach_func_name. It's convenient to specify such target function names declaratively in SEC() definitions, but attach_prog_fd is a dynamic runtime setting. To simplify such scenario, allow bpf_program__set_attach_target() to delay BTF ID resolution till the BPF program load time by providing NULL attach_func_name. In that case the behavior will be similar to using bpf_object_open_opts.attach_prog_fd (which is marked deprecated since v0.7), but has the benefit of allowing more control by user in what is attached to what. Such setup allows having BPF programs attached to different target attach_prog_fd with target functions still declaratively recorded in BPF source code in SEC() definitions. Selftests changes in the next patch should make this more obvious. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210916015836.1248906-5-andrii@kernel.org	2021-09-17 09:05:00 -07:00
Andrii Nakryiko	277641859e	libbpf: Deprecated bpf_object_open_opts.relaxed_core_relocs It's relevant and hasn't been doing anything for a long while now. Deprecated it. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210916015836.1248906-4-andrii@kernel.org	2021-09-17 09:04:12 -07:00
Andrii Nakryiko	f11f86a393	libbpf: Use pre-setup sec_def in libbpf_find_attach_btf_id() Don't perform another search for sec_def inside libbpf_find_attach_btf_id(), as each recognized bpf_program already has prog->sec_def set. Also remove unnecessary NULL check for prog->sec_name, as it can never be NULL. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210916015836.1248906-2-andrii@kernel.org	2021-09-17 09:04:12 -07:00
Grant Seltzer	69cd823956	libbpf: Add sphinx code documentation comments This adds comments above five functions in btf.h which document their uses. These comments are of a format that doxygen and sphinx can pick up and render. These are rendered by libbpf.readthedocs.org Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210915021951.117186-1-grantseltzer@gmail.com	2021-09-15 13:16:02 -07:00
Yonghong Song	5b84bd1036	libbpf: Add support for BTF_KIND_TAG Add BTF_KIND_TAG support for parsing and dedup. Also added sanitization for BTF_KIND_TAG. If BTF_KIND_TAG is not supported in the kernel, sanitize it to INTs. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210914223025.246687-1-yhs@fb.com	2021-09-14 18:45:52 -07:00
Yonghong Song	30025e8bd8	libbpf: Rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag This patch renames functions btf_{hash,equal}_int() to btf_{hash,equal}_int_tag() so they can be reused for BTF_KIND_TAG support. There is no functionality change for this patch. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210914223020.245829-1-yhs@fb.com	2021-09-14 18:45:52 -07:00
Andrii Nakryiko	b6291a6f30	libbpf: Minimize explicit iterator of section definition array Remove almost all the code that explicitly iterated BPF program section definitions in favor of using find_sec_def(). The only remaining user of section_defs is libbpf_get_type_names that has to iterate all of them to construct its result. Having one internal API entry point for section definitions will simplify further refactorings around libbpf's program section definitions parsing. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210914014733.2768-5-andrii@kernel.org	2021-09-14 15:49:24 -07:00
Andrii Nakryiko	5532dfd42e	libbpf: Simplify BPF program auto-attach code Remove the need to explicitly pass bpf_sec_def for auto-attachable BPF programs, as it is already recorded at bpf_object__open() time for all recognized type of BPF programs. This further reduces number of explicit calls to find_sec_def(), simplifying further refactorings. No functional changes are done by this patch. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210914014733.2768-4-andrii@kernel.org	2021-09-14 15:49:24 -07:00
Andrii Nakryiko	91b4d1d1d5	libbpf: Ensure BPF prog types are set before relocations Refactor bpf_object__open() sequencing to perform BPF program type detection based on SEC() definitions before we get to relocations collection. This allows to have more information about BPF program by the time we get to, say, struct_ops relocation gathering. This, subsequently, simplifies struct_ops logic and removes the need to perform extra find_sec_def() resolution. With this patch libbpf will require all struct_ops BPF programs to be marked with SEC("struct_ops") or SEC("struct_ops/xxx") annotations. Real-world applications are already doing that through something like selftests's BPF_STRUCT_OPS() macro. This change streamlines libbpf's internal handling of SEC() definitions and is in the sprit of upcoming libbpf-1.0 section strictness changes ([0]). [0] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#stricter-and-more-uniform-bpf-program-section-name-sec-handling Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210914014733.2768-3-andrii@kernel.org	2021-09-14 15:49:24 -07:00
Rafael David Tinoco	ca304b40c2	libbpf: Introduce legacy kprobe events support Allow kprobe tracepoint events creation through legacy interface, as the kprobe dynamic PMUs support, used by default, was only created in v4.17. Store legacy kprobe name in struct bpf_perf_link, instead of creating a new "subclass" off of bpf_perf_link. This is ok as it's just two new fields, which are also going to be reused for legacy uprobe support in follow up patches. Signed-off-by: Rafael David Tinoco <rafaeldtinoco@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210912064844.3181742-1-rafaeldtinoco@gmail.com	2021-09-14 14:44:45 -07:00
Andrii Nakryiko	2f38304127	libbpf: Make libbpf_version.h non-auto-generated Turn previously auto-generated libbpf_version.h header into a normal header file. This prevents various tricky Makefile integration issues, simplifies the overall build process, but also allows to further extend it with some more versioning-related APIs in the future. To prevent accidental out-of-sync versions as defined by libbpf.map and libbpf_version.h, Makefile checks their consistency at build time. Simultaneously with this change bump libbpf.map to v0.6. Also undo adding libbpf's output directory into include path for kernel/bpf/preload, bpftool, and resolve_btfids, which is not necessary because libbpf_version.h is just a normal header like any other. Fixes: `0b46b75505` ("libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210913222309.3220849-1-andrii@kernel.org	2021-09-13 15:36:47 -07:00
Quentin Monnet	0b46b75505	libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations Introduce a macro LIBBPF_DEPRECATED_SINCE(major, minor, message) to prepare the deprecation of two API functions. This macro marks functions as deprecated when libbpf's version reaches the values passed as an argument. As part of this change libbpf_version.h header is added with recorded major (LIBBPF_MAJOR_VERSION) and minor (LIBBPF_MINOR_VERSION) libbpf version macros. They are now part of libbpf public API and can be relied upon by user code. libbpf_version.h is installed system-wide along other libbpf public headers. Due to this new build-time auto-generated header, in-kernel applications relying on libbpf (resolve_btfids, bpftool, bpf_preload) are updated to include libbpf's output directory as part of a list of include search paths. Better fix would be to use libbpf's make_install target to install public API headers, but that clean up is left out as a future improvement. The build changes were tested by building kernel (with KBUILD_OUTPUT and O= specified explicitly), bpftool, libbpf, selftests/bpf, and resolve_btfids builds. No problems were detected. Note that because of the constraints of the C preprocessor we have to write a few lines of macro magic for each version used to prepare deprecation (0.6 for now). Also, use LIBBPF_DEPRECATED_SINCE() to schedule deprecation of btf__get_from_id() and btf__load(), which are replaced by btf__load_from_kernel_by_id() and btf__load_into_kernel(), respectively, starting from future libbpf v0.6. This is part of libbpf 1.0 effort ([0]). [0] Closes: https://github.com/libbpf/libbpf/issues/278 Co-developed-by: Quentin Monnet <quentin@isovalent.com> Co-developed-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210908213226.1871016-1-andrii@kernel.org	2021-09-09 23:28:05 +02:00
Andrii Nakryiko	006a5099fc	libbpf: Fix build with latest gcc/binutils with LTO After updating to binutils 2.35, the build began to fail with an assembler error. A bug was opened on the Red Hat Bugzilla a few days later for the same issue. Work around the problem by using the new `symver` attribute (introduced in GCC 10) as needed instead of assembler directives. This addresses Red Hat ([0]) and OpenSUSE ([1]) bug reports, as well as libbpf issue ([2]). [0]: https://bugzilla.redhat.com/show_bug.cgi?id=1863059 [1]: https://bugzilla.opensuse.org/show_bug.cgi?id=1188749 [2]: Closes: https://github.com/libbpf/libbpf/issues/338 Co-developed-by: Patrick McCarty <patrick.mccarty@intel.com> Co-developed-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Patrick McCarty <patrick.mccarty@intel.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210907221023.2660953-1-andrii@kernel.org	2021-09-07 19:32:04 -07:00
Matt Smith	08a6f22ef6	libbpf: Change bpf_object_skeleton data field to const pointer This change was necessary to enforce the implied contract that bpf_object_skeleton->data should not be mutated. The data will be cast to `void ` during assignment to handle the case where a user is compiling with older libbpf headers to avoid a compiler warning of `const void ` data being cast to `void *` Signed-off-by: Matt Smith <alastorze@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210901194439.3853238-2-alastorze@fb.com	2021-09-07 17:33:49 -07:00
Toke Høiland-Jørgensen	03e601f48b	libbpf: Don't crash on object files with no symbol tables If libbpf encounters an ELF file that has been stripped of its symbol table, it will crash in bpf_object__add_programs() when trying to dereference the obj->efile.symbols pointer. Fix this by erroring out of bpf_object__elf_collect() if it is not able able to find the symbol table. v2: - Move check into bpf_object__elf_collect() and add nice error message Fixes: `6245947c1b` ("libbpf: Allow gaps in BPF program sections to support overriden weak functions") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210901114812.204720-1-toke@redhat.com	2021-09-07 17:18:48 -07:00
Linus Torvalds	27151f1778	perf tools changes for v5.15: New features: - Improvements for the flamegraph python script, including: - Display perf.data header - Display PIDs of user stacks - Added option to change color scheme - Default to blue/green color scheme to improve accessibility - Correctly identify kernel stacks when debuginfo is available - Improvements for 'perf bench futex': - Add --mlockall parameter - Add --broadcast and --pi to the 'requeue' sub benchmark - Add support for PMU aliases. - Introduce an ARM Coresight ETE decoder. - Add a 'perf bench' entry for evlist open/close operations, to help quantify improvements with multithreading 'perf record'. - Allow reporting the [un]throttle PERF_RECORD_ meta event in 'perf script's python scripting. - Add a 'perf test' entry for PMU aliases. - Add a 'perf test' entry for 'perf record/perf report/perf script' pipe mode. Fixes: - perf script dlfilter (API for filtering via dynamically loaded shared object introduced in v5.14) fixes and a 'perf test' entry for it. - Fix get_current_dir_name() compilation on Android. - Fix issues with asciidoc and double dashes uses. - Fix memory leaks in the BTF handling code. - Fix leftover problems in the Documentation from the infrastructure originally lifted from the git codebase. - Fix probe_vfs_getname.sh 'perf test' failures. - Handle fd gaps in 'perf test's test__dso_data_reopen(). - Make sure to show disasembly warnings for 'perf annotate --stdio'. - Fix output from pipe to file and vice-versa in 'perf record/report/script'. - Correct 'perf data -h' output. - Fix wrong comm in system-wide mode with 'perf record --delay'. - Do not allow --for-each-cgroup without cpu in 'perf stat' - Make 'perf test --skip' work on shell tests. - Fix libperf's verbose printing. Misc improvements: - Preparatory patches for multithreading varios 'perf record' phases (synthesizing, opening, recording, etc). - Add sparse context/locking annotations in compiler-types.h, also to help with the multithreading effort. - Optimize the generation of the arch specific erno tables used in 'perf trace'. - Optimize libperf's perf_cpu_map__max(). - Improve ARM's CoreSight warnings. - Report collisions in AUX records. - Improve warnings for the LLVM 'perf test' entry. - Improve the PMU events 'perf test' codebase. - perf test: Do not compare overheads in the zstd comp test - Better support annotation on ARM. - Update 'perf trace's cmd string table to decode sys_bpf() first arg. Vendor events: - Add JSON events and metrics for Intel's Ice Lake, Tiger Lake and Elhart Lake. - Update JSON eventsand metrics for Intel's Cascade Lake and Sky Lake servers. Hardware tracing: - Improvements for the ARM hardware tracing auxtrace support. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYTO8OAAKCRCyPKLppCJ+ J/N0AQCTAfxKsVyh9vjIY5vwHvNkYMDM4Wvs7TmlfXbVgKLP3AEA0wYlTyar/teI NC6jQpipxpWASFUJbLSmU8qn/XFXwQo= =4r0w -----END PGP SIGNATURE----- Merge tag 'perf-tools-for-v5.15-2021-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tool updates from Arnaldo Carvalho de Melo: "New features: - Improvements for the flamegraph python script, including: - Display perf.data header - Display PIDs of user stacks - Added option to change color scheme - Default to blue/green color scheme to improve accessibility - Correctly identify kernel stacks when debuginfo is available - Improvements for 'perf bench futex': - Add --mlockall parameter - Add --broadcast and --pi to the 'requeue' sub benchmark - Add support for PMU aliases. - Introduce an ARM Coresight ETE decoder. - Add a 'perf bench' entry for evlist open/close operations, to help quantify improvements with multithreading 'perf record'. - Allow reporting the [un]throttle PERF_RECORD_ meta event in 'perf script's python scripting. - Add a 'perf test' entry for PMU aliases. - Add a 'perf test' entry for 'perf record/perf report/perf script' pipe mode. Fixes: - perf script dlfilter (API for filtering via dynamically loaded shared object introduced in v5.14) fixes and a 'perf test' entry for it. - Fix get_current_dir_name() compilation on Android. - Fix issues with asciidoc and double dashes uses. - Fix memory leaks in the BTF handling code. - Fix leftover problems in the Documentation from the infrastructure originally lifted from the git codebase. - Fix probe_vfs_getname.sh 'perf test' failures. - Handle fd gaps in 'perf test's test__dso_data_reopen(). - Make sure to show disasembly warnings for 'perf annotate --stdio'. - Fix output from pipe to file and vice-versa in 'perf record/report/script'. - Correct 'perf data -h' output. - Fix wrong comm in system-wide mode with 'perf record --delay'. - Do not allow --for-each-cgroup without cpu in 'perf stat' - Make 'perf test --skip' work on shell tests. - Fix libperf's verbose printing. Misc improvements: - Preparatory patches for multithreading various 'perf record' phases (synthesizing, opening, recording, etc). - Add sparse context/locking annotations in compiler-types.h, also to help with the multithreading effort. - Optimize the generation of the arch specific erno tables used in 'perf trace'. - Optimize libperf's perf_cpu_map__max(). - Improve ARM's CoreSight warnings. - Report collisions in AUX records. - Improve warnings for the LLVM 'perf test' entry. - Improve the PMU events 'perf test' codebase. - perf test: Do not compare overheads in the zstd comp test - Better support annotation on ARM. - Update 'perf trace's cmd string table to decode sys_bpf() first arg. Vendor events: - Add JSON events and metrics for Intel's Ice Lake, Tiger Lake and Elhart Lake. - Update JSON eventsand metrics for Intel's Cascade Lake and Sky Lake servers. Hardware tracing: - Improvements for the ARM hardware tracing auxtrace support" * tag 'perf-tools-for-v5.15-2021-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (130 commits) perf tests: Add test for PMU aliases perf pmu: Add PMU alias support perf session: Report collisions in AUX records perf script python: Allow reporting the [un]throttle PERF_RECORD_ meta event perf build: Report failure for testing feature libopencsd perf cs-etm: Show a warning for an unknown magic number perf cs-etm: Print the decoder name perf cs-etm: Create ETE decoder perf cs-etm: Update OpenCSD decoder for ETE perf cs-etm: Fix typo perf cs-etm: Save TRCDEVARCH register perf cs-etm: Refactor out ETMv4 header saving perf cs-etm: Initialise architecture based on TRCIDR1 perf cs-etm: Refactor initialisation of decoder params. tools build: Fix feature detect clean for out of source builds perf evlist: Add evlist__for_each_entry_from() macro perf evsel: Handle precise_ip fallback in evsel__open_cpu() perf evsel: Move bpf_counter__install_pe() to success path in evsel__open_cpu() perf evsel: Move test_attr__open() to success path in evsel__open_cpu() perf evsel: Move ignore_missing_thread() to fallback code ...	2021-09-05 11:56:18 -07:00
Riccardo Mancini	6e93bc534f	libperf cpumap: Take into advantage it is sorted to optimize perf_cpu_map__max() From commit `7074674e73` ("perf cpumap: Maintain cpumaps ordered and without dups"), perf_cpu_map elements are sorted in ascending order. This patch improves the perf_cpu_map__max function by returning the last element. Committer notes: Do it as a ternary to keep it in just one return line, add a comment explaining it is sorted and what functions does it. Signed-off-by: Riccardo Mancini <rickyman7@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/fb79f02e7b86ea8044d563adb1e9890c906f982f.1629490974.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-08-31 16:17:02 -03:00
Riccardo Mancini	b75f299d69	libsubcmd: add OPT_UINTEGER_OPTARG option type This patch adds OPT_UINTEGER_OPTARG, which is the same as OPT_UINTEGER, but also makes it possible to use the option without any value, setting the variable to a default value, d. Signed-off-by: Riccardo Mancini <rickyman7@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/c46749b3dff796729078352ff164d363457a3587.1629490974.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-08-31 15:44:05 -03:00
Arnaldo Carvalho de Melo	c635813fef	Merge remote-tracking branch 'torvalds/master' into perf/core To pick up fixes. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-08-30 10:05:46 -03:00
Shunsuke Nakamura	37c3193fa4	libperf tests: Fix verbose printing libperf's verbose printing checks the -v option every time the macro _T_ START is called. Since there are currently four libperf tests registered, the macro _T_ START is called four times, but verbose printing after the second time is not output. Resets the index of the element processed by getopt() and fix verbose printing so that it prints in all tests. Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Acked-by: Rob Herring <robh@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210820093908.734503-3-nakamura.shun@fujitsu.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-08-24 15:05:35 -03:00
Andrii Nakryiko	5e3b8356de	libbpf: Add uprobe ref counter offset support for USDT semaphores When attaching to uprobes through perf subsystem, it's possible to specify offset of a so-called USDT semaphore, which is just a reference counted u16, used by kernel to keep track of how many tracers are attached to a given location. Support for this feature was added in [0], so just wire this through uprobe_opts. This is important to enable implementing USDT attachment and tracing through libbpf's bpf_program__attach_uprobe_opts() API. [0] `a6ca88b241` ("trace_uprobe: support reference counter in fd-based uprobe") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210815070609.987780-16-andrii@kernel.org	2021-08-17 00:45:08 +02:00
Andrii Nakryiko	47faff3717	libbpf: Add bpf_cookie to perf_event, kprobe, uprobe, and tp attach APIs Wire through bpf_cookie for all attach APIs that use perf_event_open under the hood: - for kprobes, extend existing bpf_kprobe_opts with bpf_cookie field; - for perf_event, uprobe, and tracepoint APIs, add their _opts variants and pass bpf_cookie through opts. For kernel that don't support BPF_LINK_CREATE for perf_events, and thus bpf_cookie is not supported either, return error and log warning for user. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210815070609.987780-12-andrii@kernel.org	2021-08-17 00:45:08 +02:00
Andrii Nakryiko	3ec84f4b16	libbpf: Add bpf_cookie support to bpf_link_create() API Add ability to specify bpf_cookie value when creating BPF perf link with bpf_link_create() low-level API. Given BPF_LINK_CREATE command is growing and keeps getting new fields that are specific to the type of BPF_LINK, extend libbpf side of bpf_link_create() API and corresponding OPTS struct to accomodate such changes. Add extra checks to prevent using incompatible/unexpected combinations of fields. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210815070609.987780-11-andrii@kernel.org	2021-08-17 00:45:08 +02:00
Andrii Nakryiko	668ace0ea5	libbpf: Use BPF perf link when supported by kernel Detect kernel support for BPF perf link and prefer it when attaching to perf_event, tracepoint, kprobe/uprobe. Underlying perf_event FD will be kept open until BPF link is destroyed, at which point both perf_event FD and BPF link FD will be closed. This preserves current behavior in which perf_event FD is open for the duration of bpf_link's lifetime and user is able to "disconnect" bpf_link from underlying FD (with bpf_link__disconnect()), so that bpf_link__destroy() doesn't close underlying perf_event FD.When BPF perf link is used, disconnect will keep both perf_event and bpf_link FDs open, so it will be up to (advanced) user to close them. This approach is demonstrated in bpf_cookie.c selftests, added in this patch set. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210815070609.987780-10-andrii@kernel.org	2021-08-17 00:45:08 +02:00
Andrii Nakryiko	d88b71d4a9	libbpf: Remove unused bpf_link's destroy operation, but add dealloc bpf_link->destroy() isn't used by any code, so remove it. Instead, add ability to override deallocation procedure, with default doing plain free(link). This is necessary for cases when we want to "subclass" struct bpf_link to keep extra information, as is the case in the next patch adding struct bpf_link_perf. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210815070609.987780-9-andrii@kernel.org	2021-08-17 00:45:08 +02:00
Andrii Nakryiko	61c7aa5020	libbpf: Re-build libbpf.so when libbpf.map changes Ensure libbpf.so is re-built whenever libbpf.map is modified. Without this, changes to libbpf.map are not detected and versioned symbols mismatch error will be reported until `make clean && make` is used, which is a suboptimal developer experience. Fixes: `306b267cb3` ("libbpf: Verify versioned symbols") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210815070609.987780-8-andrii@kernel.org	2021-08-17 00:45:07 +02:00
Hao Luo	2211c825e7	libbpf: Support weak typed ksyms. Currently weak typeless ksyms have default value zero, when they don't exist in the kernel. However, weak typed ksyms are rejected by libbpf if they can not be resolved. This means that if a bpf object contains the declaration of a nonexistent weak typed ksym, it will be rejected even if there is no program that references the symbol. Nonexistent weak typed ksyms can also default to zero just like typeless ones. This allows programs that access weak typed ksyms to be accepted by verifier, if the accesses are guarded. For example, extern const int bpf_link_fops3 __ksym __weak; /* then in BPF program / if (&bpf_link_fops3) { / use bpf_link_fops3 */ } If actual use of nonexistent typed ksym is not guarded properly, verifier would see that register is not PTR_TO_BTF_ID and wouldn't allow to use it for direct memory reads or passing it to BPF helpers. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210812003819.2439037-1-haoluo@google.com	2021-08-13 15:56:28 -07:00
Jakub Kicinski	f4083a752a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h `9e26680733` ("bnxt_en: Update firmware call to retrieve TX PTP timestamp") `9e518f2580` ("bnxt_en: 1PPS functions to configure TSIO pins") `099fdeda65` ("bnxt_en: Event handler for PPS events") kernel/bpf/helpers.c include/linux/bpf-cgroup.h `a2baf4e8bb` ("bpf: Fix potentially incorrect results with bpf_get_local_storage()") `c7603cfa04` ("bpf: Add ambient BPF runtime context stored in current") drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c `5957cc557d` ("net/mlx5: Set all field of mlx5_irq before inserting it to the xarray") `2d0b41a376` ("net/mlx5: Refcount mlx5_irq with integer") MAINTAINERS `7b637cd52f` ("MAINTAINERS: fix Microchip CAN BUS Analyzer Tool entry typo") `7d901a1e87` ("net: phy: add Maxlinear GPY115/21x/24x driver") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-13 06:41:22 -07:00
Jin Yao	2696d6e59c	libperf: Add perf_cpu_map__default_new() libperf already has a static function called 'cpu_map__default_new()'. Add a new API perf_cpu_map__default_new() to export the function. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https //lore.kernel.org/r/20210723063433.7318-2-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-08-11 16:03:36 -03:00
Daniel Xu	c34c338a40	libbpf: Do not close un-owned FD 0 on errors Before this patch, btf_new() was liable to close an arbitrary FD 0 if BTF parsing failed. This was because: * btf->fd was initialized to 0 through the calloc() * btf__free() (in the `done` label) closed any FDs >= 0 * btf->fd is left at 0 if parsing fails This issue was discovered on a system using libbpf v0.3 (without BTF_KIND_FLOAT support) but with a kernel that had BTF_KIND_FLOAT types in BTF. Thus, parsing fails. While this patch technically doesn't fix any issues b/c upstream libbpf has BTF_KIND_FLOAT support, it'll help prevent issues in the future if more BTF types are added. It also allow the fix to be backported to older libbpf's. Fixes: `3289959b97` ("libbpf: Support BTF loading and raw data output in both endianness") Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/5969bb991adedb03c6ae93e051fd2a00d293cf25.1627513670.git.dxu@dxuuu.xyz	2021-08-07 01:39:15 +02:00
Robin Gögge	78d14bda86	libbpf: Fix probe for BPF_PROG_TYPE_CGROUP_SOCKOPT This patch fixes the probe for BPF_PROG_TYPE_CGROUP_SOCKOPT, so the probe reports accurate results when used by e.g. bpftool. Fixes: `4cdbfb59c4` ("libbpf: support sockopt hooks") Signed-off-by: Robin Gögge <r.goegge@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20210728225825.2357586-1-r.goegge@gmail.com	2021-08-07 01:38:52 +02:00
Hengqi Chen	a710eed386	libbpf: Add btf__load_vmlinux_btf/btf__load_module_btf Add two new APIs: btf__load_vmlinux_btf and btf__load_module_btf. btf__load_vmlinux_btf is just an alias to the existing API named libbpf_find_kernel_btf, rename to be more precisely and consistent with existing BTF APIs. btf__load_module_btf can be used to load module BTF, add it for completeness. These two APIs are useful for implementing tracing tools and introspection tools. This is part of the effort towards libbpf 1.0 ([0]). [0] Closes: https://github.com/libbpf/libbpf/issues/280 Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210730114012.494408-1-hengqi.chen@gmail.com	2021-07-30 12:21:29 -07:00
Quentin Monnet	61fc51b1d3	libbpf: Add split BTF support for btf__load_from_kernel_by_id() Add a new API function btf__load_from_kernel_by_id_split(), which takes a pointer to a base BTF object in order to support split BTF objects when retrieving BTF information from the kernel. Reference: https://github.com/libbpf/libbpf/issues/314 Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210729162028.29512-8-quentin@isovalent.com	2021-07-29 17:23:59 -07:00
Quentin Monnet	6cc93e2f2c	libbpf: Rename btf__get_from_id() as btf__load_from_kernel_by_id() Rename function btf__get_from_id() as btf__load_from_kernel_by_id() to better indicate what the function does. Change the new function so that, instead of requiring a pointer to the pointer to update and returning with an error code, it takes a single argument (the id of the BTF object) and returns the corresponding pointer. This is more in line with the existing constructors. The other tools calling the (soon-to-be) deprecated btf__get_from_id() function will be updated in a future commit. References: - https://github.com/libbpf/libbpf/issues/278 - https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#btfh-apis Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210729162028.29512-4-quentin@isovalent.com	2021-07-29 17:09:19 -07:00
Quentin Monnet	3c7e585906	libbpf: Rename btf__load() as btf__load_into_kernel() As part of the effort to move towards a v1.0 for libbpf, rename btf__load() function, used to "upload" BTF information into the kernel, as btf__load_into_kernel(). This new name better reflects what the function does. References: - https://github.com/libbpf/libbpf/issues/278 - https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#btfh-apis Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210729162028.29512-3-quentin@isovalent.com	2021-07-29 17:03:41 -07:00
Quentin Monnet	6d2d73cdd6	libbpf: Return non-null error on failures in libbpf_find_prog_btf_id() Variable "err" is initialised to -EINVAL so that this error code is returned when something goes wrong in libbpf_find_prog_btf_id(). However, a recent change in the function made use of the variable in such a way that it is set to 0 if retrieving linear information on the program is successful, and this 0 value remains if we error out on failures at later stages. Let's fix this by setting err to -EINVAL later in the function. Fixes: `e9fc3ce99b` ("libbpf: Streamline error reporting for high-level APIs") Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210729162028.29512-2-quentin@isovalent.com	2021-07-29 17:03:41 -07:00
Martynas Pumputis	043c5bb3c4	libbpf: Fix race when pinning maps in parallel When loading in parallel multiple programs which use the same to-be pinned map, it is possible that two instances of the loader will call bpf_object__create_maps() at the same time. If the map doesn't exist when both instances call bpf_object__reuse_map(), then one of the instances will fail with EEXIST when calling bpf_map__pin(). Fix the race by retrying reusing a map if bpf_map__pin() returns EEXIST. The fix is similar to the one in iproute2: e4c4685fd6e4 ("bpf: Fix race condition with map pinning"). Before retrying the pinning, we don't do any special cleaning of an internal map state. The closer code inspection revealed that it's not required: - bpf_object__create_map(): map->inner_map is destroyed after a successful call, map->fd is closed if pinning fails. - bpf_object__populate_internal_map(): created map elements is destroyed upon close(map->fd). - init_map_slots(): slots are freed after their initialization. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210726152001.34845-1-m@lambda.lt	2021-07-27 14:32:03 -07:00
Jason Wang	c139e40a51	libbpf: Fix comment typo Remove the repeated word 'the' in line 48. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210727115928.74600-1-wangborong@cdjrlc.com	2021-07-27 14:18:47 -07:00
Alexei Starovoitov	b0588390db	libbpf: Split CO-RE logic into relo_core.c. Move CO-RE logic into separate file. The internal interface between libbpf and CO-RE is through bpf_core_apply_relo_insn() function and few structs defined in relo_core.h. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210721000822.40958-5-alexei.starovoitov@gmail.com	2021-07-26 12:29:14 -07:00
Alexei Starovoitov	301ba4d710	libbpf: Move CO-RE types into relo_core.h. In order to make a clean split of CO-RE logic move its types into independent header file. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210721000822.40958-4-alexei.starovoitov@gmail.com	2021-07-26 12:19:41 -07:00
Alexei Starovoitov	3ee4f53355	libbpf: Split bpf_core_apply_relo() into bpf_program independent helper. bpf_core_apply_relo() doesn't need to know bpf_program internals and hashmap details. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210721000822.40958-3-alexei.starovoitov@gmail.com	2021-07-26 12:13:05 -07:00
Alexei Starovoitov	6e43b28607	libbpf: Cleanup the layering between CORE and bpf_program. CO-RE processing functions don't need to know 'struct bpf_program' details. Cleanup the layering to eventually be able to move CO-RE logic into a separate file. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210721000822.40958-2-alexei.starovoitov@gmail.com	2021-07-26 12:11:23 -07:00
Evgeniy Litvinenko	e244d34d0e	libbpf: Add bpf_map__pin_path function Add bpf_map__pin_path, so that the inconsistently named bpf_map__get_pin_path can be deprecated later. This is part of the effort towards libbpf v1.0: https://github.com/libbpf/libbpf/issues/307 Also, add a selftest for the new function. Signed-off-by: Evgeniy Litvinenko <evgeniyl@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210723221511.803683-1-evgeniyl@fb.com	2021-07-23 16:57:03 -07:00
Jiri Olsa	da97553ec6	libbpf: Export bpf_program__attach_kprobe_opts function Export bpf_program__attach_kprobe_opts as a public API. Rename bpf_program_attach_kprobe_opts to bpf_kprobe_opts and turn it into OPTS struct. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Tested-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20210721215810.889975-4-jolsa@kernel.org	2021-07-22 20:09:16 -07:00
Jiri Olsa	e3f9bc35ea	libbpf: Allow decimal offset for kprobes Allow to specify decimal offset in SEC macro, like: SEC("kprobe/bpf_fentry_test7+5") Add selftest for that. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Tested-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20210721215810.889975-3-jolsa@kernel.org	2021-07-22 20:09:16 -07:00
Jiri Olsa	1f71a468a7	libbpf: Fix func leak in attach_kprobe Add missing free() for func pointer in attach_kprobe function. Fixes: `a2488b5f48` ("libbpf: Allow specification of "kprobe/function+offset"") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Tested-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20210721215810.889975-2-jolsa@kernel.org	2021-07-22 20:08:39 -07:00
Alan Maguire	720c29fca9	libbpf: Propagate errors when retrieving enum value for typed data display When retrieving the enum value associated with typed data during "is data zero?" checking in btf_dump_type_data_check_zero(), the return value of btf_dump_get_enum_value() is not passed to the caller if the function returns a non-zero (error) value. Currently, 0 is returned if the function returns an error. We should instead propagate the error to the caller. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626770993-11073-4-git-send-email-alan.maguire@oracle.com	2021-07-20 13:49:25 -07:00
Alan Maguire	a1d3cc3c5e	libbpf: Avoid use of __int128 in typed dump display __int128 is not supported for some 32-bit platforms (arm and i386). __int128 was used in carrying out computations on bitfields which aid display, but the same calculations could be done with __u64 with the small effect of not supporting 128-bit bitfields. With these changes, a big-endian issue with casting 128-bit integers to 64-bit for enum bitfields is solved also, as we now use 64-bit integers for bitfield calculations. Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626770993-11073-2-git-send-email-alan.maguire@oracle.com	2021-07-20 13:49:25 -07:00
Martynas Pumputis	a21ab4c59e	libbpf: Fix removal of inner map in bpf_object__create_map If creating an outer map of a BTF-defined map-in-map fails (via bpf_object__create_map()), then the previously created its inner map won't be destroyed. Fix this by ensuring that the destroy routines are not bypassed in the case of a failure. Fixes: `646f02ffdd` ("libbpf: Add BTF-defined map-in-map support") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210719173838.423148-2-m@lambda.lt	2021-07-19 15:48:20 -07:00
Alan Maguire	add192f81a	libbpf: Btf typed dump does not need to allocate dump data By using the stack for this small structure, we avoid the need for freeing memory in error paths. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626475617-25984-4-git-send-email-alan.maguire@oracle.com	2021-07-16 17:29:56 -07:00
Alan Maguire	04eb4dff6a	libbpf: Fix compilation errors on ppc64le for btf dump typed data __s64 can be defined as either long or long long, depending on the architecture. On ppc64le it's defined as long, giving this error: In file included from btf_dump.c:22: btf_dump.c: In function 'btf_dump_type_data_check_overflow': libbpf_internal.h:111:22: error: format '%lld' expects argument of type 'long long int', but argument 3 has type '__s64' {aka 'long int'} [-Werror=format=] 111 \| libbpf_print(level, "libbpf: " fmt, ##__VA_ARGS__); \ \| ^~~~~~~~~~ libbpf_internal.h:114:27: note: in expansion of macro '__pr' 114 \| #define pr_warn(fmt, ...) __pr(LIBBPF_WARN, fmt, ##__VA_ARGS__) \| ^~~~ btf_dump.c:1992:3: note: in expansion of macro 'pr_warn' 1992 \| pr_warn("unexpected size [%lld] for id [%u]\n", \| ^~~~~~~ btf_dump.c:1992:32: note: format string is defined here 1992 \| pr_warn("unexpected size [%lld] for id [%u]\n", \| ~~~^ \| \| \| long long int \| %ld Cast to size_t and use %zu instead. Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626475617-25984-3-git-send-email-alan.maguire@oracle.com	2021-07-16 17:25:57 -07:00
Alan Maguire	8d44c3578b	libbpf: Clarify/fix unaligned data issues for btf typed dump If data is packed, data structures can store it outside of usual boundaries. For example a 4-byte int can be stored on a unaligned boundary in a case like this: struct s { char f1; int f2; } __attribute((packed)); ...the int is stored at an offset of one byte. Some platforms have problems dereferencing data that is not aligned with its size, and code exists to handle most cases of this for BTF typed data display. However pointer display was missed, and a simple function to test if "ptr_is_aligned(data, data_sz)" would help clarify this code. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626475617-25984-2-git-send-email-alan.maguire@oracle.com	2021-07-16 17:24:55 -07:00
Alan Maguire	920d16af9b	libbpf: BTF dumper support for typed data Add a BTF dumper for typed data, so that the user can dump a typed version of the data provided. The API is int btf_dump__dump_type_data(struct btf_dump d, __u32 id, void data, size_t data_sz, const struct btf_dump_type_data_opts opts); ...where the id is the BTF id of the data pointed to by the "void " argument; for example the BTF id of "struct sk_buff" for a "struct skb " data pointer. Options supported are - a starting indent level (indent_lvl) - a user-specified indent string which will be printed once per indent level; if NULL, tab is chosen but any string <= 32 chars can be provided. - a set of boolean options to control dump display, similar to those used for BPF helper bpf_snprintf_btf(). Options are - compact : omit newlines and other indentation - skip_names: omit member names - emit_zeroes: show zero-value members Default output format is identical to that dumped by bpf_snprintf_btf(), for example a "struct sk_buff" representation would look like this: struct sk_buff){ (union){ (struct){ .next = (struct sk_buff )0xffffffffffffffff, .prev = (struct sk_buff )0xffffffffffffffff, (union){ .dev = (struct net_device )0xffffffffffffffff, .dev_scratch = (long unsigned int)18446744073709551615, }, }, ... If the data structure is larger than the data_sz number of bytes that are available in data, as much of the data as possible will be dumped and -E2BIG will be returned. This is useful as tracers will sometimes not be able to capture all of the data associated with a type; for example a "struct task_struct" is ~16k. Being able to specify that only a subset is available is important for such cases. On success, the amount of data dumped is returned. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626362126-27775-2-git-send-email-alan.maguire@oracle.com	2021-07-16 13:46:59 -07:00
Shuyi Cheng	18353c87e0	libbpf: Fix the possible memory leak on error If the strdup() fails then we need to call bpf_object__close(obj) to avoid a resource leak. Fixes: `166750bc1d` ("libbpf: Support libbpf-provided extern variables") Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626180159-112996-3-git-send-email-chengshuyi@linux.alibaba.com	2021-07-16 13:22:47 -07:00
Shuyi Cheng	1373ff5995	libbpf: Introduce 'btf_custom_path' to 'bpf_obj_open_opts' btf_custom_path allows developers to load custom BTF which libbpf will subsequently use for CO-RE relocation instead of vmlinux BTF. Having btf_custom_path in bpf_object_open_opts one can directly use the skeleton's <objname>_bpf__open_opts() API to pass in the btf_custom_path parameter, as opposed to using bpf_object__load_xattr() which is slated to be deprecated ([0]). This work continues previous work started by another developer ([1]). [0] https://lore.kernel.org/bpf/CAEf4BzbJZLjNoiK8_VfeVg_Vrg=9iYFv+po-38SMe=UzwDKJ=Q@mail.gmail.com/#t [1] https://yhbt.net/lore/all/CAEf4Bzbgw49w2PtowsrzKQNcxD4fZRE6AKByX-5-dMo-+oWHHA@mail.gmail.com/ Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626180159-112996-2-git-send-email-chengshuyi@linux.alibaba.com	2021-07-16 13:22:47 -07:00
David S. Miller	82a1ffe57e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-07-15 The following pull-request contains BPF updates for your net-next tree. We've added 45 non-merge commits during the last 15 day(s) which contain a total of 52 files changed, 3122 insertions(+), 384 deletions(-). The main changes are: 1) Introduce bpf timers, from Alexei. 2) Add sockmap support for unix datagram socket, from Cong. 3) Fix potential memleak and UAF in the verifier, from He. 4) Add bpf_get_func_ip helper, from Jiri. 5) Improvements to generic XDP mode, from Kumar. 6) Support for passing xdp_md to XDP programs in bpf_prog_run, from Zvi. =================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-15 22:40:10 -07:00
Alan Maguire	a2488b5f48	libbpf: Allow specification of "kprobe/function+offset" kprobes can be placed on most instructions in a function, not just entry, and ftrace and bpftrace support the function+offset notification for probe placement. Adding parsing of func_name into func+offset to bpf_program__attach_kprobe() allows the user to specify SEC("kprobe/bpf_fentry_test5+0x6") ...for example, and the offset can be passed to perf_event_open_probe() to support kprobe attachment. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-8-jolsa@kernel.org	2021-07-15 17:59:14 -07:00
Jiri Olsa	ac0ed48829	libbpf: Add bpf_program__attach_kprobe_opts function Adding bpf_program__attach_kprobe_opts that does the same as bpf_program__attach_kprobe, but takes opts argument. Currently opts struct holds just retprobe bool, but we will add new field in following patch. The function is not exported, so there's no need to add size to the struct bpf_program_attach_kprobe_opts for now. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-7-jolsa@kernel.org	2021-07-15 17:59:14 -07:00
Linus Torvalds	8096acd744	Networking fixes for 5.14-rc2, including fixes from bpf and netfilter. Current release - regressions: - sock: fix parameter order in sock_setsockopt() Current release - new code bugs: - netfilter: nft_last: - fix incorrect arithmetic when restoring last used - honor NFTA_LAST_SET on restoration Previous releases - regressions: - udp: properly flush normal packet at GRO time - sfc: ensure correct number of XDP queues; don't allow enabling the feature if there isn't sufficient resources to Tx from any CPU - dsa: sja1105: fix address learning getting disabled on the CPU port - mptcp: addresses a rmem accounting issue that could keep packets in subflow receive buffers longer than necessary, delaying MPTCP-level ACKs - ip_tunnel: fix mtu calculation for ETHER tunnel devices - do not reuse skbs allocated from skbuff_fclone_cache in the napi skb cache, we'd try to return them to the wrong slab cache - tcp: consistently disable header prediction for mptcp Previous releases - always broken: - bpf: fix subprog poke descriptor tracking use-after-free - ipv6: - allocate enough headroom in ip6_finish_output2() in case iptables TEE is used - tcp: drop silly ICMPv6 packet too big messages to avoid expensive and pointless lookups (which may serve as a DDOS vector) - make sure fwmark is copied in SYNACK packets - fix 'disable_policy' for forwarded packets (align with IPv4) - netfilter: conntrack: do not renew entry stuck in tcp SYN_SENT state - netfilter: conntrack: do not mark RST in the reply direction coming after SYN packet for an out-of-sync entry - mptcp: cleanly handle error conditions with MP_JOIN and syncookies - mptcp: fix double free when rejecting a join due to port mismatch - validate lwtstate->data before returning from skb_tunnel_info() - tcp: call sk_wmem_schedule before sk_mem_charge in zerocopy path - mt76: mt7921: continue to probe driver when fw already downloaded - bonding: fix multiple issues with offloading IPsec to (thru?) bond - stmmac: ptp: fix issues around Qbv support and setting time back - bcmgenet: always clear wake-up based on energy detection Misc: - sctp: move 198 addresses from unusable to private scope - ptp: support virtual clocks and timestamping - openvswitch: optimize operation for key comparison -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmDu3mMACgkQMUZtbf5S Irsjxg//UwcPJMYFmXV+fGkEsWYe1Kf29FcUDEeANFtbltfAcIfZ0GoTbSDRnrVb HcYAKcm4XRx5bWWdQrQsQq/yiLbnS/rSLc7VRB+uRHWRKl3eYcaUB2rnCXsxrjGw wQJgOmztDCJS4BIky24iQpF/8lg7p/Gj2Ih532gh93XiYo612FrEJKkYb2/OQfYX GkbnZ0kL2Y1SV+bhy6aT5azvhHKM4/3eA4fHeJ2p8e2gOZ5ni0vpX0xEzdzKOCd0 vwR/Wu3h/+2QuFYVcSsVguuM++JXACG8MAS/Tof78dtNM4a3kQxzqeh5Bv6IkfTu rokENLq4pjNRy+nBAOeQZj8Jd0K0kkf/PN9WMdGQtplMoFhjjV25R6PeRrV9wwPo peozIz2MuQo7Kfof1D+44h2foyLfdC28/Z0CvRbDpr5EHOfYynvBbrnhzIGdQp6V xgftKTOdgz2Djgg8HiblZund1FA44OYerddVAASrIsnSFnIz1VLVQIsfV+GLBwwc FawrIZ6WfIjzRSrDGOvDsbAQI47T/1jbaPJeK6XgjWkQmjEd6UtRWRZLYCxemQEw 4HP3sWC96BOehuD8ylipVE1oFqrxCiOB/fZxezXqjo8dSX3NLdak4cCHTHoW5SuZ eEAxQRaBliKd+P7hoy9cZ57CAu3zUa8kijfM5QRlCAHF+zSxaPs= =QFnb -----END PGP SIGNATURE----- Merge tag 'net-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski. "Including fixes from bpf and netfilter. Current release - regressions: - sock: fix parameter order in sock_setsockopt() Current release - new code bugs: - netfilter: nft_last: - fix incorrect arithmetic when restoring last used - honor NFTA_LAST_SET on restoration Previous releases - regressions: - udp: properly flush normal packet at GRO time - sfc: ensure correct number of XDP queues; don't allow enabling the feature if there isn't sufficient resources to Tx from any CPU - dsa: sja1105: fix address learning getting disabled on the CPU port - mptcp: addresses a rmem accounting issue that could keep packets in subflow receive buffers longer than necessary, delaying MPTCP-level ACKs - ip_tunnel: fix mtu calculation for ETHER tunnel devices - do not reuse skbs allocated from skbuff_fclone_cache in the napi skb cache, we'd try to return them to the wrong slab cache - tcp: consistently disable header prediction for mptcp Previous releases - always broken: - bpf: fix subprog poke descriptor tracking use-after-free - ipv6: - allocate enough headroom in ip6_finish_output2() in case iptables TEE is used - tcp: drop silly ICMPv6 packet too big messages to avoid expensive and pointless lookups (which may serve as a DDOS vector) - make sure fwmark is copied in SYNACK packets - fix 'disable_policy' for forwarded packets (align with IPv4) - netfilter: conntrack: - do not renew entry stuck in tcp SYN_SENT state - do not mark RST in the reply direction coming after SYN packet for an out-of-sync entry - mptcp: cleanly handle error conditions with MP_JOIN and syncookies - mptcp: fix double free when rejecting a join due to port mismatch - validate lwtstate->data before returning from skb_tunnel_info() - tcp: call sk_wmem_schedule before sk_mem_charge in zerocopy path - mt76: mt7921: continue to probe driver when fw already downloaded - bonding: fix multiple issues with offloading IPsec to (thru?) bond - stmmac: ptp: fix issues around Qbv support and setting time back - bcmgenet: always clear wake-up based on energy detection Misc: - sctp: move 198 addresses from unusable to private scope - ptp: support virtual clocks and timestamping - openvswitch: optimize operation for key comparison" * tag 'net-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (158 commits) net: dsa: properly check for the bridge_leave methods in dsa_switch_bridge_leave() sfc: add logs explaining XDP_TX/REDIRECT is not available sfc: ensure correct number of XDP queues sfc: fix lack of XDP TX queues - error XDP TX failed (-22) net: fddi: fix UAF in fza_probe net: dsa: sja1105: fix address learning getting disabled on the CPU port net: ocelot: fix switchdev objects synced for wrong netdev with LAG offload net: Use nlmsg_unicast() instead of netlink_unicast() octeontx2-pf: Fix uninitialized boolean variable pps ipv6: allocate enough headroom in ip6_finish_output2() net: hdlc: rename 'mod_init' & 'mod_exit' functions to be module-specific net: bridge: multicast: fix MRD advertisement router port marking race net: bridge: multicast: fix PIM hello router port marking race net: phy: marvell10g: fix differentiation of 88X3310 from 88X3340 dsa: fix for_each_child.cocci warnings virtio_net: check virtqueue_add_sgs() return value mptcp: properly account bulk freed memory selftests: mptcp: fix case multiple subflows limited by server mptcp: avoid processing packet if a subflow reset mptcp: fix syncookie process if mptcp can not_accept new subflow ...	2021-07-14 09:24:32 -07:00
Martynas Pumputis	97eb31384a	libbpf: Fix reuse of pinned map on older kernel When loading a BPF program with a pinned map, the loader checks whether the pinned map can be reused, i.e. their properties match. To derive such of the pinned map, the loader invokes BPF_OBJ_GET_INFO_BY_FD and then does the comparison. Unfortunately, on < 4.12 kernels the BPF_OBJ_GET_INFO_BY_FD is not available, so loading the program fails with the following error: libbpf: failed to get map info for map FD 5: Invalid argument libbpf: couldn't reuse pinned map at '/sys/fs/bpf/tc/globals/cilium_call_policy': parameter mismatch" libbpf: map 'cilium_call_policy': error reusing pinned map libbpf: map 'cilium_call_policy': failed to create: Invalid argument(-22) libbpf: failed to load object 'bpf_overlay.o' To fix this, fallback to derivation of the map properties via /proc/$PID/fdinfo/$MAP_FD if BPF_OBJ_GET_INFO_BY_FD fails with EINVAL, which can be used as an indicator that the kernel doesn't support the latter. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210712125552.58705-1-m@lambda.lt	2021-07-12 16:15:27 -07:00
Jiri Olsa	afd4ad01ff	libperf: Add tests for perf_evlist__set_leader() Add a test for the newly added perf_evlist__set_leader() function. Committer testing: $ cd tools/lib/perf/ $ sudo make tests [sudo] password for acme: running static: - running tests/test-cpumap.c...OK - running tests/test-threadmap.c...OK - running tests/test-evlist.c...OK - running tests/test-evsel.c...OK running dynamic: - running tests/test-cpumap.c...OK - running tests/test-threadmap.c...OK - running tests/test-evlist.c...OK - running tests/test-evsel.c...OK $ Signed-off-by: Jiri Olsa <jolsa@kernel.org> Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210706151704.73662-8-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-07-09 15:34:41 -03:00
Arnaldo Carvalho de Melo	e2c18168c3	libperf: Remove BUG_ON() from library code in get_group_fd() We shouldn't just panic, return a value that doesn't clash with what perf_evsel__open() was already returning in case of error, i.e. errno when sys_perf_event_open() fails. Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Link: http://lore.kernel.org/lkml/YOiOA5zOtVH9IBbE@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-07-09 15:33:39 -03:00
Jiri Olsa	3fd35de168	libperf: Add group support to perf_evsel__open() Add support to set group_fd in perf_evsel__open() and make it follow the group setup. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210706151704.73662-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-07-09 14:49:47 -03:00
Jiri Olsa	2e6263ab54	libperf: Adopt evlist__set_leader() from tools/perf as perf_evlist__set_leader() Move the implementation of evlist__set_leader() to a new libperf perf_evlist__set_leader() function with the same functionality make it a libperf exported API. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210706151704.73662-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-07-09 14:04:32 -03:00
Jiri Olsa	3a683120d8	libperf: Move 'nr_groups' from tools/perf to evlist::nr_groups Move evsel::nr_groups to perf_evsel::nr_groups, so we can move the group interface to libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210706151704.73662-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-07-09 14:04:32 -03:00
Jiri Olsa	fba7c86601	libperf: Move 'leader' from tools/perf to perf_evsel::leader Move evsel::leader to perf_evsel::leader, so we can move the group interface to libperf. Also add several evsel helpers to ease up the transition: struct evsel evsel__leader(struct evsel evsel); - get leader evsel bool evsel__has_leader(struct evsel evsel, struct evsel leader); - true if evsel has leader as leader bool evsel__is_leader(struct evsel evsel); - true if evsel is itw own leader void evsel__set_leader(struct evsel evsel, struct evsel *leader); - set leader for evsel Committer notes: Fix this when building with 'make BUILD_BPF_SKEL=1' tools/perf/util/bpf_counter.c - if (evsel->leader->core.nr_members > 1) { + if (evsel->core.leader->nr_members > 1) { Signed-off-by: Jiri Olsa <jolsa@kernel.org> Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210706151704.73662-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-07-09 14:04:31 -03:00
Jiri Olsa	38fe0e0156	libperf: Move 'idx' from tools/perf to perf_evsel::idx Move evsel::idx to perf_evsel::idx, so we can move the group interface to libperf. Committer notes: Fixup evsel->idx usage in tools/perf/util/bpf_counter_cgroup.c, that appeared in my tree in my local tree. Also fixed up these: $ find tools/perf/ -name "*.[ch]" \| xargs grep 'evsel->idx' tools/perf/ui/gtk/annotate.c: evsel->idx + i); tools/perf/ui/gtk/annotate.c: evsel->idx); $ That running 'make -C tools/perf build-test' caught. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210706151704.73662-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-07-09 14:04:28 -03:00
Jiri Olsa	3d970601da	libperf: Change tests to single static and shared binaries Make tests to be two binaries 'tests_static' and 'tests_shared', so the maintenance is easier. Adding tests under libperf build system, so we define all the flags just once. Adding make-tests tule to just compile tests without running them. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Link: http://lore.kernel.org/lkml/20210706151704.73662-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-07-07 11:41:58 -03:00
Toke Høiland-Jørgensen	af0efa050c	libbpf: Restore errno return for functions that were already returning it The update to streamline libbpf error reporting intended to change all functions to return the errno as a negative return value if LIBBPF_STRICT_DIRECT_ERRS is set. However, if the flag is not set, the return value changes for the two functions that were already returning a negative errno unconditionally: bpf_link__unpin() and perf_buffer__poll(). This is a user-visible API change that breaks applications; so let's revert these two functions back to unconditionally returning a negative errno value. Fixes: `e9fc3ce99b` ("libbpf: Streamline error reporting for high-level APIs") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210706122355.236082-1-toke@redhat.com	2021-07-06 21:13:08 -07:00
Linus Torvalds	406254918b	perf tools changes for v5.14: Tools: - Add cgroup support for 'perf top' (-G). - Add support for KVM MSRs in 'perf kvm stat' - Support probes on init functions in 'perf probe', to support the bootconfig format. - Improve error reporting in 'perf probe'. - No need to synthesize BUILD_ID records in 'perf inject' if the MMAP2 records have build ids already. - Allow toggling source code ('s' hotkey) in 'perf annotate' in all lines. - Add itrace options support to 'perf annotate'. - Support to custom DSO filters for 'perf script'. Hardware enablement: - Support the HYBRID_TOPOLOGY and HYBRID_CPU_PMU_CAPS features in the perf.data file header. - Support PMU prefix for mem-load and mem-store events, to support hybrid (BIG little) CPUs such as Intel's Alderlake. - Support hybrid CPUs in 'perf mem' and 'perf c2c'. Hardware tracing: - Intel PT now supports tracing KVM guests. - Timestamp improvements for ARM's Coresight. Build: - Add 'make -C tools/perf build-test' entries for libopencsd/CORESIGHT=1 and libbpf/LIBBPF_DYNAMIC=1. - Use bison's --file-prefix-map option to avoid storing full paths when using O= in the perf build. Tests: - Improve the 'perf test' entries for libpfm4 and BPF counters. Misc: - Sync msr-index.h, mount.h, kvm headers with the kernel originals. - Add vendor events and metrics for Intel's Icelake Server & Client. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYN5WhAAKCRCyPKLppCJ+ J3b9APwO5iDjSMvVgKT84njXo1EqURMz6nmV3kkjBkaMo0KK2wEAvXysIEgwx1cu hakfFw63ztxVQctcWShOzP7jnJOOwwg= =PpGz -----END PGP SIGNATURE----- Merge tag 'perf-tools-for-v5.14-2021-07-01' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tool updates from Arnaldo Carvalho de Melo: "Tools: - Add cgroup support for 'perf top' (-G). - Add support for KVM MSRs in 'perf kvm stat' - Support probes on init functions in 'perf probe', to support the bootconfig format. - Improve error reporting in 'perf probe'. - No need to synthesize BUILD_ID records in 'perf inject' if the MMAP2 records have build ids already. - Allow toggling source code ('s' hotkey) in 'perf annotate' in all lines. - Add itrace options support to 'perf annotate'. - Support to custom DSO filters for 'perf script'. Hardware enablement: - Support the HYBRID_TOPOLOGY and HYBRID_CPU_PMU_CAPS features in the perf.data file header. - Support PMU prefix for mem-load and mem-store events, to support hybrid (BIG little) CPUs such as Intel's Alderlake. - Support hybrid CPUs in 'perf mem' and 'perf c2c'. Hardware tracing: - Intel PT now supports tracing KVM guests. - Timestamp improvements for ARM's Coresight. Build: - Add 'make -C tools/perf build-test' entries for libopencsd/CORESIGHT=1 and libbpf/LIBBPF_DYNAMIC=1. - Use bison's --file-prefix-map option to avoid storing full paths when using O= in the perf build. Tests: - Improve the 'perf test' entries for libpfm4 and BPF counters. Misc: - Sync msr-index.h, mount.h, kvm headers with the kernel originals. - Add vendor events and metrics for Intel's Icelake Server & Client" * tag 'perf-tools-for-v5.14-2021-07-01' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (123 commits) perf session: Add missing evlist__delete when deleting a session perf annotate: Allow 's' on source code lines perf dlfilter: Add object_code() to perf_dlfilter_fns perf dlfilter: Add attr() to perf_dlfilter_fns perf dlfilter: Add srcline() to perf_dlfilter_fns perf dlfilter: Add insn() to perf_dlfilter_fns perf dlfilter: Add resolve_address() to perf_dlfilter_fns perf build: Install perf_dlfilter.h perf script: Add option to pass arguments to dlfilters perf script: Add option to list dlfilters perf script: Add dlfilter__filter_event_early() perf script: Add API for filtering via dynamically loaded shared object perf llvm: Return -ENOMEM when asprintf() fails perf cs-etm: Delay decode of non-timeless data until cs_etm__flush_events() tools headers UAPI: Synch KVM's svm.h header with the kernel tools kvm headers arm64: Update KVM headers from the kernel sources tools headers UAPI: Sync linux/kvm.h with the kernel sources tools headers cpufeatures: Sync with the kernel sources tools include UAPI: Update linux/mount.h copy tools arch x86: Sync the msr-index.h copy with the kernel sources ...	2021-07-02 12:24:31 -07:00
Linus Torvalds	dbe69e4337	Networking changes for 5.14. Core: - BPF: - add syscall program type and libbpf support for generating instructions and bindings for in-kernel BPF loaders (BPF loaders for BPF), this is a stepping stone for signed BPF programs - infrastructure to migrate TCP child sockets from one listener to another in the same reuseport group/map to improve flexibility of service hand-off/restart - add broadcast support to XDP redirect - allow bypass of the lockless qdisc to improving performance (for pktgen: +23% with one thread, +44% with 2 threads) - add a simpler version of "DO_ONCE()" which does not require jump labels, intended for slow-path usage - virtio/vsock: introduce SOCK_SEQPACKET support - add getsocketopt to retrieve netns cookie - ip: treat lowest address of a IPv4 subnet as ordinary unicast address allowing reclaiming of precious IPv4 addresses - ipv6: use prandom_u32() for ID generation - ip: add support for more flexible field selection for hashing across multi-path routes (w/ offload to mlxsw) - icmp: add support for extended RFC 8335 PROBE (ping) - seg6: add support for SRv6 End.DT46 behavior - mptcp: - DSS checksum support (RFC 8684) to detect middlebox meddling - support Connection-time 'C' flag - time stamping support - sctp: packetization Layer Path MTU Discovery (RFC 8899) - xfrm: speed up state addition with seq set - WiFi: - hidden AP discovery on 6 GHz and other HE 6 GHz improvements - aggregation handling improvements for some drivers - minstrel improvements for no-ack frames - deferred rate control for TXQs to improve reaction times - switch from round robin to virtual time-based airtime scheduler - add trace points: - tcp checksum errors - openvswitch - action execution, upcalls - socket errors via sk_error_report Device APIs: - devlink: add rate API for hierarchical control of max egress rate of virtual devices (VFs, SFs etc.) - don't require RCU read lock to be held around BPF hooks in NAPI context - page_pool: generic buffer recycling New hardware/drivers: - mobile: - iosm: PCIe Driver for Intel M.2 Modem - support for Qualcomm MSM8998 (ipa) - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU) - NXP SJA1110 Automotive Ethernet 10-port switch - Qualcomm QCA8327 switch support (qca8k) - Mikrotik 10/25G NIC (atl1c) Driver changes: - ACPI support for some MDIO, MAC and PHY devices from Marvell and NXP (our first foray into MAC/PHY description via ACPI) - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx - Mellanox/Nvidia NIC (mlx5) - NIC VF offload of L2 bridging - support IRQ distribution to Sub-functions - Marvell (prestera): - add flower and match all - devlink trap - link aggregation - Netronome (nfp): connection tracking offload - Intel 1GE (igc): add AF_XDP support - Marvell DPU (octeontx2): ingress ratelimit offload - Google vNIC (gve): new ring/descriptor format support - Qualcomm mobile (rmnet & ipa): inline checksum offload support - MediaTek WiFi (mt76) - mt7915 MSI support - mt7915 Tx status reporting - mt7915 thermal sensors support - mt7921 decapsulation offload - mt7921 enable runtime pm and deep sleep - Realtek WiFi (rtw88) - beacon filter support - Tx antenna path diversity support - firmware crash information via devcoredump - Qualcomm 60GHz WiFi (wcn36xx) - Wake-on-WLAN support with magic packets and GTK rekeying - Micrel PHY (ksz886x/ksz8081): add cable test support Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmDb+fUACgkQMUZtbf5S Irs2Jg//aqN0Q8CgIvYCVhPxQw1tY7pTAbgyqgBZ01vwjyvtIOgJiWzSfFEU84mX M8fcpFX5eTKrOyJ9S6UFfQ/JG114n3hjAxFFT4Hxk2gC1Tg0vHuFQTDHcUl28bUE mTm61e1YpdorILnv2k5JVQ/wu0vs5QKDrjcYcrcPnh+j93wvnPOgAfDBV95nZzjS OTt4q2fR8GzLcSYWWsclMbDNkzyTG50RW/0Yd6aGjr5QGvXfrMeXfUJNz533PMf/ w5lNyjRKv+x9mdTZJzU0+msNUrZgUdRz7W8Ey8lD3hJZRE+D6/uU7FtsE8Mi3+uc HWxeZUyzA3YF1MfVl/eesbxyPT7S/OkLzk4O5B35FbqP0YltaP+bOjq1/nM3ce1/ io9Dx9pIl/2JANUgRCAtLi8Z2dkvRoqTaBxZ/nPudCCljFwDwl6joTMJ7Ow22i5Y 5aIkcXFmZq4LbJDiHvbTlqT7yiuaEvu2UK/23bSIg/K3nF4eAmkY9Y1EgiMf60OF 78Ttw0wk2tUegwaS5MZnCniKBKDyl9gM2F6rbZ/IxQRR2LTXFc1B6gC+ynUxgXfh Ub8O++6qGYGYZ0XvQH4pzco79p3qQWBTK5beIp2eu6BOAjBVIXq4AibUfoQLACsu hX7jMPYd0kc3WFgUnKgQP8EnjFSwbf4XiaE7fIXvWBY8hzCw2h4= =LvtX -----END PGP SIGNATURE----- Merge tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - BPF: - add syscall program type and libbpf support for generating instructions and bindings for in-kernel BPF loaders (BPF loaders for BPF), this is a stepping stone for signed BPF programs - infrastructure to migrate TCP child sockets from one listener to another in the same reuseport group/map to improve flexibility of service hand-off/restart - add broadcast support to XDP redirect - allow bypass of the lockless qdisc to improving performance (for pktgen: +23% with one thread, +44% with 2 threads) - add a simpler version of "DO_ONCE()" which does not require jump labels, intended for slow-path usage - virtio/vsock: introduce SOCK_SEQPACKET support - add getsocketopt to retrieve netns cookie - ip: treat lowest address of a IPv4 subnet as ordinary unicast address allowing reclaiming of precious IPv4 addresses - ipv6: use prandom_u32() for ID generation - ip: add support for more flexible field selection for hashing across multi-path routes (w/ offload to mlxsw) - icmp: add support for extended RFC 8335 PROBE (ping) - seg6: add support for SRv6 End.DT46 behavior - mptcp: - DSS checksum support (RFC 8684) to detect middlebox meddling - support Connection-time 'C' flag - time stamping support - sctp: packetization Layer Path MTU Discovery (RFC 8899) - xfrm: speed up state addition with seq set - WiFi: - hidden AP discovery on 6 GHz and other HE 6 GHz improvements - aggregation handling improvements for some drivers - minstrel improvements for no-ack frames - deferred rate control for TXQs to improve reaction times - switch from round robin to virtual time-based airtime scheduler - add trace points: - tcp checksum errors - openvswitch - action execution, upcalls - socket errors via sk_error_report Device APIs: - devlink: add rate API for hierarchical control of max egress rate of virtual devices (VFs, SFs etc.) - don't require RCU read lock to be held around BPF hooks in NAPI context - page_pool: generic buffer recycling New hardware/drivers: - mobile: - iosm: PCIe Driver for Intel M.2 Modem - support for Qualcomm MSM8998 (ipa) - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU) - NXP SJA1110 Automotive Ethernet 10-port switch - Qualcomm QCA8327 switch support (qca8k) - Mikrotik 10/25G NIC (atl1c) Driver changes: - ACPI support for some MDIO, MAC and PHY devices from Marvell and NXP (our first foray into MAC/PHY description via ACPI) - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx - Mellanox/Nvidia NIC (mlx5) - NIC VF offload of L2 bridging - support IRQ distribution to Sub-functions - Marvell (prestera): - add flower and match all - devlink trap - link aggregation - Netronome (nfp): connection tracking offload - Intel 1GE (igc): add AF_XDP support - Marvell DPU (octeontx2): ingress ratelimit offload - Google vNIC (gve): new ring/descriptor format support - Qualcomm mobile (rmnet & ipa): inline checksum offload support - MediaTek WiFi (mt76) - mt7915 MSI support - mt7915 Tx status reporting - mt7915 thermal sensors support - mt7921 decapsulation offload - mt7921 enable runtime pm and deep sleep - Realtek WiFi (rtw88) - beacon filter support - Tx antenna path diversity support - firmware crash information via devcoredump - Qualcomm WiFi (wcn36xx) - Wake-on-WLAN support with magic packets and GTK rekeying - Micrel PHY (ksz886x/ksz8081): add cable test support" * tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits) tcp: change ICSK_CA_PRIV_SIZE definition tcp_yeah: check struct yeah size at compile time gve: DQO: Fix off by one in gve_rx_dqo() stmmac: intel: set PCI_D3hot in suspend stmmac: intel: Enable PHY WOL option in EHL net: stmmac: option to enable PHY WOL with PMT enabled net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del} net: use netdev_info in ndo_dflt_fdb_{add,del} ptp: Set lookup cookie when creating a PTP PPS source. net: sock: add trace for socket errors net: sock: introduce sk_error_report net: dsa: replay the local bridge FDB entries pointing to the bridge dev too net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev net: dsa: include fdb entries pointing to bridge in the host fdb list net: dsa: include bridge addresses which are local in the host fdb list net: dsa: sync static FDB entries on foreign interfaces to hardware net: dsa: install the host MDB and FDB entries in the master's RX filter net: dsa: reference count the FDB addresses at the cross-chip notifier level net: dsa: introduce a separate cross-chip notifier type for host FDBs net: dsa: reference count the MDB entries at the cross-chip notifier level ...	2021-06-30 15:51:09 -07:00
Alexey Bayduraev	f20510d552	tools lib: Adopt bitmap_intersects() operation from the kernel sources Adopt bitmap_intersects() routine that tests whether bitmaps bitmap1 and bitmap2 intersects. This routine will be used during thread masks initialization. Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Namhyung Kim <namhyung@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Link: http://lore.kernel.org/lkml/f75aa738d8ff8f9cffd7532d671f3ef3deb97a7c.1625065643.git.alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-06-30 15:28:00 -03:00
Linus Torvalds	36824f198c	ARM: - Add MTE support in guests, complete with tag save/restore interface - Reduce the impact of CMOs by moving them in the page-table code - Allow device block mappings at stage-2 - Reduce the footprint of the vmemmap in protected mode - Support the vGIC on dumb systems such as the Apple M1 - Add selftest infrastructure to support multiple configuration and apply that to PMU/non-PMU setups - Add selftests for the debug architecture - The usual crop of PMU fixes PPC: - Support for the H_RPT_INVALIDATE hypercall - Conversion of Book3S entry/exit to C - Bug fixes S390: - new HW facilities for guests - make inline assembly more robust with KASAN and co x86: - Allow userspace to handle emulation errors (unknown instructions) - Lazy allocation of the rmap (host physical -> guest physical address) - Support for virtualizing TSC scaling on VMX machines - Optimizations to avoid shattering huge pages at the beginning of live migration - Support for initializing the PDPTRs without loading them from memory - Many TLB flushing cleanups - Refuse to load if two-stage paging is available but NX is not (this has been a requirement in practice for over a year) - A large series that separates the MMU mode (WP/SMAP/SMEP etc.) from CR0/CR4/EFER, using the MMU mode everywhere once it is computed from the CPU registers - Use PM notifier to notify the guest about host suspend or hibernate - Support for passing arguments to Hyper-V hypercalls using XMM registers - Support for Hyper-V TLB flush hypercalls and enlightened MSR bitmap on AMD processors - Hide Hyper-V hypercalls that are not included in the guest CPUID - Fixes for live migration of virtual machines that use the Hyper-V "enlightened VMCS" optimization of nested virtualization - Bugfixes (not many) Generic: - Support for retrieving statistics without debugfs - Cleanups for the KVM selftests API -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmDV9UYUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroOIRgf/XX8fKLh24RnTOs2ldIu2AfRGVrT4 QMrr8MxhmtukBAszk2xKvBt8/6gkUjdaIC3xqEnVjxaDaUvZaEtP7CQlF5JV45rn iv1zyxUKucXrnIOr+gCioIT7qBlh207zV35ArKioP9Y83cWx9uAs22pfr6g+7RxO h8bJZlJbSG6IGr3voANCIb9UyjU1V/l8iEHqRwhmr/A5rARPfD7g8lfMEQeGkzX6 +/UydX2fumB3tl8e2iMQj6vLVdSOsCkehvpHK+Z33EpkKhan7GwZ2sZ05WmXV/nY QLAYfD10KegoNWl5Ay4GTp4hEAIYVrRJCLC+wnLdc0U8udbfCuTC31LK4w== =NcRh -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "This covers all architectures (except MIPS) so I don't expect any other feature pull requests this merge window. ARM: - Add MTE support in guests, complete with tag save/restore interface - Reduce the impact of CMOs by moving them in the page-table code - Allow device block mappings at stage-2 - Reduce the footprint of the vmemmap in protected mode - Support the vGIC on dumb systems such as the Apple M1 - Add selftest infrastructure to support multiple configuration and apply that to PMU/non-PMU setups - Add selftests for the debug architecture - The usual crop of PMU fixes PPC: - Support for the H_RPT_INVALIDATE hypercall - Conversion of Book3S entry/exit to C - Bug fixes S390: - new HW facilities for guests - make inline assembly more robust with KASAN and co x86: - Allow userspace to handle emulation errors (unknown instructions) - Lazy allocation of the rmap (host physical -> guest physical address) - Support for virtualizing TSC scaling on VMX machines - Optimizations to avoid shattering huge pages at the beginning of live migration - Support for initializing the PDPTRs without loading them from memory - Many TLB flushing cleanups - Refuse to load if two-stage paging is available but NX is not (this has been a requirement in practice for over a year) - A large series that separates the MMU mode (WP/SMAP/SMEP etc.) from CR0/CR4/EFER, using the MMU mode everywhere once it is computed from the CPU registers - Use PM notifier to notify the guest about host suspend or hibernate - Support for passing arguments to Hyper-V hypercalls using XMM registers - Support for Hyper-V TLB flush hypercalls and enlightened MSR bitmap on AMD processors - Hide Hyper-V hypercalls that are not included in the guest CPUID - Fixes for live migration of virtual machines that use the Hyper-V "enlightened VMCS" optimization of nested virtualization - Bugfixes (not many) Generic: - Support for retrieving statistics without debugfs - Cleanups for the KVM selftests API" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (314 commits) KVM: x86: rename apic_access_page_done to apic_access_memslot_enabled kvm: x86: disable the narrow guest module parameter on unload selftests: kvm: Allows userspace to handle emulation errors. kvm: x86: Allow userspace to handle emulation errors KVM: x86/mmu: Let guest use GBPAGES if supported in hardware and TDP is on KVM: x86/mmu: Get CR4.SMEP from MMU, not vCPU, in shadow page fault KVM: x86/mmu: Get CR0.WP from MMU, not vCPU, in shadow page fault KVM: x86/mmu: Drop redundant rsvd bits reset for nested NPT KVM: x86/mmu: Optimize and clean up so called "last nonleaf level" logic KVM: x86: Enhance comments for MMU roles and nested transition trickiness KVM: x86/mmu: WARN on any reserved SPTE value when making a valid SPTE KVM: x86/mmu: Add helpers to do full reserved SPTE checks w/ generic MMU KVM: x86/mmu: Use MMU's role to determine PTTYPE KVM: x86/mmu: Collapse 32-bit PAE and 64-bit statements for helpers KVM: x86/mmu: Add a helper to calculate root from role_regs KVM: x86/mmu: Add helper to update paging metadata KVM: x86/mmu: Don't update nested guest's paging bitmasks if CR0.PG=0 KVM: x86/mmu: Consolidate reset_rsvds_bits_mask() calls KVM: x86/mmu: Use MMU role_regs to get LA57, and drop vCPU LA57 helper KVM: x86/mmu: Get nested MMU's root level from the MMU's role ...	2021-06-28 15:40:51 -07:00
David S. Miller	e1289cfb63	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2021-06-28 The following pull-request contains BPF updates for your net-next tree. We've added 37 non-merge commits during the last 12 day(s) which contain a total of 56 files changed, 394 insertions(+), 380 deletions(-). The main changes are: 1) XDP driver RCU cleanups, from Toke Høiland-Jørgensen and Paul E. McKenney. 2) Fix bpf_skb_change_proto() IPv4/v6 GSO handling, from Maciej Żenczykowski. 3) Fix false positive kmemleak report for BPF ringbuf alloc, from Rustam Kovhaev. 4) Fix x86 JIT's extable offset calculation for PROBE_LDX NULL, from Ravi Bangoria. 5) Enable libbpf fallback probing with tracing under RHEL7, from Jonathan Edwards. 6) Clean up x86 JIT to remove unused cnt tracking from EMIT macro, from Jiri Olsa. 7) Netlink cleanups for libbpf to please Coverity, from Kumar Kartikeya Dwivedi. 8) Allow to retrieve ancestor cgroup id in tracing programs, from Namhyung Kim. 9) Fix lirc BPF program query to use user-provided prog_cnt, from Sean Young. 10) Add initial libbpf doc including generated kdoc for its API, from Grant Seltzer. 11) Make xdp_rxq_info_unreg_mem_model() more robust, from Jakub Kicinski. 12) Fix up bpfilter startup log-level to info level, from Gary Lin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-28 15:28:03 -07:00
Paolo Bonzini	b8917b4ae4	KVM/arm64 updates for v5.14. - Add MTE support in guests, complete with tag save/restore interface - Reduce the impact of CMOs by moving them in the page-table code - Allow device block mappings at stage-2 - Reduce the footprint of the vmemmap in protected mode - Support the vGIC on dumb systems such as the Apple M1 - Add selftest infrastructure to support multiple configuration and apply that to PMU/non-PMU setups - Add selftests for the debug architecture - The usual crop of PMU fixes -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmDV2bEPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDEr8P/ivwROx5NwGcHGmU5RfUCT3aFqhtVHHwD/lu jPcgoO61kz9TelOu6QRaVuK+mVHxcq3iP4R8nPq/QCkUlEXTmK2xkyhXhGXSYpH4 6jM8+BbC3eG7iAxx6H0UM4JTl4Riwat6ZZtXpWEWs9TKqOHOQYFpMkxSttwVZ1CZ SjbtFvXLEdzKn6PzUWnKdBNMV/mHsdAtohZit9oJOc4ttc8072XxETQ4TFQ+MSvA j9zY9QPmWzgcZnotqRRu9sbTGO2vxtXuUtY3sjdD8+C9OgSe9qvpnNjymcmfwaMu 1fBkfh65oaO4ItJBdGOUOoEcFqwN5imPiI7CB/O+ZYkO9sBCuTUPSQwPkyiwXb9r bUkTaQw2nZiNWsqR1x07fQ2sGYbMp5mnmgmqiV4MUWkLmFp9LZATCWYTTn24cBNS 6SjVP6/8S0r3EhLnYjH0Pn1we5PooU1EF6RlCAd3ewYoo+9fPnwjNYwIWH5i5wB7 +tnei44NACAw9cfbos+BYQQ/dY15OSFzLzIMomlabB7OpXOdDg3H6tJnPbFwWwXb 9nF8XdHqxeDVVVrDCAx1BSodSXm9xqgnQM2RDGTUnpVcAfqAr3MXX6VsyKQDzj8T QXF9qOVCBAABv6BXAvSQ6mvMJZDUVbUPEPhf7kXzF46JsRd6A7wWoU/OnMGHQ/w7 wjvH8HVy =fWBV -----END PGP SIGNATURE----- Merge tag 'kvmarm-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for v5.14. - Add MTE support in guests, complete with tag save/restore interface - Reduce the impact of CMOs by moving them in the page-table code - Allow device block mappings at stage-2 - Reduce the footprint of the vmemmap in protected mode - Support the vGIC on dumb systems such as the Apple M1 - Add selftest infrastructure to support multiple configuration and apply that to PMU/non-PMU setups - Add selftests for the debug architecture - The usual crop of PMU fixes	2021-06-25 11:24:24 -04:00
Sean Christopherson	167f8a5cae	KVM: x86/mmu: Rename "nxe" role bit to "efer_nx" for macro shenanigans Rename "nxe" to "efer_nx" so that future macro magic can use the pattern <reg>_<bit> for all CR0, CR4, and EFER bits that included in the role. Using "efer_nx" also makes it clear that the role bit reflects EFER.NX, not the NX bit in the corresponding PTE. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622175739.3610207-25-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-06-24 18:00:41 -04:00
Kumar Kartikeya Dwivedi	ee62a5c6bb	libbpf: Switch to void * casting in netlink helpers Netlink helpers I added in `8bbb77b7c7` ("libbpf: Add various netlink helpers") used char * casts everywhere, and there were a few more that existed from before. Convert all of them to void * cast, as it is treated equivalently by clang/gcc for the purposes of pointer arithmetic and to follow the convention elsewhere in the kernel/libbpf. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210619041454.417577-2-memxor@gmail.com	2021-06-22 17:04:02 +02:00
Kumar Kartikeya Dwivedi	0ae64fb6b6	libbpf: Add request buffer type for netlink messages Coverity complains about OOB writes to nlmsghdr. There is no OOB as we write to the trailing buffer, but static analyzers and compilers may rightfully be confused as the nlmsghdr pointer has subobject provenance (and hence subobject bounds). Fix this by using an explicit request structure containing the nlmsghdr, struct tcmsg/ifinfomsg, and attribute buffer. Also switch nh_tail (renamed to req_tail) to cast req * to char * so that it can be understood as arithmetic on pointer to the representation array (hence having same bound as request structure), which should further appease analyzers. As a bonus, callers don't have to pass sizeof(req) all the time now, as size is implicitly obtained using the pointer. While at it, also reduce the size of attribute buffer to 128 bytes (132 for ifinfomsg using functions due to the padding). Summary of problem: Even though C standard allows interconvertibility of pointer to first member and pointer to struct, for the purposes of alias analysis it would still consider the first as having pointer value "pointer to T" where T is type of first member hence having subobject bounds, allowing analyzers within reason to complain when object is accessed beyond the size of pointed to object. The only exception to this rule may be when a char * is formed to a member subobject. It is not possible for the compiler to be able to tell the intent of the programmer that it is a pointer to member object or the underlying representation array of the containing object, so such diagnosis is suppressed. Fixes: `715c5ce454` ("libbpf: Add low level TC-BPF management API") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210619041454.417577-1-memxor@gmail.com	2021-06-22 17:03:52 +02:00
Jonathan Edwards	5c10a3dbe9	libbpf: Add extra BPF_PROG_TYPE check to bpf_object__probe_loading eBPF has been backported for RHEL 7 w/ kernel 3.10-940+ [0]. However only the following program types are supported [1]: BPF_PROG_TYPE_KPROBE BPF_PROG_TYPE_TRACEPOINT BPF_PROG_TYPE_PERF_EVENT For libbpf this causes an EINVAL return during the bpf_object__probe_loading call which only checks to see if programs of type BPF_PROG_TYPE_SOCKET_FILTER can load. The following will try BPF_PROG_TYPE_TRACEPOINT as a fallback attempt before erroring out. BPF_PROG_TYPE_KPROBE was not a good candidate because on some kernels it requires knowledge of the LINUX_VERSION_CODE. [0] https://www.redhat.com/en/blog/introduction-ebpf-red-hat-enterprise-linux-7 [1] https://access.redhat.com/articles/3550581 Signed-off-by: Jonathan Edwards <jonathan.edwards@165gc.onmicrosoft.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210619151007.GA6963@165gc.onmicrosoft.com	2021-06-21 17:21:07 +02:00
Jakub Kicinski	adc2e56ebe	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Trivial conflicts in net/can/isotp.c and tools/testing/selftests/net/mptcp/mptcp_connect.sh scaled_ppm_to_ppb() was moved from drivers/ptp/ptp_clock.c to include/linux/ptp_clock_kernel.h in -next so re-apply the fix there. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-06-18 19:47:02 -07:00
Grant Seltzer	f42cfb469f	bpf: Add documentation for libbpf including API autogen This patch is meant to start the initiative to document libbpf. It includes .rst files which are text documentation describing building, API naming convention, as well as an index to generated API documentation. In this approach the generated API documentation is enabled by the kernels existing kernel documentation system which uses sphinx. The resulting docs would then be synced to kernel.org/doc You can test this by running `make htmldocs` and serving the html in Documentation/output. Since libbpf does not yet have comments in kernel doc format, see kernel.org/doc/html/latest/doc-guide/kernel-doc.html for an example so you can test this. The advantage of this approach is to use the existing sphinx infrastructure that the kernel has, and have libbpf docs in the same place as everything else. The current plan is to have the libbpf mirror sync the generated docs and version them based on the libbpf releases which are cut on github. This patch includes the addition of libbpf_api.rst which pulls comment documentation from header files in libbpf under tools/lib/bpf/. The comment docs would be of the standard kernel doc format. Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210618140459.9887-2-grantseltzer@gmail.com	2021-06-18 23:06:06 +02:00
David S. Miller	a52171ae7b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2021-06-17 The following pull-request contains BPF updates for your net-next tree. We've added 50 non-merge commits during the last 25 day(s) which contain a total of 148 files changed, 4779 insertions(+), 1248 deletions(-). The main changes are: 1) BPF infrastructure to migrate TCP child sockets from a listener to another in the same reuseport group/map, from Kuniyuki Iwashima. 2) Add a provably sound, faster and more precise algorithm for tnum_mul() as noted in https://arxiv.org/abs/2105.05398, from Harishankar Vishwanathan. 3) Streamline error reporting changes in libbpf as planned out in the 'libbpf: the road to v1.0' effort, from Andrii Nakryiko. 4) Add broadcast support to xdp_redirect_map(), from Hangbin Liu. 5) Extends bpf_map_lookup_and_delete_elem() functionality to 4 more map types, that is, {LRU_,PERCPU_,LRU_PERCPU_,}HASH, from Denis Salopek. 6) Support new LLVM relocations in libbpf to make them more linker friendly, also add a doc to describe the BPF backend relocations, from Yonghong Song. 7) Silence long standing KUBSAN complaints on register-based shifts in interpreter, from Daniel Borkmann and Eric Biggers. 8) Add dummy PT_REGS macros in libbpf to fail BPF program compilation when target arch cannot be determined, from Lorenz Bauer. 9) Extend AF_XDP to support large umems with 1M+ pages, from Magnus Karlsson. 10) Fix two minor libbpf tc BPF API issues, from Kumar Kartikeya Dwivedi. 11) Move libbpf BPF_SEQ_PRINTF/BPF_SNPRINTF macros that can be used by BPF programs to bpf_helpers.h header, from Florent Revest. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-17 11:54:56 -07:00
Lorenz Bauer	4a638d581a	libbpf: Fail compilation if target arch is missing bpf2go is the Go equivalent of libbpf skeleton. The convention is that the compiled BPF is checked into the repository to facilitate distributing BPF as part of Go packages. To make this portable, bpf2go by default generates both bpfel and bpfeb variants of the C. Using bpf_tracing.h is inherently non-portable since the fields of struct pt_regs differ between platforms, so CO-RE can't help us here. The only way of working around this is to compile for each target platform independently. bpf2go can't do this by default since there are too many platforms. Define the various PT_... macros when no target can be determined and turn them into compilation failures. This works because bpf2go always compiles for bpf targets, so the compiler fallback doesn't kick in. Conditionally define __BPF_MISSING_TARGET so that we can inject a more appropriate error message at build time. The user can then choose which platform to target explicitly. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210616083635.11434-1-lmb@cloudflare.com	2021-06-16 20:15:30 -07:00
Kuniyuki Iwashima	50501271e7	libbpf: Set expected_attach_type for BPF_PROG_TYPE_SK_REUSEPORT. This commit introduces a new section (sk_reuseport/migrate) and sets expected_attach_type to two each section in BPF_PROG_TYPE_SK_REUSEPORT program. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210612123224.12525-11-kuniyu@amazon.co.jp	2021-06-15 18:01:06 +02:00
Kumar Kartikeya Dwivedi	bbf29d3a2e	libbpf: Set NLM_F_EXCL when creating qdisc This got lost during the refactoring across versions. We always use NLM_F_EXCL when creating some TC object, so reflect what the function says and set the flag. Fixes: `715c5ce454` ("libbpf: Add low level TC-BPF management API") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210612023502.1283837-3-memxor@gmail.com	2021-06-15 14:00:30 +02:00
Kumar Kartikeya Dwivedi	4e164f8716	libbpf: Remove unneeded check for flags during tc detach Coverity complained about this being unreachable code. It is right because we already enforce flags to be unset, so a check validating the flag value is redundant. Fixes: `715c5ce454` ("libbpf: Add low level TC-BPF management API") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210612023502.1283837-2-memxor@gmail.com	2021-06-15 13:58:56 +02:00
Wang Hai	3b3af91cb6	libbpf: Simplify the return expression of bpf_object__init_maps function There is no need for special treatment of the 'ret == 0' case. This patch simplifies the return expression. Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210609115651.3392580-1-wanghai38@huawei.com	2021-06-11 15:30:52 -07:00
Michal Suchanek	edc0571c5f	libbpf: Fix pr_warn type warnings on 32bit The printed value is ptrdiff_t and is formatted wiht %ld. This works on 64bit but produces a warning on 32bit. Fix the format specifier to %td. Fixes: `6723474373` ("libbpf: Generate loader program out of BPF ELF file.") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210604112448.32297-1-msuchanek@suse.de	2021-06-08 22:02:10 +02:00
Kev Jackson	11fc79fc9f	libbpf: Fixes incorrect rx_ring_setup_done When calling xsk_socket__create_shared(), the logic at line 1097 marks a boolean flag true within the xsk_umem structure to track setup progress in order to support multiple calls to the function. However, instead of marking umem->tx_ring_setup_done, the code incorrectly sets umem->rx_ring_setup_done. This leads to improper behaviour when creating and destroying xsk and umem structures. Multiple calls to this function is documented as supported. Fixes: `ca7a83e248` ("libbpf: Only create rx and tx XDP rings when necessary") Signed-off-by: Kev Jackson <foamdino@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/YL4aU4f3Aaik7CN0@linux-dev	2021-06-07 17:44:03 -07:00
Andrii Nakryiko	7d8a819dd3	libbpf: Install skel_internal.h header used from light skeletons Light skeleton code assumes skel_internal.h header to be installed system-wide by libbpf package. Make sure it is actually installed. Fixes: `6723474373` ("libbpf: Generate loader program out of BPF ELF file.") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210603004026.2698513-4-andrii@kernel.org	2021-06-03 15:50:11 +02:00
Andrii Nakryiko	232c9e8bd5	libbpf: Refactor header installation portions of Makefile As we gradually get more headers that have to be installed, it's quite annoying to copy/paste long $(call) commands. So extract that logic and do a simple $(foreach) over the list of headers. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210603004026.2698513-3-andrii@kernel.org	2021-06-03 15:50:11 +02:00
Andrii Nakryiko	16cac00606	libbpf: Move few APIs from 0.4 to 0.5 version Official libbpf 0.4 release doesn't include three APIs that were tentatively put into 0.4 section. Fix libbpf.map and move these three APIs: - bpf_map__initial_value; - bpf_map_lookup_and_delete_elem_flags; - bpf_object__gen_loader. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210603004026.2698513-2-andrii@kernel.org	2021-06-03 15:50:05 +02:00
Jakub Kicinski	5ada57a9a6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net cdc-wdm: s/kill_urbs/poison_urbs/ to fix build Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-05-27 09:55:10 -07:00
Florent Revest	d6a6a55518	libbpf: Move BPF_SEQ_PRINTF and BPF_SNPRINTF to bpf_helpers.h These macros are convenient wrappers around the bpf_seq_printf and bpf_snprintf helpers. They are currently provided by bpf_tracing.h which targets low level tracing primitives. bpf_helpers.h is a better fit. The __bpf_narg and __bpf_apply are needed in both files and provided twice. __bpf_empty isn't used anywhere and is removed from bpf_tracing.h Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210526164643.2881368-1-revest@chromium.org	2021-05-26 10:45:41 -07:00
Andrii Nakryiko	e9fc3ce99b	libbpf: Streamline error reporting for high-level APIs Implement changes to error reporting for high-level libbpf APIs to make them less surprising and less error-prone to users: - in all the cases when error happens, errno is set to an appropriate error value; - in libbpf 1.0 mode, all pointer-returning APIs return NULL on error and error code is communicated through errno; this applies both to APIs that already returned NULL before (so now they communicate more detailed error codes), as well as for many APIs that used ERR_PTR() macro and encoded error numbers as fake pointers. - in legacy (default) mode, those APIs that were returning ERR_PTR(err), continue doing so, but still set errno. With these changes, errno can be always used to extract actual error, regardless of legacy or libbpf 1.0 modes. This is utilized internally in libbpf in places where libbpf uses it's own high-level APIs. libbpf_get_error() is adapted to handle both cases completely transparently to end-users (and is used by libbpf consistently as well). More context, justification, and discussion can be found in "Libbpf: the road to v1.0" document ([0]). [0] https://docs.google.com/document/d/1UyjTZuPFWiPFyKk1tV5an11_iaRuec6U-ZESZ54nNTY Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210525035935.1461796-5-andrii@kernel.org	2021-05-25 17:32:35 -07:00
Andrii Nakryiko	f12b654327	libbpf: Streamline error reporting for low-level APIs Ensure that low-level APIs behave uniformly across the libbpf as follows: - in case of an error, errno is always set to the correct error code; - when libbpf 1.0 mode is enabled with LIBBPF_STRICT_DIRECT_ERRS option to libbpf_set_strict_mode(), return -Exxx error value directly, instead of -1; - by default, until libbpf 1.0 is released, keep returning -1 directly. More context, justification, and discussion can be found in "Libbpf: the road to v1.0" document ([0]). [0] https://docs.google.com/document/d/1UyjTZuPFWiPFyKk1tV5an11_iaRuec6U-ZESZ54nNTY Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210525035935.1461796-4-andrii@kernel.org	2021-05-25 17:32:35 -07:00
Andrii Nakryiko	5981881d21	libbpf: Add libbpf_set_strict_mode() API to turn on libbpf 1.0 behaviors Add libbpf_set_strict_mode() API that allows application to simulate libbpf 1.0 breaking changes before libbpf 1.0 is released. This will help users migrate gradually and with confidence. For now only ALL or NONE options are available, subsequent patches will add more flags. This patch is preliminary for selftests/bpf changes. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210525035935.1461796-2-andrii@kernel.org	2021-05-25 17:32:34 -07:00
Yonghong Song	9f0c317f6a	libbpf: Add support for new llvm bpf relocations LLVM patch https://reviews.llvm.org/D102712 narrowed the scope of existing R_BPF_64_64 and R_BPF_64_32 relocations, and added three new relocations, R_BPF_64_ABS64, R_BPF_64_ABS32 and R_BPF_64_NODYLD32. The main motivation is to make relocations linker friendly. This change, unfortunately, breaks libbpf build, and we will see errors like below: libbpf: ELF relo #0 in section #6 has unexpected type 2 in /home/yhs/work/bpf-next/tools/testing/selftests/bpf/bpf_tcp_nogpl.o Error: failed to link '/home/yhs/work/bpf-next/tools/testing/selftests/bpf/bpf_tcp_nogpl.o': Unknown error -22 (-22) The new relocation R_BPF_64_ABS64 is generated and libbpf linker sanity check doesn't understand it. Relocation section '.rel.struct_ops' at offset 0x1410 contains 1 entries: Offset Info Type Symbol's Value Symbol's Name 0000000000000018 0000000700000002 R_BPF_64_ABS64 0000000000000000 nogpltcp_init Look at the selftests/bpf/bpf_tcp_nogpl.c, void BPF_STRUCT_OPS(nogpltcp_init, struct sock sk) { } SEC(".struct_ops") struct tcp_congestion_ops bpf_nogpltcp = { .init = (void )nogpltcp_init, .name = "bpf_nogpltcp", }; The new llvm relocation scheme categorizes 'nogpltcp_init' reference as R_BPF_64_ABS64 instead of R_BPF_64_64 which is used to specify ld_imm64 relocation in the new scheme. Let us fix the linker sanity checking by including R_BPF_64_ABS64 and R_BPF_64_ABS32. There is no need to check R_BPF_64_NODYLD32 which is used for .BTF and .BTF.ext. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210522162341.3687617-1-yhs@fb.com	2021-05-24 21:03:05 -07:00
Denis Salopek	d59b9f2d1b	bpf: Extend libbpf with bpf_map_lookup_and_delete_elem_flags Add bpf_map_lookup_and_delete_elem_flags() libbpf API in order to use the BPF_F_LOCK flag with the map_lookup_and_delete_elem() function. Signed-off-by: Denis Salopek <denis.salopek@sartura.hr> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/15b05dafe46c7e0750d110f233977372029d1f62.1620763117.git.denis.salopek@sartura.hr	2021-05-24 13:30:51 -07:00
Stanislav Fomichev	f9bceaa59c	libbpf: Skip bpf_object__probe_loading for light skeleton I'm getting the following error when running 'gen skeleton -L' as regular user: libbpf: Error in bpf_object__probe_loading():Operation not permitted(1). Couldn't load trivial BPF program. Make sure your kernel supports BPF (CONFIG_BPF_SYSCALL=y) and/or that RLIMIT_MEMLOCK is set to big enough value. Fixes: `6723474373` ("libbpf: Generate loader program out of BPF ELF file.") Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210521030653.2626513-1-sdf@google.com	2021-05-24 12:22:09 -07:00
Alexei Starovoitov	5d67f34959	bpf: Add cmd alias BPF_PROG_RUN Add BPF_PROG_RUN command as an alias to BPF_RPOG_TEST_RUN to better indicate the full range of use cases done by the command. Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210519014032.20908-1-alexei.starovoitov@gmail.com	2021-05-19 15:35:12 +02:00
Alexei Starovoitov	7723256bf2	libbpf: Introduce bpf_map__initial_value(). Introduce bpf_map__initial_value() to read initial contents of mmaped data/rodata/bss maps. Note that bpf_map__set_initial_value() doesn't allow modifying kconfig map while bpf_map__initial_value() allows reading its values. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-17-alexei.starovoitov@gmail.com	2021-05-19 00:40:56 +02:00
Alexei Starovoitov	30f51aedab	libbpf: Cleanup temp FDs when intermediate sys_bpf fails. Fix loader program to close temporary FDs when intermediate sys_bpf command fails. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-16-alexei.starovoitov@gmail.com	2021-05-19 00:40:44 +02:00
Alexei Starovoitov	6723474373	libbpf: Generate loader program out of BPF ELF file. The BPF program loading process performed by libbpf is quite complex and consists of the following steps: "open" phase: - parse elf file and remember relocations, sections - collect externs and ksyms including their btf_ids in prog's BTF - patch BTF datasec (since llvm couldn't do it) - init maps (old style map_def, BTF based, global data map, kconfig map) - collect relocations against progs and maps "load" phase: - probe kernel features - load vmlinux BTF - resolve externs (kconfig and ksym) - load program BTF - init struct_ops - create maps - apply CO-RE relocations - patch ld_imm64 insns with src_reg=PSEUDO_MAP, PSEUDO_MAP_VALUE, PSEUDO_BTF_ID - reposition subprograms and adjust call insns - sanitize and load progs During this process libbpf does sys_bpf() calls to load BTF, create maps, populate maps and finally load programs. Instead of actually doing the syscalls generate a trace of what libbpf would have done and represent it as the "loader program". The "loader program" consists of single map with: - union bpf_attr(s) - BTF bytes - map value bytes - insns bytes and single bpf program that passes bpf_attr(s) and data into bpf_sys_bpf() helper. Executing such "loader program" via bpf_prog_test_run() command will replay the sequence of syscalls that libbpf would have done which will result the same maps created and programs loaded as specified in the elf file. The "loader program" removes libelf and majority of libbpf dependency from program loading process. kconfig, typeless ksym, struct_ops and CO-RE are not supported yet. The order of relocate_data and relocate_calls had to change, so that bpf_gen__prog_load() can see all relocations for a given program with correct insn_idx-es. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-15-alexei.starovoitov@gmail.com	2021-05-19 00:39:40 +02:00
Alexei Starovoitov	e2fa0156a4	libbpf: Preliminary support for fd_idx Prep libbpf to use FD_IDX kernel feature when generating loader program. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-14-alexei.starovoitov@gmail.com	2021-05-19 00:33:41 +02:00
Alexei Starovoitov	9ca1f56aba	libbpf: Add bpf_object pointer to kernel_supports(). Add a pointer to 'struct bpf_object' to kernel_supports() helper. It will be used in the next patch. No functional changes. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-13-alexei.starovoitov@gmail.com	2021-05-19 00:33:40 +02:00
Alexei Starovoitov	b126882672	libbpf: Change the order of data and text relocations. In order to be able to generate loader program in the later patches change the order of data and text relocations. Also improve the test to include data relos. If the kernel supports "FD array" the map_fd relocations can be processed before text relos since generated loader program won't need to manually patch ld_imm64 insns with map_fd. But ksym and kfunc relocations can only be processed after all calls are relocated, since loader program will consist of a sequence of calls to bpf_btf_find_by_name_kind() followed by patching of btf_id and btf_obj_fd into corresponding ld_imm64 insns. The locations of those ld_imm64 insns are specified in relocations. Hence process all data relocations (maps, ksym, kfunc) together after call relos. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-12-alexei.starovoitov@gmail.com	2021-05-19 00:33:40 +02:00
Alexei Starovoitov	5452fc9a17	libbpf: Support for syscall program type Trivial support for syscall program type. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-5-alexei.starovoitov@gmail.com	2021-05-19 00:33:40 +02:00
Kumar Kartikeya Dwivedi	715c5ce454	libbpf: Add low level TC-BPF management API This adds functions that wrap the netlink API used for adding, manipulating, and removing traffic control filters. The API summary: A bpf_tc_hook represents a location where a TC-BPF filter can be attached. This means that creating a hook leads to creation of the backing qdisc, while destruction either removes all filters attached to a hook, or destroys qdisc if requested explicitly (as discussed below). The TC-BPF API functions operate on this bpf_tc_hook to attach, replace, query, and detach tc filters. All functions return 0 on success, and a negative error code on failure. bpf_tc_hook_create - Create a hook Parameters: @hook - Cannot be NULL, ifindex > 0, attach_point must be set to proper enum constant. Note that parent must be unset when attach_point is one of BPF_TC_INGRESS or BPF_TC_EGRESS. Note that as an exception BPF_TC_INGRESS\|BPF_TC_EGRESS is also a valid value for attach_point. Returns -EOPNOTSUPP when hook has attach_point as BPF_TC_CUSTOM. bpf_tc_hook_destroy - Destroy a hook Parameters: @hook - Cannot be NULL. The behaviour depends on value of attach_point. If BPF_TC_INGRESS, all filters attached to the ingress hook will be detached. If BPF_TC_EGRESS, all filters attached to the egress hook will be detached. If BPF_TC_INGRESS\|BPF_TC_EGRESS, the clsact qdisc will be deleted, also detaching all filters. As before, parent must be unset for these attach_points, and set for BPF_TC_CUSTOM. It is advised that if the qdisc is operated on by many programs, then the program at least check that there are no other existing filters before deleting the clsact qdisc. An example is shown below: DECLARE_LIBBPF_OPTS(bpf_tc_hook, .ifindex = if_nametoindex("lo"), .attach_point = BPF_TC_INGRESS); /* set opts as NULL, as we're not really interested in * getting any info for a particular filter, but just * detecting its presence. / r = bpf_tc_query(&hook, NULL); if (r == -ENOENT) { / no filters / hook.attach_point = BPF_TC_INGRESS\|BPF_TC_EGREESS; return bpf_tc_hook_destroy(&hook); } else { / failed or r == 0, the latter means filters do exist / return r; } Note that there is a small race between checking for no filters and deleting the qdisc. This is currently unavoidable. Returns -EOPNOTSUPP when hook has attach_point as BPF_TC_CUSTOM. bpf_tc_attach - Attach a filter to a hook Parameters: @hook - Cannot be NULL. Represents the hook the filter will be attached to. Requirements for ifindex and attach_point are same as described in bpf_tc_hook_create, but BPF_TC_CUSTOM is also supported. In that case, parent must be set to the handle where the filter will be attached (using BPF_TC_PARENT). E.g. to set parent to 1:16 like in tc command line, the equivalent would be BPF_TC_PARENT(1, 16). @opts - Cannot be NULL. The following opts are optional: handle - The handle of the filter * priority - The priority of the filter Must be >= 0 and <= UINT16_MAX Note that when left unset, they will be auto-allocated by the kernel. The following opts must be set: * prog_fd - The fd of the loaded SCHED_CLS prog The following opts must be unset: * prog_id - The ID of the BPF prog The following opts are optional: * flags - Currently only BPF_TC_F_REPLACE is allowed. It allows replacing an existing filter instead of failing with -EEXIST. The following opts will be filled by bpf_tc_attach on a successful attach operation if they are unset: * handle - The handle of the attached filter * priority - The priority of the attached filter * prog_id - The ID of the attached SCHED_CLS prog This way, the user can know what the auto allocated values for optional opts like handle and priority are for the newly attached filter, if they were unset. Note that some other attributes are set to fixed default values listed below (this holds for all bpf_tc_* APIs): protocol as ETH_P_ALL, direct action mode, chain index of 0, and class ID of 0 (this can be set by writing to the skb->tc_classid field from the BPF program). bpf_tc_detach Parameters: @hook - Cannot be NULL. Represents the hook the filter will be detached from. Requirements are same as described above in bpf_tc_attach. @opts - Cannot be NULL. The following opts must be set: * handle, priority The following opts must be unset: * prog_fd, prog_id, flags bpf_tc_query Parameters: @hook - Cannot be NULL. Represents the hook where the filter lookup will be performed. Requirements are same as described above in bpf_tc_attach(). @opts - Cannot be NULL. The following opts must be set: * handle, priority The following opts must be unset: * prog_fd, prog_id, flags The following fields will be filled by bpf_tc_query upon a successful lookup: * prog_id Some usage examples (using BPF skeleton infrastructure): BPF program (test_tc_bpf.c): #include <linux/bpf.h> #include <bpf/bpf_helpers.h> SEC("classifier") int cls(struct __sk_buff skb) { return 0; } Userspace loader: struct test_tc_bpf skel = NULL; int fd, r; skel = test_tc_bpf__open_and_load(); if (!skel) return -ENOMEM; fd = bpf_program__fd(skel->progs.cls); DECLARE_LIBBPF_OPTS(bpf_tc_hook, hook, .ifindex = if_nametoindex("lo"), .attach_point = BPF_TC_INGRESS); /* Create clsact qdisc / r = bpf_tc_hook_create(&hook); if (r < 0) goto end; DECLARE_LIBBPF_OPTS(bpf_tc_opts, opts, .prog_fd = fd); r = bpf_tc_attach(&hook, &opts); if (r < 0) goto end; / Print the auto allocated handle and priority / printf("Handle=%u", opts.handle); printf("Priority=%u", opts.priority); opts.prog_fd = opts.prog_id = 0; bpf_tc_detach(&hook, &opts); end: test_tc_bpf__destroy(skel); This is equivalent to doing the following using tc command line: # tc qdisc add dev lo clsact # tc filter add dev lo ingress bpf obj foo.o sec classifier da # tc filter del dev lo ingress handle <h> prio <p> bpf ... where the handle and priority can be found using: # tc filter show dev lo ingress Another example replacing a filter (extending prior example): / We can also choose both (or one), let's try replacing an * existing filter. / DECLARE_LIBBPF_OPTS(bpf_tc_opts, replace_opts, .handle = opts.handle, .priority = opts.priority, .prog_fd = fd); r = bpf_tc_attach(&hook, &replace_opts); if (r == -EEXIST) { / Expected, now use BPF_TC_F_REPLACE to replace it / replace_opts.flags = BPF_TC_F_REPLACE; return bpf_tc_attach(&hook, &replace_opts); } else if (r < 0) { return r; } / There must be no existing filter with these * attributes, so cleanup and return an error. / replace_opts.prog_fd = replace_opts.prog_id = 0; bpf_tc_detach(&hook, &replace_opts); return -1; To obtain info of a particular filter: / Find info for filter with handle 1 and priority 50 */ DECLARE_LIBBPF_OPTS(bpf_tc_opts, info_opts, .handle = 1, .priority = 50); r = bpf_tc_query(&hook, &info_opts); if (r == -ENOENT) printf("Filter not found"); else if (r < 0) return r; printf("Prog ID: %u", info_opts.prog_id); return 0; Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> # libbpf API design [ Daniel: also did major patch cleanup ] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210512103451.989420-3-memxor@gmail.com	2021-05-17 17:48:45 +02:00
Kumar Kartikeya Dwivedi	8bbb77b7c7	libbpf: Add various netlink helpers This change introduces a few helpers to wrap open coded attribute preparation in netlink.c. It also adds a libbpf_netlink_send_recv() that is useful to wrap send + recv handling in a generic way. Subsequent patch will also use this function for sending and receiving a netlink response. The libbpf_nl_get_link() helper has been removed instead, moving socket creation into the newly named libbpf_netlink_send_recv(). Every nested attribute's closure must happen using the helper nlattr_end_nested(), which sets its length properly. NLA_F_NESTED is enforced using nlattr_begin_nested() helper. Other simple attributes can be added directly. The maxsz parameter corresponds to the size of the request structure which is being filled in, so for instance with req being: struct { struct nlmsghdr nh; struct tcmsg t; char buf[4096]; } req; Then, maxsz should be sizeof(req). This change also converts the open coded attribute preparation with these helpers. Note that the only failure the internal call to nlattr_add() could result in the nested helper would be -EMSGSIZE, hence that is what we return to our caller. The libbpf_netlink_send_recv() call takes care of opening the socket, sending the netlink message, receiving the response, potentially invoking callbacks, and return errors if any, and then finally close the socket. This allows users to avoid identical socket setup code in different places. The only user of libbpf_nl_get_link() has been converted to make use of it. __bpf_set_link_xdp_fd_replace() has also been refactored to use it. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> [ Daniel: major patch cleanup ] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210512103451.989420-2-memxor@gmail.com	2021-05-17 17:48:36 +02:00
Andrii Nakryiko	513f485ca5	libbpf: Reject static entry-point BPF programs Detect use of static entry-point BPF programs (those with SEC() markings) and emit error message. This is similar to `c1cccec9c6` ("libbpf: Reject static maps") but for BPF programs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210514195534.1440970-1-andrii@kernel.org	2021-05-14 16:11:09 -07:00
Andrii Nakryiko	c1cccec9c6	libbpf: Reject static maps Static maps never really worked with libbpf, because all such maps were always silently resolved to the very first map. Detect static maps (both legacy and BTF-defined) and report user-friendly error. Tested locally by switching few maps (legacy and BTF-defined) in selftests to static ones and verifying that now libbpf rejects them loudly. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210513233643.194711-2-andrii@kernel.org	2021-05-13 17:23:58 -07:00
David S. Miller	df6f823703	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2021-05-11 The following pull-request contains BPF updates for your net tree. We've added 13 non-merge commits during the last 8 day(s) which contain a total of 21 files changed, 817 insertions(+), 382 deletions(-). The main changes are: 1) Fix multiple ringbuf bugs in particular to prevent writable mmap of read-only pages, from Andrii Nakryiko & Thadeu Lima de Souza Cascardo. 2) Fix verifier alu32 known-const subregister bound tracking for bitwise operations and/or/xor, from Daniel Borkmann. 3) Reject trampoline attachment for functions with variable arguments, and also add a deny list of other forbidden functions, from Jiri Olsa. 4) Fix nested bpf_bprintf_prepare() calls used by various helpers by switching to per-CPU buffers, from Florent Revest. 5) Fix kernel compilation with BTF debug info on ppc64 due to pahole missing TCP-CC functions like cubictcp_init, from Martin KaFai Lau. 6) Add a kconfig entry to provide an option to disallow unprivileged BPF by default, from Daniel Borkmann. 7) Fix libbpf compilation for older libelf when GELF_ST_VISIBILITY() macro is not available, from Arnaldo Carvalho de Melo. 8) Migrate test_tc_redirect to test_progs framework as prep work for upcoming skb_change_head() fix & selftest, from Jussi Maki. 9) Fix a libbpf segfault in add_dummy_ksym_var() if BTF is not present, from Ian Rogers. 10) Fix tx_only micro-benchmark in xdpsock BPF sample with proper frame size, from Magnus Karlsson. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-05-11 16:05:56 -07:00
Andrii Nakryiko	e5670fa029	libbpf: Treat STV_INTERNAL same as STV_HIDDEN for functions Do the same global -> static BTF update for global functions with STV_INTERNAL visibility to turn on static BPF verification mode. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210507054119.270888-7-andrii@kernel.org	2021-05-11 15:07:17 -07:00
Andrii Nakryiko	247b8634e6	libbpf: Fix ELF symbol visibility update logic Fix silly bug in updating ELF symbol's visibility. Fixes: `a46349227c` ("libbpf: Add linker extern resolution support for functions and global variables") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210507054119.270888-6-andrii@kernel.org	2021-05-11 15:07:17 -07:00
Andrii Nakryiko	fdbf5ddeb8	libbpf: Add per-file linker opts For better future extensibility add per-file linker options. Currently the set of available options is empty. This changes bpf_linker__add_file() API, but it's not a breaking change as bpf_linker APIs hasn't been released yet. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210507054119.270888-3-andrii@kernel.org	2021-05-11 15:07:17 -07:00
Arnaldo Carvalho de Melo	67e7ec0bd4	libbpf: Provide GELF_ST_VISIBILITY() define for older libelf Where that macro isn't available. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/YJaspEh0qZr4LYOc@kernel.org	2021-05-11 23:07:33 +02:00
Linus Torvalds	fc858a5231	Networking fixes for 5.13-rc1, including fixes from bpf, can and netfilter trees. Self-contained fixes, nothing risky. Current release - new code bugs: - dsa: ksz: fix a few bugs found by static-checker in the new driver - stmmac: fix frame preemption handshake not triggering after interface restart Previous releases - regressions: - make nla_strcmp handle more then one trailing null character - fix stack OOB reads while fragmenting IPv4 packets in openvswitch and net/sched - sctp: do asoc update earlier in sctp_sf_do_dupcook_a - sctp: delay auto_asconf init until binding the first addr - stmmac: clear receive all(RA) bit when promiscuous mode is off - can: mcp251x: fix resume from sleep before interface was brought up Previous releases - always broken: - bpf: fix leakage of uninitialized bpf stack under speculation - bpf: fix masking negation logic upon negative dst register - netfilter: don't assume that skb_header_pointer() will never fail - only allow init netns to set default tcp cong to a restricted algo - xsk: fix xp_aligned_validate_desc() when len == chunk_size to avoid false positive errors - ethtool: fix missing NLM_F_MULTI flag when dumping - can: m_can: m_can_tx_work_queue(): fix tx_skb race condition - sctp: fix a SCTP_MIB_CURRESTAB leak in sctp_sf_do_dupcook_b - bridge: fix NULL-deref caused by a races between assigning rx_handler_data and setting the IFF_BRIDGE_PORT bit Latecomer: - seg6: add counters support for SRv6 Behaviors Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmCV3YoACgkQMUZtbf5S IrsQ2w//Q8/qbl6wGTKUfu6DZHYUU5j5sTwiHR823PKKSgXI+okWMN0KUlZszOsz qnPkH6GuojRooOE1s8PFLSlt9axKhQ0y7uzMTrWYafQ+JZTtgg9/MiPxQ8fdiE5i uOG1ngttZ+1jlE5tMPL4GAOSegg3rWVDclzqnJTdsPPOco3MWj6SL9xN0LDPxCEL BDysRqL/UiOIoh4v6IXQRx2UWjsNGu4biM1po+Jfumnd9T0zKoEpzu6UN6yPShbx 284LihZSQtughCbhGqkErBOxfjZcvpFOQrqmjEvI+Z/eYg4InfWZemt8Sa92/alE yAFjK76MUTaUxaAO/gk8XauhvkYOzJJwKpqhbOmlaM7oj55QdzT5/8JxMxVoA6hV pscHOixk15GVse49PdPV8v47cyTLc/Xi69i+/uUdNVVfuORL1wft1w1xbd0S6Pbe 7Gqax21S7zxcDsrUli7cFheYiqtbQAL0anlIUz8tUOZFz0VQ/zPuFd4rUYZ/o38V Mrevdk3t6CXNxS4CRXyUW4UejYB1O6Qw12sUue31e3h73d6LiN3NAiN5Qp7SEk1/ fvk+jfOf8vvmtimYvcUK2i0D+vqj4Ec/qRIE/XXuUDBcp22tPL9uWMfWavwTdAj1 Se4SzksTWF+NM0lO0ItonMyPh3ZXcSLhIv/gHrZwEKuWkXCGO4M= =JmWS -----END PGP SIGNATURE----- Merge tag 'net-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.13-rc1, including fixes from bpf, can and netfilter trees. Self-contained fixes, nothing risky. Current release - new code bugs: - dsa: ksz: fix a few bugs found by static-checker in the new driver - stmmac: fix frame preemption handshake not triggering after interface restart Previous releases - regressions: - make nla_strcmp handle more then one trailing null character - fix stack OOB reads while fragmenting IPv4 packets in openvswitch and net/sched - sctp: do asoc update earlier in sctp_sf_do_dupcook_a - sctp: delay auto_asconf init until binding the first addr - stmmac: clear receive all(RA) bit when promiscuous mode is off - can: mcp251x: fix resume from sleep before interface was brought up Previous releases - always broken: - bpf: fix leakage of uninitialized bpf stack under speculation - bpf: fix masking negation logic upon negative dst register - netfilter: don't assume that skb_header_pointer() will never fail - only allow init netns to set default tcp cong to a restricted algo - xsk: fix xp_aligned_validate_desc() when len == chunk_size to avoid false positive errors - ethtool: fix missing NLM_F_MULTI flag when dumping - can: m_can: m_can_tx_work_queue(): fix tx_skb race condition - sctp: fix a SCTP_MIB_CURRESTAB leak in sctp_sf_do_dupcook_b - bridge: fix NULL-deref caused by a races between assigning rx_handler_data and setting the IFF_BRIDGE_PORT bit Latecomer: - seg6: add counters support for SRv6 Behaviors" * tag 'net-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (73 commits) atm: firestream: Use fallthrough pseudo-keyword net: stmmac: Do not enable RX FIFO overflow interrupts mptcp: fix splat when closing unaccepted socket i40e: Remove LLDP frame filters i40e: Fix PHY type identifiers for 2.5G and 5G adapters i40e: fix the restart auto-negotiation after FEC modified i40e: Fix use-after-free in i40e_client_subtask() i40e: fix broken XDP support netfilter: nftables: avoid potential overflows on 32bit arches netfilter: nftables: avoid overflows in nft_hash_buckets() tcp: Specify cmsgbuf is user pointer for receive zerocopy. mlxsw: spectrum_mr: Update egress RIF list before route's action net: ipa: fix inter-EE IRQ register definitions can: m_can: m_can_tx_work_queue(): fix tx_skb race condition can: mcp251x: fix resume from sleep before interface was brought up can: mcp251xfd: mcp251xfd_probe(): add missing can_rx_offload_del() in error path can: mcp251xfd: mcp251xfd_probe(): fix an error pointer dereference in probe netfilter: nftables: Fix a memleak from userdata error path in new objects netfilter: remove BUG_ON() after skb_header_pointer() netfilter: nfnetlink_osf: Fix a missing skb_header_pointer() NULL check ...	2021-05-08 08:31:46 -07:00
Linus Torvalds	a48b0872e6	Merge branch 'akpm' (patches from Andrew) Merge yet more updates from Andrew Morton: "This is everything else from -mm for this merge window. 90 patches. Subsystems affected by this patch series: mm (cleanups and slub), alpha, procfs, sysctl, misc, core-kernel, bitmap, lib, compat, checkpatch, epoll, isofs, nilfs2, hpfs, exit, fork, kexec, gcov, panic, delayacct, gdb, resource, selftests, async, initramfs, ipc, drivers/char, and spelling" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (90 commits) mm: fix typos in comments mm: fix typos in comments treewide: remove editor modelines and cruft ipc/sem.c: spelling fix fs: fat: fix spelling typo of values kernel/sys.c: fix typo kernel/up.c: fix typo kernel/user_namespace.c: fix typos kernel/umh.c: fix some spelling mistakes include/linux/pgtable.h: few spelling fixes mm/slab.c: fix spelling mistake "disired" -> "desired" scripts/spelling.txt: add "overflw" scripts/spelling.txt: Add "diabled" typo scripts/spelling.txt: add "overlfow" arm: print alloc free paths for address in registers mm/vmalloc: remove vwrite() mm: remove xlate_dev_kmem_ptr() drivers/char: remove /dev/kmem for good mm: fix some typos and code style problems ipc/sem.c: mundane typo fixes ...	2021-05-07 00:34:51 -07:00
Yury Norov	eaae7841ba	tools: sync lib/find_bit implementation Add fast paths to find_*_bit() functions as per kernel implementation. Link: https://lkml.kernel.org/r/20210401003153.97325-12-yury.norov@gmail.com Signed-off-by: Yury Norov <yury.norov@gmail.com> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Alexey Klimov <aklimov@redhat.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Sterba <dsterba@suse.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jianpeng Ma <jianpeng.ma@intel.com> Cc: Joe Perches <joe@perches.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Rich Felker <dalias@libc.org> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Wolfram Sang <wsa+renesas@sang-engineering.com> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-05-06 19:24:12 -07:00
Yury Norov	ea81c1ef44	tools: sync find_next_bit implementation Sync the implementation with recent kernel changes. Link: https://lkml.kernel.org/r/20210401003153.97325-9-yury.norov@gmail.com Signed-off-by: Yury Norov <yury.norov@gmail.com> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Alexey Klimov <aklimov@redhat.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Sterba <dsterba@suse.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jianpeng Ma <jianpeng.ma@intel.com> Cc: Joe Perches <joe@perches.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Rich Felker <dalias@libc.org> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Wolfram Sang <wsa+renesas@sang-engineering.com> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-05-06 19:24:12 -07:00
Yury Norov	e5b9252d90	tools: bitmap: sync function declarations with the kernel Some functions in tools/include/linux/bitmap.h declare nbits as int. In the kernel nbits is declared as unsigned int. Link: https://lkml.kernel.org/r/20210401003153.97325-3-yury.norov@gmail.com Signed-off-by: Yury Norov <yury.norov@gmail.com> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Alexey Klimov <aklimov@redhat.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Sterba <dsterba@suse.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jianpeng Ma <jianpeng.ma@intel.com> Cc: Joe Perches <joe@perches.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Rich Felker <dalias@libc.org> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Wolfram Sang <wsa+renesas@sang-engineering.com> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-05-06 19:24:11 -07:00
Ian Rogers	9683e5775c	libbpf: Add NULL check to add_dummy_ksym_var Avoids a segv if btf isn't present. Seen on the call path __bpf_object__open calling bpf_object__collect_externs. Fixes: `5bd022ec01` (libbpf: Support extern kernel function) Suggested-by: Stanislav Fomichev <sdf@google.com> Suggested-by: Petar Penkov <ppenkov@google.com> Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210504234910.976501-1-irogers@google.com	2021-05-05 11:29:33 -07:00
Brendan Jackman	2a30f94406	libbpf: Fix signed overflow in ringbuf_process_ring One of our benchmarks running in (Google-internal) CI pushes data through the ringbuf faster htan than userspace is able to consume it. In this case it seems we're actually able to get >INT_MAX entries in a single ring_buffer__consume() call. ASAN detected that cnt overflows in this case. Fix by using 64-bit counter internally and then capping the result to INT_MAX before converting to the int return type. Do the same for the ring_buffer__poll(). Fixes: `bf99c936f9` (libbpf: Add BPF ring buffer support) Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210429130510.1621665-1-jackmanb@google.com	2021-05-03 09:54:12 -07:00
Linus Torvalds	10a3efd0fe	perf tools changes for v5.13: 1st batch perf stat: - Add support for hybrid PMUs to support systems such as Intel Alderlake and its BIG/little core/atom cpus. - Introduce 'bperf' to share hardware PMCs with BPF. - New --iostat option to collect and present IO stats on Intel hardware. This functionality is based on recently introduced sysfs attributes for Intel® Xeon® Scalable processor family (code name Skylake-SP): commit `bb42b3d397` ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping") It is intended to provide four I/O performance metrics in MB per each PCIe root port: - Inbound Read: I/O devices below root port read from the host memory - Inbound Write: I/O devices below root port write to the host memory - Outbound Read: CPU reads from I/O devices below root port - Outbound Write: CPU writes to I/O devices below root port - Align CSV output for summary. - Clarify --null use cases: Assess raw overhead of 'perf stat' or measure just wall clock time. - Improve readability of shadow stats. perf record: - Change the COMM when starting tha workload so that --exclude-perf doesn't seem to be not honoured. - Improve 'Workload failed' message printing events + what was exec'ed. - Fix cross-arch support for TIME_CONV. perf report: - Add option to disable raw event ordering. - Dump the contents of PERF_RECORD_TIME_CONV in 'perf report -D'. - Improvements to --stat output, that shows information about PERF_RECORD_ events. - Preserve identifier id in OCaml demangler. perf annotate: - Show full source location with 'l' hotkey in the 'perf annotate' TUI. - Add line number like in TUI and source location at EOL to the 'perf annotate' --stdio mode. - Add --demangle and --demangle-kernel to 'perf annotate'. - Allow configuring annotate.demangle{,_kernel} in 'perf config'. - Fix sample events lost in stdio mode. perf data: - Allow converting a perf.data file to JSON. libperf: - Add support for user space counter access. - Update topdown documentation to permit rdpmc calls. perf test: - Add 'perf test' for 'perf stat' CSV output. - Add 'perf test' entries to test the hybrid PMU support. - Cleanup 'perf test daemon' if its 'perf test' is interrupted. - Handle metric reuse in pmu-events parsing 'perf test' entry. - Add test for PE executable support. - Add timeout for wait for daemon start in its 'perf test' entries. Build: - Enable libtraceevent dynamic linking. - Improve feature detection output. - Fix caching of feature checks caching. - First round of updates for tools copies of kernel headers. - Enable warnings when compiling BPF programs. Vendor specific events: Intel: - Add missing skylake & icelake model numbers. arm64: - Add Hisi hip08 L1, L2 and L3 metrics. - Add Fujitsu A64FX PMU events. PowerPC: - Initial JSON/events list for power10 platform. - Remove unsupported power9 metrics. AMD: - Add Zen3 events. - Fix broken L2 Cache Hits from L2 HWPF metric. - Use lowercases for all the eventcodes and umasks. Hardware tracing: arm64: - Update CoreSight ETM metadata format. - Fix bitmap for CS-ETM option. - Support PID tracing in config. - Detect pid in VMID for kernel running at EL2. Arch specific: MIPS: - Support MIPS unwinding and dwarf-regs. - Generate mips syscalls_n64.c syscall table. PowerPC: - Add support for PERF_SAMPLE_WEIGH_STRUCT on PowerPC. - Support pipeline stage cycles for powerpc. libbeauty: - Fix fsconfig generator. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYIshAwAKCRCyPKLppCJ+ J8oWAP9c1POclDQ7AZDe5/t/InZYSQKJFIku1sE1SNCSOupy7wEAuPBtaN7wDaRj BFBibfUGd4MNzLPvMMHneIhSY3DgJwg= =FLLr -----END PGP SIGNATURE----- Merge tag 'perf-tools-for-v5.13-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tool updates from Arnaldo Carvalho de Melo: "perf stat: - Add support for hybrid PMUs to support systems such as Intel Alderlake and its BIG/little core/atom cpus. - Introduce 'bperf' to share hardware PMCs with BPF. - New --iostat option to collect and present IO stats on Intel hardware. This functionality is based on recently introduced sysfs attributes for Intel® Xeon® Scalable processor family (code name Skylake-SP) in commit `bb42b3d397` ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping") It is intended to provide four I/O performance metrics in MB per each PCIe root port: - Inbound Read: I/O devices below root port read from the host memory - Inbound Write: I/O devices below root port write to the host memory - Outbound Read: CPU reads from I/O devices below root port - Outbound Write: CPU writes to I/O devices below root port - Align CSV output for summary. - Clarify --null use cases: Assess raw overhead of 'perf stat' or measure just wall clock time. - Improve readability of shadow stats. perf record: - Change the COMM when starting tha workload so that --exclude-perf doesn't seem to be not honoured. - Improve 'Workload failed' message printing events + what was exec'ed. - Fix cross-arch support for TIME_CONV. perf report: - Add option to disable raw event ordering. - Dump the contents of PERF_RECORD_TIME_CONV in 'perf report -D'. - Improvements to --stat output, that shows information about PERF_RECORD_ events. - Preserve identifier id in OCaml demangler. perf annotate: - Show full source location with 'l' hotkey in the 'perf annotate' TUI. - Add line number like in TUI and source location at EOL to the 'perf annotate' --stdio mode. - Add --demangle and --demangle-kernel to 'perf annotate'. - Allow configuring annotate.demangle{,_kernel} in 'perf config'. - Fix sample events lost in stdio mode. perf data: - Allow converting a perf.data file to JSON. libperf: - Add support for user space counter access. - Update topdown documentation to permit rdpmc calls. perf test: - Add 'perf test' for 'perf stat' CSV output. - Add 'perf test' entries to test the hybrid PMU support. - Cleanup 'perf test daemon' if its 'perf test' is interrupted. - Handle metric reuse in pmu-events parsing 'perf test' entry. - Add test for PE executable support. - Add timeout for wait for daemon start in its 'perf test' entries. Build: - Enable libtraceevent dynamic linking. - Improve feature detection output. - Fix caching of feature checks caching. - First round of updates for tools copies of kernel headers. - Enable warnings when compiling BPF programs. Vendor specific events: - Intel: - Add missing skylake & icelake model numbers. - arm64: - Add Hisi hip08 L1, L2 and L3 metrics. - Add Fujitsu A64FX PMU events. - PowerPC: - Initial JSON/events list for power10 platform. - Remove unsupported power9 metrics. - AMD: - Add Zen3 events. - Fix broken L2 Cache Hits from L2 HWPF metric. - Use lowercases for all the eventcodes and umasks. Hardware tracing: - arm64: - Update CoreSight ETM metadata format. - Fix bitmap for CS-ETM option. - Support PID tracing in config. - Detect pid in VMID for kernel running at EL2. Arch specific updates: - MIPS: - Support MIPS unwinding and dwarf-regs. - Generate mips syscalls_n64.c syscall table. - PowerPC: - Add support for PERF_SAMPLE_WEIGH_STRUCT on PowerPC. - Support pipeline stage cycles for powerpc. libbeauty: - Fix fsconfig generator" * tag 'perf-tools-for-v5.13-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (132 commits) perf build: Defer printing detected features to the end of all feature checks tools build: Allow deferring printing the results of feature detection perf build: Regenerate the FEATURE_DUMP file after extra feature checks perf session: Dump PERF_RECORD_TIME_CONV event perf session: Add swap operation for event TIME_CONV perf jit: Let convert_timestamp() to be backwards-compatible perf tools: Change fields type in perf_record_time_conv perf tools: Enable libtraceevent dynamic linking perf Documentation: Document intel-hybrid support perf tests: Skip 'perf stat metrics (shadow stat) test' for hybrid perf tests: Support 'Convert perf time to TSC' test for hybrid perf tests: Support 'Session topology' test for hybrid perf tests: Support 'Parse and process metrics' test for hybrid perf tests: Support 'Track with sched_switch' test for hybrid perf tests: Skip 'Setup struct perf_event_attr' test for hybrid perf tests: Add hybrid cases for 'Roundtrip evsel->name' test perf tests: Add hybrid cases for 'Parse event definition strings' test perf record: Uniquify hybrid event name perf stat: Warn group events from different hybrid PMU perf stat: Filter out unmatched aggregation for hybrid event ...	2021-05-01 12:22:38 -07:00
Leo Yan	aa616f5a8a	perf jit: Let convert_timestamp() to be backwards-compatible Commit `d110162caf` ("perf tsc: Support cap_user_time_short for event TIME_CONV") supports the extended parameters for event TIME_CONV, but it broke the backwards compatibility, so any perf data file with old event format fails to convert timestamp. This patch introduces a helper event_contains() to check if an event contains a specific member or not. For the backwards-compatibility, if the event size confirms the extended parameters are supported in the event TIME_CONV, then copies these parameters. Committer notes: To make this compiler backwards compatible add this patch: - struct perf_tsc_conversion tc = { 0 }; + struct perf_tsc_conversion tc = { .time_shift = 0, }; Fixes: `d110162caf` ("perf tsc: Support cap_user_time_short for event TIME_CONV") Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steve MacLean <Steve.MacLean@Microsoft.com> Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io> Link: https://lore.kernel.org/r/20210428120915.7123-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-04-29 10:31:00 -03:00
Leo Yan	e1d380ea8b	perf tools: Change fields type in perf_record_time_conv C standard claims "An object declared as type _Bool is large enough to store the values 0 and 1", bool type size can be 1 byte or larger than 1 byte. Thus it's uncertian for bool type size with different compilers. This patch changes the bool type in structure perf_record_time_conv to __u8 type, and pads extra bytes for 8-byte alignment; this can give reliable structure size. Fixes: `d110162caf` ("perf tsc: Support cap_user_time_short for event TIME_CONV") Suggested-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steve MacLean <Steve.MacLean@Microsoft.com> Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io> Link: https://lore.kernel.org/r/20210428120915.7123-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-04-29 10:31:00 -03:00
Song Liu	ec8149fba6	perf util: Move bpf_perf definitions to a libperf header By following the same protocol, other tools can share hardware PMCs with perf. Move perf_event_attr_map_entry and BPF_PERF_DEFAULT_ATTR_MAP_PATH to bpf_perf.h for other tools to use. Signed-off-by: Song Liu <song@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: kernel-team@fb.com Link: https://lore.kernel.org/r/20210425214333.1090950-2-song@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-04-29 10:30:58 -03:00
Andrii Nakryiko	0f20615d64	selftests/bpf: Fix BPF_CORE_READ_BITFIELD() macro Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable bitfields. Missing breaks in a switch caused 8-byte reads always. This can confuse libbpf because it does strict checks that memory load size corresponds to the original size of the field, which in this case quite often would be wrong. After fixing that, we run into another problem, which quite subtle, so worth documenting here. The issue is in Clang optimization and CO-RE relocation interactions. Without that asm volatile construct (also known as barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and will apply BYTE_OFFSET 4 times for each switch case arm. This will result in the same error from libbpf about mismatch of memory load size and original field size. I.e., if we were reading u32, we'd still have (u8 ), (u16 ), (u32 ), and (u64 ) memory loads, three of which will fail. Using barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to calculate p, after which value of p is used without relocation in each of switch case arms, doing appropiately-sized memory load. Here's the list of relevant relocations and pieces of generated BPF code before and after this patch for test_core_reloc_bitfields_direct selftests. BEFORE ===== #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 157: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 159: 7b 12 20 01 00 00 00 00 (u64 )(r2 + 288) = r1 160: b7 02 00 00 04 00 00 00 r2 = 4 ; BYTE_SIZE relocation here ^^^ 161: 66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63> 162: 16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65> 163: 16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66> 164: 05 00 12 00 00 00 00 00 goto +18 <LBB0_69> 0000000000000528 <LBB0_66>: 165: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 167: 69 11 08 00 00 00 00 00 r1 = (u16 )(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 168: 05 00 0e 00 00 00 00 00 goto +14 <LBB0_69> 0000000000000548 <LBB0_63>: 169: 16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67> 170: 16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68> 171: 05 00 0b 00 00 00 00 00 goto +11 <LBB0_69> 0000000000000560 <LBB0_68>: 172: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 174: 79 11 08 00 00 00 00 00 r1 = (u64 )(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 175: 05 00 07 00 00 00 00 00 goto +7 <LBB0_69> 0000000000000580 <LBB0_65>: 176: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 178: 71 11 08 00 00 00 00 00 r1 = (u8 )(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 179: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 00000000000005a0 <LBB0_67>: 180: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 182: 61 11 08 00 00 00 00 00 r1 = (u32 )(r1 + 8) ; BYTE_OFFSET relo here w/ RIGHT size ^^^^^^^^^^^^^^^^ 00000000000005b8 <LBB0_69>: 183: 67 01 00 00 20 00 00 00 r1 <<= 32 184: b7 02 00 00 00 00 00 00 r2 = 0 185: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 186: c7 01 00 00 20 00 00 00 r1 s>>= 32 187: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000005e0 <LBB0_71>: 188: 77 01 00 00 20 00 00 00 r1 >>= 32 AFTER ===== #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 129: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 131: 7b 12 20 01 00 00 00 00 (u64 )(r2 + 288) = r1 132: b7 01 00 00 08 00 00 00 r1 = 8 ; BYTE_OFFSET relo here ^^^ ; no size check for non-memory dereferencing instructions 133: 0f 12 00 00 00 00 00 00 r2 += r1 134: b7 03 00 00 04 00 00 00 r3 = 4 ; BYTE_SIZE relocation here ^^^ 135: 66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63> 136: 16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65> 137: 16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66> 138: 05 00 0a 00 00 00 00 00 goto +10 <LBB0_69> 0000000000000458 <LBB0_66>: 139: 69 21 00 00 00 00 00 00 r1 = (u16 )(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 140: 05 00 08 00 00 00 00 00 goto +8 <LBB0_69> 0000000000000468 <LBB0_63>: 141: 16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67> 142: 16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68> 143: 05 00 05 00 00 00 00 00 goto +5 <LBB0_69> 0000000000000480 <LBB0_68>: 144: 79 21 00 00 00 00 00 00 r1 = (u64 )(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 145: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 0000000000000490 <LBB0_65>: 146: 71 21 00 00 00 00 00 00 r1 = (u8 )(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 147: 05 00 01 00 00 00 00 00 goto +1 <LBB0_69> 00000000000004a0 <LBB0_67>: 148: 61 21 00 00 00 00 00 00 r1 = (u32 )(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 00000000000004a8 <LBB0_69>: 149: 67 01 00 00 20 00 00 00 r1 <<= 32 150: b7 02 00 00 00 00 00 00 r2 = 0 151: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 152: c7 01 00 00 20 00 00 00 r1 s>>= 32 153: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000004d0 <LBB0_71>: 154: 77 01 00 00 20 00 00 00 r1 >>= 323 Fixes: `ee26dade0e` ("libbpf: Add support for relocatable bitfields") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-4-andrii@kernel.org	2021-04-26 18:37:13 -07:00
Andrii Nakryiko	6709a914c8	libbpf: Support BTF_KIND_FLOAT during type compatibility checks in CO-RE Add BTF_KIND_FLOAT support when doing CO-RE field type compatibility check. Without this, relocations against float/double fields will fail. Also adjust one error message to emit instruction index instead of less convenient instruction byte offset. Fixes: `22541a9eeb` ("libbpf: Add BTF_KIND_FLOAT support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-3-andrii@kernel.org	2021-04-26 18:37:13 -07:00
Arnaldo Carvalho de Melo	26bda3ca19	Merge remote-tracking branch 'torvalds/master' into perf/core To pick up fixes. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-04-26 09:35:41 -03:00
David S. Miller	5f6c2f536d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-04-23 The following pull-request contains BPF updates for your net-next tree. We've added 69 non-merge commits during the last 22 day(s) which contain a total of 69 files changed, 3141 insertions(+), 866 deletions(-). The main changes are: 1) Add BPF static linker support for extern resolution of global, from Andrii. 2) Refine retval for bpf_get_task_stack helper, from Dave. 3) Add a bpf_snprintf helper, from Florent. 4) A bunch of miscellaneous improvements from many developers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-25 18:02:32 -07:00
Andrii Nakryiko	0a342457b3	libbpf: Support extern resolution for BTF-defined maps in .maps section Add extra logic to handle map externs (only BTF-defined maps are supported for linking). Re-use the map parsing logic used during bpf_object__open(). Map externs are currently restricted to always match complete map definition. So all the specified attributes will be compared (down to pining, map_flags, numa_node, etc). In the future this restriction might be relaxed with no backwards compatibility issues. If any attribute is mismatched between extern and actual map definition, linker will report an error, pointing out which one mismatches. The original intent was to allow for extern to specify attributes that matters (to user) to enforce. E.g., if you specify just key information and omit value, then any value fits. Similarly, it should have been possible to enforce map_flags, pinning, and any other possible map attribute. Unfortunately, that means that multiple externs can be only partially overlapping with each other, which means linker would need to combine their type definitions to end up with the most restrictive and fullest map definition. This requires an extra amount of BTF manipulation which at this time was deemed unnecessary and would require further extending generic BTF writer APIs. So that is left for future follow ups, if there will be demand for that. But the idea seems intresting and useful, so I want to document it here. Weak definitions are also supported, but are pretty strict as well, just like externs: all weak map definitions have to match exactly. In the follow up patches this most probably will be relaxed, with __weak map definitions being able to differ between each other (with non-weak definition always winning, of course). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-13-andrii@kernel.org	2021-04-23 14:05:27 -07:00
Andrii Nakryiko	a46349227c	libbpf: Add linker extern resolution support for functions and global variables Add BPF static linker logic to resolve extern variables and functions across multiple linked together BPF object files. For that, linker maintains a separate list of struct glob_sym structures, which keeps track of few pieces of metadata (is it extern or resolved global, is it a weak symbol, which ELF section it belongs to, etc) and ties together BTF type info and ELF symbol information and keeps them in sync. With adding support for extern variables/funcs, it's now possible for some sections to contain both extern and non-extern definitions. This means that some sections may start out as ephemeral (if only externs are present and thus there is not corresponding ELF section), but will be "upgraded" to actual ELF section as symbols are resolved or new non-extern definitions are appended. Additional care is taken to not duplicate extern entries in sections like .kconfig and .ksyms. Given libbpf requires BTF type to always be present for .kconfig/.ksym externs, linker extends this requirement to all the externs, even those that are supposed to be resolved during static linking and which won't be visible to libbpf. With BTF information always present, static linker will check not just ELF symbol matches, but entire BTF type signature match as well. That logic is stricter that BPF CO-RE checks. It probably should be re-used by .ksym resolution logic in libbpf as well, but that's left for follow up patches. To make it unnecessary to rewrite ELF symbols and minimize BTF type rewriting/removal, ELF symbols that correspond to externs initially will be updated in place once they are resolved. Similarly for BTF type info, VAR/FUNC and var_secinfo's (sec_vars in struct bpf_linker) are staying stable, but types they point to might get replaced when extern is resolved. This might leave some left-over types (even though we try to minimize this for common cases of having extern funcs with not argument names vs concrete function with names properly specified). That can be addresses later with a generic BTF garbage collection. That's left for a follow up as well. Given BTF type appending phase is separate from ELF symbol appending/resolution, special struct glob_sym->underlying_btf_id variable is used to communicate resolution and rewrite decisions. 0 means underlying_btf_id needs to be appended (it's not yet in final linker->btf), <0 values are used for temporary storage of source BTF type ID (not yet rewritten), so -glob_sym->underlying_btf_id is BTF type id in obj-btf. But by the end of linker_append_btf() phase, that underlying_btf_id will be remapped and will always be > 0. This is the uglies part of the whole process, but keeps the other parts much simpler due to stability of sec_var and VAR/FUNC types, as well as ELF symbol, so please keep that in mind while reviewing. BTF-defined maps require some extra custom logic and is addressed separate in the next patch, so that to keep this one smaller and easier to review. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-12-andrii@kernel.org	2021-04-23 14:05:27 -07:00
Andrii Nakryiko	83a157279f	libbpf: Tighten BTF type ID rewriting with error checking It should never fail, but if it does, it's better to know about this rather than end up with nonsensical type IDs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-11-andrii@kernel.org	2021-04-23 14:05:27 -07:00
Andrii Nakryiko	386b1d241e	libbpf: Extend sanity checking ELF symbols with externs validation Add logic to validate extern symbols, plus some other minor extra checks, like ELF symbol #0 validation, general symbol visibility and binding validations. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-10-andrii@kernel.org	2021-04-23 14:05:26 -07:00
Andrii Nakryiko	42869d2852	libbpf: Make few internal helpers available outside of libbpf.c Make skip_mods_and_typedefs(), btf_kind_str(), and btf_func_linkage() helpers available outside of libbpf.c, to be used by static linker code. Also do few cleanups (error code fixes, comment clean up, etc) that don't deserve their own commit. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-9-andrii@kernel.org	2021-04-23 14:05:26 -07:00
Andrii Nakryiko	beaa3711ad	libbpf: Factor out symtab and relos sanity checks Factor out logic for sanity checking SHT_SYMTAB and SHT_REL sections into separate sections. They are already quite extensive and are suffering from too deep indentation. Subsequent changes will extend SYMTAB sanity checking further, so it's better to factor each into a separate function. No functional changes are intended. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-8-andrii@kernel.org	2021-04-23 14:05:26 -07:00
Andrii Nakryiko	c7ef5ec957	libbpf: Refactor BTF map definition parsing Refactor BTF-defined maps parsing logic to allow it to be nicely reused by BPF static linker. Further, at least for BPF static linker, it's important to know which attributes of a BPF map were defined explicitly, so provide a bit set for each known portion of BTF map definition. This allows BPF static linker to do a simple check when dealing with extern map declarations. The same capabilities allow to distinguish attributes explicitly set to zero (e.g., __uint(max_entries, 0)) vs the case of not specifying it at all (no max_entries attribute at all). Libbpf is currently not utilizing that, but it could be useful for backwards compatibility reasons later. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-7-andrii@kernel.org	2021-04-23 14:05:26 -07:00
Andrii Nakryiko	6245947c1b	libbpf: Allow gaps in BPF program sections to support overriden weak functions Currently libbpf is very strict about parsing BPF program instruction sections. No gaps are allowed between sequential BPF programs within a given ELF section. Libbpf enforced that by keeping track of the next section offset that should start a new BPF (sub)program and cross-checks that by searching for a corresponding STT_FUNC ELF symbol. But this is too restrictive once we allow to have weak BPF programs and link together two or more BPF object files. In such case, some weak BPF programs might be "overridden" by either non-weak BPF program with the same name and signature, or even by another weak BPF program that just happened to be linked first. That, in turn, leaves BPF instructions of the "lost" BPF (sub)program intact, but there is no corresponding ELF symbol, because no one is going to be referencing it. Libbpf already correctly handles such cases in the sense that it won't append such dead code to actual BPF programs loaded into kernel. So the only change that needs to be done is to relax the logic of parsing BPF instruction sections. Instead of assuming next BPF (sub)program section offset, iterate available STT_FUNC ELF symbols to discover all available BPF subprograms and programs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-6-andrii@kernel.org	2021-04-23 14:05:26 -07:00
Andrii Nakryiko	aea28a602f	libbpf: Mark BPF subprogs with hidden visibility as static for BPF verifier Define __hidden helper macro in bpf_helpers.h, which is a short-hand for __attribute__((visibility("hidden"))). Add libbpf support to mark BPF subprograms marked with __hidden as static in BTF information to enforce BPF verifier's static function validation algorithm, which takes more information (caller's context) into account during a subprogram validation. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-5-andrii@kernel.org	2021-04-23 14:05:26 -07:00
Andrii Nakryiko	0fec7a3cee	libbpf: Suppress compiler warning when using SEC() macro with externs When used on externs SEC() macro will trigger compilation warning about inapplicable `__attribute__((used))`. That's expected for extern declarations, so suppress it with the corresponding _Pragma. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-4-andrii@kernel.org	2021-04-23 14:05:26 -07:00
Rob Herring	818869489b	libperf xyarray: Add bounds checks to xyarray__entry() xyarray__entry() is missing any bounds checking yet often the x and y parameters come from external callers. Add bounds checks and an unchecked __xyarray__entry(). Committer notes: Make the 'x' and 'y' arguments to the new xyarray__entry() that does bounds check to be of type 'size_t', so that we cover also the case where 'x' and 'y' could be negative, which is needed anyway as having them as 'int' breaks the build with: /home/acme/git/perf/tools/lib/perf/include/internal/xyarray.h: In function ‘xyarray__entry’: /home/acme/git/perf/tools/lib/perf/include/internal/xyarray.h:28:8: error: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare] 28 \| if (x >= xy->max_x \|\| y >= xy->max_y) \| ^~ /home/acme/git/perf/tools/lib/perf/include/internal/xyarray.h:28:26: error: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare] 28 \| if (x >= xy->max_x \|\| y >= xy->max_y) \| ^~ cc1: all warnings being treated as errors Signed-off-by: Rob Herring <robh@kernel.org> Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Suggested-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210414195758.4078803-1-robh@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-04-20 08:11:33 -03:00
Rob Herring	47d01e7b99	libperf: Add support for user space counter access x86 and arm64 can both support direct access of event counters in userspace. The access sequence is less than trivial and currently exists in perf test code (tools/perf/arch/x86/tests/rdpmc.c) with copies in projects such as PAPI and libpfm4. In order to support userspace access, an event must be mmapped first with perf_evsel__mmap(). Then subsequent calls to perf_evsel__read() will use the fast path (assuming the arch supports it). Committer notes: Added a '__maybe_unused' attribute to the read_perf_counter() argument to fix the build on arches other than x86_64 and arm. Committer testing: Building and running the libperf tests in verbose mode (V=1) now shows those "loop = N, count = N" extra lines, testing user space counter access. # make V=1 -C tools/lib/perf tests make: Entering directory '/home/acme/git/perf/tools/lib/perf' make -f /home/acme/git/perf/tools/build/Makefile.build dir=. obj=libperf make -C /home/acme/git/perf/tools/lib/api/ O= libapi.a make -f /home/acme/git/perf/tools/build/Makefile.build dir=./fd obj=libapi make -f /home/acme/git/perf/tools/build/Makefile.build dir=./fs obj=libapi make -C tests gcc -I/home/acme/git/perf/tools/lib/perf/include -I/home/acme/git/perf/tools/include -I/home/acme/git/perf/tools/lib -g -Wall -o test-cpumap-a test-cpumap.c ../libperf.a /home/acme/git/perf/tools/lib/api/libapi.a gcc -I/home/acme/git/perf/tools/lib/perf/include -I/home/acme/git/perf/tools/include -I/home/acme/git/perf/tools/lib -g -Wall -o test-threadmap-a test-threadmap.c ../libperf.a /home/acme/git/perf/tools/lib/api/libapi.a gcc -I/home/acme/git/perf/tools/lib/perf/include -I/home/acme/git/perf/tools/include -I/home/acme/git/perf/tools/lib -g -Wall -o test-evlist-a test-evlist.c ../libperf.a /home/acme/git/perf/tools/lib/api/libapi.a gcc -I/home/acme/git/perf/tools/lib/perf/include -I/home/acme/git/perf/tools/include -I/home/acme/git/perf/tools/lib -g -Wall -o test-evsel-a test-evsel.c ../libperf.a /home/acme/git/perf/tools/lib/api/libapi.a gcc -I/home/acme/git/perf/tools/lib/perf/include -I/home/acme/git/perf/tools/include -I/home/acme/git/perf/tools/lib -g -Wall -L.. -o test-cpumap-so test-cpumap.c /home/acme/git/perf/tools/lib/api/libapi.a -lperf gcc -I/home/acme/git/perf/tools/lib/perf/include -I/home/acme/git/perf/tools/include -I/home/acme/git/perf/tools/lib -g -Wall -L.. -o test-threadmap-so test-threadmap.c /home/acme/git/perf/tools/lib/api/libapi.a -lperf gcc -I/home/acme/git/perf/tools/lib/perf/include -I/home/acme/git/perf/tools/include -I/home/acme/git/perf/tools/lib -g -Wall -L.. -o test-evlist-so test-evlist.c /home/acme/git/perf/tools/lib/api/libapi.a -lperf gcc -I/home/acme/git/perf/tools/lib/perf/include -I/home/acme/git/perf/tools/include -I/home/acme/git/perf/tools/lib -g -Wall -L.. -o test-evsel-so test-evsel.c /home/acme/git/perf/tools/lib/api/libapi.a -lperf make -C tests run running static: - running test-cpumap.c...OK - running test-threadmap.c...OK - running test-evlist.c...OK - running test-evsel.c... loop = 65536, count = 333926 loop = 131072, count = 655781 loop = 262144, count = 1311141 loop = 524288, count = 2630126 loop = 1048576, count = 5256955 loop = 65536, count = 524594 loop = 131072, count = 1058916 loop = 262144, count = 2097458 loop = 524288, count = 4205429 loop = 1048576, count = 8406606 OK running dynamic: - running test-cpumap.c...OK - running test-threadmap.c...OK - running test-evlist.c...OK - running test-evsel.c... loop = 65536, count = 328102 loop = 131072, count = 655782 loop = 262144, count = 1317494 loop = 524288, count = 2627851 loop = 1048576, count = 5255187 loop = 65536, count = 524601 loop = 131072, count = 1048923 loop = 262144, count = 2107917 loop = 524288, count = 4194606 loop = 1048576, count = 8409322 OK make: Leaving directory '/home/acme/git/perf/tools/lib/perf' # Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Itaru Kitayama <itaru.kitayama@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: http://lore.kernel.org/lkml/20210414155412.3697605-4-robh@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-04-20 08:10:45 -03:00
Florent Revest	58c2b1f5e0	libbpf: Introduce a BPF_SNPRINTF helper macro Similarly to BPF_SEQ_PRINTF, this macro turns variadic arguments into an array of u64, making it more natural to call the bpf_snprintf helper. Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210419155243.1632274-6-revest@chromium.org	2021-04-19 15:27:37 -07:00
Florent Revest	83cd92b464	libbpf: Initialize the bpf_seq_printf parameters array field by field When initializing the __param array with a one liner, if all args are const, the initial array value will be placed in the rodata section but because libbpf does not support relocation in the rodata section, any pointer in this array will stay NULL. Fixes: `c09add2fbc` ("tools/libbpf: Add bpf_iter support") Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210419155243.1632274-5-revest@chromium.org	2021-04-19 15:27:37 -07:00
Jakub Kicinski	8203c7ce4e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net drivers/net/ethernet/stmicro/stmmac/stmmac_main.c - keep the ZC code, drop the code related to reinit net/bridge/netfilter/ebtables.c - fix build after move to net_generic Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-04-17 11:08:07 -07:00
Alexei Starovoitov	d3d93e34bd	libbpf: Remove unused field. relo->processed is set, but not used. Remove it. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210415141817.53136-1-alexei.starovoitov@gmail.com	2021-04-15 15:34:16 -07:00
Rob Herring	d3003d9e68	libperf tests: Add support for verbose printing Add __T_VERBOSE() so tests can add verbose output. The verbose output is enabled with the '-v' command line option. Running 'make tests V=1' will enable the '-v' option when running the tests. It'll be used in the next patch, for a user space counter access test. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Itaru Kitayama <itaru.kitayama@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: http://lore.kernel.org/lkml/20210414155412.3697605-3-robh@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-04-15 16:40:15 -03:00
Rob Herring	6cd70754f2	libperf: Add evsel mmap support In order to support usersapce access, an event must be mmapped. While there's already mmap support for evlist, the usecase is a bit different than the self monitoring with userspace access. So let's add new perf_evsel__mmap()/perf_evsel_munmap() functions to mmap/munmap an evsel. This allows implementing userspace access as a fastpath for perf_evsel__read(). The mmapped address is returned by perf_evsel__mmap_base() which primarily for users/tests to check if userspace access is enabled. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Itaru Kitayama <itaru.kitayama@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: http://lore.kernel.org/lkml/20210414155412.3697605-2-robh@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-04-15 16:38:51 -03:00
Jakub Kicinski	8859a44ea0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Conflicts: MAINTAINERS - keep Chandrasekar drivers/net/ethernet/mellanox/mlx5/core/en_main.c - simple fix + trust the code re-added to param.c in -next is fine include/linux/bpf.h - trivial include/linux/ethtool.h - trivial, fix kdoc while at it include/linux/skmsg.h - move to relevant place in tcp.c, comment re-wrapped net/core/skmsg.c - add the sk = sk // sk = NULL around calls net/tipc/crypto.c - trivial Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-04-09 20:48:35 -07:00
Andrii Nakryiko	b3278099b2	libbpf: Add bpf_map__inner_map API The API gives access to inner map for map in map types (array or hash of map). It will be used to dynamically set max_entries in it. Signed-off-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210408061310.95877-7-yauheni.kaliuta@redhat.com	2021-04-08 23:54:48 -07:00
Ciara Loftus	afd0be7299	libbpf: Fix potential NULL pointer dereference Wait until after the UMEM is checked for null to dereference it. Fixes: `43f1bc1eff` ("libbpf: Restore umem state after socket create failure") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210408052009.7844-1-ciara.loftus@intel.com	2021-04-08 23:36:46 +02:00
Hengqi Chen	1e1032b0c4	libbpf: Fix KERNEL_VERSION macro Add missing ')' for KERNEL_VERSION macro. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210405040119.802188-1-hengqi.chen@gmail.com	2021-04-05 07:14:13 -07:00
Yang Yingliang	f07669df4c	libbpf: Remove redundant semi-colon Remove redundant semi-colon in finalize_btf_ext(). Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210402012634.1965453-1-yangyingliang@huawei.com	2021-04-03 01:49:38 +02:00
Ciara Loftus	ca7a83e248	libbpf: Only create rx and tx XDP rings when necessary Prior to this commit xsk_socket__create(_shared) always attempted to create the rx and tx rings for the socket. However this causes an issue when the socket being setup is that which shares the fd with the UMEM. If a previous call to this function failed with this socket after the rings were set up, a subsequent call would always fail because the rings are not torn down after the first call and when we try to set them up again we encounter an error because they already exist. Solve this by remembering whether the rings were set up by introducing new bools to struct xsk_umem which represent the ring setup status and using them to determine whether or not to set up the rings. Fixes: `1cad078842` ("libbpf: add support for using AF_XDP sockets") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210331061218.1647-4-ciara.loftus@intel.com	2021-04-01 14:45:43 -07:00
Ciara Loftus	43f1bc1eff	libbpf: Restore umem state after socket create failure If the call to xsk_socket__create fails, the user may want to retry the socket creation using the same umem. Ensure that the umem is in the same state on exit if the call fails by: 1. ensuring the umem _save pointers are unmodified. 2. not unmapping the set of umem rings that were set up with the umem during xsk_umem__create, since those maps existed before the call to xsk_socket__create and should remain in tact even in the event of failure. Fixes: `2f6324a393` ("libbpf: Support shared umems between queues and devices") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210331061218.1647-3-ciara.loftus@intel.com	2021-04-01 14:45:43 -07:00
Ciara Loftus	df66201631	libbpf: Ensure umem pointer is non-NULL before dereferencing Calls to xsk_socket__create dereference the umem to access the fill_save and comp_save pointers. Make sure the umem is non-NULL before doing this. Fixes: `2f6324a393` ("libbpf: Support shared umems between queues and devices") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20210331061218.1647-2-ciara.loftus@intel.com	2021-04-01 14:45:43 -07:00
Maciej Fijalkowski	10397994d3	libbpf: xsk: Use bpf_link Currently, if there are multiple xdpsock instances running on a single interface and in case one of the instances is terminated, the rest of them are left in an inoperable state due to the fact of unloaded XDP prog from interface. Consider the scenario below: // load xdp prog and xskmap and add entry to xskmap at idx 10 $ sudo ./xdpsock -i ens801f0 -t -q 10 // add entry to xskmap at idx 11 $ sudo ./xdpsock -i ens801f0 -t -q 11 terminate one of the processes and another one is unable to work due to the fact that the XDP prog was unloaded from interface. To address that, step away from setting bpf prog in favour of bpf_link. This means that refcounting of BPF resources will be done automatically by bpf_link itself. Provide backward compatibility by checking if underlying system is bpf_link capable. Do this by looking up/creating bpf_link on loopback device. If it failed in any way, stick with netlink-based XDP prog. therwise, use bpf_link-based logic. When setting up BPF resources during xsk socket creation, check whether bpf_link for a given ifindex already exists via set of calls to bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd and comparing the ifindexes from bpf_link and xsk socket. For case where resources exist but they are not AF_XDP related, bail out and ask user to remove existing prog and then retry. Lastly, do a bit of refactoring within __xsk_setup_xdp_prog and pull out existing code branches based on prog_id value onto separate functions that are responsible for resource initialization if prog_id was 0 and for lookup existing resources for non-zero prog_id as that implies that XDP program is present on the underlying net device. This in turn makes it easier to follow, especially the teardown part of both branches. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-7-maciej.fijalkowski@intel.com	2021-03-30 09:24:38 -07:00
Andrii Nakryiko	05d817031f	libbpf: Fix memory leak when emitting final btf_ext Free temporary allocated memory used to construct finalized .BTF.ext data. Found by Coverity static analysis on libbpf's Github repo. Fixes: `8fd27bf69b` ("libbpf: Add BPF static linker BTF and BTF.ext support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210327042502.969745-1-andrii@kernel.org	2021-03-30 07:38:36 -07:00
Martin KaFai Lau	5bd022ec01	libbpf: Support extern kernel function This patch is to make libbpf able to handle the following extern kernel function declaration and do the needed relocations before loading the bpf program to the kernel. extern int foo(struct sock *) __attribute__((section(".ksyms"))) In the collect extern phase, needed changes is made to bpf_object__collect_externs() and find_extern_btf_id() to collect extern function in ".ksyms" section. The func in the BTF datasec also needs to be replaced by an int var. The idea is similar to the existing handling in extern var. In case the BTF may not have a var, a dummy ksym var is added at the beginning of bpf_object__collect_externs() if there is func under ksyms datasec. It will also change the func linkage from extern to global which the kernel can support. It also assigns a param name if it does not have one. In the collect relo phase, it will record the kernel function call as RELO_EXTERN_FUNC. bpf_object__resolve_ksym_func_btf_id() is added to find the func btf_id of the running kernel. During actual relocation, it will patch the BPF_CALL instruction with src_reg = BPF_PSEUDO_FUNC_CALL and insn->imm set to the running kernel func's btf_id. The required LLVM patch: https://reviews.llvm.org/D93563 Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210325015234.1548923-1-kafai@fb.com	2021-03-26 20:41:51 -07:00
Martin KaFai Lau	aa0b8d43e9	libbpf: Record extern sym relocation first This patch records the extern sym relocs first before recording subprog relocs. The later patch will have relocs for extern kernel function call which is also using BPF_JMP \| BPF_CALL. It will be easier to handle the extern symbols first in the later patch. is_call_insn() helper is added. The existing is_ldimm64() helper is renamed to is_ldimm64_insn() for consistency. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325015227.1548623-1-kafai@fb.com	2021-03-26 20:41:51 -07:00
Martin KaFai Lau	0c091e5c2d	libbpf: Rename RELO_EXTERN to RELO_EXTERN_VAR This patch renames RELO_EXTERN to RELO_EXTERN_VAR. It is to avoid the confusion with a later patch adding RELO_EXTERN_FUNC. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325015221.1547722-1-kafai@fb.com	2021-03-26 20:41:51 -07:00
Martin KaFai Lau	774e132e83	libbpf: Refactor codes for finding btf id of a kernel symbol This patch refactors code, that finds kernel btf_id by kind and symbol name, to a new function find_ksym_btf_id(). It also adds a new helper __btf_kind_str() to return a string by the numeric kind value. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325015214.1547069-1-kafai@fb.com	2021-03-26 20:41:51 -07:00
Martin KaFai Lau	933d1aa324	libbpf: Refactor bpf_object__resolve_ksyms_btf_id This patch refactors most of the logic from bpf_object__resolve_ksyms_btf_id() into a new function bpf_object__resolve_ksym_var_btf_id(). It is to get ready for a later patch adding bpf_object__resolve_ksym_func_btf_id() which resolves a kernel function to the running kernel btf_id. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325015207.1546749-1-kafai@fb.com	2021-03-26 20:41:51 -07:00
Andrii Nakryiko	36e7985160	libbpf: Preserve empty DATASEC BTFs during static linking Ensure that BPF static linker preserves all DATASEC BTF types, even if some of them might not have any variable information at all. This may happen if the compiler promotes local initialized variable contents into .rodata section and there are no global or static functions in the program. For example, $ cat t.c struct t { char a; char b; char c; }; void bar(struct t*); void find() { struct t tmp = {1, 2, 3}; bar(&tmp); } $ clang -target bpf -O2 -g -S t.c .long 104 # BTF_KIND_DATASEC(id = 8) .long 251658240 # 0xf000000 .long 0 .ascii ".rodata" # string offset=104 $ clang -target bpf -O2 -g -c t.c $ readelf -S t.o \| grep data [ 4] .rodata PROGBITS 0000000000000000 00000090 Fixes: `8fd27bf69b` ("libbpf: Add BPF static linker BTF and BTF.ext support") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210326043036.3081011-1-andrii@kernel.org	2021-03-26 17:45:17 +01:00
Pedro Tammela	6032ebb54c	libbpf: Fix bail out from 'ringbuf_process_ring()' on error The current code bails out with negative and positive returns. If the callback returns a positive return code, 'ring_buffer__consume()' and 'ring_buffer__poll()' will return a spurious number of records consumed, but mostly important will continue the processing loop. This patch makes positive returns from the callback a no-op. Fixes: `bf99c936f9` ("libbpf: Add BPF ring buffer support") Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325150115.138750-1-pctammela@mojatatu.com	2021-03-25 21:13:24 -07:00
Rafael David Tinoco	155f556d64	libbpf: Add bpf object kern_version attribute setter Unfortunately some distros don't have their kernel version defined accurately in <linux/version.h> due to different long term support reasons. It is important to have a way to override the bpf kern_version attribute during runtime: some old kernels might still check for kern_version attribute during bpf_prog_load(). Signed-off-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210323040952.2118241-1-rafaeldtinoco@ubuntu.com	2021-03-25 19:23:27 -07:00
Andrii Nakryiko	a46410d5e4	libbpf: Constify few bpf_program getters bpf_program__get_type() and bpf_program__get_expected_attach_type() shouldn't modify given bpf_program, so mark input parameter as const struct bpf_program. This eliminates unnecessary compilation warnings or explicit casts in user programs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210324172941.2609884-1-andrii@kernel.org	2021-03-26 01:17:04 +01:00
David S. Miller	241949e488	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-03-24 The following pull-request contains BPF updates for your net-next tree. We've added 37 non-merge commits during the last 15 day(s) which contain a total of 65 files changed, 3200 insertions(+), 738 deletions(-). The main changes are: 1) Static linking of multiple BPF ELF files, from Andrii. 2) Move drop error path to devmap for XDP_REDIRECT, from Lorenzo. 3) Spelling fixes from various folks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 16:30:46 -07:00
David S. Miller	efd13b71a3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 15:31:22 -07:00
Andrii Nakryiko	78b226d481	libbpf: Skip BTF fixup if object file has no BTF Skip BTF fixup step when input object file is missing BTF altogether. Fixes: `8fd27bf69b` ("libbpf: Add BPF static linker BTF and BTF.ext support") Reported-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/bpf/20210319205909.1748642-3-andrii@kernel.org	2021-03-22 18:58:25 -07:00
Jean-Philippe Brucker	901ee1d750	libbpf: Fix BTF dump of pointer-to-array-of-struct The vmlinux.h generated from BTF is invalid when building drivers/phy/ti/phy-gmii-sel.c with clang: vmlinux.h:61702:27: error: array type has incomplete element type ‘struct reg_field’ 61702 \| const struct reg_field (regfields)[3]; \| ^~~~~~~~~ bpftool generates a forward declaration for this struct regfield, which compilers aren't happy about. Here's a simplified reproducer: struct inner { int val; }; struct outer { struct inner (ptr_to_array)[2]; } A; After build with clang -> bpftool btf dump c -> clang/gcc: ./def-clang.h:11:23: error: array has incomplete element type 'struct inner' struct inner (ptr_to_array)[2]; Member ptr_to_array of struct outer is a pointer to an array of struct inner. In the DWARF generated by clang, struct outer appears before struct inner, so when converting BTF of struct outer into C, bpftool issues a forward declaration to struct inner. With GCC the DWARF info is reversed so struct inner gets fully defined. That forward declaration is not sufficient when compilers handle an array of the struct, even when it's only used through a pointer. Note that we can trigger the same issue with an intermediate typedef: struct inner { int val; }; typedef struct inner inner2_t[2]; struct outer { inner2_t ptr_to_array; } A; Becomes: struct inner; typedef struct inner inner2_t[2]; And causes: ./def-clang.h:10:30: error: array has incomplete element type 'struct inner' typedef struct inner inner2_t[2]; To fix this, clear through_ptr whenever we encounter an intermediate array, to make the inner struct part of a strong link and force full declaration. Fixes: `351131b51c` ("libbpf: add btf_dump API for BTF-to-C conversion") Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210319112554.794552-2-jean-philippe@linaro.org	2021-03-19 14:14:44 -07:00
KP Singh	ea24b19562	libbpf: Add explicit padding to btf_dump_emit_type_decl_opts Similar to https://lore.kernel.org/bpf/20210313210920.1959628-2-andrii@kernel.org/ When DECLARE_LIBBPF_OPTS is used with inline field initialization, e.g: DECLARE_LIBBPF_OPTS(btf_dump_emit_type_decl_opts, opts, .field_name = var_ident, .indent_level = 2, .strip_mods = strip_mods, ); and compiled in debug mode, the compiler generates code which leaves the padding uninitialized and triggers errors within libbpf APIs which require strict zero initialization of OPTS structs. Adding anonymous padding field fixes the issue. Fixes: `9f81654eeb` ("libbpf: Expose BTF-to-C type declaration emitting API") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210319192117.2310658-1-kpsingh@kernel.org	2021-03-19 14:03:39 -07:00
Andrii Nakryiko	8fd27bf69b	libbpf: Add BPF static linker BTF and BTF.ext support Add .BTF and .BTF.ext static linking logic. When multiple BPF object files are linked together, their respective .BTF and .BTF.ext sections are merged together. BTF types are not just concatenated, but also deduplicated. .BTF.ext data is grouped by type (func info, line info, core_relos) and target section names, and then all the records are concatenated together, preserving their relative order. All the BTF type ID references and string offsets are updated as necessary, to take into account possibly deduplicated strings and types. BTF DATASEC types are handled specially. Their respective var_secinfos are accumulated separately in special per-section data and then final DATASEC types are emitted at the very end during bpf_linker__finalize() operation, just before emitting final ELF output file. BTF data can also provide "section annotations" for some extern variables. Such concept is missing in ELF, but BTF will have DATASEC types for such special extern datasections (e.g., .kconfig, .ksyms). Such sections are called "ephemeral" internally. Internally linker will keep metadata for each such section, collecting variables information, but those sections won't be emitted into the final ELF file. Also, given LLVM/Clang during compilation emits BTF DATASECS that are incomplete, missing section size and variable offsets for static variables, BPF static linker will initially fix up such DATASECs, using ELF symbols data. The final DATASECs will preserve section sizes and all variable offsets. This is handled correctly by libbpf already, so won't cause any new issues. On the other hand, it's actually a nice property to have a complete BTF data without runtime adjustments done during bpf_object__open() by libbpf. In that sense, BPF static linker is also a BTF normalizer. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210318194036.3521577-8-andrii@kernel.org	2021-03-18 16:14:22 -07:00
Andrii Nakryiko	faf6ed321c	libbpf: Add BPF static linker APIs Introduce BPF static linker APIs to libbpf. BPF static linker allows to perform static linking of multiple BPF object files into a single combined resulting object file, preserving all the BPF programs, maps, global variables, etc. Data sections (.bss, .data, .rodata, .maps, maps, etc) with the same name are concatenated together. Similarly, code sections are also concatenated. All the symbols and ELF relocations are also concatenated in their respective ELF sections and are adjusted accordingly to the new object file layout. Static variables and functions are handled correctly as well, adjusting BPF instructions offsets to reflect new variable/function offset within the combined ELF section. Such relocations are referencing STT_SECTION symbols and that stays intact. Data sections in different files can have different alignment requirements, so that is taken care of as well, adjusting sizes and offsets as necessary to satisfy both old and new alignment requirements. DWARF data sections are stripped out, currently. As well as LLLVM_ADDRSIG section, which is ignored by libbpf in bpf_object__open() anyways. So, in a way, BPF static linker is an analogue to `llvm-strip -g`, which is a pretty nice property, especially if resulting .o file is then used to generate BPF skeleton. Original string sections are ignored and instead we construct our own set of unique strings using libbpf-internal `struct strset` API. To reduce the size of the patch, all the .BTF and .BTF.ext processing was moved into a separate patch. The high-level API consists of just 4 functions: - bpf_linker__new() creates an instance of BPF static linker. It accepts output filename and (currently empty) options struct; - bpf_linker__add_file() takes input filename and appends it to the already processed ELF data; it can be called multiple times, one for each BPF ELF object file that needs to be linked in; - bpf_linker__finalize() needs to be called to dump final ELF contents into the output file, specified when bpf_linker was created; after bpf_linker__finalize() is called, no more bpf_linker__add_file() and bpf_linker__finalize() calls are allowed, they will return error; - regardless of whether bpf_linker__finalize() was called or not, bpf_linker__free() will free up all the used resources. Currently, BPF static linker doesn't resolve cross-object file references (extern variables and/or functions). This will be added in the follow up patch set. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210318194036.3521577-7-andrii@kernel.org	2021-03-18 16:14:22 -07:00
Andrii Nakryiko	9af44bc5d4	libbpf: Add generic BTF type shallow copy API Add btf__add_type() API that performs shallow copy of a given BTF type from the source BTF into the destination BTF. All the information and type IDs are preserved, but all the strings encountered are added into the destination BTF and corresponding offsets are rewritten. BTF type IDs are assumed to be correct or such that will be (somehow) modified afterwards. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210318194036.3521577-6-andrii@kernel.org	2021-03-18 16:14:22 -07:00
Andrii Nakryiko	90d76d3ece	libbpf: Extract internal set-of-strings datastructure APIs Extract BTF logic for maintaining a set of strings data structure, used for BTF strings section construction in writable mode, into separate re-usable API. This data structure is going to be used by bpf_linker to maintains ELF STRTAB section, which has the same layout as BTF strings section. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210318194036.3521577-5-andrii@kernel.org	2021-03-18 16:14:22 -07:00
Andrii Nakryiko	3b029e06f6	libbpf: Rename internal memory-management helpers Rename btf_add_mem() and btf_ensure_mem() helpers that abstract away details of dynamically resizable memory to use libbpf_ prefix, as they are not BTF-specific. No functional changes. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210318194036.3521577-4-andrii@kernel.org	2021-03-18 16:14:22 -07:00
Andrii Nakryiko	f36e99a45d	libbpf: Generalize BTF and BTF.ext type ID and strings iteration Extract and generalize the logic to iterate BTF type ID and string offset fields within BTF types and .BTF.ext data. Expose this internally in libbpf for re-use by bpf_linker. Additionally, complete strings deduplication handling for BTF.ext (e.g., CO-RE access strings), which was previously missing. There previously was no case of deduplicating .BTF.ext data, but bpf_linker is going to use it. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210318194036.3521577-3-andrii@kernel.org	2021-03-18 16:14:22 -07:00
Andrii Nakryiko	e14ef4bf01	libbpf: Expose btf_type_by_id() internally btf_type_by_id() is internal-only convenience API returning non-const pointer to struct btf_type. Expose it outside of btf.c for re-use. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210318194036.3521577-2-andrii@kernel.org	2021-03-18 16:14:22 -07:00
Andrii Nakryiko	9ae2c26e43	libbpf: provide NULL and KERNEL_VERSION macros in bpf_helpers.h Given that vmlinux.h is not compatible with headers like stddef.h, NULL poses an annoying problem: it is defined as #define, so is not captured in BTF, so is not emitted into vmlinux.h. This leads to users either sticking to explicit 0, or defining their own NULL (as progs/skb_pkt_end.c does). But it's easy for bpf_helpers.h to provide (conditionally) NULL definition. Similarly, KERNEL_VERSION is another commonly missed macro that came up multiple times. So this patch adds both of them, along with offsetof(), that also is typically defined in stddef.h, just like NULL. This might cause compilation warning for existing BPF applications defining their own NULL and/or KERNEL_VERSION already: progs/skb_pkt_end.c:7:9: warning: 'NULL' macro redefined [-Wmacro-redefined] #define NULL 0 ^ /tmp/linux/tools/testing/selftests/bpf/tools/include/vmlinux.h:4:9: note: previous definition is here #define NULL ((void *)0) ^ It is trivial to fix, though, so long-term benefits outweight temporary inconveniences. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20210317200510.1354627-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2021-03-17 18:48:05 -07:00
Kumar Kartikeya Dwivedi	58bfd95b55	libbpf: Use SOCK_CLOEXEC when opening the netlink socket Otherwise, there exists a small window between the opening and closing of the socket fd where it may leak into processes launched by some other thread. Fixes: `949abbe884` ("libbpf: add function to setup XDP") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210317115857.6536-1-memxor@gmail.com	2021-03-18 00:50:21 +01:00

... 4 5 6 7 8 ...

2166 Commits