We moved swap_src_dst_mac to xdp_sample.bpf.h to be shared with other
potential users, so drop it while moving code to the new file.
Also, consistently use SEC("xdp") naming instead.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210821002010.845777-16-memxor@gmail.com
Use the libbpf skeleton facility and other utilities provided by XDP
samples helper.
A lot of the code in xdp_monitor and xdp_redirect_cpu has been moved to
the xdp_sample_user.o helper, so we remove the duplicate functions here
that are no longer needed.
Thanks to BPF skeleton, we no longer depend on order of tracepoints to
uninstall them on startup. Instead, the sample mask is used to install
the needed tracepoints.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210821002010.845777-15-memxor@gmail.com
We already moved all the functionality it provided in XDP samples helper
userspace and kernel BPF object, so just delete the unneeded code.
We also add generation of BPF skeleton and compilation using clang
-target bpf for files ending with .bpf.c suffix (to denote that they use
vmlinux.h).
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210821002010.845777-14-memxor@gmail.com
Also, take this opportunity to depend on in-tree bpftool, so that we can
use static linking support in subsequent commits for XDP samples BPF
helper object.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210821002010.845777-13-memxor@gmail.com
The samples/bpf Makefile currently compiles BPF files in a way that will
produce an .eh_frame section, which will in turn confuse libbpf and produce
errors when loading BPF programs, like:
libbpf: elf: skipping unrecognized data section(32) .eh_frame
libbpf: elf: skipping relo section(33) .rel.eh_frame for section(32) .eh_frame
Fix this by instruction Clang not to produce this section, as it's useless
for BPF anyway.
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210705103841.180260-1-toke@redhat.com
This is a sample for xdp redirect broadcast. In the sample we could forward
all packets between given interfaces. There is also an option -X that could
enable 2nd xdp_prog on egress interface.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210519090747.1655268-4-liuhangbin@gmail.com
There exists many build errors when make M=samples/bpf on the Loongson
platform. This issue is MIPS related, x86 compiles just fine.
Here are some errors:
CLANG-bpf samples/bpf/sockex2_kern.o
In file included from samples/bpf/sockex2_kern.c:2:
In file included from ./include/uapi/linux/in.h:24:
In file included from ./include/linux/socket.h:8:
In file included from ./include/linux/uio.h:8:
In file included from ./include/linux/kernel.h:11:
In file included from ./include/linux/bitops.h:32:
In file included from ./arch/mips/include/asm/bitops.h:19:
In file included from ./arch/mips/include/asm/barrier.h:11:
./arch/mips/include/asm/addrspace.h:13:10: fatal error: 'spaces.h' file not found
^~~~~~~~~~
1 error generated.
CLANG-bpf samples/bpf/sockex2_kern.o
In file included from samples/bpf/sockex2_kern.c:2:
In file included from ./include/uapi/linux/in.h:24:
In file included from ./include/linux/socket.h:8:
In file included from ./include/linux/uio.h:8:
In file included from ./include/linux/kernel.h:11:
In file included from ./include/linux/bitops.h:32:
In file included from ./arch/mips/include/asm/bitops.h:22:
In file included from ./arch/mips/include/asm/cpu-features.h:13:
In file included from ./arch/mips/include/asm/cpu-info.h:15:
In file included from ./include/linux/cache.h:6:
./arch/mips/include/asm/cache.h:12:10: fatal error: 'kmalloc.h' file not found
^~~~~~~~~~~
1 error generated.
CLANG-bpf samples/bpf/sockex2_kern.o
In file included from samples/bpf/sockex2_kern.c:2:
In file included from ./include/uapi/linux/in.h:24:
In file included from ./include/linux/socket.h:8:
In file included from ./include/linux/uio.h:8:
In file included from ./include/linux/kernel.h:11:
In file included from ./include/linux/bitops.h:32:
In file included from ./arch/mips/include/asm/bitops.h:22:
./arch/mips/include/asm/cpu-features.h:15:10: fatal error: 'cpu-feature-overrides.h' file not found
^~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
$ find arch/mips/include/asm -name spaces.h | sort
arch/mips/include/asm/mach-ar7/spaces.h
...
arch/mips/include/asm/mach-generic/spaces.h
...
arch/mips/include/asm/mach-loongson64/spaces.h
...
arch/mips/include/asm/mach-tx49xx/spaces.h
$ find arch/mips/include/asm -name kmalloc.h | sort
arch/mips/include/asm/mach-generic/kmalloc.h
arch/mips/include/asm/mach-ip32/kmalloc.h
arch/mips/include/asm/mach-tx49xx/kmalloc.h
$ find arch/mips/include/asm -name cpu-feature-overrides.h | sort
arch/mips/include/asm/mach-ath25/cpu-feature-overrides.h
...
arch/mips/include/asm/mach-generic/cpu-feature-overrides.h
...
arch/mips/include/asm/mach-loongson64/cpu-feature-overrides.h
...
arch/mips/include/asm/mach-tx49xx/cpu-feature-overrides.h
In the arch/mips/Makefile, there exists the following board-dependent
options:
include arch/mips/Kbuild.platforms
cflags-y += -I$(srctree)/arch/mips/include/asm/mach-generic
So we can do the similar things in samples/bpf/Makefile, just add
platform specific and generic include dir for MIPS Loongson64 to
fix the build errors.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1611669925-25315-1-git-send-email-yangtiezhu@loongson.cn
There exists many build warnings when make M=samples/bpf on the Loongson
platform, this issue is MIPS related, x86 compiles just fine.
Here are some warnings:
CC samples/bpf/ibumad_user.o
samples/bpf/ibumad_user.c: In function ‘dump_counts’:
samples/bpf/ibumad_user.c:46:24: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=]
printf("0x%02x : %llu\n", key, value);
~~~^ ~~~~~
%lu
CC samples/bpf/offwaketime_user.o
samples/bpf/offwaketime_user.c: In function ‘print_ksym’:
samples/bpf/offwaketime_user.c:34:17: warning: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=]
printf("%s/%llx;", sym->name, addr);
~~~^ ~~~~
%lx
samples/bpf/offwaketime_user.c: In function ‘print_stack’:
samples/bpf/offwaketime_user.c:68:17: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=]
printf(";%s %lld\n", key->waker, count);
~~~^ ~~~~~
%ld
MIPS needs __SANE_USERSPACE_TYPES__ before <linux/types.h> to select
'int-ll64.h' in arch/mips/include/uapi/asm/types.h, then it can avoid
build warnings when printing __u64 with %llu, %llx or %lld.
The header tools/include/linux/types.h defines __SANE_USERSPACE_TYPES__,
it seems that we can include <linux/types.h> in the source files which
have build warnings, but it has no effect due to actually it includes
usr/include/linux/types.h instead of tools/include/linux/types.h, the
problem is that "usr/include" is preferred first than "tools/include"
in samples/bpf/Makefile, that sounds like a ugly hack to -Itools/include
before -Iusr/include.
So define __SANE_USERSPACE_TYPES__ for MIPS in samples/bpf/Makefile
is proper, if add "TPROGS_CFLAGS += -D__SANE_USERSPACE_TYPES__" in
samples/bpf/Makefile, it appears the following error:
Auto-detecting system features:
... libelf: [ on ]
... zlib: [ on ]
... bpf: [ OFF ]
BPF API too old
make[3]: *** [Makefile:293: bpfdep] Error 1
make[2]: *** [Makefile:156: all] Error 2
With #ifndef __SANE_USERSPACE_TYPES__ in tools/include/linux/types.h,
the above error has gone and this ifndef change does not hurt other
compilations.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/1611551146-14052-1-git-send-email-yangtiezhu@loongson.cn
The current LLVM and Clang build procedure in samples/bpf/README.rst is
out of date. See below that the links are not accessible any more.
$ git clone http://llvm.org/git/llvm.git
Cloning into 'llvm'...
fatal: unable to access 'http://llvm.org/git/llvm.git/': Maximum (20) redirects followed
$ git clone --depth 1 http://llvm.org/git/clang.git
Cloning into 'clang'...
fatal: unable to access 'http://llvm.org/git/clang.git/': Maximum (20) redirects followed
The LLVM community has adopted new ways to build the compiler. There are
different ways to build LLVM and Clang, the Clang Getting Started page [1]
has one way. As Yonghong said, it is better to copy the build procedure
in Documentation/bpf/bpf_devel_QA.rst to keep consistent.
I verified the procedure and it is proved to be feasible, so we should
update README.rst to reflect the reality. At the same time, update the
related comment in Makefile.
Additionally, as Fangrui said, the dir llvm-project/llvm/build/install is
not used, BUILD_SHARED_LIBS=OFF is the default option [2], so also change
Documentation/bpf/bpf_devel_QA.rst together.
At last, we recommend that developers who want the fastest incremental
builds use the Ninja build system [1], you can find it in your system's
package manager, usually the package is ninja or ninja-build [3], so add
ninja to build dependencies suggested by Nathan.
[1] https://clang.llvm.org/get_started.html
[2] https://www.llvm.org/docs/CMake.html
[3] https://github.com/ninja-build/ninja/wiki/Pre-built-Ninja-packages
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Cc: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/bpf/1611279584-26047-1-git-send-email-yangtiezhu@loongson.cn
Introduce a sample program to demonstrate the control and data
plane split. For the control plane part a new program called
xdpsock_ctrl_proc is introduced. For the data plane part, some code
was added to xdpsock_user.c to act as the data plane entity.
Application xdpsock_ctrl_proc works as control entity with sudo
privileges (CAP_SYS_ADMIN and CAP_NET_ADMIN are sufficient) and the
extended xdpsock as data plane entity with CAP_NET_RAW capability
only.
Usage example:
sudo ./samples/bpf/xdpsock_ctrl_proc -i <interface>
sudo ./samples/bpf/xdpsock -i <interface> -q <queue_id>
-n <interval> -N -l -R
Signed-off-by: Mariusz Dudek <mariuszx.dudek@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20201203090546.11976-3-mariuszx.dudek@intel.com
Numerous refactoring that rewrites BPF programs written with bpf_load
to use the libbpf loader was finally completed, resulting in BPF
programs using bpf_load within the kernel being completely no longer
present.
This commit removes bpf_load, an outdated bpf loader that is difficult
to keep up with the latest kernel BPF and causes confusion.
Also, this commit removes the unused trace_helper and bpf_load from
samples/bpf target objects from Makefile.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201124090310.24374-8-danieltimlee@gmail.com
This commit refactors the existing program with libbpf bpf loader.
Since the kprobe, tracepoint and raw_tracepoint bpf program can be
attached with single bpf_program__attach() interface, so the
corresponding function of libbpf is used here.
Rather than specifying the number of cpus inside the code, this commit
uses the number of available cpus with _SC_NPROCESSORS_ONLN.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201124090310.24374-6-danieltimlee@gmail.com
This commit refactors the existing ibumad program with libbpf bpf
loader. Attach/detach of Tracepoint bpf programs has been managed
with the generic bpf_program__attach() and bpf_link__destroy() from
the libbpf.
Also, instead of using the previous BPF MAP definition, this commit
refactors ibumad MAP definition with the new BTF-defined MAP format.
To verify that this bpf program works without an infiniband device,
try loading ib_umad kernel module and test the program as follows:
# modprobe ib_umad
# ./ibumad
Moreover, TRACE_HELPERS has been removed from the Makefile since it is
not used on this program.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201124090310.24374-5-danieltimlee@gmail.com
This commit refactors the existing kprobe program with libbpf bpf
loader. To attach bpf program, this uses generic bpf_program__attach()
approach rather than using bpf_load's load_bpf_file().
To attach bpf to perf_event, instead of using previous ioctl method,
this commit uses bpf_program__attach_perf_event since it manages the
enable of perf_event and attach of BPF programs to it, which is much
more intuitive way to achieve.
Also, explicit close(fd) has been removed since event will be closed
inside bpf_link__destroy() automatically.
Furthermore, to prevent conflict of same named uprobe events, O_TRUNC
flag has been used to clear 'uprobe_events' interface.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201124090310.24374-4-danieltimlee@gmail.com
This commit refactors the existing cgroup program with libbpf bpf
loader. The original test_cgrp2_sock2 has keeped the bpf program
attached to the cgroup hierarchy even after the exit of user program.
To implement the same functionality with libbpf, this commit uses the
BPF_LINK_PINNING to pin the link attachment even after it is closed.
Since this uses LINK instead of ATTACH, detach of bpf program from
cgroup with 'test_cgrp2_sock' is not used anymore.
The code to mount the bpf was added to the .sh file in case the bpff
was not mounted on /sys/fs/bpf. Additionally, to fix the problem that
shell script cannot find the binary object from the current path,
relative path './' has been added in front of binary.
Fixes: 554ae6e792 ("samples/bpf: add userspace example for prohibiting sockets")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201124090310.24374-3-danieltimlee@gmail.com
This commit refactors the existing cgroup programs with libbpf
bpf loader. Since bpf_program__attach doesn't support cgroup program
attachment, this explicitly attaches cgroup bpf program with
bpf_program__attach_cgroup(bpf_prog, cg1).
Also, to change attach_type of bpf program, this uses libbpf's
bpf_program__set_expected_attach_type helper to switch EGRESS to
INGRESS. To keep bpf program attached to the cgroup hierarchy even
after the exit, this commit uses the BPF_LINK_PINNING to pin the link
attachment even after it is closed.
Besides, this program was broken due to the typo of BPF MAP definition.
But this commit solves the problem by fixing this from 'queue_stats' map
struct hvm_queue_stats -> hbm_queue_stats.
Fixes: 36b5d47113 ("selftests/bpf: samples/bpf: Split off legacy stuff from bpf_helpers.h")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201124090310.24374-2-danieltimlee@gmail.com
>From commit d7a18ea7e8 ("libbpf: Add generic bpf_program__attach()"),
for some BPF programs, it is now possible to attach BPF programs
with __attach() instead of explicitly calling __attach_<type>().
This commit refactors the __attach_tracepoint() with libbpf's generic
__attach() method. In addition, this refactors the logic of setting
the map FD to simplify the code. Also, the missing removal of
bpf_load.o in Makefile has been fixed.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201010181734.1109-3-danieltimlee@gmail.com
To avoid confusion caused by the increasing fragmentation of the BPF
Loader program, this commit would like to change to the libbpf loader
instead of using the bpf_load.
Thanks to libbpf's bpf_link interface, managing the tracepoint BPF
program is much easier. bpf_program__attach_tracepoint manages the
enable of tracepoint event and attach of BPF programs to it with a
single interface bpf_link, so there is no need to manage event_fd and
prog_fd separately.
This commit refactors xdp_monitor with using this libbpf API, and the
bpf_load is removed and migrated to libbpf.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201010181734.1109-2-danieltimlee@gmail.com
With latest llvm trunk, bpf programs under samples/bpf
directory, if using CORE, may experience the following
errors:
LLVM ERROR: Cannot select: intrinsic %llvm.preserve.struct.access.index
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0. Program arguments: llc -march=bpf -filetype=obj -o samples/bpf/test_probe_write_user_kern.o
1. Running pass 'Function Pass Manager' on module '<stdin>'.
2. Running pass 'BPF DAG->DAG Pattern Instruction Selection' on function '@bpf_prog1'
#0 0x000000000183c26c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int)
(/data/users/yhs/work/llvm-project/llvm/build.cur/install/bin/llc+0x183c26c)
...
#7 0x00000000017c375e (/data/users/yhs/work/llvm-project/llvm/build.cur/install/bin/llc+0x17c375e)
#8 0x00000000016a75c5 llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*)
(/data/users/yhs/work/llvm-project/llvm/build.cur/install/bin/llc+0x16a75c5)
#9 0x00000000016ab4f8 llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*,
unsigned int) (/data/users/yhs/work/llvm-project/llvm/build.cur/install/bin/llc+0x16ab4f8)
...
Aborted (core dumped) | llc -march=bpf -filetype=obj -o samples/bpf/test_probe_write_user_kern.o
The reason is due to llvm change https://reviews.llvm.org/D87153
where the CORE relocation global generation is moved from the beginning
of target dependent optimization (llc) to the beginning
of target independent optimization (opt).
Since samples/bpf programs did not use vmlinux.h and its clang compilation
uses native architecture, we need to adjust arch triple at opt level
to do CORE relocation global generation properly. Otherwise, the above
error will appear.
This patch fixed the issue by introduce opt and llvm-dis to compilation chain,
which will do proper CORE relocation global generation as well as O2 level
optimization. Tested with llvm10, llvm11 and trunk/llvm12.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20201006043427.1891742-1-yhs@fb.com
This sample code illustrates the packet forwarding between multiple
AF_XDP sockets in multi-threading environment. All the threads and
sockets are sharing a common buffer pool, with each socket having
its own private buffer cache. The sockets are created with the
xsk_socket__create_shared() function, which allows multiple AF_XDP
sockets to share the same UMEM object.
Example 1: Single thread handling two sockets. Packets received
from socket A (on top of interface IFA, queue QA) are forwarded
to socket B (on top of interface IFB, queue QB) and vice-versa.
The thread is affinitized to CPU core C:
./xsk_fwd -i IFA -q QA -i IFB -q QB -c C
Example 2: Two threads, each handling two sockets. Packets from
socket A are sent to socket B (by thread X), packets
from socket B are sent to socket A (by thread X); packets from
socket C are sent to socket D (by thread Y), packets from socket
D are sent to socket C (by thread Y). The two threads are bound
to CPU cores CX and CY:
./xdp_fwd -i IFA -q QA -i IFB -q QB -i IFC -q QC -i IFD -q QD -c CX -c CY
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/1598603189-32145-15-git-send-email-magnus.karlsson@intel.com
For the problem of increasing fragmentation of the bpf loader programs,
instead of using bpf_loader.o, which is used in samples/bpf, this
commit refactors the existing tracepoint tracing programs with libbbpf
bpf loader.
- Adding a tracepoint event and attaching a bpf program to it was done
through bpf_program_attach().
- Instead of using the existing BPF MAP definition, MAP definition
has been refactored with the new BTF-defined MAP format.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200823085334.9413-4-danieltimlee@gmail.com
For the problem of increasing fragmentation of the bpf loader programs,
instead of using bpf_loader.o, which is used in samples/bpf, this
commit refactors the existing kprobe tracing programs with libbbpf
bpf loader.
- For kprobe events pointing to system calls, the SYSCALL() macro in
trace_common.h was used.
- Adding a kprobe event and attaching a bpf program to it was done
through bpf_program_attach().
- Instead of using the existing BPF MAP definition, MAP definition
has been refactored with the new BTF-defined MAP format.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200823085334.9413-3-danieltimlee@gmail.com
Since commit cc7f641d63 ("samples: bpf: Refactor BPF map performance
test with libbpf") has ommited the removal of bpf_load.o from Makefile,
this commit removes the bpf_load.o rule for targets where bpf_load.o is
not used.
Fixes: cc7f641d63 ("samples: bpf: Refactor BPF map performance test with libbpf")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200823085334.9413-2-danieltimlee@gmail.com
From commit 646f02ffdd ("libbpf: Add BTF-defined map-in-map
support"), a way to define internal map in BTF-defined map has been
added.
Instead of using previous 'inner_map_idx' definition, the structure to
be used for the inner map can be directly defined using array directive.
__array(values, struct inner_map)
This commit refactors map in map test program with libbpf by explicitly
defining inner map with BTF-defined format.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200707184855.30968-3-danieltimlee@gmail.com
BPF tail call uses the BPF_MAP_TYPE_PROG_ARRAY type map for calling
into other BPF programs and this PROG_ARRAY should be filled prior to
use. Currently, samples with the PROG_ARRAY type MAP fill this program
array with bpf_load. For bpf_load to fill this map, kernel BPF program
must specify the section with specific format of <prog_type>/<array_idx>
(e.g. SEC("socket/0"))
But by using libbpf instead of bpf_load, user program can specify which
programs should be added to PROG_ARRAY. The advantage of this approach
is that you can selectively add only the programs you want, rather than
adding all of them to PROG_ARRAY, and it's much more intuitive than the
traditional approach.
This commit refactors user programs with the PROG_ARRAY type MAP with
libbpf instead of using bpf_load.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200516040608.1377876-4-danieltimlee@gmail.com
Currently, the kprobe BPF program attachment method for bpf_load is
quite old. The implementation of bpf_load "directly" controls and
manages(create, delete) the kprobe events of DEBUGFS. On the other hand,
using using the libbpf automatically manages the kprobe event.
(under bpf_link interface)
By calling bpf_program__attach(_kprobe) in libbpf, the corresponding
kprobe is created and the BPF program will be attached to this kprobe.
To remove this, by simply invoking bpf_link__destroy will clean up the
event.
This commit refactors kprobe tracing programs (tracex{1~7}_user.c) with
libbpf using bpf_link interface and bpf_program__attach.
tracex2_kern.c, which tracks system calls (sys_*), has been modified to
append prefix depending on architecture.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200516040608.1377876-3-danieltimlee@gmail.com
The bpf_program__attach of libbpf(using bpf_link) is much more intuitive
than the previous method using ioctl.
bpf_program__attach_perf_event manages the enable of perf_event and
attach of BPF programs to it, so there's no neeed to do this
directly with ioctl.
In addition, bpf_link provides consistency in the use of API because it
allows disable (detach, destroy) for multiple events to be treated as
one bpf_link__destroy. Also, bpf_link__destroy manages the close() of
perf_event fd.
This commit refactors samples that attach the bpf program to perf_event
by using libbbpf instead of ioctl. Also the bpf_load in the samples were
removed and migrated to use libbbpf API.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200321100424.1593964-3-danieltimlee@gmail.com
To reduce the reliance of trace samples (trace*_user) on bpf_load,
move read_trace_pipe to trace_helpers. By moving this bpf_loader helper
elsewhere, trace functions can be easily migrated to libbbpf.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200321100424.1593964-2-danieltimlee@gmail.com
In old days, the "host-progs" syntax was used for specifying host
programs. It was renamed to the current "hostprogs-y" in 2004.
It is typically useful in scripts/Makefile because it allows Kbuild to
selectively compile host programs based on the kernel configuration.
This commit renames like follows:
always -> always-y
hostprogs-y -> hostprogs
So, scripts/Makefile will look like this:
always-$(CONFIG_BUILD_BIN2C) += ...
always-$(CONFIG_KALLSYMS) += ...
...
hostprogs := $(always-y) $(always-m)
I think this makes more sense because a host program is always a host
program, irrespective of the kernel configuration. We want to specify
which ones to compile by CONFIG options, so always-y will be handier.
The "always", "hostprogs-y", "hostprogs-m" will be kept for backward
compatibility for a while.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Fix all files in samples/bpf to include libbpf header files with the bpf/
prefix, to be consistent with external users of the library. Also ensure
that all includes of exported libbpf header files (those that are exported
on 'make install' of the library) use bracketed includes instead of quoted.
To make sure no new files are introduced that doesn't include the bpf/
prefix in its include, remove tools/lib/bpf from the include path entirely,
and use tools/lib instead.
Fixes: 6910d7d386 ("selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157952560911.1683545.8795966751309534150.stgit@toke.dk
The 'clean' rule in the samples/bpf Makefile tries to remove backup
files (ending in ~). However, if no such files exist, it will instead try
to remove the user's home directory. While the attempt is mostly harmless,
it does lead to a somewhat scary warning like this:
rm: cannot remove '~': Is a directory
Fix this by using find instead of shell expansion to locate any actual
backup files that need to be removed.
Fixes: b62a796c10 ("samples/bpf: allow make to be run from samples/bpf/ directory")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/157952560126.1683545.7273054725976032511.stgit@toke.dk
It seems Clang can in some cases turn on stack protection by default, which
doesn't work with BPF. This was reported once before[0], but it seems the
flag to explicitly turn off the stack protector wasn't added to the
Makefile, so do that now.
The symptom of this is compile errors like the following:
error: <unknown>:0:0: in function bpf_prog1 i32 (%struct.__sk_buff*): A call to built-in function '__stack_chk_fail' is not supported.
[0] https://www.spinics.net/lists/netdev/msg556400.html
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191216103819.359535-1-toke@redhat.com
Since libbpf now links against zlib, this needs to be included in the
linker invocation for the userspace programs in samples/bpf that link
statically against libbpf.
Fixes: 166750bc1d ("libbpf: Support libbpf-provided extern variables")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/20191216102405.353834-1-toke@redhat.com
Add xdp_redirect and per_socket_stats_example in build targets.
They got removed from build targets in Makefile reorganization.
Fixes: 1d97c6c251 ("samples/bpf: Base target programs rules on Makefile.target")
Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191216071619.25479-1-prashantbhole.linux@gmail.com
Daniel Borkmann says:
====================
pull-request: bpf-next 2019-11-20
The following pull-request contains BPF updates for your *net-next* tree.
We've added 81 non-merge commits during the last 17 day(s) which contain
a total of 120 files changed, 4958 insertions(+), 1081 deletions(-).
There are 3 trivial conflicts, resolve it by always taking the chunk from
196e8ca748:
<<<<<<< HEAD
=======
void *bpf_map_area_mmapable_alloc(u64 size, int numa_node);
>>>>>>> 196e8ca748
<<<<<<< HEAD
void *bpf_map_area_alloc(u64 size, int numa_node)
=======
static void *__bpf_map_area_alloc(u64 size, int numa_node, bool mmapable)
>>>>>>> 196e8ca748
<<<<<<< HEAD
if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
=======
/* kmalloc()'ed memory can't be mmap()'ed */
if (!mmapable && size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
>>>>>>> 196e8ca748
The main changes are:
1) Addition of BPF trampoline which works as a bridge between kernel functions,
BPF programs and other BPF programs along with two new use cases: i) fentry/fexit
BPF programs for tracing with practically zero overhead to call into BPF (as
opposed to k[ret]probes) and ii) attachment of the former to networking related
programs to see input/output of networking programs (covering xdpdump use case),
from Alexei Starovoitov.
2) BPF array map mmap support and use in libbpf for global data maps; also a big
batch of libbpf improvements, among others, support for reading bitfields in a
relocatable manner (via libbpf's CO-RE helper API), from Andrii Nakryiko.
3) Extend s390x JIT with usage of relative long jumps and loads in order to lift
the current 64/512k size limits on JITed BPF programs there, from Ilya Leoshkevich.
4) Add BPF audit support and emit messages upon successful prog load and unload in
order to have a timeline of events, from Daniel Borkmann and Jiri Olsa.
5) Extension to libbpf and xdpsock sample programs to demo the shared umem mode
(XDP_SHARED_UMEM) as well as RX-only and TX-only sockets, from Magnus Karlsson.
6) Several follow-up bug fixes for libbpf's auto-pinning code and a new API
call named bpf_get_link_xdp_info() for retrieving the full set of prog
IDs attached to XDP, from Toke Høiland-Jørgensen.
7) Add BTF support for array of int, array of struct and multidimensional arrays
and enable it for skb->cb[] access in kfree_skb test, from Martin KaFai Lau.
8) Fix AF_XDP by using the correct number of channels from ethtool, from Luigi Rizzo.
9) Two fixes for BPF selftest to get rid of a hang in test_tc_tunnel and to avoid
xdping to be run as standalone, from Jiri Benc.
10) Various BPF selftest fixes when run with latest LLVM trunk, from Yonghong Song.
11) Fix a memory leak in BPF fentry test run data, from Colin Ian King.
12) Various smaller misc cleanups and improvements mostly all over BPF selftests and
samples, from Daniel T. Lee, Andre Guedes, Anders Roxell, Mao Wenan, Yue Haibing.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Side effect of some kbuild changes resulted in breaking the
documented way to build samples/bpf/.
This patch change the samples/bpf/Makefile to work again, when
invoking make from the subdir samples/bpf/. Also update the
documentation in README.rst, to reflect the new way to build.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for the XDP_SHARED_UMEM mode to the xdpsock sample
application. As libbpf does not have a built in XDP program for this
mode, we use an explicitly loaded XDP program. This also serves as an
example on how to write your own XDP program that can route to an
AF_XDP socket.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: William Tu <u9012063@gmail.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/bpf/1573148860-30254-3-git-send-email-magnus.karlsson@intel.com
One conflict in the BPF samples Makefile, some fixes in 'net' whilst
we were converting over to Makefile.target rules in 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
To remove that test_attr__{enabled/open} are used by perf-sys.h, we
set HAVE_ATTR_TEST to zero.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: KP Singh <kpsingh@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: http://lore.kernel.org/bpf/20191001113307.27796-3-bjorn.topel@gmail.com
Basically it only enables that was added by previous couple fixes.
Sysroot contains correct libs installed and its headers. Useful when
working with NFC or virtual machine.
Usage example:
clean (on demand)
make ARCH=arm -C samples/bpf clean
make ARCH=arm -C tools clean
make ARCH=arm clean
configure and install headers:
make ARCH=arm defconfig
make ARCH=arm headers_install
build samples/bpf:
make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- samples/bpf/ \
SYSROOT="path/to/sysroot"
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191011002808.28206-15-ivan.khoronzhuk@linaro.org
No need in hacking HOSTCC to be cross-compiler any more, so drop
this trick and use target CC for HDR_PROBE.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191011002808.28206-11-ivan.khoronzhuk@linaro.org
While compiling natively, the host's cflags and ldflags are equal to
ones used from HOSTCFLAGS and HOSTLDFLAGS. When cross compiling it
should have own, used for target arch. While verification, for arm,
arm64 and x86_64 the following flags were used always:
-Wall -O2
-fomit-frame-pointer
-Wmissing-prototypes
-Wstrict-prototypes
So, add them as they were verified and used before adding
Makefile.target and lets omit "-fomit-frame-pointer" as were proposed
while review, as no sense in such optimization for samples.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191011002808.28206-10-ivan.khoronzhuk@linaro.org
The main reason for that - HOSTCC and CC have different aims.
HOSTCC is used to build programs running on host, that can
cross-comple target programs with CC. It was tested for arm and arm64
cross compilation, based on linaro toolchain, but should work for
others.
So, in order to split cross compilation (CC) with host build (HOSTCC),
lets base samples on Makefile.target. It allows to cross-compile
samples/bpf programs with CC while auxialry tools running on host
built with HOSTCC.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191011002808.28206-9-ivan.khoronzhuk@linaro.org
Drop inclusion for bpf_load -I$(objtree)/usr/include as it is
included for all objects anyway, with above line:
KBUILD_HOSTCFLAGS += -I$(objtree)/usr/include
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191011002808.28206-7-ivan.khoronzhuk@linaro.org
For arm, -D__LINUX_ARM_ARCH__=X is min version used as instruction
set selector and is absolutely required while parsing some parts of
headers. It's present in KBUILD_CFLAGS but not in autoconf.h, so let's
retrieve it from and add to programs cflags. In another case errors
like "SMP is not supported" for armv7 and bunch of other errors are
issued resulting to incorrect final object.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191011002808.28206-6-ivan.khoronzhuk@linaro.org
It can overlap with CFLAGS used for libraries built with gcc if
not now then in next patches. Correct it here for simplicity.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191011002808.28206-5-ivan.khoronzhuk@linaro.org
For cross compiling the target triple can be inherited from
cross-compile prefix as it's done in CLANG_FLAGS from kernel makefile.
So copy-paste this decision from kernel Makefile.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191011002808.28206-4-ivan.khoronzhuk@linaro.org
Don't list userspace "cookie_uid_helper_example" object in list for
bpf objects.
'always' target is used for listing bpf programs, but
'cookie_uid_helper_example.o' is a user space ELF file, and covered
by rule `per_socket_stats_example`, so shouldn't be in 'always'.
Let us remove `always += cookie_uid_helper_example.o`, which avoids
breaking cross compilation due to mismatched includes.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191011002808.28206-3-ivan.khoronzhuk@linaro.org
echo should be replaced with echo -e to handle '\n' correctly, but
instead, replace it with printf as some systems can't handle echo -e.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191011002808.28206-2-ivan.khoronzhuk@linaro.org
Move bpf_helpers.h, bpf_tracing.h, and bpf_endian.h into libbpf. Move
bpf_helper_defs.h generation into libbpf's Makefile. Ensure all those
headers are installed along the other libbpf headers. Also, adjust
selftests and samples include path to include libbpf now.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191008175942.1769476-6-andriin@fb.com
While $ARCH can be relatively flexible (see Makefile and
tools/scripts/Makefile.arch), $SRCARCH always corresponds to a directory
name under arch/.
Therefore, build samples with -D__TARGET_ARCH_$(SRCARCH), since that
matches the expectations of bpf_helpers.h.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Adds support for fq's Earliest Departure Time to HBM (Host Bandwidth
Manager). Includes a new BPF program supporting EDT, and also updates
corresponding programs.
It will drop packets with an EDT of more than 500us in the future
unless the packet belongs to a flow with less than 2 packets in flight.
This is done so each flow has at least 2 packets in flight, so they
will not starve, and also to help prevent delayed ACK timeouts.
It will also work with ECN enabled traffic, where the packets will be
CE marked if their EDT is more than 50us in the future.
The table below shows some performance numbers. The flows are back to
back RPCS. One server sending to another, either 2 or 4 flows.
One flow is a 10KB RPC, the rest are 1MB RPCs. When there are more
than one flow of a given RPC size, the numbers represent averages.
The rate limit applies to all flows (they are in the same cgroup).
Tests ending with "-edt" ran with the new BPF program supporting EDT.
Tests ending with "-hbt" ran on top HBT qdisc with the specified rate
(i.e. no HBM). The other tests ran with the HBM BPF program included
in the HBM patch-set.
EDT has limited value when using DCTCP, but it helps in many cases when
using Cubic. It usually achieves larger link utilization and lower
99% latencies for the 1MB RPCs.
HBM ends up queueing a lot of packets with its default parameter values,
reducing the goodput of the 10KB RPCs and increasing their latency. Also,
the RTTs seen by the flows are quite large.
Aggr 10K 10K 10K 1MB 1MB 1MB
Limit rate drops RTT rate P90 P99 rate P90 P99
Test rate Flows Mbps % us Mbps us us Mbps ms ms
-------- ---- ----- ---- ----- --- ---- ---- ---- ---- ---- ----
cubic 1G 2 904 0.02 108 257 511 539 647 13.4 24.5
cubic-edt 1G 2 982 0.01 156 239 656 967 743 14.0 17.2
dctcp 1G 2 977 0.00 105 324 408 744 653 14.5 15.9
dctcp-edt 1G 2 981 0.01 142 321 417 811 660 15.7 17.0
cubic-htb 1G 2 919 0.00 1825 40 2822 4140 879 9.7 9.9
cubic 200M 2 155 0.30 220 81 532 655 74 283 450
cubic-edt 200M 2 188 0.02 222 87 1035 1095 101 84 85
dctcp 200M 2 188 0.03 111 77 912 939 111 76 325
dctcp-edt 200M 2 188 0.03 217 74 1416 1738 114 76 79
cubic-htb 200M 2 188 0.00 5015 8 14ms 15ms 180 48 50
cubic 1G 4 952 0.03 110 165 516 546 262 38 154
cubic-edt 1G 4 973 0.01 190 111 1034 1314 287 65 79
dctcp 1G 4 951 0.00 103 180 617 905 257 37 38
dctcp-edt 1G 4 967 0.00 163 151 732 1126 272 43 55
cubic-htb 1G 4 914 0.00 3249 13 7ms 8ms 300 29 34
cubic 5G 4 4236 0.00 134 305 490 624 1310 10 17
cubic-edt 5G 4 4865 0.00 156 306 425 759 1520 10 16
dctcp 5G 4 4936 0.00 128 485 221 409 1484 7 9
dctcp-edt 5G 4 4924 0.00 148 390 392 623 1508 11 26
v1 -> v2: Incorporated Andrii's suggestions
v2 -> v3: Incorporated Yonghong's suggestions
v3 -> v4: Removed credit update that is not needed
Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Due to recent change of include path at commit b552d33c80
("samples/bpf: fix include path in Makefile"), some of the
previous include options became unnecessary.
This commit removes duplicated include options in Makefile.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Recent commit included libbpf.h in selftests/bpf/bpf_util.h.
Since some samples use bpf_util.h and samples/bpf/Makefile doesn't
have libbpf.h path included, build was failing. Let's add the path
in samples/bpf/Makefile.
Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Quentin reports that commit 07c3bbdb1a ("samples: bpf: print
a warning about headers_install") is producing the false
positive when make is invoked locally, from the samples/bpf/
directory.
When make is run locally it hits the "all" target, which
will recursively invoke make through the full build system.
Speed up the "local" run which doesn't actually build anything,
and avoid false positives by skipping all the probes if not in
kbuild environment (cover both the new warning and the BTF
probes).
Reported-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
It seems like periodically someone posts patches to "fix"
header includes. The issue is that samples expect the
include path to have the uAPI headers (from usr/) first,
and then tools/ headers, so that locally installed uAPI
headers take precedence. This means that if users didn't
run headers_install they will see all sort of strange
compilation errors, e.g.:
HOSTCC samples/bpf/test_lru_dist
samples/bpf/test_lru_dist.c:39:8: error: redefinition of ‘struct list_head’
struct list_head {
^~~~~~~~~
In file included from samples/bpf/test_lru_dist.c:9:0:
../tools/include/linux/types.h:69:8: note: originally defined here
struct list_head {
^~~~~~~~~
Try to detect this situation, and print a helpful warning.
v2: just use HOSTCC (Jiong).
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Convert test_cgrp2_attach2 example into a proper test_cgroup_attach
kselftest. It's better because we do run kselftest on a constant
basis, so there are better chances to spot a potential regression.
Also make it slightly less verbose to conform kselftests output style.
Output example:
$ ./test_cgroup_attach
#override:PASS
#multi:PASS
test_cgroup_attach:PASS
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Provide a count of class types for a summary of MAD packets. The example
shows one way to filter the trace data based on management class.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The program nrm creates a cgroup and attaches a BPF program to the
cgroup for testing HBM (Host Bandwidth Manager) for egress traffic.
One still needs to create network traffic. This can be done through
netesto, netperf or iperf3.
A follow-up patch contains a script to create traffic.
USAGE: hbm [-d] [-l] [-n <id>] [-r <rate>] [-s] [-t <secs>]
[-w] [-h] [prog]
Where:
-d Print BPF trace debug buffer
-l Also limit flows doing loopback
-n <#> To create cgroup "/hbm#" and attach prog. Default is /nrm1
This is convenient when testing HBM in more than 1 cgroup
-r <rate> Rate limit in Mbps
-s Get HBM stats (marked, dropped, etc.)
-t <time> Exit after specified seconds (deault is 0)
-w Work conserving flag. cgroup can increase its bandwidth
beyond the rate limit specified while there is available
bandwidth. Current implementation assumes there is only
NIC (eth0), but can be extended to support multiple NICs.
Currrently only supported for egress. Note, this is just
a proof of concept.
-h Print this info
prog BPF program file name. Name defaults to hbm_out_kern.o
More information about HBM can be found in the paper "BPF Host Resource
Management" presented at the 2018 Linux Plumbers Conference, Networking Track
(http://vger.kernel.org/lpc_net2018_talks/LPC%20BPF%20Network%20Resource%20Paper.pdf)
Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
A cgroup skb BPF program to limit cgroup output bandwidth.
It uses a modified virtual token bucket queue to limit average
egress bandwidth. The implementation uses credits instead of tokens.
Negative credits imply that queueing would have happened (this is
a virtual queue, so no queueing is done by it. However, queueing may
occur at the actual qdisc (which is not used for rate limiting).
This implementation uses 3 thresholds, one to start marking packets and
the other two to drop packets:
CREDIT
- <--------------------------|------------------------> +
| | | 0
| Large pkt |
| drop thresh |
Small pkt drop Mark threshold
thresh
The effect of marking depends on the type of packet:
a) If the packet is ECN enabled, then the packet is ECN ce marked.
The current mark threshold is tuned for DCTCP.
c) Else, it is dropped if it is a large packet.
If the credit is below the drop threshold, the packet is dropped.
Note that dropping a packet through the BPF program does not trigger CWR
(Congestion Window Reduction) in TCP packets. A future patch will add
support for triggering CWR.
This BPF program actually uses 2 drop thresholds, one threshold
for larger packets (>= 120 bytes) and another for smaller packets. This
protects smaller packets such as SYNs, ACKs, etc.
The default bandwidth limit is set at 1Gbps but this can be changed by
a user program through a shared BPF map. In addition, by default this BPF
program does not limit connections using loopback. This behavior can be
overwritten by the user program. There is also an option to calculate
some statistics, such as percent of packets marked or dropped, which
the user program can access.
A latter patch provides such a program (hbm.c)
Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Some samples don't really need the magic of bpf_load,
switch them to libbpf.
v2: - specify program types.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
bpftool can do all the things load_sock_ops used to do, and more.
Point users to bpftool instead of maintaining this sample utility.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This commit converts the xdpsock sample application to use the AF_XDP
functions present in libbpf. This cuts down the size of it by nearly
300 lines of code.
The default ring sizes plus the batch size has been increased and the
size of the umem area has decreased. This so that the sample application
will provide higher throughput. Note also that the shared umem code
has been removed from the sample as this is not supported by libbpf
at this point in time.
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Some of XDP samples that are attaching the bpf program to the interface
via libbpf's bpf_set_link_xdp_fd are still using the bpf_load.c for
loading and manipulating the ebpf program and maps. Convert them to do
this through libbpf usage and remove bpf_load from the picture.
While at it remove what looks like debug leftover in
xdp_redirect_map_user.c
In xdp_redirect_cpu, change the way that the program to be loaded onto
interface is chosen - user now needs to pass the program's section name
instead of the relative number. In case of typo print out the section
names to choose from.
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
x86 compilation has required asm goto support since 4.17.
Since clang does not support asm goto, at 4.17,
Commit b1ae32dbab ("x86/cpufeature: Guard asm_volatile_goto usage
for BPF compilation") worked around the issue by permitting an
alternative implementation without asm goto for clang.
At 5.0, more asm goto usages appeared.
[yhs@148 x86]$ egrep -r asm_volatile_goto
include/asm/cpufeature.h: asm_volatile_goto("1: jmp 6f\n"
include/asm/jump_label.h: asm_volatile_goto("1:"
include/asm/jump_label.h: asm_volatile_goto("1:"
include/asm/rmwcc.h: asm_volatile_goto (fullop "; j" #cc " %l[cc_label]" \
include/asm/uaccess.h: asm_volatile_goto("\n" \
include/asm/uaccess.h: asm_volatile_goto("\n" \
[yhs@148 x86]$
Compiling samples/bpf directories, most bpf programs failed
compilation with error messages like:
In file included from /home/yhs/work/bpf-next/samples/bpf/xdp_sample_pkts_kern.c:2:
In file included from /home/yhs/work/bpf-next/include/linux/ptrace.h:6:
In file included from /home/yhs/work/bpf-next/include/linux/sched.h:15:
In file included from /home/yhs/work/bpf-next/include/linux/sem.h:5:
In file included from /home/yhs/work/bpf-next/include/uapi/linux/sem.h:5:
In file included from /home/yhs/work/bpf-next/include/linux/ipc.h:9:
In file included from /home/yhs/work/bpf-next/include/linux/refcount.h:72:
/home/yhs/work/bpf-next/arch/x86/include/asm/refcount.h:70:9: error: 'asm goto' constructs are not supported yet
return GEN_BINARY_SUFFIXED_RMWcc(LOCK_PREFIX "subl",
^
/home/yhs/work/bpf-next/arch/x86/include/asm/rmwcc.h:67:2: note: expanded from macro 'GEN_BINARY_SUFFIXED_RMWcc'
__GEN_RMWcc(op " %[val], %[var]\n\t" suffix, var, cc, \
^
/home/yhs/work/bpf-next/arch/x86/include/asm/rmwcc.h:21:2: note: expanded from macro '__GEN_RMWcc'
asm_volatile_goto (fullop "; j" #cc " %l[cc_label]" \
^
/home/yhs/work/bpf-next/include/linux/compiler_types.h:188:37: note: expanded from macro 'asm_volatile_goto'
#define asm_volatile_goto(x...) asm goto(x)
Most implementation does not even provide an alternative
implementation. And it is also not practical to make changes
for each call site.
This patch workarounded the asm goto issue by redefining the macro like below:
#define asm_volatile_goto(x...) asm volatile("invalid use of asm_volatile_goto")
If asm_volatile_goto is not used by bpf programs, which is typically the case, nothing bad
will happen. If asm_volatile_goto is used by bpf programs, which is incorrect, the compiler
will issue an error since "invalid use of asm_volatile_goto" is not valid assembly codes.
With this patch, all bpf programs under samples/bpf can pass compilation.
Note that bpf programs under tools/testing/selftests/bpf/ compiled fine as
they do not access kernel internal headers.
Fixes: e769742d35 ("Revert "x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs"")
Fixes: 18fe58229d ("x86, asm: change the GEN_*_RMWcc() macros to not quote the condition")
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
These explicit rules are unneeded because scripts/Makefile.build
provides a pattern rule to create %.s from %.c
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Add additional checks in tools/testing/selftests/bpf and
samples/bpf such that if clang/llvm compiler can generate
BTF sections, do not use pahole.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Sample program which shows TCP_SAVE_SYN/TCP_SAVED_SYN usage example:
bpf program which is doing TOS/TCLASS reflection (server would reply
with a same TOS/TCLASS as client).
Signed-off-by: Nikita V. Shirokov <tehnerd@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Pull networking updates from David Miller:
"Highlights:
- Gustavo A. R. Silva keeps working on the implicit switch fallthru
changes.
- Support 802.11ax High-Efficiency wireless in cfg80211 et al, From
Luca Coelho.
- Re-enable ASPM in r8169, from Kai-Heng Feng.
- Add virtual XFRM interfaces, which avoids all of the limitations of
existing IPSEC tunnels. From Steffen Klassert.
- Convert GRO over to use a hash table, so that when we have many
flows active we don't traverse a long list during accumluation.
- Many new self tests for routing, TC, tunnels, etc. Too many
contributors to mention them all, but I'm really happy to keep
seeing this stuff.
- Hardware timestamping support for dpaa_eth/fsl-fman from Yangbo Lu.
- Lots of cleanups and fixes in L2TP code from Guillaume Nault.
- Add IPSEC offload support to netdevsim, from Shannon Nelson.
- Add support for slotting with non-uniform distribution to netem
packet scheduler, from Yousuk Seung.
- Add UDP GSO support to mlx5e, from Boris Pismenny.
- Support offloading of Team LAG in NFP, from John Hurley.
- Allow to configure TX queue selection based upon RX queue, from
Amritha Nambiar.
- Support ethtool ring size configuration in aquantia, from Anton
Mikaev.
- Support DSCP and flowlabel per-transport in SCTP, from Xin Long.
- Support list based batching and stack traversal of SKBs, this is
very exciting work. From Edward Cree.
- Busyloop optimizations in vhost_net, from Toshiaki Makita.
- Introduce the ETF qdisc, which allows time based transmissions. IGB
can offload this in hardware. From Vinicius Costa Gomes.
- Add parameter support to devlink, from Moshe Shemesh.
- Several multiplication and division optimizations for BPF JIT in
nfp driver, from Jiong Wang.
- Lots of prepatory work to make more of the packet scheduler layer
lockless, when possible, from Vlad Buslov.
- Add ACK filter and NAT awareness to sch_cake packet scheduler, from
Toke Høiland-Jørgensen.
- Support regions and region snapshots in devlink, from Alex Vesker.
- Allow to attach XDP programs to both HW and SW at the same time on
a given device, with initial support in nfp. From Jakub Kicinski.
- Add TLS RX offload and support in mlx5, from Ilya Lesokhin.
- Use PHYLIB in r8169 driver, from Heiner Kallweit.
- All sorts of changes to support Spectrum 2 in mlxsw driver, from
Ido Schimmel.
- PTP support in mv88e6xxx DSA driver, from Andrew Lunn.
- Make TCP_USER_TIMEOUT socket option more accurate, from Jon
Maxwell.
- Support for templates in packet scheduler classifier, from Jiri
Pirko.
- IPV6 support in RDS, from Ka-Cheong Poon.
- Native tproxy support in nf_tables, from Máté Eckl.
- Maintain IP fragment queue in an rbtree, but optimize properly for
in-order frags. From Peter Oskolkov.
- Improvde handling of ACKs on hole repairs, from Yuchung Cheng"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1996 commits)
bpf: test: fix spelling mistake "REUSEEPORT" -> "REUSEPORT"
hv/netvsc: Fix NULL dereference at single queue mode fallback
net: filter: mark expected switch fall-through
xen-netfront: fix warn message as irq device name has '/'
cxgb4: Add new T5 PCI device ids 0x50af and 0x50b0
net: dsa: mv88e6xxx: missing unlock on error path
rds: fix building with IPV6=m
inet/connection_sock: prefer _THIS_IP_ to current_text_addr
net: dsa: mv88e6xxx: bitwise vs logical bug
net: sock_diag: Fix spectre v1 gadget in __sock_diag_cmd()
ieee802154: hwsim: using right kind of iteration
net: hns3: Add vlan filter setting by ethtool command -K
net: hns3: Set tx ring' tc info when netdev is up
net: hns3: Remove tx ring BD len register in hns3_enet
net: hns3: Fix desc num set to default when setting channel
net: hns3: Fix for phy link issue when using marvell phy driver
net: hns3: Fix for information of phydev lost problem when down/up
net: hns3: Fix for command format parsing error in hclge_is_all_function_id_zero
net: hns3: Add support for serdes loopback selftest
bnxt_en: take coredump_record structure off stack
...
Convert xdpsock_user.c to use libbpf instead of bpf_load.o.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Convert xdp_fwd_user.c to use libbpf instead of bpf_load.o.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
To smoothly test BTF supported binary on samples/bpf,
let samples/bpf/Makefile probe llc, pahole and
llvm-objcopy for BPF support and use them
like tools/testing/selftests/bpf/Makefile
changed from the commit c0fa1b6c3e ("bpf: btf:
Add BTF tests").
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
In preparation for enabling command line LDLIBS, re-name HOST_LOADLIBES
to KBUILD_HOSTLDLIBS as the internal use only flags. Also rename
existing usage to HOSTLDLIBS for consistency. This should not have any
visible effects.
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
In preparation for enabling command line CFLAGS, re-name HOSTCFLAGS to
KBUILD_HOSTCFLAGS as the internal use only flags. This should not have
any visible effects.
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Add an example program showing how to sample packets from XDP using the
perf event buffer. The example userspace program just prints the ethernet
header for every packet sampled.
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Alexei Starovoitov says:
====================
pull-request: bpf-next 2018-05-24
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Björn Töpel cleans up AF_XDP (removes rebind, explicit cache alignment from uapi, etc).
2) David Ahern adds mtu checks to bpf_ipv{4,6}_fib_lookup() helpers.
3) Jesper Dangaard Brouer adds bulking support to ndo_xdp_xmit.
4) Jiong Wang adds support for indirect and arithmetic shifts to NFP
5) Martin KaFai Lau cleans up BTF uapi and makes the btf_header extensible.
6) Mathieu Xhonneux adds an End.BPF action to seg6local with BPF helpers allowing
to edit/grow/shrink a SRH and apply on a packet generic SRv6 actions.
7) Sandipan Das adds support for bpf2bpf function calls in ppc64 JIT.
8) Yonghong Song adds BPF_TASK_FD_QUERY command for introspection of tracing events.
9) other misc fixes from Gustavo A. R. Silva, Sirio Balmelli, John Fastabend, and Magnus Karlsson
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This is mostly to test kprobe/uprobe which needs kernel headers.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
S390 bpf_jit.S is removed in net-next and had changes in 'net',
since that code isn't used any more take the removal.
TLS data structures split the TX and RX components in 'net-next',
put the new struct members from the bug fix in 'net' into the RX
part.
The 'net-next' tree had some reworking of how the ERSPAN code works in
the GRE tunneling code, overlapping with a one-line headroom
calculation fix in 'net'.
Overlapping changes in __sock_map_ctx_update_elem(), keep the bits
that read the prog members via READ_ONCE() into local variables
before using them.
Signed-off-by: David S. Miller <davem@davemloft.net>
Building samples with clang ignores the $(Q) setting, always
printing full command to the output. Make it less verbose.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Make complains that it doesn't know how to make libbpf.a:
scripts/Makefile.host:106: target 'samples/bpf/../../tools/lib/bpf/libbpf.a' doesn't match the target pattern
Now that we have it as a dependency of the sources simply add libbpf.a
to libraries not objects.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
There are many ways users may compile samples, some of them got
broken by commit 5f9380572b ("samples: bpf: compile and link
against full libbpf"). Improve path resolution and make libbpf
building a dependency of source files to force its build.
Samples should now again build with any of:
cd samples/bpf; make
make samples/bpf/
make -C samples/bpf
cd samples/bpf; make O=builddir
make samples/bpf/ O=builddir
make -C samples/bpf O=builddir
export KBUILD_OUTPUT=builddir
make samples/bpf/
make -C samples/bpf
Fixes: 5f9380572b ("samples: bpf: compile and link against full libbpf")
Reported-by: Björn Töpel <bjorn.topel@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Now that we can use full powers of libbpf in BPF samples, we
should perhaps make the simplest XDP programs not depend on
bpf_load helpers. This way newcomers will be exposed to the
recommended library from the start.
Use of bpf_prog_load_xattr() will also make it trivial to later
on request offload of the programs by simply adding ifindex to
the xattr.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
There are two copies of event reading loop - in bpftool and
trace_helpers "library". Consolidate them and move the code
to libbpf. Return codes from trace_helpers are kept, but
renamed to include LIBBPF prefix.
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
samples/bpf currently cherry-picks object files from tools/lib/bpf
to link against. Just compile the full library and link statically
against it.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Simple example of fast-path forwarding. It has a serious flaw
in not verifying the egress device index supports XDP forwarding.
If the egress device does not packets are dropped.
Take this only as a simple example of fast-path forwarding.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This is a sample application for AF_XDP sockets. The application
supports three different modes of operation: rxdrop, txonly and l2fwd.
To show-case a simple round-robin load-balancing between a set of
sockets in an xskmap, set the RR_LB compile time define option to 1 in
"xdpsock.h".
v2: The entries variable was calculated twice in {umem,xq}_nb_avail.
Co-authored-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
There is no functionality change in this patch. The common-purpose
trace functions, including perf_event polling and ksym lookup,
are moved from trace_output_user.c and bpf_load.c to
selftests/bpf/trace_helpers.c so that these function can
be reused later in selftests.
Acked-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Move the testsuite to
selftests/bpf/{test_tunnel_kern.c, test_tunnel.sh}
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
adding bpf's sample program which is using bpf_xdp_adjust_tail helper
by generating ICMPv4 "packet to big" message if ingress packet's size is
bigger then 600 bytes
Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
add empty raw_tracepoint bpf program to test overhead similar
to kprobe and traditional tracepoint tests
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
CPU is active when have running tasks on it and CPUFreq governor can
select different operating points (OPP) according to different workload;
we use 'pstate' to present CPU state which have running tasks with one
specific OPP. On the other hand, CPU is idle which only idle task on
it, CPUIdle governor can select one specific idle state to power off
hardware logics; we use 'cstate' to present CPU idle state.
Based on trace events 'cpu_idle' and 'cpu_frequency' we can accomplish
the duration statistics for every state. Every time when CPU enters
into or exits from idle states, the trace event 'cpu_idle' is recorded;
trace event 'cpu_frequency' records the event for CPU OPP changing, so
it's easily to know how long time the CPU stays in the specified OPP,
and the CPU must be not in any idle state.
This patch is to utilize the mentioned trace events for pstate and
cstate statistics. To achieve more accurate profiling data, the program
uses below sequence to insure CPU running/idle time aren't missed:
- Before profiling the user space program wakes up all CPUs for once, so
can avoid to missing account time for CPU staying in idle state for
long time; the program forces to set 'scaling_max_freq' to lowest
frequency and then restore 'scaling_max_freq' to highest frequency,
this can ensure the frequency to be set to lowest frequency and later
after start to run workload the frequency can be easily to be changed
to higher frequency;
- User space program reads map data and update statistics for every 5s,
so this is same with other sample bpf programs for avoiding big
overload introduced by bpf program self;
- When send signal to terminate program, the signal handler wakes up
all CPUs, set lowest frequency and restore highest frequency to
'scaling_max_freq'; this is exactly same with the first step so
avoid to missing account CPU pstate and cstate time during last
stage. Finally it reports the latest statistics.
The program has been tested on Hikey board with octa CA53 CPUs, below
is one example for statistics result, the format mainly follows up
Jesper Dangaard Brouer suggestion.
Jesper reminds to 'get printf to pretty print with thousands separators
use %' and setlocale(LC_NUMERIC, "en_US")', tried three different arm64
GCC toolchains (5.4.0 20160609, 6.2.1 20161016, 6.3.0 20170516) but all
of them cannot support printf flag character %' on arm64 platform, so go
back print number without grouping mode.
CPU states statistics:
state(ms) cstate-0 cstate-1 cstate-2 pstate-0 pstate-1 pstate-2 pstate-3 pstate-4
CPU-0 767 6111 111863 561 31 756 853 190
CPU-1 241 10606 107956 484 125 646 990 85
CPU-2 413 19721 98735 636 84 696 757 89
CPU-3 84 11711 79989 17516 909 4811 5773 341
CPU-4 152 19610 98229 444 53 649 708 1283
CPU-5 185 8781 108697 666 91 671 677 1365
CPU-6 157 21964 95825 581 67 566 684 1284
CPU-7 125 15238 102704 398 20 665 786 1197
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Parse netlink ext attribute to get the error message returned by
the card. Code is partially take from libnl.
We add netlink.h to the uapi include of tools. And we need to
avoid include of userspace netlink header to have a successful
build of sample so nlattr.h has a define to avoid
the inclusion. Using a direct define could have been an issue
as NLMSGERR_ATTR_MAX can change in the future.
We also define SOL_NETLINK if not defined to avoid to have to
copy socket.h for a fixed value.
Signed-off-by: Eric Leblond <eric@regit.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Do not build lib/bpf/bpf.o with this Makefile but use the one from the
library directory. This avoid making a buggy bpf.o file (e.g. missing
symbols).
This patch is useful if some code (e.g. Landlock tests) needs both the
bpf.o (from tools/lib/bpf) and the bpf_load.o (from samples/bpf).
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Creating a bpf sample that shows howto use the XDP 'data_meta'
infrastructure, created by Daniel Borkmann. Very few drivers support
this feature, but I wanted a functional sample to begin with, when
working on adding driver support.
XDP data_meta is about creating a communication channel between BPF
programs. This can be XDP tail-progs, but also other SKB based BPF
hooks, like in this case the TC clsact hook. In this sample I show
that XDP can store info named "mark", and TC/clsact chooses to use
this info and store it into the skb->mark.
It is a bit annoying that XDP and TC samples uses different tools/libs
when attaching their BPF hooks. As the XDP and TC programs need to
cooperate and agree on a struct-layout, it is best/easiest if the two
programs can be contained within the same BPF restricted-C file.
As the bpf-loader, I choose to not use bpf_load.c (or libbpf), but
instead wrote a bash shell scripted named xdp2skb_meta.sh, which
demonstrate howto use the iproute cmdline tools 'tc' and 'ip' for
loading BPF programs. To make it easy for first time users, the shell
script have command line parsing, and support --verbose and --dry-run
mode, if you just want to see/learn the tc+ip command syntax:
# ./xdp2skb_meta.sh --dev ixgbe2 --dry-run
# Dry-run mode: enable VERBOSE and don't call TC+IP
tc qdisc del dev ixgbe2 clsact
tc qdisc add dev ixgbe2 clsact
tc filter add dev ixgbe2 ingress prio 1 handle 1 bpf da obj ./xdp2skb_meta_kern.o sec tc_mark
# Flush XDP on device: ixgbe2
ip link set dev ixgbe2 xdp off
ip link set dev ixgbe2 xdp obj ./xdp2skb_meta_kern.o sec xdp_mark
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This sample program can be used for monitoring and reporting how many
packets per sec (pps) are received per NIC RX queue index and which
CPU processed the packet. In itself it is a useful tool for quickly
identifying RSS imbalance issues, see below.
The default XDP action is XDP_PASS in-order to provide a monitor
mode. For benchmarking purposes it is possible to specify other XDP
actions on the cmdline --action.
Output below shows an imbalance RSS case where most RXQ's deliver to
CPU-0 while CPU-2 only get packets from a single RXQ. Looking at
things from a CPU level the two CPUs are processing approx the same
amount, BUT looking at the rx_queue_index levels it is clear that
RXQ-2 receive much better service, than other RXQs which all share CPU-0.
Running XDP on dev:i40e1 (ifindex:3) action:XDP_PASS
XDP stats CPU pps issue-pps
XDP-RX CPU 0 900,473 0
XDP-RX CPU 2 906,921 0
XDP-RX CPU total 1,807,395
RXQ stats RXQ:CPU pps issue-pps
rx_queue_index 0:0 180,098 0
rx_queue_index 0:sum 180,098
rx_queue_index 1:0 180,098 0
rx_queue_index 1:sum 180,098
rx_queue_index 2:2 906,921 0
rx_queue_index 2:sum 906,921
rx_queue_index 3:0 180,098 0
rx_queue_index 3:sum 180,098
rx_queue_index 4:0 180,082 0
rx_queue_index 4:sum 180,082
rx_queue_index 5:0 180,093 0
rx_queue_index 5:sum 180,093
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>