mirror of
https://git.proxmox.com/git/mirror_ubuntu-kernels.git
synced 2026-01-19 17:06:22 +00:00
The common use-case in production is to have multiple cgroup-bpf programs per attach type that cover multiple use-cases. Such programs are attached with BPF_F_ALLOW_MULTI and can be maintained by different people. Order of programs usually matters, for example imagine two egress programs: the first one drops packets and the second one counts packets. If they're swapped the result of counting program will be different. It brings operational challenges with updating cgroup-bpf program(s) attached with BPF_F_ALLOW_MULTI since there is no way to replace a program: * One way to update is to detach all programs first and then attach the new version(s) again in the right order. This introduces an interruption in the work a program is doing and may not be acceptable (e.g. if it's egress firewall); * Another way is attach the new version of a program first and only then detach the old version. This introduces the time interval when two versions of same program are working, what may not be acceptable if a program is not idempotent. It also imposes additional burden on program developers to make sure that two versions of their program can co-exist. Solve the problem by introducing a "replace" mode in BPF_PROG_ATTACH command for cgroup-bpf programs being attached with BPF_F_ALLOW_MULTI flag. This mode is enabled by newly introduced BPF_F_REPLACE attach flag and bpf_attr.replace_bpf_fd attribute to pass fd of the old program to replace That way user can replace any program among those attached with BPF_F_ALLOW_MULTI flag without the problems described above. Details of the new API: * If BPF_F_REPLACE is set but replace_bpf_fd doesn't have valid descriptor of BPF program, BPF_PROG_ATTACH will return corresponding error (EINVAL or EBADF). * If replace_bpf_fd has valid descriptor of BPF program but such a program is not attached to specified cgroup, BPF_PROG_ATTACH will return ENOENT. BPF_F_REPLACE is introduced to make the user intent clear, since replace_bpf_fd alone can't be used for this (its default value, 0, is a valid fd). BPF_F_REPLACE also makes it possible to extend the API in the future (e.g. add BPF_F_BEFORE and BPF_F_AFTER if needed). Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Andrii Narkyiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/30cd850044a0057bdfcaaf154b7d2f39850ba813.1576741281.git.rdna@fb.com |
||
|---|---|---|
| .. | ||
| tc_act | ||
| bpf_common.h | ||
| bpf_perf_event.h | ||
| bpf.h | ||
| btf.h | ||
| const.h | ||
| erspan.h | ||
| ethtool.h | ||
| fadvise.h | ||
| fcntl.h | ||
| fs.h | ||
| fscrypt.h | ||
| hw_breakpoint.h | ||
| if_link.h | ||
| if_tun.h | ||
| if_xdp.h | ||
| in.h | ||
| kcmp.h | ||
| kvm.h | ||
| lirc.h | ||
| mman.h | ||
| mount.h | ||
| netlink.h | ||
| perf_event.h | ||
| pkt_cls.h | ||
| pkt_sched.h | ||
| prctl.h | ||
| sched.h | ||
| seg6_local.h | ||
| seg6.h | ||
| stat.h | ||
| tls.h | ||
| usbdevice_fs.h | ||
| vhost.h | ||