Necessary to understand what is going on when bpf_program_load fails
Signed-off-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
There are directly calls in libbpf for bpf program load/attach.
So we could just use two wrapper functions for ipvrf and convert
them with libbpf support.
Function bpf_prog_load() is removed as it's conflict with libbpf
function name.
bpf.c is moved to bpf_legacy.c for later main libbpf support in
iproute2.
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch prepares infrastructure for matching sockets by cgroups.
Two helper functions are added for transformation between cgroup v2 ID
and pathname. Cgroup v2 cache is implemented as hash table indexed by ID.
This cache is needed for faster lookups of socket cgroup.
v2:
- style fixes (David Ahern)
Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru>
Signed-off-by: David Ahern <dsahern@gmail.com>
On vrf exec, reset the VRF associations in the child process, via the
new hook added to cmd_exec(). In this way, the parent doesn't have to
reset the VRF associations before spawning other processes.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
'ip netns exec' changes the current netns just before executing a child
process, and restores it after forking. This is needed if we're running
in batch or do_all mode.
Some cleanups must be done both in the parent and in the child: the
parent must restore the previous netns, while the child must reset any
VRF association.
Unfortunately, if do_all is set, the VRF are not reset in the child, and
the spawned processes are started with the wrong VRF context. This can
be triggered with this script:
# ip -b - <<-'EOF'
link add type vrf table 100
link set vrf0 up
link add type dummy
link set dummy0 vrf vrf0 up
netns add ns1
EOF
# ip -all -b - <<-'EOF'
vrf exec vrf0 true
netns exec setsid -f sleep 1h
EOF
# ip vrf pids vrf0
314 sleep
# ps 314
PID TTY STAT TIME COMMAND
314 ? Ss 0:00 sleep 1h
Refactor cmd_exec() and pass to it a function pointer which is called in
the child before the final exec. In the netns exec case the function just
resets the VRF and switches netns.
Doing it in the child is less error prone and safer, because the parent
environment is always kept unaltered.
After this refactor some utility functions became unused, so remove them.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Every tool in the iproute2 package have one or more function to show
an help message to the user. Some of these functions print the help
line by line with a series of printf call, e.g. ip/xfrm_state.c does
60 fprintf calls.
If we group all the calls to a single one and just concatenate strings,
we save a lot of libc calls and thus object size. The size difference
of the compiled binaries calculated with bloat-o-meter is:
ip/ip:
add/remove: 0/0 grow/shrink: 5/15 up/down: 103/-4796 (-4693)
Total: Before=672591, After=667898, chg -0.70%
ip/rtmon:
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-54 (-54)
Total: Before=48879, After=48825, chg -0.11%
tc/tc:
add/remove: 0/2 grow/shrink: 31/10 up/down: 882/-6133 (-5251)
Total: Before=351912, After=346661, chg -1.49%
bridge/bridge:
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-459 (-459)
Total: Before=70502, After=70043, chg -0.65%
misc/lnstat:
add/remove: 0/1 grow/shrink: 1/0 up/down: 48/-486 (-438)
Total: Before=9960, After=9522, chg -4.40%
tipc/tipc:
add/remove: 0/0 grow/shrink: 1/1 up/down: 18/-62 (-44)
Total: Before=79182, After=79138, chg -0.06%
While at it, indent some strings which were starting at column 0,
and use tabs where possible, to have a consistent style across helps.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Split ip_linkaddr_list into one function that generates a list of devices
and a second that generates the list of addresses.
Signed-off-by: David Ahern <dsahern@gmail.com>
This is simpler and cleaner, and avoids having to include the header
from every file where the functions are used. The prototypes of the
internal implementation are in this header, so utils.h will have to be
included anyway for those.
Fixes: 508f3c231e ("Use libbsd for strlcpy if available")
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
If libc does not provide strlcpy check for libbsd with pkg-config to
avoid relying on inline version.
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
ip vrf exec requires root or CAP_NET_ADMIN, CAP_SYS_ADMIN and
CAP_DAC_OVERRIDE. It is not possible to run unprivileged commands like
ping as non-root or non-cap-enabled due to this requirement.
To allow users and administrators to safely add the required
capabilities to the binary, drop all capabilities on start if not
invoked with "vrf exec".
Update the manpage with the requirements.
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch converts spots where manual buffer termination was missing to
strlcpy() since that does what is needed.
Signed-off-by: Phil Sutter <phil@nwl.cc>
'ip vrf pids' is used to list processes bound to a vrf, but it only
shows the pid leaving a lot of work for the user. Add the command
name to the output. With this patch you get the more user friendly:
$ ip vrf pids mgmt
1121 ntpd
1418 gdm-session-wor
1488 gnome-session
1491 dbus-launch
1492 dbus-daemon
1565 sshd
...
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Since cgroups are not namespace aware, the directory heirarchy used by
ip vrf should account for network namespaces. In this case, change the
path from CGRP/BASE/vrf/NAME to CGRP/BASE/NETNS/vrf/NAME where CGRP is
the cgroup2 mount path, BASE in any base heirarchy inherited before VRF
is applied and NAME is the VRF name.
The intent is as follows: a user logs into the box into some namespace
with a name known to iproute2. Some other policy may have put the
process into a BASE heirarchy. From there the user executes a task in
a VRF and in doing so the task heirarchy becomes CGRP/BASE/NETNS/vrf/NAME.
The namespace level is omitted for the default namespace.
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Add support for VRF in a pre-existing hierarchy. For example, if the
current process is running in CGRP/foo/bar, the 'ip vrf exec NAME CMD'
should run CMD in the cgroup CGRP/foo/bar/vrf/NAME.
When listing process ids in a VRF, search for the directory vrf/NAME
regardless of base path (foo/bar/vrf/NAME and vrf/NAME) are still
running against the same vrf NAME.
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Next up a non-root user gets various bpf related error messages:
$ ip vrf exec mgmt bash
Failed to load BPF prog: 'Operation not permitted'
Kernel compiled with CGROUP_BPF enabled?
Catch the EPERM error and do not show the kernel config option.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
A vrf is local to a namespace. Drop any VRF association before trying
to exec a command in the new namespace.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Path in vrf_switch for "default" VRF is supposed to be MNT/vrf not
MNT/default. Also, default_vrf flag is redundant with ifindex. Remove
the flag in favor of ifindex != 0.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Split ipvrf_identify into arg processing and a function that does the
actual cgroup file parsing. The latter function is used in a follow
on patch.
In the process, convert the reading of the cgroups file to use fopen
and fgets just in case the file ever grows beyond 4k. Move printing
of any error message and the vrf name to the caller of the new
vrf_identify.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Move the hint about CGROUP_BPF enabled to prog_load failure since
it fails before the attach. Update the existing error message to
print to stderr.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
'ip vrf' follows the user semnatics established by 'ip netns'.
The 'ip vrf' subcommand supports 3 usages:
1. Run a command against a given vrf:
ip vrf exec NAME CMD
Uses the recently committed cgroup/sock BPF option. vrf directory
is added to cgroup2 mount. Individual vrfs are created under it. BPF
filter attached to vrf/NAME cgroup2 to set sk_bound_dev_if to the VRF
device index. From there the current process (ip's pid) is addded to
the cgroups.proc file and the given command is exected. In doing so
all AF_INET/AF_INET6 (ipv4/ipv6) sockets are automatically bound to
the VRF domain.
The association is inherited parent to child allowing the command to
be a shell from which other commands are run relative to the VRF.
2. Show the VRF a process is bound to:
ip vrf id
This command essentially looks at /proc/pid/cgroup for a "::/vrf/"
entry with the VRF name following.
3. Show process ids bound to a VRF
ip vrf pids NAME
This command dumps the file MNT/vrf/NAME/cgroup.procs since that file
shows the process ids in the particular vrf cgroup.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>