mirror_ubuntu-kernels/kernel/bpf/log.c
Andrii Nakryiko fa1c7d5cc4 bpf: Keep track of total log content size in both fixed and rolling modes
Change how we do accounting in BPF_LOG_FIXED mode and adopt log->end_pos
as *logical* log position. This means that we can go beyond physical log
buffer size now and be able to tell what log buffer size should be to
fit entire log contents without -ENOSPC.

To do this for BPF_LOG_FIXED mode, we need to remove a short-circuiting
logic of not vsnprintf()'ing further log content once we filled up
user-provided buffer, which is done by bpf_verifier_log_needed() checks.
We modify these checks to always keep going if log->level is non-zero
(i.e., log is requested), even if log->ubuf was NULL'ed out due to
copying data to user-space, or if entire log buffer is physically full.
We adopt bpf_verifier_vlog() routine to work correctly with
log->ubuf == NULL condition, performing log formatting into temporary
kernel buffer, doing all the necessary accounting, but just avoiding
copying data out if buffer is full or NULL'ed out.

With these changes, it's now possible to do this sort of determination of
log contents size in both BPF_LOG_FIXED and default rolling log mode.
We need to keep in mind bpf_vlog_reset(), though, which shrinks log
contents after successful verification of a particular code path. This
log reset means that log->end_pos isn't always increasing, so to return
back to users what should be the log buffer size to fit all log content
without causing -ENOSPC even in the presence of log resetting, we need
to keep maximum over "lifetime" of logging. We do this accounting in
bpf_vlog_update_len_max() helper.

A related and subtle aspect is that with this logical log->end_pos even in
BPF_LOG_FIXED mode we could temporary "overflow" buffer, but then reset
it back with bpf_vlog_reset() to a position inside user-supplied
log_buf. In such situation we still want to properly maintain
terminating zero. We will eventually return -ENOSPC even if final log
buffer is small (we detect this through log->len_max check). This
behavior is simpler to reason about and is consistent with current
behavior of verifier log. Handling of this required a small addition to
bpf_vlog_reset() logic to avoid doing put_user() beyond physical log
buffer dimensions.

Another issue to keep in mind is that we limit log buffer size to 32-bit
value and keep such log length as u32, but theoretically verifier could
produce huge log stretching beyond 4GB. Instead of keeping (and later
returning) 64-bit log length, we cap it at UINT_MAX. Current UAPI makes
it impossible to specify log buffer size bigger than 4GB anyways, so we
don't really loose anything here and keep everything consistently 32-bit
in UAPI. This property will be utilized in next patch.

Doing the same determination of maximum log buffer for rolling mode is
trivial, as log->end_pos and log->start_pos are already logical
positions, so there is nothing new there.

These changes do incidentally fix one small issue with previous logging
logic. Previously, if use provided log buffer of size N, and actual log
output was exactly N-1 bytes + terminating \0, kernel logic coun't
distinguish this condition from log truncation scenario which would end
up with truncated log contents of N-1 bytes + terminating \0 as well.

But now with log->end_pos being logical position that could go beyond
actual log buffer size, we can distinguish these two conditions, which
we do in this patch. This plays nicely with returning log_size_actual
(implemented in UAPI in the next patch), as we can now guarantee that if
user takes such log_size_actual and provides log buffer of that exact
size, they will not get -ENOSPC in return.

All in all, all these changes do conceptually unify fixed and rolling
log modes much better, and allow a nice feature requested by users:
knowing what should be the size of the buffer to avoid -ENOSPC.

We'll plumb this through the UAPI and the code in the next patch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Lorenz Bauer <lmb@isovalent.com>
Link: https://lore.kernel.org/bpf/20230406234205.323208-12-andrii@kernel.org
2023-04-11 18:05:43 +02:00

293 lines
8.2 KiB
C

// SPDX-License-Identifier: GPL-2.0-only
/* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
* Copyright (c) 2016 Facebook
* Copyright (c) 2018 Covalent IO, Inc. http://covalent.io
*/
#include <uapi/linux/btf.h>
#include <linux/kernel.h>
#include <linux/types.h>
#include <linux/bpf.h>
#include <linux/bpf_verifier.h>
#include <linux/math64.h>
bool bpf_verifier_log_attr_valid(const struct bpf_verifier_log *log)
{
return log->len_total > 0 && log->len_total <= UINT_MAX >> 2 &&
log->level && log->ubuf && !(log->level & ~BPF_LOG_MASK);
}
static void bpf_vlog_update_len_max(struct bpf_verifier_log *log, u32 add_len)
{
/* add_len includes terminal \0, so no need for +1. */
u64 len = log->end_pos + add_len;
/* log->len_max could be larger than our current len due to
* bpf_vlog_reset() calls, so we maintain the max of any length at any
* previous point
*/
if (len > UINT_MAX)
log->len_max = UINT_MAX;
else if (len > log->len_max)
log->len_max = len;
}
void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt,
va_list args)
{
u64 cur_pos;
u32 new_n, n;
n = vscnprintf(log->kbuf, BPF_VERIFIER_TMP_LOG_SIZE, fmt, args);
WARN_ONCE(n >= BPF_VERIFIER_TMP_LOG_SIZE - 1,
"verifier log line truncated - local buffer too short\n");
if (log->level == BPF_LOG_KERNEL) {
bool newline = n > 0 && log->kbuf[n - 1] == '\n';
pr_err("BPF: %s%s", log->kbuf, newline ? "" : "\n");
return;
}
n += 1; /* include terminating zero */
bpf_vlog_update_len_max(log, n);
if (log->level & BPF_LOG_FIXED) {
/* check if we have at least something to put into user buf */
new_n = 0;
if (log->end_pos < log->len_total) {
new_n = min_t(u32, log->len_total - log->end_pos, n);
log->kbuf[new_n - 1] = '\0';
}
cur_pos = log->end_pos;
log->end_pos += n - 1; /* don't count terminating '\0' */
if (log->ubuf && new_n &&
copy_to_user(log->ubuf + cur_pos, log->kbuf, new_n))
goto fail;
} else {
u64 new_end, new_start;
u32 buf_start, buf_end, new_n;
new_end = log->end_pos + n;
if (new_end - log->start_pos >= log->len_total)
new_start = new_end - log->len_total;
else
new_start = log->start_pos;
new_n = min(n, log->len_total);
cur_pos = new_end - new_n;
div_u64_rem(cur_pos, log->len_total, &buf_start);
div_u64_rem(new_end, log->len_total, &buf_end);
/* new_end and buf_end are exclusive indices, so if buf_end is
* exactly zero, then it actually points right to the end of
* ubuf and there is no wrap around
*/
if (buf_end == 0)
buf_end = log->len_total;
log->start_pos = new_start;
log->end_pos = new_end - 1; /* don't count terminating '\0' */
if (!log->ubuf)
return;
/* if buf_start > buf_end, we wrapped around;
* if buf_start == buf_end, then we fill ubuf completely; we
* can't have buf_start == buf_end to mean that there is
* nothing to write, because we always write at least
* something, even if terminal '\0'
*/
if (buf_start < buf_end) {
/* message fits within contiguous chunk of ubuf */
if (copy_to_user(log->ubuf + buf_start,
log->kbuf + n - new_n,
buf_end - buf_start))
goto fail;
} else {
/* message wraps around the end of ubuf, copy in two chunks */
if (copy_to_user(log->ubuf + buf_start,
log->kbuf + n - new_n,
log->len_total - buf_start))
goto fail;
if (copy_to_user(log->ubuf,
log->kbuf + n - buf_end,
buf_end))
goto fail;
}
}
return;
fail:
log->ubuf = NULL;
}
void bpf_vlog_reset(struct bpf_verifier_log *log, u64 new_pos)
{
char zero = 0;
u32 pos;
if (WARN_ON_ONCE(new_pos > log->end_pos))
return;
if (!bpf_verifier_log_needed(log) || log->level == BPF_LOG_KERNEL)
return;
/* if position to which we reset is beyond current log window,
* then we didn't preserve any useful content and should adjust
* start_pos to end up with an empty log (start_pos == end_pos)
*/
log->end_pos = new_pos;
if (log->end_pos < log->start_pos)
log->start_pos = log->end_pos;
if (log->level & BPF_LOG_FIXED)
pos = log->end_pos + 1;
else
div_u64_rem(new_pos, log->len_total, &pos);
if (log->ubuf && pos < log->len_total && put_user(zero, log->ubuf + pos))
log->ubuf = NULL;
}
static void bpf_vlog_reverse_kbuf(char *buf, int len)
{
int i, j;
for (i = 0, j = len - 1; i < j; i++, j--)
swap(buf[i], buf[j]);
}
static int bpf_vlog_reverse_ubuf(struct bpf_verifier_log *log, int start, int end)
{
/* we split log->kbuf into two equal parts for both ends of array */
int n = sizeof(log->kbuf) / 2, nn;
char *lbuf = log->kbuf, *rbuf = log->kbuf + n;
/* Read ubuf's section [start, end) two chunks at a time, from left
* and right side; within each chunk, swap all the bytes; after that
* reverse the order of lbuf and rbuf and write result back to ubuf.
* This way we'll end up with swapped contents of specified
* [start, end) ubuf segment.
*/
while (end - start > 1) {
nn = min(n, (end - start ) / 2);
if (copy_from_user(lbuf, log->ubuf + start, nn))
return -EFAULT;
if (copy_from_user(rbuf, log->ubuf + end - nn, nn))
return -EFAULT;
bpf_vlog_reverse_kbuf(lbuf, nn);
bpf_vlog_reverse_kbuf(rbuf, nn);
/* we write lbuf to the right end of ubuf, while rbuf to the
* left one to end up with properly reversed overall ubuf
*/
if (copy_to_user(log->ubuf + start, rbuf, nn))
return -EFAULT;
if (copy_to_user(log->ubuf + end - nn, lbuf, nn))
return -EFAULT;
start += nn;
end -= nn;
}
return 0;
}
bool bpf_vlog_truncated(const struct bpf_verifier_log *log)
{
return log->len_max > log->len_total;
}
void bpf_vlog_finalize(struct bpf_verifier_log *log)
{
u32 sublen;
int err;
if (!log || !log->level || !log->ubuf)
return;
if ((log->level & BPF_LOG_FIXED) || log->level == BPF_LOG_KERNEL)
return;
/* If we never truncated log, there is nothing to move around. */
if (log->start_pos == 0)
return;
/* Otherwise we need to rotate log contents to make it start from the
* buffer beginning and be a continuous zero-terminated string. Note
* that if log->start_pos != 0 then we definitely filled up entire log
* buffer with no gaps, and we just need to shift buffer contents to
* the left by (log->start_pos % log->len_total) bytes.
*
* Unfortunately, user buffer could be huge and we don't want to
* allocate temporary kernel memory of the same size just to shift
* contents in a straightforward fashion. Instead, we'll be clever and
* do in-place array rotation. This is a leetcode-style problem, which
* could be solved by three rotations.
*
* Let's say we have log buffer that has to be shifted left by 7 bytes
* (spaces and vertical bar is just for demonstrative purposes):
* E F G H I J K | A B C D
*
* First, we reverse entire array:
* D C B A | K J I H G F E
*
* Then we rotate first 4 bytes (DCBA) and separately last 7 bytes
* (KJIHGFE), resulting in a properly rotated array:
* A B C D | E F G H I J K
*
* We'll utilize log->kbuf to read user memory chunk by chunk, swap
* bytes, and write them back. Doing it byte-by-byte would be
* unnecessarily inefficient. Altogether we are going to read and
* write each byte twice, for total 4 memory copies between kernel and
* user space.
*/
/* length of the chopped off part that will be the beginning;
* len(ABCD) in the example above
*/
div_u64_rem(log->start_pos, log->len_total, &sublen);
sublen = log->len_total - sublen;
err = bpf_vlog_reverse_ubuf(log, 0, log->len_total);
err = err ?: bpf_vlog_reverse_ubuf(log, 0, sublen);
err = err ?: bpf_vlog_reverse_ubuf(log, sublen, log->len_total);
if (err)
log->ubuf = NULL;
}
/* log_level controls verbosity level of eBPF verifier.
* bpf_verifier_log_write() is used to dump the verification trace to the log,
* so the user can figure out what's wrong with the program
*/
__printf(2, 3) void bpf_verifier_log_write(struct bpf_verifier_env *env,
const char *fmt, ...)
{
va_list args;
if (!bpf_verifier_log_needed(&env->log))
return;
va_start(args, fmt);
bpf_verifier_vlog(&env->log, fmt, args);
va_end(args);
}
EXPORT_SYMBOL_GPL(bpf_verifier_log_write);
__printf(2, 3) void bpf_log(struct bpf_verifier_log *log,
const char *fmt, ...)
{
va_list args;
if (!bpf_verifier_log_needed(log))
return;
va_start(args, fmt);
bpf_verifier_vlog(log, fmt, args);
va_end(args);
}
EXPORT_SYMBOL_GPL(bpf_log);