Re: [PATCH bpf v2 1/2] bpf: allow UTF-8 literals in bpf_bprintf_prepare()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Yihan Ding" <dingyihan@uniontech.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH bpf v2 1/2] bpf: allow UTF-8 literals in bpf_bprintf_prepare()
Date: Wed, 15 Apr 2026 03:48:00 +0000	[thread overview]
Message-ID: <20260415034800.89F7EC2BCB4@smtp.kernel.org> (raw)
In-Reply-To: <20260415032126.1096299-2-dingyihan@uniontech.com>

Sashiko AI review found 2 potential issue(s):
- [Critical] Misuse of `snprintf` return value causes `tmp_buf` to advance out-of-bounds, leading to buffer overflow and memory corruption.
- [High] Truncated strings in `%s` specifiers lack a null terminator, causing `bstr_printf` to perform an out-of-bounds read and leak kernel memory.
--

commit 5e148b2d150df2934403df59a01d44e2a86c360d
Author: Yihan Ding <dingyihan@uniontech.com>

bpf: allow UTF-8 literals in bpf_bprintf_prepare()

bpf_bprintf_prepare() only needs ASCII parsing for conversion
specifiers. Plain text can safely carry bytes >= 0x80, so allow
UTF-8 literals outside '%' sequences while keeping ASCII control
bytes rejected and format specifiers ASCII-only.

> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index bb95e287b0dc8..b40add1cfa2c4 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c

[ ... ]

> @@ -867,6 +873,14 @@ int bpf_bprintf_prepare(const char *fmt, u32 fmt_size, const u64 *raw_args,
>  		 * always access fmt[i + 1], in the worst case it will be a 0
>  		 */
>  		i++;
> +		/*
> +		 * The format parser below only understands ASCII conversion
> +		 * specifiers and modifiers, so reject non-ASCII after '%'.
> +		 */
> +		if (!isascii((unsigned char)fmt[i])) {
> +			err = -EINVAL;
> +			goto out;
> +		}
>  
>  		/* skip optional "[0 +-][num]" width formatting field */
>  		while (fmt[i] == '0' || fmt[i] == '+'  || fmt[i] == '-' ||

This regression wasn't introduced by this patch, but there appears to be an
issue in how the return value of snprintf() is used later in this function.

When parsing %pB or %pI specifiers, the code formats the argument into the
bin_args buffer:

kernel/bpf/helpers.c:bpf_bprintf_prepare() {
    ...
    if (fmt[i + 1] == 'B') {
        if (tmp_buf)  {
            err = snprintf(tmp_buf,
                           (tmp_buf_end - tmp_buf),
                           "%pB",
                           (void *)(long)raw_args[num_spec]);
            tmp_buf += (err + 1);
        }
    ...
}

Does this code advance tmp_buf out-of-bounds if the remaining space is too
small?

The kernel's snprintf() returns the number of characters that would have
been written. If truncation occurs, it returns the un-truncated length.

Since the code unconditionally executes tmp_buf += (err + 1), truncation
would cause tmp_buf to advance past tmp_buf_end.

On the next loop iteration, the bounds check evaluates
(tmp_buf_end - tmp_buf < sizeof_cur_arg).

Since tmp_buf > tmp_buf_end, the left side evaluates to a negative ptrdiff_t.
This is compared to sizeof_cur_arg (a size_t), causing the negative value
to be promoted to a large unsigned integer.

Could this bypass the bounds check and allow out-of-bounds writes into
adjacent per-CPU memory?

There is another regression not introduced by this patch related to string
truncation.

When formatting %s, the code uses bpf_trace_copy_string() which internally
uses strncpy_from_kernel_nofault():

kernel/bpf/helpers.c:bpf_bprintf_prepare() {
    ...
    } else if (fmt[i] == 's') {
        ...
        unsafe_ptr = (char *)(long)raw_args[num_spec];
        err = bpf_trace_copy_string(tmp_buf, unsafe_ptr,
                                    fmt_ptype,
                                    tmp_buf_end - tmp_buf);
        if (err < 0) {
            tmp_buf[0] = '\0';
            err = 1;
        }

        tmp_buf += err;
        num_spec++;
        ...
}

Does this code lack a null terminator if the string is truncated?

If the source string length meets or exceeds the remaining space, exactly
tmp_buf_end - tmp_buf characters are copied without appending a null byte.

The code advances tmp_buf += err but fails to manually append a null
terminator.

Later, bstr_printf() unconditionally calls strlen(str_arg) for %s.

Because the string is not null-terminated, could strlen() scan past the end
of the bin_args buffer and leak potentially sensitive kernel data?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260415032126.1096299-1-dingyihan@uniontech.com?part=1

next prev parent reply	other threads:[~2026-04-15  3:48 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-15  3:21 [PATCH bpf v2 0/2] bpf: allow UTF-8 literals in bpf_bprintf_prepare() Yihan Ding
2026-04-15  3:21 ` [PATCH bpf v2 1/2] " Yihan Ding
2026-04-15  3:48   ` sashiko-bot [this message]
2026-04-15 10:44   ` Paul Chaignon
2026-04-15 10:49     ` Paul Chaignon
2026-04-15  3:21 ` [PATCH bpf v2 2/2] selftests/bpf: cover UTF-8 trace_printk output Yihan Ding
2026-04-15 10:46   ` Paul Chaignon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260415034800.89F7EC2BCB4@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=dingyihan@uniontech.com \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.