Re: [PATCH bpf v2 1/2] bpf: allow UTF-8 literals in bpf_bprintf_prepare()

public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Yihan Ding" <dingyihan@uniontech.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH bpf v2 1/2] bpf: allow UTF-8 literals in bpf_bprintf_prepare()
Date: Wed, 15 Apr 2026 03:48:00 +0000	[thread overview]
Message-ID: <20260415034800.89F7EC2BCB4@smtp.kernel.org> (raw)
In-Reply-To: <20260415032126.1096299-2-dingyihan@uniontech.com>

Sashiko AI review found 2 potential issue(s):
- [Critical] Misuse of `snprintf` return value causes `tmp_buf` to advance out-of-bounds, leading to buffer overflow and memory corruption.
- [High] Truncated strings in `%s` specifiers lack a null terminator, causing `bstr_printf` to perform an out-of-bounds read and leak kernel memory.
--

commit 5e148b2d150df2934403df59a01d44e2a86c360d
Author: Yihan Ding <dingyihan@uniontech.com>

bpf: allow UTF-8 literals in bpf_bprintf_prepare()

bpf_bprintf_prepare() only needs ASCII parsing for conversion
specifiers. Plain text can safely carry bytes >= 0x80, so allow
UTF-8 literals outside '%' sequences while keeping ASCII control
bytes rejected and format specifiers ASCII-only.

> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index bb95e287b0dc8..b40add1cfa2c4 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c

[ ... ]

> @@ -867,6 +873,14 @@ int bpf_bprintf_prepare(const char *fmt, u32 fmt_size, const u64 *raw_args,
>  		 * always access fmt[i + 1], in the worst case it will be a 0
>  		 */
>  		i++;
> +		/*
> +		 * The format parser below only understands ASCII conversion
> +		 * specifiers and modifiers, so reject non-ASCII after '%'.
> +		 */
> +		if (!isascii((unsigned char)fmt[i])) {
> +			err = -EINVAL;
> +			goto out;
> +		}
>  
>  		/* skip optional "[0 +-][num]" width formatting field */
>  		while (fmt[i] == '0' || fmt[i] == '+'  || fmt[i] == '-' ||

This regression wasn't introduced by this patch, but there appears to be an
issue in how the return value of snprintf() is used later in this function.

When parsing %pB or %pI specifiers, the code formats the argument into the
bin_args buffer:

kernel/bpf/helpers.c:bpf_bprintf_prepare() {
    ...
    if (fmt[i + 1] == 'B') {
        if (tmp_buf)  {
            err = snprintf(tmp_buf,
                           (tmp_buf_end - tmp_buf),
                           "%pB",
                           (void *)(long)raw_args[num_spec]);
            tmp_buf += (err + 1);
        }
    ...
}

Does this code advance tmp_buf out-of-bounds if the remaining space is too
small?

The kernel's snprintf() returns the number of characters that would have
been written. If truncation occurs, it returns the un-truncated length.

Since the code unconditionally executes tmp_buf += (err + 1), truncation
would cause tmp_buf to advance past tmp_buf_end.

On the next loop iteration, the bounds check evaluates
(tmp_buf_end - tmp_buf < sizeof_cur_arg).

Since tmp_buf > tmp_buf_end, the left side evaluates to a negative ptrdiff_t.
This is compared to sizeof_cur_arg (a size_t), causing the negative value
to be promoted to a large unsigned integer.

Could this bypass the bounds check and allow out-of-bounds writes into
adjacent per-CPU memory?

There is another regression not introduced by this patch related to string
truncation.

When formatting %s, the code uses bpf_trace_copy_string() which internally
uses strncpy_from_kernel_nofault():

kernel/bpf/helpers.c:bpf_bprintf_prepare() {
    ...
    } else if (fmt[i] == 's') {
        ...
        unsafe_ptr = (char *)(long)raw_args[num_spec];
        err = bpf_trace_copy_string(tmp_buf, unsafe_ptr,
                                    fmt_ptype,
                                    tmp_buf_end - tmp_buf);
        if (err < 0) {
            tmp_buf[0] = '\0';
            err = 1;
        }

        tmp_buf += err;
        num_spec++;
        ...
}

Does this code lack a null terminator if the string is truncated?

If the source string length meets or exceeds the remaining space, exactly
tmp_buf_end - tmp_buf characters are copied without appending a null byte.

The code advances tmp_buf += err but fails to manually append a null
terminator.

Later, bstr_printf() unconditionally calls strlen(str_arg) for %s.

Because the string is not null-terminated, could strlen() scan past the end
of the bin_args buffer and leak potentially sensitive kernel data?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260415032126.1096299-1-dingyihan@uniontech.com?part=1

next prev parent reply	other threads:[~2026-04-15  3:48 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-15  3:21 [PATCH bpf v2 0/2] bpf: allow UTF-8 literals in bpf_bprintf_prepare() Yihan Ding
2026-04-15  3:21 ` [PATCH bpf v2 1/2] " Yihan Ding
2026-04-15  3:48   ` sashiko-bot [this message]
2026-04-15 10:44   ` Paul Chaignon
2026-04-15 10:49     ` Paul Chaignon
2026-04-15  3:21 ` [PATCH bpf v2 2/2] selftests/bpf: cover UTF-8 trace_printk output Yihan Ding
2026-04-15 10:46   ` Paul Chaignon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260415034800.89F7EC2BCB4@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=dingyihan@uniontech.com \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox