All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Chaignon <paul.chaignon@gmail.com>
To: Yihan Ding <dingyihan@uniontech.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
	andrii@kernel.org, shuah@kernel.org, alan.maguire@oracle.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH bpf v3 1/2] bpf: allow UTF-8 literals in bpf_bprintf_prepare()
Date: Fri, 17 Apr 2026 00:32:57 +0200	[thread overview]
Message-ID: <aeFjma_GYKD23X88@mail.gmail.com> (raw)
In-Reply-To: <20260416120142.1420646-2-dingyihan@uniontech.com>

On Thu, Apr 16, 2026 at 08:01:41PM +0800, Yihan Ding wrote:
> bpf_bprintf_prepare() only needs ASCII parsing for conversion
> specifiers. Plain text can safely carry bytes >= 0x80, so allow
> UTF-8 literals outside '%' sequences while keeping ASCII control
> bytes rejected and format specifiers ASCII-only.
> 
> This keeps existing parsing rules for format directives unchanged,
> while allowing helpers such as bpf_trace_printk() to emit UTF-8
> literal text.
> 
> Update test_snprintf_negative() in the same commit so selftests keep
> matching the new plain-text vs format-specifier split during bisection.
> 
> Fixes: 48cac3f4a96d ("bpf: Implement formatted output helpers with bstr_printf")
> Signed-off-by: Yihan Ding <dingyihan@uniontech.com>
> ---
>  kernel/bpf/helpers.c                            | 17 ++++++++++++++++-
>  .../testing/selftests/bpf/prog_tests/snprintf.c |  3 ++-
>  2 files changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 6eb6c82ed2ee..d51f1b612f1d 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -845,7 +845,13 @@ int bpf_bprintf_prepare(const char *fmt, u32 fmt_size, const u64 *raw_args,
>  		data->buf = buffers->buf;
>  
>  	for (i = 0; i < fmt_size; i++) {
> -		if ((!isprint(fmt[i]) && !isspace(fmt[i])) || !isascii(fmt[i])) {
> +		unsigned char c = fmt[i];

I'm a bit unsure this extra variable is worth it, but it's probably not
worth sending a v4 just for that.

> +
> +		/*
> +		 * Permit bytes >= 0x80 in plain text so UTF-8 literals can pass
> +		 * through unchanged, while still rejecting ASCII control bytes.
> +		 */
> +		if (isascii(c) && !isprint(c) && !isspace(c)) {
>  			err = -EINVAL;
>  			goto out;
>  		}
> @@ -867,6 +873,15 @@ int bpf_bprintf_prepare(const char *fmt, u32 fmt_size, const u64 *raw_args,
>  		 * always access fmt[i + 1], in the worst case it will be a 0
>  		 */
>  		i++;
> +		c = fmt[i];
> +		/*
> +		 * The format parser below only understands ASCII conversion
> +		 * specifiers and modifiers, so reject non-ASCII after '%'.
> +		 */
> +		if (!isascii(c)) {
> +			err = -EINVAL;
> +			goto out;
> +		}
>  
>  		/* skip optional "[0 +-][num]" width formatting field */
>  		while (fmt[i] == '0' || fmt[i] == '+'  || fmt[i] == '-' ||
> diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c b/tools/testing/selftests/bpf/prog_tests/snprintf.c
> index 594441acb707..4e4a82d54f79 100644
> --- a/tools/testing/selftests/bpf/prog_tests/snprintf.c
> +++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c
> @@ -114,7 +114,8 @@ static void test_snprintf_negative(void)
>  	ASSERT_ERR(load_single_snprintf("%--------"), "invalid specifier 5");
>  	ASSERT_ERR(load_single_snprintf("%lc"), "invalid specifier 6");
>  	ASSERT_ERR(load_single_snprintf("%llc"), "invalid specifier 7");
> -	ASSERT_ERR(load_single_snprintf("\x80"), "non ascii character");
> +	ASSERT_OK(load_single_snprintf("\x80"), "non ascii plain text");
> +	ASSERT_ERR(load_single_snprintf("%\x80"), "non ascii in specifier");

Acked-by: Paul Chaignon <paul.chaignon@gmail.com>

>  	ASSERT_ERR(load_single_snprintf("\x1"), "non printable character");
>  	ASSERT_ERR(load_single_snprintf("%p%"), "invalid specifier 8");
>  	ASSERT_ERR(load_single_snprintf("%s%"), "invalid specifier 9");
> -- 
> 2.20.1
> 

  parent reply	other threads:[~2026-04-16 22:33 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-16 12:01 [PATCH bpf v3 0/2] bpf: allow UTF-8 literals in bpf_bprintf_prepare() Yihan Ding
2026-04-16 12:01 ` [PATCH bpf v3 1/2] " Yihan Ding
2026-04-16 13:03   ` sashiko-bot
2026-04-16 22:32   ` Paul Chaignon [this message]
2026-04-16 12:01 ` [PATCH bpf v3 2/2] selftests/bpf: cover UTF-8 trace_printk output Yihan Ding
2026-04-16 22:35   ` Paul Chaignon
2026-04-16 23:00 ` [PATCH bpf v3 0/2] bpf: allow UTF-8 literals in bpf_bprintf_prepare() patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aeFjma_GYKD23X88@mail.gmail.com \
    --to=paul.chaignon@gmail.com \
    --cc=alan.maguire@oracle.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dingyihan@uniontech.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.