Netdev List
 help / color / mirror / Atom feed
From: "Alexei Starovoitov" <alexei.starovoitov@gmail.com>
To: "Yuyang Huang" <yuyanghuang@google.com>
Cc: "Daniel Borkmann" <daniel@iogearbox.net>,
	bot+bpf-ci@kernel.org, "Alexei Starovoitov" <ast@kernel.org>,
	"Andrew Lunn" <andrew+netdev@lunn.ch>,
	"Andrii Nakryiko" <andrii@kernel.org>, Eduard <eddyz87@gmail.com>,
	"Eric Dumazet" <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Jiri Olsa" <jolsa@kernel.org>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Kumar Kartikeya Dwivedi" <memxor@gmail.com>,
	"Martin KaFai Lau" <martin.lau@linux.dev>,
	"Nikolay Aleksandrov" <razor@blackwall.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Shuah Khan" <shuah@kernel.org>,
	"Simon Horman" <horms@kernel.org>, "Song Liu" <song@kernel.org>,
	"Stanislav Fomichev" <sdf@fomichev.me>,
	"Yonghong Song" <yonghong.song@linux.dev>,
	bpf <bpf@vger.kernel.org>, LKML <linux-kernel@vger.kernel.org>,
	"open list:KERNEL SELFTEST FRAMEWORK"
	<linux-kselftest@vger.kernel.org>,
	"Network Development" <netdev@vger.kernel.org>,
	"Maciej Żenczykowski" <maze@google.com>,
	"Lorenzo Colitti" <lorenzo@google.com>,
	"Martin KaFai Lau" <martin.lau@kernel.org>,
	"Chris Mason" <clm@meta.com>,
	"Ihor Solodrai" <ihor.solodrai@linux.dev>,
	"Todd Kjos" <tkjos@google.com>,
	"Carlos Llamas" <cmllamas@google.com>,
	"Kalesh Singh" <kaleshsingh@google.com>
Subject: Re: [PATCH bpf-next 1/2] bpf: align syscall writeback behavior with caller-declared size
Date: Fri, 29 May 2026 13:36:31 -0700	[thread overview]
Message-ID: <DIVG6ANND95G.2MC7WWJG1P01A@gmail.com> (raw)
In-Reply-To: <CADXeF1E-QpKj-37T5ghiycXrVR_YN72fOc3M4OQg-Arz7JNjsQ@mail.gmail.com>

On Thu May 28, 2026 at 9:44 PM PDT, Yuyang Huang wrote:
> On Fri, May 29, 2026 at 9:03 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>>
>> On Mon May 25, 2026 at 12:21 AM PDT, Yuyang Huang wrote:
>
> Thanks for the reply.
>
>> Ok. This is fair. 120933984460 is indeed problematic.
>
> I'm glad to hear we agree that 120933984460 is problematic.
>
>> Not really. Your patch adds checks to what looks like "random" copy_to_user()
>> places. This thread is clear indication that it's not a "robust architectural path".
>>
>> True robust path would be to wrap every copy_to_user() into another helper
>> that takes uattr start pointer and size, does extra check, and
>> returns something like ENOSPC (and not EFAULT),
>> but that would be a significant churn.
>
> I agree that the truly robust path would be a proper copy_to_user()
> wrapper instead of doing the ad-hoc check in the current path, but
> indeed, that will be a big change.
>
>> So I prefer a minimal patch that add single check in bpf_prog_query()
>> that checks that user space supplied buffer is large enough for attr->query.
>> If not -> EFAULT. That would be better than overwritting.
>> One can argue that partial copy_to_user() in __cgroup_bpf_query()
>> should still be allowed, but I'm against partial and
>> inconsistent queries.
>> query.revision might seem benign, but if we do it for all copy_to_user()
>> we better return ENOSPC to differentiate vs EFAULT to tell user space
>> to fix itself.
>
> Okay, I understand the preference is for a minimal patch to implement
> the fix. Just to make sure I fully understand your suggestion:
>
> Are you proposing that we add a single check at the entry gate in
> `bpf_prog_query()` like this?
>
> ```c
> static int bpf_prog_query(const union bpf_attr *attr,
>                           union bpf_attr __user *uattr, u32 uattr_size)
> {
>         if (uattr_size < offsetofend(union bpf_attr, query.revision))
>                 return -EFAULT;
>         ...
> }
> ```
>
> If we implement this, I want to confirm if you are okay with the
> consequence for legacy cgroup queries (e.g.,
> `BPF_CGROUP_INET_INGRESS`):
>
> 1. Before 120933984460: Legacy userspace compiled years ago with a
> 40-byte `bpf_attr` layout (ending at `prog_attach_flags`) regularly
> queries cgroups passing `size = 40`, and it worked perfectly.
> 2. With this check: These unmodified legacy applications will now fail
> with `-EFAULT` on newer kernels because they pass `size = 40` which is
> less than 64. This means all user-space applicaion is expected to
> "fix" their code when along with the kernel upgrade.

I bet you fixed your android setup long ago,
so all of these "but what about backward compat" is getting annoying.
Whatever kernel patch get merged android won't even backport, so enough of it.

> Is your preference to explicitly fail these legacy `size = 40` cgroup
> queries (breaking backward compatibility for them) to avoid "partial
> and inconsistent queries"?
>
> If we want to keep these legacy queries working safely (without
> failing them and without causing memory corruption), we cannot use a
> single check in `bpf_prog_query()`. We would still be forced to plumb
> `uattr_size` to `__cgroup_bpf_query()` so we can conditionally skip
> the `revision` writeback when `size < 64`.

Replied earlier that I don't like random checks like this through the code.

  reply	other threads:[~2026-05-29 20:36 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-15  7:15 [PATCH bpf-next 0/2] bpf: Align syscall writeback behavior with user-declared size Yuyang Huang
2026-05-15  7:15 ` [PATCH bpf-next 1/2] bpf: align syscall writeback behavior with caller-declared size Yuyang Huang
2026-05-15  8:14   ` bot+bpf-ci
2026-05-21 11:40     ` Yuyang Huang
2026-05-24 11:22       ` Alexei Starovoitov
2026-05-25  7:21         ` Yuyang Huang
2026-05-28  5:42           ` Leon Hwang
2026-05-28 13:20             ` Yuyang Huang
2026-05-28 14:36               ` Leon Hwang
2026-05-28 15:08                 ` Lorenzo Colitti
2026-05-28 16:01                   ` Yuyang Huang
2026-05-28 16:18                     ` Yuyang Huang
2026-05-29  1:03           ` Alexei Starovoitov
2026-05-29  4:44             ` Yuyang Huang
2026-05-29 20:36               ` Alexei Starovoitov [this message]
2026-05-29 23:42                 ` Yuyang Huang
2026-05-15  7:15 ` [PATCH bpf-next 2/2] selftests/bpf: Add verification for BPF_PROG_QUERY attr size boundaries Yuyang Huang
2026-05-18  3:29 ` [PATCH bpf-next 0/2] bpf: Align syscall writeback behavior with user-declared size Alexei Starovoitov
2026-05-18  5:07   ` Yuyang Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DIVG6ANND95G.2MC7WWJG1P01A@gmail.com \
    --to=alexei.starovoitov@gmail.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bot+bpf-ci@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=clm@meta.com \
    --cc=cmllamas@google.com \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=ihor.solodrai@linux.dev \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kaleshsingh@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=lorenzo@google.com \
    --cc=martin.lau@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=maze@google.com \
    --cc=memxor@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=razor@blackwall.org \
    --cc=sdf@fomichev.me \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=tkjos@google.com \
    --cc=yonghong.song@linux.dev \
    --cc=yuyanghuang@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox