From: Leon Hwang <leon.hwang@linux.dev>
To: Yuyang Huang <yuyanghuang@google.com>
Cc: "Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
"Daniel Borkmann" <daniel@iogearbox.net>,
bot+bpf-ci@kernel.org, "Alexei Starovoitov" <ast@kernel.org>,
"Andrew Lunn" <andrew+netdev@lunn.ch>,
"Andrii Nakryiko" <andrii@kernel.org>, Eduard <eddyz87@gmail.com>,
"Eric Dumazet" <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Jiri Olsa" <jolsa@kernel.org>,
"John Fastabend" <john.fastabend@gmail.com>,
"Kumar Kartikeya Dwivedi" <memxor@gmail.com>,
"Martin KaFai Lau" <martin.lau@linux.dev>,
"Nikolay Aleksandrov" <razor@blackwall.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"Shuah Khan" <shuah@kernel.org>,
"Simon Horman" <horms@kernel.org>, "Song Liu" <song@kernel.org>,
"Stanislav Fomichev" <sdf@fomichev.me>,
"Yonghong Song" <yonghong.song@linux.dev>,
bpf <bpf@vger.kernel.org>, LKML <linux-kernel@vger.kernel.org>,
"open list:KERNEL SELFTEST FRAMEWORK"
<linux-kselftest@vger.kernel.org>,
"Network Development" <netdev@vger.kernel.org>,
"Maciej Żenczykowski" <maze@google.com>,
"Lorenzo Colitti" <lorenzo@google.com>,
"Martin KaFai Lau" <martin.lau@kernel.org>,
"Chris Mason" <clm@meta.com>,
"Ihor Solodrai" <ihor.solodrai@linux.dev>
Subject: Re: [PATCH bpf-next 1/2] bpf: align syscall writeback behavior with caller-declared size
Date: Thu, 28 May 2026 22:36:57 +0800 [thread overview]
Message-ID: <6083831a-dc6d-42e4-a1c8-8b7f6b966650@linux.dev> (raw)
In-Reply-To: <CADXeF1HRe=U7aWx2yx6RFZ_uH-mMbdsxMheofr2tMD9wae46oQ@mail.gmail.com>
On 2026/5/28 21:20, Yuyang Huang wrote:
> On Thu, May 28, 2026 at 1:43 PM Leon Hwang <leon.hwang@linux.dev> wrote:
>>
>> On 25/5/26 15:21, Yuyang Huang wrote:
>> [...]
>>>
>>> Feel free to let us know your thoughts.
>>>
>> I believe this is a user space issue instead of a kernel bug.
>>
>> I tried to use mmap() memory as uattr that got -EFAULT instead of crash.
>>
>> [................] /* mmap() memory */
>> ^ tail 40B as uattr
>> ^ 56B offset for copy_to_user()
>>
>> Thanks,
>> Leon
>>
>
> Thanks for testing this!
>
> There are some discussion in the original thread:
> https://lore.kernel.org/all/CANP3RGfZTXM_u=E_atoomPZXutoQJ02nOMkCCR-YBZbOm2suWA@mail.gmail.com/
> as follows, which might answer your question
>
It seems you haven't convinced Alexei in that thread.
>>>> If the uattr indeed has less than needed space, then for
>>>> if (copy_to_user(&uattr->query.revision, &revision, sizeof(revision)))
>>>> return -EFAULT;
>>>> the kernel will return -EFAULT to user space.
>>>>
>>>> Maybe userspace didn't handle the return code properly and causing
>>>> user space corruption and segfaults. This shouldn't be a kernel issue.
>>>> Maybe I missed something?
>>>
>>> That's not how that works at all.
>>>
>>> copy_to_user() will only fail and thus EFAULT will only be returned if
>>> the memory area copy_to_user() is trying to copy into isn't
>>> owned/mapped by the user (or perhaps is read-only protected, not sure
>>> about this last one).
>>>
>>> Because memory is mapped in (at least) 4K pages, the memory after a
>>> user buffer is almost always still valid memory. It might be unused,
>>> or it might be something on the stack - like a return address, or it
>>> might be on the heap - metadata tracking, or a different memory
>>> allocation perhaps entirely.
>
> You might hit the same case as maze@ mentioned in the thread.
>
> To trigger -EFAULT, you likely positioned `uattr` at the very end of a
> mapped page immediately followed by a protected page
>
> Could you share the test program you created so we can verify?
>
Attached below.
> Please check the test program I shared earlier in the thread (where
> uattr is stored on the stack); the BPF syscall returned 0, but stack
> corruption occurred.
>
To avoid such stack corruption, you should reserve enough space for the
query, e.g., by extracting union bpf_attr from kernel BTF vmlinux.
Thanks,
Leon
> If you think my test program contains a bug, feel free to let me know.
>
> Thanks,
>
> Yuyang
---
Assisted-by: Copilot:gemini-3-1-pro-preview
// SPDX-License-Identifier: GPL-2.0
#include <uapi/linux/if_link.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <linux/bpf.h>
#include <net/if.h>
#include <test_progs.h>
#define loopback 1
#include "test_tc_link.skel.h"
#include "tc_helpers.h"
#define SHORT_QUERY_SIZE offsetofend(union bpf_attr,
query.prog_attach_flags)
/*
* test_tail_uattr_out_of_mmap:
*
* Places uattr at the very tail of a 1-page anonymous mmap so that the
* mandatory fields (target_ifindex..prog_attach_flags, 40 bytes) fit
inside the page
* but query.revision (+56) falls in the unmapped page immediately after.
*
* mmap layout (1 page, e.g. 4096 bytes):
*
* [0 ............ page_size - SHORT_QUERY_SIZE - 1] unused
* [page_size - 40 ................. page_size - 1 ] uattr:
target_ifindex..prog_cnt
* [page_size .................................. ] UNMAPPED
* ^
* uattr + 56 (revision) lands here
*/
static void test_tail_uattr_out_of_mmap(void)
{
long page_size = sysconf(_SC_PAGE_SIZE);
LIBBPF_OPTS(bpf_prog_attach_opts, opta);
LIBBPF_OPTS(bpf_prog_detach_opts, optd);
struct test_tc_link *skel;
union bpf_attr *attr;
void *mem, *tail;
int fd, err;
skel = test_tc_link__open_and_load();
if (!ASSERT_OK_PTR(skel, "skel_load"))
return;
fd = bpf_program__fd(skel->progs.tc1);
err = bpf_prog_attach_opts(fd, loopback, BPF_TCX_INGRESS, &opta);
if (!ASSERT_OK(err, "prog_attach"))
goto cleanup_skel;
/*
* Allocate 2 contiguous pages then immediately unmap the second one.
* This guarantees the page following the first is unmapped, regardless
* of what the runtime placed there beforehand.
* Place uattr at the tail: last SHORT_QUERY_SIZE (40) bytes of page 1.
* uattr + 56 (revision) therefore lands 16 bytes past the page end,
* in the unmapped region.
*/
mem = mmap(NULL, 2 * page_size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (!ASSERT_OK_PTR(mem, "mmap"))
goto detach;
munmap((char *)mem + page_size, page_size);
tail = (char *)mem + page_size - SHORT_QUERY_SIZE;
memset(tail, 0, SHORT_QUERY_SIZE);
attr = (union bpf_attr *)tail;
attr->query.target_ifindex = loopback;
attr->query.attach_type = BPF_TCX_INGRESS;
err = syscall(__NR_bpf, BPF_PROG_QUERY, tail, SHORT_QUERY_SIZE);
ASSERT_OK(err, "syscall");
ASSERT_OK(errno, "errno");
ASSERT_EQ(attr->query.prog_cnt, 1, "prog_cnt_written");
munmap(mem, page_size); /* second page already unmapped above */
detach:
bpf_prog_detach_opts(fd, loopback, BPF_TCX_INGRESS, &optd);
cleanup_skel:
test_tc_link__destroy(skel);
}
void test_mmap_uattr_corruption(void)
{
if (test__start_subtest("tail_uattr_out_of_mmap"))
test_tail_uattr_out_of_mmap();
}
next prev parent reply other threads:[~2026-05-28 14:37 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-15 7:15 [PATCH bpf-next 0/2] bpf: Align syscall writeback behavior with user-declared size Yuyang Huang
2026-05-15 7:15 ` [PATCH bpf-next 1/2] bpf: align syscall writeback behavior with caller-declared size Yuyang Huang
2026-05-15 8:14 ` bot+bpf-ci
2026-05-21 11:40 ` Yuyang Huang
2026-05-24 11:22 ` Alexei Starovoitov
2026-05-25 7:21 ` Yuyang Huang
2026-05-28 5:42 ` Leon Hwang
2026-05-28 13:20 ` Yuyang Huang
2026-05-28 14:36 ` Leon Hwang [this message]
2026-05-28 15:08 ` Lorenzo Colitti
2026-05-28 16:01 ` Yuyang Huang
2026-05-28 16:18 ` Yuyang Huang
2026-05-29 1:03 ` Alexei Starovoitov
2026-05-29 4:44 ` Yuyang Huang
2026-05-29 20:36 ` Alexei Starovoitov
2026-05-29 23:42 ` Yuyang Huang
2026-05-15 7:15 ` [PATCH bpf-next 2/2] selftests/bpf: Add verification for BPF_PROG_QUERY attr size boundaries Yuyang Huang
2026-05-18 3:29 ` [PATCH bpf-next 0/2] bpf: Align syscall writeback behavior with user-declared size Alexei Starovoitov
2026-05-18 5:07 ` Yuyang Huang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6083831a-dc6d-42e4-a1c8-8b7f6b966650@linux.dev \
--to=leon.hwang@linux.dev \
--cc=alexei.starovoitov@gmail.com \
--cc=andrew+netdev@lunn.ch \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bot+bpf-ci@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=clm@meta.com \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=ihor.solodrai@linux.dev \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=lorenzo@google.com \
--cc=martin.lau@kernel.org \
--cc=martin.lau@linux.dev \
--cc=maze@google.com \
--cc=memxor@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=razor@blackwall.org \
--cc=sdf@fomichev.me \
--cc=shuah@kernel.org \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
--cc=yuyanghuang@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox