BPF List
 help / color / mirror / Atom feed
From: Leon Hwang <leon.hwang@linux.dev>
To: Yuyang Huang <yuyanghuang@google.com>
Cc: "Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	bot+bpf-ci@kernel.org, "Alexei Starovoitov" <ast@kernel.org>,
	"Andrew Lunn" <andrew+netdev@lunn.ch>,
	"Andrii Nakryiko" <andrii@kernel.org>, Eduard <eddyz87@gmail.com>,
	"Eric Dumazet" <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Jiri Olsa" <jolsa@kernel.org>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Kumar Kartikeya Dwivedi" <memxor@gmail.com>,
	"Martin KaFai Lau" <martin.lau@linux.dev>,
	"Nikolay Aleksandrov" <razor@blackwall.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Shuah Khan" <shuah@kernel.org>,
	"Simon Horman" <horms@kernel.org>, "Song Liu" <song@kernel.org>,
	"Stanislav Fomichev" <sdf@fomichev.me>,
	"Yonghong Song" <yonghong.song@linux.dev>,
	bpf <bpf@vger.kernel.org>, LKML <linux-kernel@vger.kernel.org>,
	"open list:KERNEL SELFTEST FRAMEWORK"
	<linux-kselftest@vger.kernel.org>,
	"Network Development" <netdev@vger.kernel.org>,
	"Maciej Żenczykowski" <maze@google.com>,
	"Lorenzo Colitti" <lorenzo@google.com>,
	"Martin KaFai Lau" <martin.lau@kernel.org>,
	"Chris Mason" <clm@meta.com>,
	"Ihor Solodrai" <ihor.solodrai@linux.dev>
Subject: Re: [PATCH bpf-next 1/2] bpf: align syscall writeback behavior with caller-declared size
Date: Thu, 28 May 2026 22:36:57 +0800	[thread overview]
Message-ID: <6083831a-dc6d-42e4-a1c8-8b7f6b966650@linux.dev> (raw)
In-Reply-To: <CADXeF1HRe=U7aWx2yx6RFZ_uH-mMbdsxMheofr2tMD9wae46oQ@mail.gmail.com>

On 2026/5/28 21:20, Yuyang Huang wrote:
> On Thu, May 28, 2026 at 1:43 PM Leon Hwang <leon.hwang@linux.dev> wrote:
>>
>> On 25/5/26 15:21, Yuyang Huang wrote:
>> [...]
>>>
>>> Feel free to let us know your thoughts.
>>>
>> I believe this is a user space issue instead of a kernel bug.
>>
>> I tried to use mmap() memory as uattr that got -EFAULT instead of crash.
>>
>> [................] /* mmap() memory */
>>              ^ tail 40B as uattr
>>                     ^ 56B offset for copy_to_user()
>>
>> Thanks,
>> Leon
>>
> 
> Thanks for testing this!
> 
> There are some discussion in the original thread:
> https://lore.kernel.org/all/CANP3RGfZTXM_u=E_atoomPZXutoQJ02nOMkCCR-YBZbOm2suWA@mail.gmail.com/
> as follows, which might answer your question
> 

It seems you haven't convinced Alexei in that thread.

>>>> If the uattr indeed has less than needed space, then for
>>>>          if (copy_to_user(&uattr->query.revision, &revision, sizeof(revision)))
>>>>                  return -EFAULT;
>>>> the kernel will return -EFAULT to user space.
>>>>
>>>> Maybe userspace didn't handle the return code properly and causing
>>>> user space corruption and segfaults. This shouldn't be a kernel issue.
>>>> Maybe I missed something?
>>>
>>> That's not how that works at all.
>>>
>>> copy_to_user() will only fail and thus EFAULT will only be returned if
>>> the memory area copy_to_user() is trying to copy into isn't
>>> owned/mapped by the user (or perhaps is read-only protected, not sure
>>> about this last one).
>>>
>>> Because memory is mapped in (at least) 4K pages, the memory after a
>>> user buffer is almost always still valid memory.  It might be unused,
>>> or it might be something on the stack - like a return address, or it
>>> might be on the heap - metadata tracking, or a different memory
>>> allocation perhaps entirely.
> 
> You might hit the same case as maze@ mentioned in the thread.
> 
> To trigger -EFAULT, you likely positioned `uattr` at the very end of a
> mapped page immediately followed by a protected page
> 
> Could you share the test program you created so we can verify?
> 

Attached below.

> Please check the test program I shared earlier in the thread (where
> uattr is stored on the stack); the BPF syscall returned 0, but stack
> corruption occurred.
> 

To avoid such stack corruption, you should reserve enough space for the
query, e.g., by extracting union bpf_attr from kernel BTF vmlinux.

Thanks,
Leon

> If you think my test program contains a bug, feel free to let me know.
> 
> Thanks,
> 
> Yuyang

---

Assisted-by: Copilot:gemini-3-1-pro-preview

// SPDX-License-Identifier: GPL-2.0
#include <uapi/linux/if_link.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <linux/bpf.h>
#include <net/if.h>
#include <test_progs.h>

#define loopback 1

#include "test_tc_link.skel.h"
#include "tc_helpers.h"

#define SHORT_QUERY_SIZE	offsetofend(union bpf_attr,
query.prog_attach_flags)

/*
 * test_tail_uattr_out_of_mmap:
 *
 * Places uattr at the very tail of a 1-page anonymous mmap so that the
 * mandatory fields (target_ifindex..prog_attach_flags, 40 bytes) fit
inside the page
 * but query.revision (+56) falls in the unmapped page immediately after.
 *
 *   mmap layout (1 page, e.g. 4096 bytes):
 *
 *     [0 ............ page_size - SHORT_QUERY_SIZE - 1]  unused
 *     [page_size - 40 ................. page_size - 1 ]  uattr:
target_ifindex..prog_cnt
 *     [page_size ..................................   ]  UNMAPPED
 *               ^
 *               uattr + 56 (revision) lands here
 */
static void test_tail_uattr_out_of_mmap(void)
{
	long page_size = sysconf(_SC_PAGE_SIZE);
	LIBBPF_OPTS(bpf_prog_attach_opts, opta);
	LIBBPF_OPTS(bpf_prog_detach_opts, optd);
	struct test_tc_link *skel;
	union bpf_attr *attr;
	void *mem, *tail;
	int fd, err;

	skel = test_tc_link__open_and_load();
	if (!ASSERT_OK_PTR(skel, "skel_load"))
		return;

	fd = bpf_program__fd(skel->progs.tc1);

	err = bpf_prog_attach_opts(fd, loopback, BPF_TCX_INGRESS, &opta);
	if (!ASSERT_OK(err, "prog_attach"))
		goto cleanup_skel;

	/*
	 * Allocate 2 contiguous pages then immediately unmap the second one.
	 * This guarantees the page following the first is unmapped, regardless
	 * of what the runtime placed there beforehand.
	 * Place uattr at the tail: last SHORT_QUERY_SIZE (40) bytes of page 1.
	 * uattr + 56 (revision) therefore lands 16 bytes past the page end,
	 * in the unmapped region.
	 */
	mem = mmap(NULL, 2 * page_size, PROT_READ | PROT_WRITE,
		   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
	if (!ASSERT_OK_PTR(mem, "mmap"))
		goto detach;
	munmap((char *)mem + page_size, page_size);

	tail = (char *)mem + page_size - SHORT_QUERY_SIZE;
	memset(tail, 0, SHORT_QUERY_SIZE);

	attr = (union bpf_attr *)tail;
	attr->query.target_ifindex = loopback;
	attr->query.attach_type = BPF_TCX_INGRESS;

	err = syscall(__NR_bpf, BPF_PROG_QUERY, tail, SHORT_QUERY_SIZE);
	ASSERT_OK(err, "syscall");
	ASSERT_OK(errno, "errno");
	ASSERT_EQ(attr->query.prog_cnt, 1, "prog_cnt_written");

	munmap(mem, page_size);	/* second page already unmapped above */
detach:
	bpf_prog_detach_opts(fd, loopback, BPF_TCX_INGRESS, &optd);
cleanup_skel:
	test_tc_link__destroy(skel);
}

void test_mmap_uattr_corruption(void)
{
	if (test__start_subtest("tail_uattr_out_of_mmap"))
		test_tail_uattr_out_of_mmap();
}


  reply	other threads:[~2026-05-28 14:37 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-15  7:15 [PATCH bpf-next 0/2] bpf: Align syscall writeback behavior with user-declared size Yuyang Huang
2026-05-15  7:15 ` [PATCH bpf-next 1/2] bpf: align syscall writeback behavior with caller-declared size Yuyang Huang
2026-05-15  8:14   ` bot+bpf-ci
2026-05-21 11:40     ` Yuyang Huang
2026-05-24 11:22       ` Alexei Starovoitov
2026-05-25  7:21         ` Yuyang Huang
2026-05-28  5:42           ` Leon Hwang
2026-05-28 13:20             ` Yuyang Huang
2026-05-28 14:36               ` Leon Hwang [this message]
2026-05-28 15:08                 ` Lorenzo Colitti
2026-05-28 16:01                   ` Yuyang Huang
2026-05-28 16:18                     ` Yuyang Huang
2026-05-29  1:03           ` Alexei Starovoitov
2026-05-29  4:44             ` Yuyang Huang
2026-05-29 20:36               ` Alexei Starovoitov
2026-05-29 23:42                 ` Yuyang Huang
2026-05-15  7:15 ` [PATCH bpf-next 2/2] selftests/bpf: Add verification for BPF_PROG_QUERY attr size boundaries Yuyang Huang
2026-05-18  3:29 ` [PATCH bpf-next 0/2] bpf: Align syscall writeback behavior with user-declared size Alexei Starovoitov
2026-05-18  5:07   ` Yuyang Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6083831a-dc6d-42e4-a1c8-8b7f6b966650@linux.dev \
    --to=leon.hwang@linux.dev \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bot+bpf-ci@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=clm@meta.com \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=ihor.solodrai@linux.dev \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=lorenzo@google.com \
    --cc=martin.lau@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=maze@google.com \
    --cc=memxor@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=razor@blackwall.org \
    --cc=sdf@fomichev.me \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    --cc=yuyanghuang@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox