Linux kernel -stable discussions
 help / color / mirror / Atom feed
From: "Emil Tsalapatis" <emil@etsalapatis.com>
To: "Dawei Feng" <dawei.feng@seu.edu.cn>, <martin.lau@linux.dev>
Cc: <ast@kernel.org>, <daniel@iogearbox.net>, <andrii@kernel.org>,
	<eddyz87@gmail.com>, <memxor@gmail.com>, <song@kernel.org>,
	<yonghong.song@linux.dev>, <jolsa@kernel.org>, <kees@kernel.org>,
	<joel.granados@kernel.org>, <bpf@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <linux-fsdevel@vger.kernel.org>,
	<jianhao.xu@seu.edu.cn>, <stable@vger.kernel.org>,
	"Zilin Guan" <zilin@seu.edu.cn>
Subject: Re: [PATCH 2/2] bpf: cgroup: Use kvfree instead of kfree in __cgroup_bpf_run_filter_sysctl
Date: Tue, 26 May 2026 18:24:44 -0400	[thread overview]
Message-ID: <DISYLID6QGFY.1HTQOMHVLTRWS@etsalapatis.com> (raw)
In-Reply-To: <20260526131035.1312864-3-dawei.feng@seu.edu.cn>

On Tue May 26, 2026 at 9:10 AM EDT, Dawei Feng wrote:
> proc_sys_call_handler() allocates its temporary sysctl buffer with
> kvzalloc() and passes it to __cgroup_bpf_run_filter_sysctl(). Since
> kvzalloc() may fall back to vmalloc() for large allocations, freeing
> that buffer with kfree() is wrong and can corrupt memory.
>
> Use kvfree() to safely handle both kmalloc and kvzalloc()/vmalloc
> allocations.
>
> The bug was first flagged by an experimental analysis tool we are
> developing for kernel memory-management bugs while analyzing
> v6.13-rc1. The tool is still under development and is not yet publicly
> available. Manual inspection confirms that the bug is still
> present in v7.1-rc5.
>
> Reproduced the bug based on v7.1-rc4 in a QEMU x86_64 guest booted with
> KASAN and CONFIG_FAILSLAB enabled. The reproducer confines failslab
> injections to the proc_sys_call_handler() range, uses
> stacktrace-depth=32, and injects fail-nth=1 while writing 8191 bytes to
> /proc/sys/kernel/domainname from a task in the target cgroup. On the
> patch1-only kernel, fail-nth=1 triggered the fault:
>
>   BUG: unable to handle page fault for address: ffffeb0200024d48
>   #PF: supervisor read access in kernel mode
>   #PF: error_code(0x0000) - not-present page
>   PGD 0 P4D 0
>   Oops: Oops: 0000  SMP KASAN NOPTI
>   CPU: 2 UID: 0 PID: 209 Comm: repro_proc_sys_ Not tainted 7.1.0-rc4-00686-g97625979a5d4  PREEMPT(lazy)
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
>   RIP: 0010:kfree+0x6e/0x510
>   Code: 80 48 01 ef 0f 82 ae 04 00 00 48 c7 c0 00 00 00 80 48 2b 05 04 1b 23 04 48 01 c7 48 c1 ef 0c 48 c1 e7 06 48 03 3d e2 1a 23 04 <4c> 8b 57 08 4c 89 d0 83 e0 01 48 83 e8 01 49 09 c2 49 >
>   RSP: 0018:ffff888108de7ab8 EFLAGS: 00010282
>   RAX: 0000777f80000000 RBX: ffff88815af398c0 RCX: 0000000000000080
>   RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffeb0200024d40
>   RBP: ffffc90000935000 R08: 0000000000000001 R09: 0000000000000001
>   R10: ffffffff86b4b297 R11: 0000000000000000 R12: ffffffff819b71fd
>   R13: 0000000000000001 R14: ffff888108de7cc0 R15: 0000000000000000
>   FS:  00007f8988cc2b80(0000) GS:ffff8881d3256000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: ffffeb0200024d48 CR3: 0000000101d6b000 CR4: 0000000000350ef0
>   Call Trace:
>    <TASK>
>    ? __cgroup_bpf_run_filter_sysctl+0x626/0xc30
>    __cgroup_bpf_run_filter_sysctl+0x74d/0xc30
>    ? __pfx___cgroup_bpf_run_filter_sysctl+0x10/0x10
>    ? srso_return_thunk+0x5/0x5f
>    ? __kvmalloc_node_noprof+0x345/0x870
>    ? proc_sys_call_handler+0x250/0x480
>    ? srso_return_thunk+0x5/0x5f
>    proc_sys_call_handler+0x3a2/0x480
>    ? __pfx_proc_sys_call_handler+0x10/0x10
>    ? srso_return_thunk+0x5/0x5f
>    ? selinux_file_permission+0x39f/0x500
>    ? srso_return_thunk+0x5/0x5f
>    ? lock_is_held_type+0x9e/0x120
>    vfs_write+0x98e/0x1000
>    ? srso_return_thunk+0x5/0x5f
>    ? kmem_cache_free+0x308/0x550
>    ? __pfx_vfs_write+0x10/0x10
>    ? __pfx_do_sys_openat2+0x10/0x10
>    ksys_write+0xf2/0x1d0
>    ? __pfx_ksys_write+0x10/0x10
>    ? srso_return_thunk+0x5/0x5f
>    ? trace_irq_enable.constprop.0+0x110/0x140
>    do_syscall_64+0x115/0x690
>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
>    RIP: 0033:0x7f8988dd8907
>    Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8  01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 >
>    RSP: 002b:00007fff4069b878 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
>    RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f8988dd8907
>    RDX: 0000000000001fff RSI: 0000564f97ef46b0 RDI: 0000000000000005
>    RBP: 0000564f97ef46b0 R08: 0000000000000000 R09: 0000564f97ef46b0
>    R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000000
>    R13: 0000000000001fff R14: 0000000000000005 R15: 0000000000000001
>    </TASK>
> With this fix applied, rerunning the reproducer with the same
> fail-nth=1 setup yields no corresponding Oops reports.
>
> Fixes: 4508943794ef ("proc: use kvzalloc for our kernel buffer")
> Cc: stable@vger.kernel.org
>
> Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
> Signed-off-by: Dawei Feng <dawei.feng@seu.edu.cn>
> ---

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>

>  kernel/bpf/cgroup.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> index 8715a014c21d..f4eefdacd453 100644
> --- a/kernel/bpf/cgroup.c
> +++ b/kernel/bpf/cgroup.c
> @@ -1936,7 +1936,7 @@ int __cgroup_bpf_run_filter_sysctl(struct ctl_table_header *head,
>  	kfree(ctx.cur_val);
>  
>  	if (!ret && ctx.new_updated) {
> -		kfree(*buf);
> +		kvfree(*buf);
>  		*buf = ctx.new_val;
>  		*pcount = ctx.new_len;
>  	} else {


      reply	other threads:[~2026-05-26 22:24 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260526131035.1312864-1-dawei.feng@seu.edu.cn>
2026-05-26 13:10 ` [PATCH 1/2] bpf: cgroup: fix sysctl new value replacement Dawei Feng
2026-05-26 13:55   ` bot+bpf-ci
2026-05-26 22:16   ` Emil Tsalapatis
2026-05-26 13:10 ` [PATCH 2/2] bpf: cgroup: Use kvfree instead of kfree in __cgroup_bpf_run_filter_sysctl Dawei Feng
2026-05-26 22:24   ` Emil Tsalapatis [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DISYLID6QGFY.1HTQOMHVLTRWS@etsalapatis.com \
    --to=emil@etsalapatis.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dawei.feng@seu.edu.cn \
    --cc=eddyz87@gmail.com \
    --cc=jianhao.xu@seu.edu.cn \
    --cc=joel.granados@kernel.org \
    --cc=jolsa@kernel.org \
    --cc=kees@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=memxor@gmail.com \
    --cc=song@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=yonghong.song@linux.dev \
    --cc=zilin@seu.edu.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox