From: sashiko-bot@kernel.org
To: "Vernon Yang" <vernon2gm@gmail.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH 3/4] mm: introduce bpf_mthp_ops struct ops
Date: Sun, 03 May 2026 17:41:24 +0000 [thread overview]
Message-ID: <20260503174125.2C949C2BCB4@smtp.kernel.org> (raw)
In-Reply-To: <20260503165024.1526680-4-vernon2gm@gmail.com>
Thank you for your contribution! Sashiko AI review found 6 potential issue(s) to consider:
- [Critical] Use-After-Free vulnerability in `bpf_mthp_choose()` due to improper SRCU locking.
- [High] NULL pointer dereference of `link` before validation in `bpf_mthp_ops_reg()` and `bpf_mthp_ops_unreg()`.
- [High] Newly created cgroups silently bypass the hierarchical BPF mTHP policy.
- [High] Silent registration failure leaves the cgroup hierarchy in an inconsistent state.
- [Medium] The BPF program can only restrict allowable THP orders, not expand or select new ones.
- [Low] Compilation failure due to dereferencing an incomplete type and a non-existent struct member.
--
commit e5c9c74b11f76b4de7ba4ef8ff94c10bae978e16
Author: Vernon Yang <yanglincheng@kylinos.cn>
mm: introduce bpf_mthp_ops struct ops
Introducing bpf_mthp_ops enables eBPF programs to register the
mthp_choose callback function via cgroup-ebpf.
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 2949e5acff35..80ec622213df 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
[ ... ]
> @@ -291,6 +292,11 @@ unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma,
> enum tva_type type,
> unsigned long orders)
> {
> + /* The eBPF-specified orders overrides which order is selected. */
> + orders &= bpf_mthp_choose(vma->vm_mm, orders);
Does this inadvertently prevent the BPF program from enabling or selecting new
orders? Since the return value is applied using a bitwise AND, it appears
restricted to only disabling existing orders rather than customizing them as
described in the commit message.
> + if (!orders)
> + return 0;
> +
> /*
> * Optimization to check if required orders are enabled early. Only
> * forced collapse ignores sysfs configs.
> diff --git a/mm/bpf_huge_memory.c b/mm/bpf_huge_memory.c
> new file mode 100644
> index 000000000000..e34e0a35edac
> --- /dev/null
> +++ b/mm/bpf_huge_memory.c
[ ... ]
> +unsigned long bpf_mthp_choose(struct mm_struct *mm, unsigned long orders)
> +{
> + struct cgroup *cgrp;
> + struct mem_cgroup *memcg;
> + struct bpf_mthp_ops *ops;
> + int idx;
> +
> + memcg = get_mem_cgroup_from_mm(mm);
> + if (!memcg)
> + return orders;
> +
> + cgrp = memcg->css.cgroup;
> + ops = READ_ONCE(cgrp->mthp_ops);
> + if (unlikely(ops)) {
> + idx = srcu_read_lock(&mthp_bpf_srcu);
> + if (ops->mthp_choose)
> + orders = ops->mthp_choose(cgrp, orders);
> + srcu_read_unlock(&mthp_bpf_srcu, idx);
> + }
Could reading cgrp->mthp_ops outside the SRCU critical section lead to a
use-after-free regression?
If a thread is preempted immediately after ops is read:
ops = READ_ONCE(cgrp->mthp_ops);
<preempted>
Another thread could execute bpf_mthp_ops_unreg(), set the pointer to NULL,
and complete synchronize_srcu(). The original thread would then resume,
acquire the lock, and dereference the freed ops pointer.
> +
> + mem_cgroup_put(memcg);
> +
> + return orders;
> +}
[ ... ]
> +static int bpf_mthp_ops_reg(void *kdata, struct bpf_link *link)
> +{
> + struct bpf_struct_ops_link *st_link = (struct bpf_struct_ops_link *)link;
> + struct bpf_mthp_ops *ops = kdata;
> + struct cgroup *cgrp = st_link->cgroup;
Can this cause a regression if link is passed as NULL? st_link is cast from
link, and st_link->cgroup implicitly dereferences it before the validation
check below.
Additionally, struct bpf_struct_ops_link is defined privately in
kernel/bpf/bpf_struct_ops.c and does not contain a cgroup field. Will this
cause a compilation failure due to dereferencing an incomplete type and a
non-existent struct member?
> + struct cgroup_subsys_state *pos;
> +
> + /* The link is not yet fully initialized, but cgroup should be set */
> + if (!link)
> + return -EOPNOTSUPP;
> +
> + cgroup_lock();
> + css_for_each_descendant_pre(pos, &cgrp->self) {
> + struct cgroup *child = pos->cgroup;
> +
> + if (READ_ONCE(child->mthp_ops)) {
> + /* TODO
> + * Do not destroy the cgroup hierarchy property.
> + * If an eBPF program already exists in the sub-cgroup,
> + * trigger an error and clear the already set
> + * bpf_mthp_ops data.
> + */
> + continue;
Is it intentional to return 0 at the end of this function when we skip over
descendants that already have mthp_ops configured? This appears to leave the
hierarchy in an inconsistent state while userspace assumes the registration
was fully successful.
Furthermore, is there a missing hook in the cgroup creation path? Since this
loop only applies to existing descendants, any newly created cgroups will
have their mthp_ops field initialized to NULL, allowing tasks moved there to
silently bypass the BPF program attached to their ancestors.
> + }
> + WRITE_ONCE(child->mthp_ops, ops);
> + }
> + cgroup_unlock();
> +
> + return 0;
> +}
> +
> +static void bpf_mthp_ops_unreg(void *kdata, struct bpf_link *link)
> +{
> + struct bpf_struct_ops_link *st_link = (struct bpf_struct_ops_link *)link;
> + struct bpf_mthp_ops *ops = kdata;
> + struct cgroup *cgrp = st_link->cgroup;
Does this code similarly dereference link before it can be validated?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260503165024.1526680-1-vernon2gm@gmail.com?part=3
next prev parent reply other threads:[~2026-05-03 17:41 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-03 16:50 [PATCH 0/4] mm: introduce mthp_ext via cgroup-bpf to make mTHP more transparent Vernon Yang
2026-05-03 16:50 ` [PATCH 1/4] psi: add psi_group_flush_stats() function Vernon Yang
2026-05-03 16:50 ` [PATCH 2/4] bpf: add bpf_cgroup_{flush_stats,stall} function Vernon Yang
2026-05-03 17:23 ` bot+bpf-ci
2026-05-06 12:38 ` Vernon Yang
2026-05-03 17:25 ` sashiko-bot
2026-05-06 12:55 ` Vernon Yang
2026-05-03 16:50 ` [PATCH 3/4] mm: introduce bpf_mthp_ops struct ops Vernon Yang
2026-05-03 17:35 ` bot+bpf-ci
2026-05-06 13:06 ` Vernon Yang
2026-05-03 17:41 ` sashiko-bot [this message]
2026-05-06 13:26 ` Vernon Yang
2026-05-03 16:50 ` [PATCH 4/4] samples: bpf: add mthp_ext Vernon Yang
2026-05-03 17:35 ` bot+bpf-ci
2026-05-06 13:30 ` Vernon Yang
2026-05-03 17:57 ` sashiko-bot
2026-05-06 13:50 ` Vernon Yang
2026-05-07 3:34 ` [PATCH 0/4] mm: introduce mthp_ext via cgroup-bpf to make mTHP more transparent Yafang Shao
2026-05-07 12:50 ` Vernon Yang
2026-05-07 13:18 ` Yafang Shao
2026-05-07 15:19 ` Vernon Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260503174125.2C949C2BCB4@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=sashiko@lists.linux.dev \
--cc=vernon2gm@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox