BPF List
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Vernon Yang" <vernon2gm@gmail.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH v2 3/4] mm: introduce bpf_mthp_ops struct ops
Date: Fri, 08 May 2026 22:29:17 +0000	[thread overview]
Message-ID: <20260508222918.40C2BC2BCB0@smtp.kernel.org> (raw)
In-Reply-To: <20260508150055.680136-4-vernon2gm@gmail.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [Critical] BPF architectural layering violation resulting in a Use-After-Free risk.
- [High] Broken cgroup inheritance prevents overriding parent policies.
- [High] Unconditional atomic operations and SRCU locking on the page fault hot path.
- [Low] Build failures due to incomplete types and missing header includes.
--

commit 5151cfb43b99be8eef5398ab31786c76a122dc79
Author: Vernon Yang <yanglincheng@kylinos.cn>

mm: introduce bpf_mthp_ops struct ops
    
This commit introduces a new BPF struct_ops called bpf_mthp_ops to allow eBPF
programs to customize the sizes of multi-size THP (mTHP) for different
cgroups. It adds bpf_mthp_choose() to the page fault path to retrieve the
BPF-specified orders.

> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> index 1f084ee71443e..7db83ca81e6e7 100644
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@ -5836,6 +5836,8 @@ static struct cgroup *cgroup_create(struct cgroup *parent, const char *name,
>  	if (ret)
>  		goto out_stat_exit;
>  
> +	cgroup_bpf_set_mthp_ops(cgrp, parent);

Does this file need to include <linux/bpf_huge_memory.h> for this function
call? This might cause a build failure due to a missing declaration.

> +
>  	for (tcgrp = cgrp; tcgrp; tcgrp = cgroup_parent(tcgrp))
>  		cgrp->ancestors[tcgrp->level] = tcgrp;

[ ... ]

> diff --git a/mm/bpf_huge_memory.c b/mm/bpf_huge_memory.c
> new file mode 100644
> index 0000000000000..851c6ebe29334
> --- /dev/null
> +++ b/mm/bpf_huge_memory.c
> @@ -0,0 +1,168 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
[ ... ]
> +#include <linux/bpf.h>
> +#include <linux/srcu.h>

Does this file also need to include <linux/memcontrol.h> and <linux/cgroup.h>?
Functions like get_mem_cgroup_from_mm() and css_for_each_descendant_pre() are
used below, which might lead to build failures.

> +
> +/* Protects cgrp->mthp_ops pointer for read and write. */
> +DEFINE_SRCU(mthp_bpf_srcu);
> +
> +unsigned long bpf_mthp_choose(struct mm_struct *mm, unsigned long orders)
> +{
> +	struct cgroup *cgrp;
> +	struct mem_cgroup *memcg;
> +	struct bpf_mthp_ops *ops;
> +	int idx;
> +
> +	memcg = get_mem_cgroup_from_mm(mm);

Since thp_vma_allowable_orders() calls this function unconditionally on every
THP allocation, does executing get_mem_cgroup_from_mm() and modifying atomic
reference counts on the page fault hot path introduce significant overhead?

Could a static key be used to bypass these atomic operations and the SRCU
locking below when the BPF feature is inactive?

> +	if (!memcg)
> +		return orders;
> +
> +	cgrp = memcg->css.cgroup;
> +
> +	idx = srcu_read_lock(&mthp_bpf_srcu);
> +	ops = READ_ONCE(cgrp->mthp_ops);
> +	if (unlikely(ops && ops->mthp_choose))
> +		orders = ops->mthp_choose(cgrp, orders);
> +	srcu_read_unlock(&mthp_bpf_srcu, idx);
> +
> +	mem_cgroup_put(memcg);
> +
> +	return orders;
> +}

[ ... ]

> +static int bpf_mthp_ops_reg(void *kdata, struct bpf_link *link)
> +{
> +	struct bpf_struct_ops_link *st_link = (struct bpf_struct_ops_link *)link;
> +	struct bpf_mthp_ops *ops = kdata;
> +	struct cgroup_subsys_state *child;
> +	struct cgroup *cgrp;
> +
> +	if (!link)
> +		return -EOPNOTSUPP;
> +
> +	cgrp = st_link->cgroup;

Can this code successfully compile? struct bpf_struct_ops_link appears to be an
incomplete type defined privately in kernel/bpf/bpf_struct_ops.c, and it lacks
a cgroup field.

Additionally, does bpf_mthp_ops_reg() need to acquire a reference count on the
target cgroup using cgroup_get()? If the cgroup is deleted while the link
remains active, could st_link->cgroup become a dangling pointer and lead to a
use-after-free when bpf_mthp_ops_unreg() is later called?

> +	if (!cgrp)
> +		return -EINVAL;
> +
> +	cgroup_lock();
> +	css_for_each_descendant_pre(child, &cgrp->self) {
> +		if (READ_ONCE(child->cgroup->mthp_ops)) {
> +			pr_warn("sub-cgroup has already registered.\n");
> +			cgroup_unlock();
> +			return -EBUSY;
> +		}
> +	}

Since cgroup_create() copies parent->mthp_ops directly to the child cgroup,
wouldn't all descendants inherently have mthp_ops set if their parent does?
Does this block users from registering an overriding BPF program on the child
cgroup, or block the parent if a child registers first?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260508150055.680136-1-vernon2gm@gmail.com?part=3

  parent reply	other threads:[~2026-05-08 22:29 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-08 15:00 [PATCH v2 0/4] mm: introduce mthp_ext via cgroup-bpf to make mTHP more transparent Vernon Yang
2026-05-08 15:00 ` [PATCH v2 1/4] psi: add psi_group_flush_stats() function Vernon Yang
2026-05-08 15:19   ` Lorenzo Stoakes
2026-05-08 21:36   ` sashiko-bot
2026-05-08 15:00 ` [PATCH v2 2/4] bpf: add bpf_cgroup_{flush_stats,stall} function Vernon Yang
2026-05-08 15:40   ` bot+bpf-ci
2026-05-08 22:01   ` sashiko-bot
2026-05-08 15:00 ` [PATCH v2 3/4] mm: introduce bpf_mthp_ops struct ops Vernon Yang
2026-05-08 15:40   ` bot+bpf-ci
2026-05-08 15:57   ` Lorenzo Stoakes
2026-05-08 20:54   ` David Hildenbrand (Arm)
2026-05-11 11:25     ` Lorenzo Stoakes
2026-05-08 22:29   ` sashiko-bot [this message]
2026-05-08 15:00 ` [PATCH v2 4/4] samples: bpf: add mthp_ext Vernon Yang
2026-05-08 15:40   ` bot+bpf-ci
2026-05-08 22:52   ` sashiko-bot
2026-05-08 15:14 ` [PATCH v2 0/4] mm: introduce mthp_ext via cgroup-bpf to make mTHP more transparent Lorenzo Stoakes
2026-05-08 16:05   ` Lorenzo Stoakes
2026-05-08 16:53     ` Vernon Yang
2026-05-11 11:20       ` Lorenzo Stoakes
2026-05-08 16:00 ` Pedro Falcato
2026-05-08 16:15   ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260508222918.40C2BC2BCB0@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=sashiko@lists.linux.dev \
    --cc=vernon2gm@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox