Re: [PATCH bpf-next v2 09/15] bpf: Add infrastructure to support attaching struct_ops to cgroups

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Amery Hung" <ameryhung@gmail.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH bpf-next v2 09/15] bpf: Add infrastructure to support attaching struct_ops to cgroups
Date: Tue, 23 Jun 2026 22:53:59 +0000	[thread overview]
Message-ID: <20260623225359.77EE31F00A3A@smtp.kernel.org> (raw)
In-Reply-To: <20260623175006.3136053-10-ameryhung@gmail.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:

New issues:
- [High] The `bpf_cgroup_struct_ops_foreach` macro does not filter out dummy sentinel elements, leading to type confusion and kernel panics.
- [High] Strict `rcu_dereference()` in `bpf_cgroup_struct_ops_foreach` causes lockdep splats or panics for sleepable struct_ops.
- [High] Array truncation during `purge_effective_progs` if the struct_ops type was registered without `cfi_stubs`.

Pre-existing issues:
- [High] `bpf_cgroup_array_copy_to_user()` incorrectly returns -ENOSPC when trailing dummy elements exist, breaking userspace queries.
--

commit a3bccbe6dc639128f470580a5e6281248194eb4e
Author: Martin KaFai Lau <martin.lau@kernel.org>

bpf: Add infrastructure to support attaching struct_ops to cgroups

This patch adds necessary infrastructure to attach a struct_ops map to a
cgroup. It reuses existing cgroup bpf structures and logic to handle the
attachment, hierarchy, and ordering of struct_ops maps. It also implements
a separate struct bpf_link_ops for cgroup's struct_ops link.

> diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
> index 4d0cc65976a14..8a75a6cd73096 100644
> --- a/include/linux/bpf-cgroup.h
> +++ b/include/linux/bpf-cgroup.h
> @@ -110,6 +112,18 @@ struct bpf_prog_list {
>  	u32 flags;
>  };
>  
> +#define bpf_cgroup_struct_ops_foreach(var, item, cgrp, atype)		\
> +	for (item = rcu_dereference((cgrp)->bpf.effective[atype])->items;\
> +	     ((var) = READ_ONCE(item->kdata));				\
> +	     item++)

[Severity: High]
Does this macro fail to filter out dummy sentinel elements?

When update_effective_progs() fails under memory pressure, the kernel falls
back to purge_effective_progs(), which overwrites the detached link's pointer
with a dummy sentinel value (struct_ops_cfi_stubs[atype]).

Since the macro yields every non-NULL element to the caller subsystem,
including these dummy sentinels, could this cause type confusion and kernel
panics if the subsystem accesses data fields on the dummy struct?

[Severity: High]
Will this strict rcu_dereference() cause lockdep splats or panics for
sleepable struct_ops?

The patch introduces free_after_mult_rcu_gp to support sleepable struct_ops,
meaning a subsystem executing a sleepable bpf program must hold
rcu_read_lock_trace() instead of the standard rcu_read_lock(). However,
rcu_dereference() rigidly enforces that rcu_read_lock() is held.

If the subsystem holds rcu_read_lock_trace(), lockdep will splat, and if
it erroneously holds rcu_read_lock(), the kernel will crash when the
sleepable bpf program actually sleeps.

Should this use rcu_dereference_check() or similar to accommodate sleepable
contexts?

> +
> +static inline bool cgroup_bpf_is_struct_ops_atype(enum cgroup_bpf_attach_type atype)
> +{
> +	return atype == CGROUP_TCP_SOCK_OPS;
> +}
> +void cgroup_bpf_struct_ops_register(int atype, u32 type_id, void *cfi_stubs, bool mult_trace);
> +int cgroup_bpf_struct_ops_attach(struct bpf_map *map, const union bpf_attr *attr);
> +
>  void __init cgroup_bpf_lifetime_notifier_init(void);

[ ... ]

> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 64572f85edc83..d591f306ace52 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c

[ ... ]

> @@ -9868,6 +9872,23 @@ btf_add_struct_ops(struct btf *btf, struct bpf_struct_ops *st_ops,
>  	if (err)
>  		return err;
>  
> +	if (cgroup_atype) {
> +		if (!cgroup_bpf_is_struct_ops_atype(cgroup_atype) ||
> +		    st_ops->reg || st_ops->unreg || st_ops->free_after_tasks_rcu_gp) {
> +			bpf_struct_ops_desc_release(&tab->ops[btf->struct_ops_tab->cnt]);
> +			return -EINVAL;
> +		}
> +
> +		/* There is no need to unregister from cgroup when the
> +		 * btf_free(). No struct_ops map and its cgroup link
> +		 * can be created once its btf is gone.
> +		 */
> +		cgroup_bpf_struct_ops_register(cgroup_atype,
> +					       tab->ops[btf->struct_ops_tab->cnt].type_id,
> +					       st_ops->cfi_stubs,
> +					       st_ops->free_after_mult_rcu_gp);

[Severity: High]
Does this path accept st_ops->cfi_stubs without validating that it is non-NULL?

If a struct_ops type is registered without providing cfi_stubs (e.g., if
CONFIG_CFI_CLANG is disabled), struct_ops_cfi_stubs[atype] will be NULL.

Later, if a link is detached and update_effective_progs() fails due to memory
pressure, the fallback bpf_cgroup_array_delete_safe_at() overwrites the
detached program's slot with bpf_cgroup_array_dummy().

Because cfi_stubs is NULL, it writes NULL into the middle of the array.

Since NULL is used as the array terminator, won't this prematurely truncate
the array and silently bypass all attached bpf programs that follow the
detached one?

> +	}
> +
>  	btf->struct_ops_tab->cnt++;
>  
>  	return 0;

[ ... ]

> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> index 081d81de18163..a808d2a31f11d 100644
> --- a/kernel/bpf/cgroup.c
> +++ b/kernel/bpf/cgroup.c

[ ... ]

> @@ -355,7 +407,12 @@ static int bpf_cgroup_array_copy_to_user(struct bpf_prog_array *array,
>  	for (item = array->items; item->prog && i < cnt; item++) {
>  		if (item->prog == bpf_cgroup_array_dummy(atype))
>  			continue;
> -		id = item->prog->aux->id;
> +
> +		if (cgroup_bpf_is_struct_ops_atype(atype))
> +			id = bpf_struct_ops_kdata_map_id(item->kdata);
> +		else
> +			id = item->prog->aux->id;
> +
>  		if (copy_to_user(prog_ids + i, &id, sizeof(id)))
>  			return -EFAULT;
>  		i++;

[Severity: High]
This is a pre-existing issue, but does bpf_cgroup_array_copy_to_user()
incorrectly return -ENOSPC when trailing dummy elements exist?

If a cgroup link is detached under memory pressure, it leaves a trailing dummy
sentinel in the array.

When userspace calls bpf_prog_query with an exact-sized buffer, the loop
breaks via "i < cnt" exactly when i == cnt, leaving "item" pointing to the
dummy element.

Since item->prog is non-NULL (it is the dummy), the function evaluates
"return item->prog ? -ENOSPC : 0;" and erroneously returns -ENOSPC.

This breaks userspace queries, causing userspace tools to continually receive
-ENOSPC and attempt to allocate larger buffers, eventually exhausting memory.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260623175006.3136053-1-ameryhung@gmail.com?part=9

next prev parent reply	other threads:[~2026-06-23 22:53 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-23 17:49 [PATCH bpf-next v2 00/15] bpf: A common way to attach struct_ops to a cgroup Amery Hung
2026-06-23 17:49 ` [PATCH bpf-next v2 01/15] bpf: Remove __rcu tagging in st_link->map Amery Hung
2026-06-23 17:49 ` [PATCH bpf-next v2 02/15] bpf: Make struct_ops tasks_rcu grace period optional Amery Hung
2026-06-23 17:49 ` [PATCH bpf-next v2 03/15] bpf: Add bpf_struct_ops accessor helpers Amery Hung
2026-06-23 21:40   ` sashiko-bot
2026-06-23 17:49 ` [PATCH bpf-next v2 04/15] bpf: Remove unnecessary prog_list_prog() check Amery Hung
2026-06-23 17:49 ` [PATCH bpf-next v2 05/15] bpf: Replace prog_list_prog() check with direct pl->prog and pl->link check Amery Hung
2026-06-23 17:49 ` [PATCH bpf-next v2 06/15] bpf: Add prog_list_init_item(), prog_list_replace_item(), and prog_list_id() Amery Hung
2026-06-23 17:49 ` [PATCH bpf-next v2 07/15] bpf: Move LSM trampoline unlink into bpf_cgroup_link_auto_detach() Amery Hung
2026-06-23 17:49 ` [PATCH bpf-next v2 08/15] bpf: Add a few bpf_cgroup_array_* helper functions Amery Hung
2026-06-23 22:25   ` sashiko-bot
2026-06-23 17:49 ` [PATCH bpf-next v2 09/15] bpf: Add infrastructure to support attaching struct_ops to cgroups Amery Hung
2026-06-23 22:53   ` sashiko-bot [this message]
2026-06-23 17:49 ` [PATCH bpf-next v2 10/15] bpf: Allow all struct_ops to use bpf_dynptr_from_skb() Amery Hung
2026-06-23 17:49 ` [PATCH bpf-next v2 11/15] bpf: tcp: Support selected sock_ops callbacks as struct_ops Amery Hung
2026-06-23 23:22   ` sashiko-bot
2026-06-23 17:50 ` [PATCH bpf-next v2 12/15] bpf: tcp: Support parse/len/write header option hooks in bpf_tcp_ops Amery Hung
2026-06-23 23:40   ` sashiko-bot
2026-06-23 17:50 ` [PATCH bpf-next v2 13/15] libbpf: Support attaching struct_ops to a cgroup Amery Hung
2026-06-23 23:53   ` sashiko-bot
2026-06-23 17:50 ` [PATCH bpf-next v2 14/15] selftests/bpf: Test " Amery Hung
2026-06-24  0:03   ` sashiko-bot
2026-06-23 17:50 ` [PATCH bpf-next v2 15/15] selftests/bpf: Add test for bpf_tcp_ops header option hooks Amery Hung
2026-06-24  0:14   ` sashiko-bot
     [not found] ` <0e6fb8d7eca4b5494e08a2230056e333bdd814d97c16784b3e1f7687b3640990@mail.kernel.org>
2026-06-23 21:11   ` [PATCH bpf-next v2 00/15] bpf: A common way to attach struct_ops to a cgroup Amery Hung

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260623225359.77EE31F00A3A@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=ameryhung@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.