Re: [RFC PATCH bpf-next 02/12] bpf: Make struct_ops tasks_rcu grace period optional

BPF List
 help / color / mirror / Atom feed

From: Martin KaFai Lau <martin.lau@linux.dev>
To: sashiko-reviews@lists.linux.dev
Cc: bpf@vger.kernel.org, Amery Hung <ameryhung@gmail.com>
Subject: Re: [RFC PATCH bpf-next 02/12] bpf: Make struct_ops tasks_rcu grace period optional
Date: Tue, 19 May 2026 17:25:53 -0700	[thread overview]
Message-ID: <2026520006.6Mm9.martin.lau@linux.dev> (raw)
In-Reply-To: <20260519225421.3CBCC1F000E9@smtp.kernel.org>

On Tue, May 19, 2026 at 10:54:20PM +0000, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
> - [Critical] The patch incorrectly skips the Tasks RCU grace period (`synchronize_rcu_tasks()`) before freeing executable trampoline memory for all `struct_ops` except `tcp_congestion_ops`, leading to Use-After-Free of executable memory.
> - [High] RCU-001 violation: `bpf_struct_ops_map_dissoc_progs()` removes the map from `prog->aux->st_ops_assoc` after the regular RCU grace period has already completed, leading to a Use-After-Free.
> --
> 
> commit 85a451ad8aefbb364a1840344e08d2f34596b96c
> Author: Martin KaFai Lau <martin.lau@kernel.org>
> 
> bpf: Make struct_ops tasks_rcu grace period optional
> 
> This commit makes the tasks RCU grace period optional during struct_ops map
> freeing. It introduces a map_free_pre_rcu callback to remove ksyms before the
> regular RCU grace period. The tasks RCU grace period is now only explicitly
> requested by tcp_congestion_ops.
> 
> > diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
> > index 08791180d71d9..28eab24ef0ed0 100644
> > --- a/kernel/bpf/bpf_struct_ops.c
> > +++ b/kernel/bpf/bpf_struct_ops.c
> > @@ -996,24 +1005,8 @@ static void bpf_struct_ops_map_free(struct bpf_map *map)
> >  
> >  	bpf_struct_ops_map_dissoc_progs(st_map);
> 
> [Severity: High]
> Does this violate the RCU remove-before-reclaim rule?
> 
> Because map->free_after_rcu_gp is set to true in bpf_struct_ops_map_alloc(),
> the regular RCU grace period occurs before bpf_struct_ops_map_free() is called.
> Leaving bpf_struct_ops_map_dissoc_progs() here means the map is removed
> from prog->aux->st_ops_assoc after the grace period has already finished.

After the grace period finished, all struct_ops's bpf prog has
finished running

> 
> Could a concurrent execution of the BPF program acquire the RCU read lock, read
> the still-valid aux->st_ops_assoc pointer, and then access the map after it is
> freed?

so this cannot happen.

afaik, for non struct_ops's prog, the st_ops_assoc has a refcnt.
[I could be wrong here. CC Amery.]

Overall, moving bpf_struct_ops_map_dissoc_progs to map_free_pre_rcu
could be a simpler read but don't think it is needed.

> 
> Should bpf_struct_ops_map_dissoc_progs() be moved to the new
> bpf_struct_ops_map_free_pre_rcu() callback instead?
> 
> > -	/* The struct_ops's function may switch to another struct_ops.
> > -	 *
> > -	 * For example, bpf_tcp_cc_x->init() may switch to
> > -	 * another tcp_cc_y by calling
> > -	 * setsockopt(TCP_CONGESTION, "tcp_cc_y").
> > -	 * During the switch,  bpf_struct_ops_put(tcp_cc_x) is called
> > -	 * and its refcount may reach 0 which then free its
> > -	 * trampoline image while tcp_cc_x is still running.
> > -	 *
> > -	 * A vanilla rcu gp is to wait for all bpf-tcp-cc prog
> > -	 * to finish. bpf-tcp-cc prog is non sleepable.
> > -	 * A rcu_tasks gp is to wait for the last few insn
> > -	 * in the tramopline image to finish before releasing
> > -	 * the trampoline image.
> > -	 */
> > -	synchronize_rcu_mult(call_rcu, call_rcu_tasks);
> > +	if (tasks_rcu && IS_ENABLED(CONFIG_TASKS_RCU))
> > +		synchronize_rcu_tasks();
> >  
> >  	__bpf_struct_ops_map_free(map);
> >  }
> 
> [Severity: Critical]
> Does skipping the Tasks RCU grace period for other struct_ops cause a
> use-after-free of executable trampoline memory?

I don't think so. Other subsystems using struct_ops wait for
the bpf prog finished running in unreg().

Note that in general this should only matter to the bpf-tcp-cc's struct_ops prog
that does a bpf_setsockopt(TCP_CONGESTION) which triggers a bpf_map_put() on
its own running struct_ops's map refcnt.

> 
> All struct_ops programs are executed via dynamically generated trampolines. When
> the program returns, the RCU read lock is dropped and preemption is re-enabled.
> If a task is preempted while still executing the trampoline epilogue, the
> regular RCU grace period will not wait for it.
> 
> Without the Tasks RCU wait, could the executable trampoline image be freed
> prematurely, causing preempted tasks to execute freed memory when they resume?
> 
> -- 
> Sashiko AI review · https://sashiko.dev/#/patchset/20260519215841.2984970-1-martin.lau@linux.dev?part=2

next prev parent reply	other threads:[~2026-05-20  0:26 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-19 21:58 [RFC PATCH bpf-next 00/12] bpf: A common way to attach struct_ops to a cgroup Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 01/12] bpf: Remove __rcu tagging in st_link->map Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 02/12] bpf: Make struct_ops tasks_rcu grace period optional Martin KaFai Lau
2026-05-19 22:54   ` sashiko-bot
2026-05-20  0:25     ` Martin KaFai Lau [this message]
2026-05-19 21:58 ` [RFC PATCH bpf-next 03/12] bpf: Add bpf_struct_ops accessor helpers Martin KaFai Lau
2026-05-19 22:25   ` sashiko-bot
2026-05-19 21:58 ` [RFC PATCH bpf-next 04/12] bpf: Remove unnecessary prog_list_prog() check Martin KaFai Lau
2026-05-19 22:49   ` sashiko-bot
2026-05-19 21:58 ` [RFC PATCH bpf-next 05/12] bpf: Replace prog_list_prog() check with direct pl->prog and pl->link check Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 06/12] bpf: Add prog_list_init_item(), prog_list_replace_item(), and prog_list_id() Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 07/12] bpf: Move LSM trampoline unlink into bpf_cgroup_link_auto_detach() Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 08/12] bpf: Add a few bpf_cgroup_array_* helper functions Martin KaFai Lau
2026-05-19 22:45   ` sashiko-bot
2026-05-19 22:50     ` Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 09/12] bpf: Add infrastructure to support attaching struct_ops to cgroups Martin KaFai Lau
2026-05-19 22:50   ` sashiko-bot
2026-05-19 23:56     ` Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 10/12] bpf: tcp: Support selected sock_ops callbacks as struct_ops Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 11/12] libbpf: Support attaching struct_ops to a cgroup Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 12/12] selftests/bpf: Test " Martin KaFai Lau
2026-05-19 23:03   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2026520006.6Mm9.martin.lau@linux.dev \
    --to=martin.lau@linux.dev \
    --cc=ameryhung@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox