From: Martin KaFai Lau <martin.lau@linux.dev>
To: sashiko-reviews@lists.linux.dev
Cc: bpf@vger.kernel.org, Amery Hung <ameryhung@gmail.com>
Subject: Re: [RFC PATCH bpf-next 02/12] bpf: Make struct_ops tasks_rcu grace period optional
Date: Tue, 19 May 2026 17:25:53 -0700 [thread overview]
Message-ID: <2026520006.6Mm9.martin.lau@linux.dev> (raw)
In-Reply-To: <20260519225421.3CBCC1F000E9@smtp.kernel.org>
On Tue, May 19, 2026 at 10:54:20PM +0000, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
> - [Critical] The patch incorrectly skips the Tasks RCU grace period (`synchronize_rcu_tasks()`) before freeing executable trampoline memory for all `struct_ops` except `tcp_congestion_ops`, leading to Use-After-Free of executable memory.
> - [High] RCU-001 violation: `bpf_struct_ops_map_dissoc_progs()` removes the map from `prog->aux->st_ops_assoc` after the regular RCU grace period has already completed, leading to a Use-After-Free.
> --
>
> commit 85a451ad8aefbb364a1840344e08d2f34596b96c
> Author: Martin KaFai Lau <martin.lau@kernel.org>
>
> bpf: Make struct_ops tasks_rcu grace period optional
>
> This commit makes the tasks RCU grace period optional during struct_ops map
> freeing. It introduces a map_free_pre_rcu callback to remove ksyms before the
> regular RCU grace period. The tasks RCU grace period is now only explicitly
> requested by tcp_congestion_ops.
>
> > diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
> > index 08791180d71d9..28eab24ef0ed0 100644
> > --- a/kernel/bpf/bpf_struct_ops.c
> > +++ b/kernel/bpf/bpf_struct_ops.c
> > @@ -996,24 +1005,8 @@ static void bpf_struct_ops_map_free(struct bpf_map *map)
> >
> > bpf_struct_ops_map_dissoc_progs(st_map);
>
> [Severity: High]
> Does this violate the RCU remove-before-reclaim rule?
>
> Because map->free_after_rcu_gp is set to true in bpf_struct_ops_map_alloc(),
> the regular RCU grace period occurs before bpf_struct_ops_map_free() is called.
> Leaving bpf_struct_ops_map_dissoc_progs() here means the map is removed
> from prog->aux->st_ops_assoc after the grace period has already finished.
After the grace period finished, all struct_ops's bpf prog has
finished running
>
> Could a concurrent execution of the BPF program acquire the RCU read lock, read
> the still-valid aux->st_ops_assoc pointer, and then access the map after it is
> freed?
so this cannot happen.
afaik, for non struct_ops's prog, the st_ops_assoc has a refcnt.
[I could be wrong here. CC Amery.]
Overall, moving bpf_struct_ops_map_dissoc_progs to map_free_pre_rcu
could be a simpler read but don't think it is needed.
>
> Should bpf_struct_ops_map_dissoc_progs() be moved to the new
> bpf_struct_ops_map_free_pre_rcu() callback instead?
>
> > - /* The struct_ops's function may switch to another struct_ops.
> > - *
> > - * For example, bpf_tcp_cc_x->init() may switch to
> > - * another tcp_cc_y by calling
> > - * setsockopt(TCP_CONGESTION, "tcp_cc_y").
> > - * During the switch, bpf_struct_ops_put(tcp_cc_x) is called
> > - * and its refcount may reach 0 which then free its
> > - * trampoline image while tcp_cc_x is still running.
> > - *
> > - * A vanilla rcu gp is to wait for all bpf-tcp-cc prog
> > - * to finish. bpf-tcp-cc prog is non sleepable.
> > - * A rcu_tasks gp is to wait for the last few insn
> > - * in the tramopline image to finish before releasing
> > - * the trampoline image.
> > - */
> > - synchronize_rcu_mult(call_rcu, call_rcu_tasks);
> > + if (tasks_rcu && IS_ENABLED(CONFIG_TASKS_RCU))
> > + synchronize_rcu_tasks();
> >
> > __bpf_struct_ops_map_free(map);
> > }
>
> [Severity: Critical]
> Does skipping the Tasks RCU grace period for other struct_ops cause a
> use-after-free of executable trampoline memory?
I don't think so. Other subsystems using struct_ops wait for
the bpf prog finished running in unreg().
Note that in general this should only matter to the bpf-tcp-cc's struct_ops prog
that does a bpf_setsockopt(TCP_CONGESTION) which triggers a bpf_map_put() on
its own running struct_ops's map refcnt.
>
> All struct_ops programs are executed via dynamically generated trampolines. When
> the program returns, the RCU read lock is dropped and preemption is re-enabled.
> If a task is preempted while still executing the trampoline epilogue, the
> regular RCU grace period will not wait for it.
>
> Without the Tasks RCU wait, could the executable trampoline image be freed
> prematurely, causing preempted tasks to execute freed memory when they resume?
>
> --
> Sashiko AI review · https://sashiko.dev/#/patchset/20260519215841.2984970-1-martin.lau@linux.dev?part=2
next prev parent reply other threads:[~2026-05-20 0:26 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-19 21:58 [RFC PATCH bpf-next 00/12] bpf: A common way to attach struct_ops to a cgroup Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 01/12] bpf: Remove __rcu tagging in st_link->map Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 02/12] bpf: Make struct_ops tasks_rcu grace period optional Martin KaFai Lau
2026-05-19 22:54 ` sashiko-bot
2026-05-20 0:25 ` Martin KaFai Lau [this message]
2026-05-19 21:58 ` [RFC PATCH bpf-next 03/12] bpf: Add bpf_struct_ops accessor helpers Martin KaFai Lau
2026-05-19 22:25 ` sashiko-bot
2026-05-19 21:58 ` [RFC PATCH bpf-next 04/12] bpf: Remove unnecessary prog_list_prog() check Martin KaFai Lau
2026-05-19 22:49 ` sashiko-bot
2026-05-19 21:58 ` [RFC PATCH bpf-next 05/12] bpf: Replace prog_list_prog() check with direct pl->prog and pl->link check Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 06/12] bpf: Add prog_list_init_item(), prog_list_replace_item(), and prog_list_id() Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 07/12] bpf: Move LSM trampoline unlink into bpf_cgroup_link_auto_detach() Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 08/12] bpf: Add a few bpf_cgroup_array_* helper functions Martin KaFai Lau
2026-05-19 22:45 ` sashiko-bot
2026-05-19 22:50 ` Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 09/12] bpf: Add infrastructure to support attaching struct_ops to cgroups Martin KaFai Lau
2026-05-19 22:50 ` sashiko-bot
2026-05-19 23:56 ` Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 10/12] bpf: tcp: Support selected sock_ops callbacks as struct_ops Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 11/12] libbpf: Support attaching struct_ops to a cgroup Martin KaFai Lau
2026-05-19 21:58 ` [RFC PATCH bpf-next 12/12] selftests/bpf: Test " Martin KaFai Lau
2026-05-19 23:03 ` sashiko-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2026520006.6Mm9.martin.lau@linux.dev \
--to=martin.lau@linux.dev \
--cc=ameryhung@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox