public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Puranjay Mohan <puranjay@kernel.org>
To: Kumar Kartikeya Dwivedi <memxor@gmail.com>, bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <martin.lau@kernel.org>,
	Eduard Zingerman <eddyz87@gmail.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	kkd@meta.com, kernel-team@meta.com
Subject: Re: [PATCH bpf v1 1/2] bpf: Fix grace period wait for tracepoint bpf_link
Date: Mon, 30 Mar 2026 11:00:45 +0100	[thread overview]
Message-ID: <m25x6d1vsi.fsf@kernel.org> (raw)
In-Reply-To: <CAP01T77HcqyBKZRzRbLbHuZeskJ7XJ+FU2GQpZX-WXTCPMyikw@mail.gmail.com>

Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:

> On Mon, 30 Mar 2026 at 05:21, Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>>
>> Recently, tracepoints were switched from using disabled preemption
>> (which acts as RCU read section) to SRCU-fast when they are not
>> faultable. This means that to do a proper grace period wait for programs
>> running in such tracepoints, we must use SRCU's grace period wait.
>> This is only for non-faultable tracepoints, faultable ones continue
>> using RCU Tasks Trace.
>>
>> However, bpf_link_free() currently does call_rcu() for all cases when
>> the link is non-sleepable (hence, for tracepoints, non-faultable). Fix
>> this by doing a call_srcu() grace period wait.
>>
>> As far RCU Tasks Trace gp -> RCU gp chaining is concerned, it is deemed
>> unnecessary for tracepoint programs. The link and program are either
>> accessed under RCU Tasks Trace protection, or SRCU-fast protection now.
>>
>> The earlier logic of chaining both RCU Tasks Trace and RCU gp waits was
>> to generalize the logic, even if it conceded an extra RCU gp wait,
>> however that is unnecessary for tracepoints even before this change.
>> In practice no cost was paid since rcu_trace_implies_rcu_gp() was always
>> true.
>>
>> Hence we need not chain any SRCU gp waits after RCU Tasks Trace.
>
> ... or chaining RCP gp after SRCU gp, rather, the commit log should
> probably say that instead. The above might be confusing.
> But more eyes on this would be great, I went back and read a few
> discussions on why we were chaining RCU gp after RCU-tt gp and
> couldn't convince myself it was necessary for the tracepoint path.

Yeah the commit message is a bit hard to follow, let me try to lay out
why chaining isn't needed for either case, let me know if you agree with
this analysis:

For non-faultable tracepoints (the call_srcu path):

The tracepoint dispatch macro in __DECLARE_TRACE does:

        guard(srcu_fast_notrace)(&tracepoint_srcu);
        __DO_TRACE_CALL(name, args);

which calls into __bpf_trace_##call, which calls bpf_trace_runN, and
that ends up in __bpf_trace_run() where we have:

        struct bpf_prog *prog = link->link.prog;
        ...
        rcu_read_lock_dont_migrate();
        ...
        run_ctx.bpf_cookie = link->cookie;
        bpf_prog_run(prog, args);
        ...
        rcu_read_unlock_migrate();

Both the link dereference (link->link.prog) and the
rcu_read_lock_dont_migrate() happen inside the SRCU-fast read section
from the tracepoint macro. So classic RCU is nested inside SRCU-fast
here. When the SRCU grace period completes, all in-flight SRCU-fast
readers have finished, which means all their nested classic RCU read
sections have also finished. No need to chain a classic RCU GP after
the SRCU GP.

For faultable tracepoints (the call_rcu_tasks_trace path):

__DECLARE_TRACE_SYSCALL uses guard(rcu_tasks_trace)() instead of
SRCU-fast, so SRCU isn't involved at all on this path. The link and
program are accessed exclusively under RCU Tasks Trace protection.
A tasks trace GP is sufficient on its own, and since tasks trace GP
implies classic RCU GP, there's nothing to chain.

So in both cases, the outermost protection (SRCU-fast or tasks trace)
is what we wait for in bpf_link_free(), and the inner
rcu_read_lock_dont_migrate() in __bpf_trace_run() is subsumed by that
outer GP wait.

Am I missing something?

Thanks,
Puranjay

  reply	other threads:[~2026-03-30 10:00 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-30  3:21 [PATCH bpf v1 0/2] Fix bpf_link grace period wait for tracepoints Kumar Kartikeya Dwivedi
2026-03-30  3:21 ` [PATCH bpf v1 1/2] bpf: Fix grace period wait for tracepoint bpf_link Kumar Kartikeya Dwivedi
2026-03-30  3:36   ` Kumar Kartikeya Dwivedi
2026-03-30 10:00     ` Puranjay Mohan [this message]
2026-03-30 14:03       ` Kumar Kartikeya Dwivedi
2026-03-30  9:52   ` Puranjay Mohan
2026-03-30 14:02     ` Kumar Kartikeya Dwivedi
2026-03-30 15:07   ` Steven Rostedt
2026-03-30 15:27     ` Kumar Kartikeya Dwivedi
2026-03-30 16:10       ` Steven Rostedt
2026-03-30  3:21 ` [PATCH bpf v1 2/2] bpf: Retire rcu_trace_implies_rcu_gp() Kumar Kartikeya Dwivedi
2026-03-30 10:17   ` Puranjay Mohan
2026-03-30 10:40   ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m25x6d1vsi.fsf@kernel.org \
    --to=puranjay@kernel.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=kkd@meta.com \
    --cc=martin.lau@kernel.org \
    --cc=memxor@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox