From: Yonghong Song <yonghong.song@linux.dev>
To: Kumar Kartikeya Dwivedi <memxor@gmail.com>, bot+bpf-ci@kernel.org
Cc: mykyta.yatsenko5@gmail.com, bpf@vger.kernel.org, ast@kernel.org,
andrii@kernel.org, daniel@iogearbox.net, kafai@meta.com,
kernel-team@meta.com, eddyz87@gmail.com, yatsenko@meta.com,
martin.lau@kernel.org, clm@meta.com, ihor.solodrai@linux.dev
Subject: Re: [PATCH bpf-next v2 2/6] bpf: Add sleepable execution path for raw tracepoint programs
Date: Tue, 10 Mar 2026 10:38:26 -0700 [thread overview]
Message-ID: <b1c85fc5-e60a-46c4-bea9-4909dcf78579@linux.dev> (raw)
In-Reply-To: <CAP01T74FbF5FH70B2-6iB0BJEh8cLcM7Ok=vhUPJZe-SXeOYJg@mail.gmail.com>
On 3/5/26 8:52 PM, Kumar Kartikeya Dwivedi wrote:
> On Fri, 6 Mar 2026 at 05:23, Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>> On Wed, 25 Feb 2026 at 18:14, <bot+bpf-ci@kernel.org> wrote:
>>>> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
>>>> index 9bc0dfd235af..2dd345e0fdb0 100644
>>>> --- a/kernel/trace/bpf_trace.c
>>>> +++ b/kernel/trace/bpf_trace.c
>>> [ ... ]
>>>
>>>> @@ -2085,12 +2085,26 @@ void __bpf_trace_run(struct bpf_raw_tp_link *link, u64 *args)
>>>> run_ctx.bpf_cookie = link->cookie;
>>>> old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
>>>>
>>>> - (void) bpf_prog_run(prog, args);
>>>> + if (prog->sleepable) {
>>>> + might_fault();
>>>> + (void)bpf_prog_run(prog, args);
>>> At this commit, __BPF_DECLARE_TRACE_SYSCALL in include/trace/bpf_probe.h
>>> still wraps the BPF callback in preempt_disable_notrace() /
>>> preempt_enable_notrace():
>>>
>>> include/trace/bpf_probe.h:__BPF_DECLARE_TRACE_SYSCALL() {
>>> might_fault();
>>> preempt_disable_notrace();
>>> bpf_trace_run##x(__data, ...);
>>> preempt_enable_notrace();
>>> }
>>>
>>> Since __bpf_trace_run() is inlined into bpf_trace_run##x(), which is
>>> called between the preempt_disable_notrace() and
>>> preempt_enable_notrace(), the might_fault() here will execute with
>>> preempt_count > 0, producing a warning on CONFIG_DEBUG_ATOMIC_SLEEP
>>> kernels.
>>>
>>> More importantly, if the sleepable program calls a helper like
>>> bpf_copy_from_user(), the page fault handler will see in_atomic()
>>> and return -EFAULT for any non-resident page instead of faulting it
>>> in, defeating the purpose of sleepable programs.
>>>
>>> The v1 RFC series included a separate patch to remove
>>> preempt_disable_notrace() from __BPF_DECLARE_TRACE_SYSCALL. Should
>>> that change be folded into this commit so the sleepable path works
>>> at each bisect point?
>> But sleepable programs are not yet permitted, so I think this is not
>> needed for bisection-friendliness.
>>
>>>> + } else {
>>>> + /*
>>>> + * Non-sleepable programs may run in the faultable context,
>>>> + * do cant_sleep() only if program is non-sleepable and context
>>>> + * is non-faultable.
>>>> + */
>>>> + if (!link->link.sleepable)
>>>> + cant_sleep();
>>>> + rcu_read_lock();
>>>> + (void)bpf_prog_run(prog, args);
>>>> + rcu_read_unlock();
>>>> + }
>>> [ ... ]
>>>
>>> The commit message states:
>>>
>>>> For non-sleepable programs, behavior is unchanged: cant_sleep() check,
>>>> rcu_read_lock() protection.
>>> The parent commit does not have a cant_sleep() call in
>>> __bpf_trace_run() -- it uses rcu_read_lock_dont_migrate(). The
>>> conditional cant_sleep() is new in this commit. Could this sentence
>>> be updated to reflect the actual change?
>> This seems worth adjusting.
> Actually, with the extra cant_sleep() check, we get this splat when
> running normal raw_tp tests.
> E.g.
>
> ...
> [ 295.211949] BUG: assuming atomic context at kernel/trace/bpf_trace.c:2098
> [ 295.212166] in_atomic(): 0, irqs_disabled(): 0, pid: 587, name: test_progs
> [ 295.212316] 3 locks held by test_progs/587:
> ...
> [ 295.213044] Call Trace:
> [ 295.213046] <TASK>
> [ 295.213048] dump_stack_lvl+0x54/0x70
> [ 295.213055] __cant_sleep+0xb7/0xd0
> [ 295.213060] bpf_trace_run1+0xce/0x340
> [ 295.213072] bpf_testmod_test_read+0x433/0x5e0 [bpf_testmod]
> [ 295.213080] ? srso_return_thunk+0x5/0x5f
> [ 295.213093] kernfs_fop_read_iter+0x166/0x220
> ...
>
> It means if the tracepoint is not marked faultable but is invoked in a
> non-atomic context, we will get a warning.
> It might be better to just drop this, since I don't think this adds much value.
Agree. This is due to the tracepoints defined in test_kmods where
tracepoints have default non-sleepable. So we have
prog: non-sleepable
test_kmods assumed context: non-sleepable
actual context: sleepable
This caused the warning.
BTW, can we merge patches 2 and 3 together? I know the sleepable tp_btf
is not enabled yet in verifier. But just looking at patch 2 is a little
bit confusing as its caller still has preempt_{disable,enable}_notrace().
>
>
>>>
>> Other than that LGTM.
>>
>> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
>>
>>> [...]
next prev parent reply other threads:[~2026-03-10 17:38 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-25 16:23 [PATCH bpf-next v2 0/6] bpf: Add support for sleepable raw tracepoint programs Mykyta Yatsenko
2026-02-25 16:23 ` [PATCH bpf-next v2 1/6] bpf: Reject sleepable raw_tp programs on non-faultable tracepoints Mykyta Yatsenko
2026-03-06 3:59 ` Kumar Kartikeya Dwivedi
2026-02-25 16:23 ` [PATCH bpf-next v2 2/6] bpf: Add sleepable execution path for raw tracepoint programs Mykyta Yatsenko
2026-02-25 17:12 ` bot+bpf-ci
2026-03-06 4:23 ` Kumar Kartikeya Dwivedi
2026-03-06 4:52 ` Kumar Kartikeya Dwivedi
2026-03-10 17:38 ` Yonghong Song [this message]
2026-02-25 16:23 ` [PATCH bpf-next v2 3/6] bpf: Remove preempt_disable from faultable tracepoint BPF callbacks Mykyta Yatsenko
2026-03-06 4:25 ` Kumar Kartikeya Dwivedi
2026-02-25 16:23 ` [PATCH bpf-next v2 4/6] bpf: Allow sleepable programs for BPF_TRACE_RAW_TP attach type Mykyta Yatsenko
2026-03-06 4:26 ` Kumar Kartikeya Dwivedi
2026-02-25 16:23 ` [PATCH bpf-next v2 5/6] libbpf: Add tp_btf.s section handler for sleepable raw tracepoints Mykyta Yatsenko
2026-03-06 4:26 ` Kumar Kartikeya Dwivedi
2026-02-25 16:23 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add tests for sleepable raw tracepoint programs Mykyta Yatsenko
2026-03-06 4:41 ` Kumar Kartikeya Dwivedi
2026-03-06 23:56 ` Kumar Kartikeya Dwivedi
2026-03-09 21:11 ` Jiri Olsa
2026-03-10 0:22 ` Mykyta Yatsenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b1c85fc5-e60a-46c4-bea9-4909dcf78579@linux.dev \
--to=yonghong.song@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bot+bpf-ci@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=clm@meta.com \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=ihor.solodrai@linux.dev \
--cc=kafai@meta.com \
--cc=kernel-team@meta.com \
--cc=martin.lau@kernel.org \
--cc=memxor@gmail.com \
--cc=mykyta.yatsenko5@gmail.com \
--cc=yatsenko@meta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox