public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: Kumar Kartikeya Dwivedi <memxor@gmail.com>, bot+bpf-ci@kernel.org
Cc: mykyta.yatsenko5@gmail.com, bpf@vger.kernel.org, ast@kernel.org,
	andrii@kernel.org, daniel@iogearbox.net, kafai@meta.com,
	kernel-team@meta.com, eddyz87@gmail.com, yatsenko@meta.com,
	martin.lau@kernel.org, clm@meta.com, ihor.solodrai@linux.dev
Subject: Re: [PATCH bpf-next v2 2/6] bpf: Add sleepable execution path for raw tracepoint programs
Date: Tue, 10 Mar 2026 10:38:26 -0700	[thread overview]
Message-ID: <b1c85fc5-e60a-46c4-bea9-4909dcf78579@linux.dev> (raw)
In-Reply-To: <CAP01T74FbF5FH70B2-6iB0BJEh8cLcM7Ok=vhUPJZe-SXeOYJg@mail.gmail.com>



On 3/5/26 8:52 PM, Kumar Kartikeya Dwivedi wrote:
> On Fri, 6 Mar 2026 at 05:23, Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>> On Wed, 25 Feb 2026 at 18:14, <bot+bpf-ci@kernel.org> wrote:
>>>> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
>>>> index 9bc0dfd235af..2dd345e0fdb0 100644
>>>> --- a/kernel/trace/bpf_trace.c
>>>> +++ b/kernel/trace/bpf_trace.c
>>> [ ... ]
>>>
>>>> @@ -2085,12 +2085,26 @@ void __bpf_trace_run(struct bpf_raw_tp_link *link, u64 *args)
>>>>        run_ctx.bpf_cookie = link->cookie;
>>>>        old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
>>>>
>>>> -     (void) bpf_prog_run(prog, args);
>>>> +     if (prog->sleepable) {
>>>> +             might_fault();
>>>> +             (void)bpf_prog_run(prog, args);
>>> At this commit, __BPF_DECLARE_TRACE_SYSCALL in include/trace/bpf_probe.h
>>> still wraps the BPF callback in preempt_disable_notrace() /
>>> preempt_enable_notrace():
>>>
>>> include/trace/bpf_probe.h:__BPF_DECLARE_TRACE_SYSCALL() {
>>>      might_fault();
>>>      preempt_disable_notrace();
>>>      bpf_trace_run##x(__data, ...);
>>>      preempt_enable_notrace();
>>> }
>>>
>>> Since __bpf_trace_run() is inlined into bpf_trace_run##x(), which is
>>> called between the preempt_disable_notrace() and
>>> preempt_enable_notrace(), the might_fault() here will execute with
>>> preempt_count > 0, producing a warning on CONFIG_DEBUG_ATOMIC_SLEEP
>>> kernels.
>>>
>>> More importantly, if the sleepable program calls a helper like
>>> bpf_copy_from_user(), the page fault handler will see in_atomic()
>>> and return -EFAULT for any non-resident page instead of faulting it
>>> in, defeating the purpose of sleepable programs.
>>>
>>> The v1 RFC series included a separate patch to remove
>>> preempt_disable_notrace() from __BPF_DECLARE_TRACE_SYSCALL.  Should
>>> that change be folded into this commit so the sleepable path works
>>> at each bisect point?
>> But sleepable programs are not yet permitted, so I think this is not
>> needed for bisection-friendliness.
>>
>>>> +     } else {
>>>> +             /*
>>>> +              * Non-sleepable programs may run in the faultable context,
>>>> +              * do cant_sleep() only if program is non-sleepable and context
>>>> +              * is non-faultable.
>>>> +              */
>>>> +             if (!link->link.sleepable)
>>>> +                     cant_sleep();
>>>> +             rcu_read_lock();
>>>> +             (void)bpf_prog_run(prog, args);
>>>> +             rcu_read_unlock();
>>>> +     }
>>> [ ... ]
>>>
>>> The commit message states:
>>>
>>>> For non-sleepable programs, behavior is unchanged: cant_sleep() check,
>>>> rcu_read_lock() protection.
>>> The parent commit does not have a cant_sleep() call in
>>> __bpf_trace_run() -- it uses rcu_read_lock_dont_migrate().  The
>>> conditional cant_sleep() is new in this commit.  Could this sentence
>>> be updated to reflect the actual change?
>> This seems worth adjusting.
> Actually, with the extra cant_sleep() check, we get this splat when
> running normal raw_tp tests.
> E.g.
>
> ...
> [  295.211949] BUG: assuming atomic context at kernel/trace/bpf_trace.c:2098
> [  295.212166] in_atomic(): 0, irqs_disabled(): 0, pid: 587, name: test_progs
> [  295.212316] 3 locks held by test_progs/587:
> ...
> [  295.213044] Call Trace:
> [  295.213046]  <TASK>
> [  295.213048]  dump_stack_lvl+0x54/0x70
> [  295.213055]  __cant_sleep+0xb7/0xd0
> [  295.213060]  bpf_trace_run1+0xce/0x340
> [  295.213072]  bpf_testmod_test_read+0x433/0x5e0 [bpf_testmod]
> [  295.213080]  ? srso_return_thunk+0x5/0x5f
> [  295.213093]  kernfs_fop_read_iter+0x166/0x220
> ...
>
> It means if the tracepoint is not marked faultable but is invoked in a
> non-atomic context, we will get a warning.
> It might be better to just drop this, since I don't think this adds much value.

Agree. This is due to the tracepoints defined in test_kmods where
tracepoints have default non-sleepable. So we have
   prog: non-sleepable
   test_kmods assumed context: non-sleepable
   actual context: sleepable

This caused the warning.

BTW, can we merge patches 2 and 3 together? I know the sleepable tp_btf
is not enabled yet in verifier. But just looking at patch 2 is a little
bit confusing as its caller still has preempt_{disable,enable}_notrace().

>
>
>>>
>> Other than that LGTM.
>>
>> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
>>
>>> [...]


  reply	other threads:[~2026-03-10 17:38 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-25 16:23 [PATCH bpf-next v2 0/6] bpf: Add support for sleepable raw tracepoint programs Mykyta Yatsenko
2026-02-25 16:23 ` [PATCH bpf-next v2 1/6] bpf: Reject sleepable raw_tp programs on non-faultable tracepoints Mykyta Yatsenko
2026-03-06  3:59   ` Kumar Kartikeya Dwivedi
2026-02-25 16:23 ` [PATCH bpf-next v2 2/6] bpf: Add sleepable execution path for raw tracepoint programs Mykyta Yatsenko
2026-02-25 17:12   ` bot+bpf-ci
2026-03-06  4:23     ` Kumar Kartikeya Dwivedi
2026-03-06  4:52       ` Kumar Kartikeya Dwivedi
2026-03-10 17:38         ` Yonghong Song [this message]
2026-02-25 16:23 ` [PATCH bpf-next v2 3/6] bpf: Remove preempt_disable from faultable tracepoint BPF callbacks Mykyta Yatsenko
2026-03-06  4:25   ` Kumar Kartikeya Dwivedi
2026-02-25 16:23 ` [PATCH bpf-next v2 4/6] bpf: Allow sleepable programs for BPF_TRACE_RAW_TP attach type Mykyta Yatsenko
2026-03-06  4:26   ` Kumar Kartikeya Dwivedi
2026-02-25 16:23 ` [PATCH bpf-next v2 5/6] libbpf: Add tp_btf.s section handler for sleepable raw tracepoints Mykyta Yatsenko
2026-03-06  4:26   ` Kumar Kartikeya Dwivedi
2026-02-25 16:23 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add tests for sleepable raw tracepoint programs Mykyta Yatsenko
2026-03-06  4:41   ` Kumar Kartikeya Dwivedi
2026-03-06 23:56     ` Kumar Kartikeya Dwivedi
2026-03-09 21:11   ` Jiri Olsa
2026-03-10  0:22     ` Mykyta Yatsenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b1c85fc5-e60a-46c4-bea9-4909dcf78579@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bot+bpf-ci@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=clm@meta.com \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=ihor.solodrai@linux.dev \
    --cc=kafai@meta.com \
    --cc=kernel-team@meta.com \
    --cc=martin.lau@kernel.org \
    --cc=memxor@gmail.com \
    --cc=mykyta.yatsenko5@gmail.com \
    --cc=yatsenko@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox