All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: Kumar Kartikeya Dwivedi <memxor@gmail.com>, bot+bpf-ci@kernel.org
Cc: mykyta.yatsenko5@gmail.com, bpf@vger.kernel.org, ast@kernel.org,
	andrii@kernel.org, daniel@iogearbox.net, kafai@meta.com,
	kernel-team@meta.com, eddyz87@gmail.com, yatsenko@meta.com,
	martin.lau@kernel.org, clm@meta.com, ihor.solodrai@linux.dev
Subject: Re: [PATCH bpf-next v2 2/6] bpf: Add sleepable execution path for raw tracepoint programs
Date: Tue, 10 Mar 2026 10:38:26 -0700	[thread overview]
Message-ID: <b1c85fc5-e60a-46c4-bea9-4909dcf78579@linux.dev> (raw)
In-Reply-To: <CAP01T74FbF5FH70B2-6iB0BJEh8cLcM7Ok=vhUPJZe-SXeOYJg@mail.gmail.com>



On 3/5/26 8:52 PM, Kumar Kartikeya Dwivedi wrote:
> On Fri, 6 Mar 2026 at 05:23, Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>> On Wed, 25 Feb 2026 at 18:14, <bot+bpf-ci@kernel.org> wrote:
>>>> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
>>>> index 9bc0dfd235af..2dd345e0fdb0 100644
>>>> --- a/kernel/trace/bpf_trace.c
>>>> +++ b/kernel/trace/bpf_trace.c
>>> [ ... ]
>>>
>>>> @@ -2085,12 +2085,26 @@ void __bpf_trace_run(struct bpf_raw_tp_link *link, u64 *args)
>>>>        run_ctx.bpf_cookie = link->cookie;
>>>>        old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
>>>>
>>>> -     (void) bpf_prog_run(prog, args);
>>>> +     if (prog->sleepable) {
>>>> +             might_fault();
>>>> +             (void)bpf_prog_run(prog, args);
>>> At this commit, __BPF_DECLARE_TRACE_SYSCALL in include/trace/bpf_probe.h
>>> still wraps the BPF callback in preempt_disable_notrace() /
>>> preempt_enable_notrace():
>>>
>>> include/trace/bpf_probe.h:__BPF_DECLARE_TRACE_SYSCALL() {
>>>      might_fault();
>>>      preempt_disable_notrace();
>>>      bpf_trace_run##x(__data, ...);
>>>      preempt_enable_notrace();
>>> }
>>>
>>> Since __bpf_trace_run() is inlined into bpf_trace_run##x(), which is
>>> called between the preempt_disable_notrace() and
>>> preempt_enable_notrace(), the might_fault() here will execute with
>>> preempt_count > 0, producing a warning on CONFIG_DEBUG_ATOMIC_SLEEP
>>> kernels.
>>>
>>> More importantly, if the sleepable program calls a helper like
>>> bpf_copy_from_user(), the page fault handler will see in_atomic()
>>> and return -EFAULT for any non-resident page instead of faulting it
>>> in, defeating the purpose of sleepable programs.
>>>
>>> The v1 RFC series included a separate patch to remove
>>> preempt_disable_notrace() from __BPF_DECLARE_TRACE_SYSCALL.  Should
>>> that change be folded into this commit so the sleepable path works
>>> at each bisect point?
>> But sleepable programs are not yet permitted, so I think this is not
>> needed for bisection-friendliness.
>>
>>>> +     } else {
>>>> +             /*
>>>> +              * Non-sleepable programs may run in the faultable context,
>>>> +              * do cant_sleep() only if program is non-sleepable and context
>>>> +              * is non-faultable.
>>>> +              */
>>>> +             if (!link->link.sleepable)
>>>> +                     cant_sleep();
>>>> +             rcu_read_lock();
>>>> +             (void)bpf_prog_run(prog, args);
>>>> +             rcu_read_unlock();
>>>> +     }
>>> [ ... ]
>>>
>>> The commit message states:
>>>
>>>> For non-sleepable programs, behavior is unchanged: cant_sleep() check,
>>>> rcu_read_lock() protection.
>>> The parent commit does not have a cant_sleep() call in
>>> __bpf_trace_run() -- it uses rcu_read_lock_dont_migrate().  The
>>> conditional cant_sleep() is new in this commit.  Could this sentence
>>> be updated to reflect the actual change?
>> This seems worth adjusting.
> Actually, with the extra cant_sleep() check, we get this splat when
> running normal raw_tp tests.
> E.g.
>
> ...
> [  295.211949] BUG: assuming atomic context at kernel/trace/bpf_trace.c:2098
> [  295.212166] in_atomic(): 0, irqs_disabled(): 0, pid: 587, name: test_progs
> [  295.212316] 3 locks held by test_progs/587:
> ...
> [  295.213044] Call Trace:
> [  295.213046]  <TASK>
> [  295.213048]  dump_stack_lvl+0x54/0x70
> [  295.213055]  __cant_sleep+0xb7/0xd0
> [  295.213060]  bpf_trace_run1+0xce/0x340
> [  295.213072]  bpf_testmod_test_read+0x433/0x5e0 [bpf_testmod]
> [  295.213080]  ? srso_return_thunk+0x5/0x5f
> [  295.213093]  kernfs_fop_read_iter+0x166/0x220
> ...
>
> It means if the tracepoint is not marked faultable but is invoked in a
> non-atomic context, we will get a warning.
> It might be better to just drop this, since I don't think this adds much value.

Agree. This is due to the tracepoints defined in test_kmods where
tracepoints have default non-sleepable. So we have
   prog: non-sleepable
   test_kmods assumed context: non-sleepable
   actual context: sleepable

This caused the warning.

BTW, can we merge patches 2 and 3 together? I know the sleepable tp_btf
is not enabled yet in verifier. But just looking at patch 2 is a little
bit confusing as its caller still has preempt_{disable,enable}_notrace().

>
>
>>>
>> Other than that LGTM.
>>
>> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
>>
>>> [...]


  reply	other threads:[~2026-03-10 17:38 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-25 16:23 [PATCH bpf-next v2 0/6] bpf: Add support for sleepable raw tracepoint programs Mykyta Yatsenko
2026-02-25 16:23 ` [PATCH bpf-next v2 1/6] bpf: Reject sleepable raw_tp programs on non-faultable tracepoints Mykyta Yatsenko
2026-03-06  3:59   ` Kumar Kartikeya Dwivedi
2026-02-25 16:23 ` [PATCH bpf-next v2 2/6] bpf: Add sleepable execution path for raw tracepoint programs Mykyta Yatsenko
2026-02-25 17:12   ` bot+bpf-ci
2026-03-06  4:23     ` Kumar Kartikeya Dwivedi
2026-03-06  4:52       ` Kumar Kartikeya Dwivedi
2026-03-10 17:38         ` Yonghong Song [this message]
2026-02-25 16:23 ` [PATCH bpf-next v2 3/6] bpf: Remove preempt_disable from faultable tracepoint BPF callbacks Mykyta Yatsenko
2026-03-06  4:25   ` Kumar Kartikeya Dwivedi
2026-02-25 16:23 ` [PATCH bpf-next v2 4/6] bpf: Allow sleepable programs for BPF_TRACE_RAW_TP attach type Mykyta Yatsenko
2026-03-06  4:26   ` Kumar Kartikeya Dwivedi
2026-02-25 16:23 ` [PATCH bpf-next v2 5/6] libbpf: Add tp_btf.s section handler for sleepable raw tracepoints Mykyta Yatsenko
2026-03-06  4:26   ` Kumar Kartikeya Dwivedi
2026-02-25 16:23 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add tests for sleepable raw tracepoint programs Mykyta Yatsenko
2026-03-06  4:41   ` Kumar Kartikeya Dwivedi
2026-03-06 23:56     ` Kumar Kartikeya Dwivedi
2026-03-09 21:11   ` Jiri Olsa
2026-03-10  0:22     ` Mykyta Yatsenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b1c85fc5-e60a-46c4-bea9-4909dcf78579@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bot+bpf-ci@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=clm@meta.com \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=ihor.solodrai@linux.dev \
    --cc=kafai@meta.com \
    --cc=kernel-team@meta.com \
    --cc=martin.lau@kernel.org \
    --cc=memxor@gmail.com \
    --cc=mykyta.yatsenko5@gmail.com \
    --cc=yatsenko@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.