public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tao Chen <chen.dylane@linux.dev>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: song@kernel.org, jolsa@kernel.org, ast@kernel.org,
	daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev,
	eddyz87@gmail.com, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, bpf@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH bpf-next v2 1/2] bpf: Add preempt disable for bpf_get_stack
Date: Wed, 11 Feb 2026 15:10:53 +0800	[thread overview]
Message-ID: <ecb7df62-d707-43dd-943b-98a452b2268e@linux.dev> (raw)
In-Reply-To: <CAEf4BzbSCbqg6y1KGg_j5xkK1=xsmOyK5ob9uTJiVcWgQ4jAJw@mail.gmail.com>

在 2026/2/7 01:12, Andrii Nakryiko 写道:
> On Fri, Feb 6, 2026 at 1:07 AM Tao Chen <chen.dylane@linux.dev> wrote:
>>
>> The get_perf_callchain() return values may be reused if a task is preempted
>> after the BPF program enters migrate disable mode, so we should add
>> preempt_disable. And as Andrii suggested, BPF can guarantee perf callchain
>> buffer won't be released during use, for bpf_get_stack_id, BPF stack map
>> will keep them alive by delaying put_callchain_buffer() until freeing time
>> or for bpf_get_stack/bpf_get_task_stack, BPF program itself will hold these
>> buffers alive again, until freeing time which is delayed until after
>> RCU Tasks Trace + RCU grace period.
>>
>> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
>> Signed-off-by: Tao Chen <chen.dylane@linux.dev>
>> ---
>>
>> Change list:
>>   - v1 -> v2
>>     - add preempt_disable for bpf_get_stack in patch1
>>     - add patch2
>>   - v1: https://lore.kernel.org/bpf/20260128165710.928294-1-chen.dylane@linux.dev
>>
>>   kernel/bpf/stackmap.c | 13 ++++++-------
>>   1 file changed, 6 insertions(+), 7 deletions(-)
>>
> 
> Hm... looking at bpf_get_stack_pe(), I'm not sure what's the exact
> guarantees around that ctx->data->callchain that we pass as
> trace_in... It looks like it's the same temporary per-cpu callchain as
> in other places, just attached (temporarily) to ctx. So we probably
> want preemption disabled/enabled for that one as well, no? And to

see commit "1d7bf6b7d3e8" (perf/bpf: Remove preempt disable around BPF 
invocation)
bpf_overflow_handler is called from NMI or at least hard
interrupt context which is already non-preemptible. So no preemption 
disabled needed.

> achieve that, I think we'll need to split out build_id logic out of
> __bpf_get_stack() and do it after preemption is enabled in the
> callers. Luckily it's not that much of a code and logic, should be
> easy. But please analyze this carefully yourself.
> 
> pw-bot: cr
> 
> 
>> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
>> index da3d328f5c1..1b100a03ef2 100644
>> --- a/kernel/bpf/stackmap.c
>> +++ b/kernel/bpf/stackmap.c
>> @@ -460,8 +460,8 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
>>
>>          max_depth = stack_map_calculate_max_depth(size, elem_size, flags);
>>
>> -       if (may_fault)
>> -               rcu_read_lock(); /* need RCU for perf's callchain below */
>> +       if (!trace_in)
>> +               preempt_disable();
>>
>>          if (trace_in) {
>>                  trace = trace_in;
>> @@ -474,8 +474,8 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
>>          }
>>
>>          if (unlikely(!trace) || trace->nr < skip) {
>> -               if (may_fault)
>> -                       rcu_read_unlock();
>> +               if (!trace_in)
>> +                       preempt_enable();
>>                  goto err_fault;
>>          }
>>
>> @@ -493,9 +493,8 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
>>                  memcpy(buf, ips, copy_len);
>>          }
>>
>> -       /* trace/ips should not be dereferenced after this point */
>> -       if (may_fault)
>> -               rcu_read_unlock();
>> +       if (!trace_in)
>> +               preempt_enable();
>>
>>          if (user_build_id)
>>                  stack_map_get_build_id_offset(buf, trace_nr, user, may_fault);
>> --
>> 2.48.1
>>


-- 
Best Regards
Tao Chen

      reply	other threads:[~2026-02-11  7:11 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-06  9:06 [PATCH bpf-next v2 1/2] bpf: Add preempt disable for bpf_get_stack Tao Chen
2026-02-06  9:06 ` [PATCH bpf-next v2 2/2] bpf: Add preempt disable for bpf_get_stackid Tao Chen
2026-02-06  9:34   ` bot+bpf-ci
2026-02-06  9:58     ` Tao Chen
2026-02-06 17:20   ` Andrii Nakryiko
2026-02-11  7:18     ` Tao Chen
2026-02-06 14:19 ` [syzbot ci] Re: bpf: Add preempt disable for bpf_get_stack syzbot ci
2026-02-06 17:12 ` [PATCH bpf-next v2 1/2] " Andrii Nakryiko
2026-02-11  7:10   ` Tao Chen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ecb7df62-d707-43dd-943b-98a452b2268e@linux.dev \
    --to=chen.dylane@linux.dev \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=sdf@fomichev.me \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox