From: Tao Chen <chen.dylane@linux.dev>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: peterz@infradead.org, mingo@redhat.com, acme@kernel.org,
namhyung@kernel.org, mark.rutland@arm.com,
alexander.shishkin@linux.intel.com, jolsa@kernel.org,
irogers@google.com, adrian.hunter@intel.com,
kan.liang@linux.intel.com, song@kernel.org, ast@kernel.org,
daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev,
eddyz87@gmail.com, yonghong.song@linux.dev,
john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
haoluo@google.com, linux-perf-users@vger.kernel.org,
linux-kernel@vger.kernel.org, bpf@vger.kernel.org
Subject: Re: [PATCH bpf-next v8 2/3] perf: Refactor get_perf_callchain
Date: Thu, 5 Feb 2026 14:16:00 +0800 [thread overview]
Message-ID: <bb62bd4f-199e-4907-817f-bc820c9198c9@linux.dev> (raw)
In-Reply-To: <CAEf4BzYMJfQN+0SwzBM-DhREHwZ54TwA6GUtFv9=xret9pzXrQ@mail.gmail.com>
在 2026/2/4 09:08, Andrii Nakryiko 写道:
> On Sun, Jan 25, 2026 at 11:45 PM Tao Chen <chen.dylane@linux.dev> wrote:
>>
>> From BPF stack map, we want to ensure that the callchain buffer
>> will not be overwritten by other preemptive tasks and we also aim
>> to reduce the preempt disable interval, Based on the suggestions from Peter
>> and Andrrii, export new API __get_perf_callchain and the usage scenarios
>> are as follows from BPF side:
>>
>> preempt_disable()
>> entry = get_callchain_entry()
>> preempt_enable()
>> __get_perf_callchain(entry)
>> put_callchain_entry(entry)
>>
>> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
>> Signed-off-by: Tao Chen <chen.dylane@linux.dev>
>> ---
>> include/linux/perf_event.h | 5 +++++
>> kernel/events/callchain.c | 34 ++++++++++++++++++++++------------
>> 2 files changed, 27 insertions(+), 12 deletions(-)
>>
>
> Looking at the whole __bpf_get_stack() logic again, why didn't we just
> do something like this:
>
> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
> index da3d328f5c15..80364561611c 100644
> --- a/kernel/bpf/stackmap.c
> +++ b/kernel/bpf/stackmap.c
> @@ -460,8 +460,8 @@ static long __bpf_get_stack(struct pt_regs *regs,
> struct task_struct *task,
>
> max_depth = stack_map_calculate_max_depth(size, elem_size, flags);
>
> - if (may_fault)
> - rcu_read_lock(); /* need RCU for perf's callchain below */
> + if (!trace_in)
> + preempt_disable();
>
> if (trace_in) {
> trace = trace_in;
> @@ -474,8 +474,8 @@ static long __bpf_get_stack(struct pt_regs *regs,
> struct task_struct *task,
> }
>
> if (unlikely(!trace) || trace->nr < skip) {
> - if (may_fault)
> - rcu_read_unlock();
> + if (!trace_in)
> + preempt_enable();
> goto err_fault;
> }
>
> @@ -494,8 +494,8 @@ static long __bpf_get_stack(struct pt_regs *regs,
> struct task_struct *task,
> }
>
> /* trace/ips should not be dereferenced after this point */
> - if (may_fault)
> - rcu_read_unlock();
> + if (!trace_in)
> + preempt_enable();
>
> if (user_build_id)
> stack_map_get_build_id_offset(buf, trace_nr, user, may_fault);
>
>
> ?
>
> Build ID parsing is happening after we copied data from perf's
> callchain_entry into user-provided memory, so raw callchain retrieval
> can be done with preemption disabled, as it's supposed to be brief.
> Build ID parsing part which indeed might fault and be much slower will
> be done well after that (we even have a comment there saying that
> trace/ips should not be touched).
>
> Am I missing something?
Yes it looks good for bpf_get_stack, and I also proposed a similar
change previously. But Alexei suggested that we should reduce the
preemption-disabled section protected in bpf_get_stackid if we do like
bpf_get_stack. Maybe we can change it first for bpf_get_stack?
https://lore.kernel.org/bpf/20250922075333.1452803-1-chen.dylane@linux.dev/
>
> [...]
--
Best Regards
Tao Chen
next prev parent reply other threads:[~2026-02-05 6:16 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-26 7:43 [PATCH bpf-next v8 0/3] Pass external callchain entry to get_perf_callchain Tao Chen
2026-01-26 7:43 ` [PATCH bpf-next v8 1/3] perf: Add rctx in perf_callchain_entry Tao Chen
2026-01-26 8:03 ` bot+bpf-ci
2026-01-26 8:51 ` Tao Chen
2026-01-27 21:01 ` Andrii Nakryiko
2026-01-28 2:41 ` Tao Chen
2026-01-28 8:59 ` Peter Zijlstra
2026-01-28 16:52 ` Tao Chen
2026-01-28 18:59 ` Andrii Nakryiko
2026-01-29 3:03 ` Tao Chen
2026-01-26 7:43 ` [PATCH bpf-next v8 2/3] perf: Refactor get_perf_callchain Tao Chen
2026-01-27 21:07 ` Andrii Nakryiko
2026-01-28 2:42 ` Tao Chen
2026-01-28 9:10 ` Peter Zijlstra
2026-01-28 16:49 ` Tao Chen
2026-01-28 19:12 ` Andrii Nakryiko
2026-01-30 11:31 ` Peter Zijlstra
2026-01-30 20:04 ` Andrii Nakryiko
2026-02-02 19:59 ` Peter Zijlstra
2026-02-04 0:24 ` Andrii Nakryiko
2026-02-04 1:08 ` Andrii Nakryiko
2026-02-05 6:16 ` Tao Chen [this message]
2026-02-05 17:34 ` Andrii Nakryiko
2026-02-06 9:20 ` Tao Chen
2026-01-26 7:43 ` [PATCH bpf-next v8 3/3] bpf: Hold ther perf callchain entry until used completely Tao Chen
2026-01-27 21:35 ` Andrii Nakryiko
2026-01-28 4:21 ` Tao Chen
2026-01-28 19:13 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bb62bd4f-199e-4907-817f-bc820c9198c9@linux.dev \
--to=chen.dylane@linux.dev \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=haoluo@google.com \
--cc=irogers@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=kpsingh@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=martin.lau@linux.dev \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.