public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tao Chen <chen.dylane@linux.dev>
To: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@redhat.com, acme@kernel.org, namhyung@kernel.org,
	mark.rutland@arm.com, alexander.shishkin@linux.intel.com,
	jolsa@kernel.org, irogers@google.com, adrian.hunter@intel.com,
	kan.liang@linux.intel.com, song@kernel.org, ast@kernel.org,
	daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev,
	eddyz87@gmail.com, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, linux-perf-users@vger.kernel.org,
	linux-kernel@vger.kernel.org, bpf@vger.kernel.org
Subject: Re: [PATCH bpf-next v8 2/3] perf: Refactor get_perf_callchain
Date: Thu, 29 Jan 2026 00:49:43 +0800	[thread overview]
Message-ID: <e3e37e53-59ce-4fd8-8e4c-a3c05acda497@linux.dev> (raw)
In-Reply-To: <20260128091033.GG3372621@noisy.programming.kicks-ass.net>

在 2026/1/28 17:10, Peter Zijlstra 写道:
> On Mon, Jan 26, 2026 at 03:43:30PM +0800, Tao Chen wrote:
>>  From BPF stack map, we want to ensure that the callchain buffer
>> will not be overwritten by other preemptive tasks and we also aim
>> to reduce the preempt disable interval, Based on the suggestions from Peter
>> and Andrrii, export new API __get_perf_callchain and the usage scenarios
>> are as follows from BPF side:
>>
>> preempt_disable()
>> entry = get_callchain_entry()
>> preempt_enable()
>> __get_perf_callchain(entry)
>> put_callchain_entry(entry)
> 
> That makes no sense, this means any other task on that CPU is getting
> screwed over.
> 
> Why are you worried about the preempt_disable() here? If this were an
> interrupt context we'd still do that unwind -- but then with IRQs
> disabled.

Hi Peter,

Right now, obtaining stack information in BPF includes 2 steps:
1.get callchain
2.store callchain in bpf map or copy to buffer

There is no preempt disable in BPF now, When obtaining the stack 
information of Process A, Process A may be preempted by Process B. With 
the same logic, we then acquire the stack information of Process B. 
However, when execution resumes to Process A, the callchain buffer will 
store the stack information of Process B. Because each context(task, 
soft irq, irq, nmi) has only one callchain entry.

       taskA                             taskB
1.callchain(A) = get_perf_callchain
		<-- preepmted by B   callchain(B) = get_perf_callchain	
2.stack_map(callchain(B))
	

So we want to ensure that when task A is in use, the preepmt task B 
cannot be used. The approach involves deferring the put_callchain_entry 
until the stack is captured and saved in the stack_map.

       taskA                             taskB
1.callchain(A) = __get_perf_callchain
		<-- preepmted by B   callchain(B) = __get_perf_callchain
2.stack_map(callchain(A))
3.put_callchain_entry()		
	
taskB can not get the callchain because taskA hold it.

And the preempt_disable() for get_callchain_entry was suggested from 
Yonghong in v4
https://lore.kernel.org/bpf/c352f357-1417-47b5-9d8c-28d99f20f5a6@linux.dev/

Please correct me if I'm mistaken. Thanks.

-- 
Best Regards
Tao Chen

  reply	other threads:[~2026-01-28 16:50 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-26  7:43 [PATCH bpf-next v8 0/3] Pass external callchain entry to get_perf_callchain Tao Chen
2026-01-26  7:43 ` [PATCH bpf-next v8 1/3] perf: Add rctx in perf_callchain_entry Tao Chen
2026-01-26  8:03   ` bot+bpf-ci
2026-01-26  8:51     ` Tao Chen
2026-01-27 21:01       ` Andrii Nakryiko
2026-01-28  2:41         ` Tao Chen
2026-01-28  8:59   ` Peter Zijlstra
2026-01-28 16:52     ` Tao Chen
2026-01-28 18:59       ` Andrii Nakryiko
2026-01-29  3:03         ` Tao Chen
2026-01-26  7:43 ` [PATCH bpf-next v8 2/3] perf: Refactor get_perf_callchain Tao Chen
2026-01-27 21:07   ` Andrii Nakryiko
2026-01-28  2:42     ` Tao Chen
2026-01-28  9:10   ` Peter Zijlstra
2026-01-28 16:49     ` Tao Chen [this message]
2026-01-28 19:12     ` Andrii Nakryiko
2026-01-30 11:31       ` Peter Zijlstra
2026-01-30 20:04         ` Andrii Nakryiko
2026-02-02 19:59           ` Peter Zijlstra
2026-02-04  0:24             ` Andrii Nakryiko
2026-02-04  1:08   ` Andrii Nakryiko
2026-02-05  6:16     ` Tao Chen
2026-02-05 17:34       ` Andrii Nakryiko
2026-02-06  9:20         ` Tao Chen
2026-01-26  7:43 ` [PATCH bpf-next v8 3/3] bpf: Hold ther perf callchain entry until used completely Tao Chen
2026-01-27 21:35   ` Andrii Nakryiko
2026-01-28  4:21     ` Tao Chen
2026-01-28 19:13       ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e3e37e53-59ce-4fd8-8e4c-a3c05acda497@linux.dev \
    --to=chen.dylane@linux.dev \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=haoluo@google.com \
    --cc=irogers@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=martin.lau@linux.dev \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sdf@fomichev.me \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox