All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v5 0/3] Pass external callchain entry to get_perf_callchain
@ 2025-11-09 16:35 Tao Chen
  2025-11-09 16:35 ` [PATCH bpf-next v5 1/3] perf: Refactor get_perf_callchain Tao Chen
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Tao Chen @ 2025-11-09 16:35 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
	jolsa, irogers, adrian.hunter, kan.liang
  Cc: linux-perf-users, linux-kernel, bpf, Tao Chen

Background
==========
Alexei noted we should use preempt_disable to protect get_perf_callchain
in bpf stackmap.
https://lore.kernel.org/bpf/CAADnVQ+s8B7-fvR1TNO-bniSyKv57cH_ihRszmZV7pQDyV=VDQ@mail.gmail.com

A previous patch was submitted to attempt fixing this issue. And Andrii
suggested teach get_perf_callchain to let us pass that buffer directly to
avoid that unnecessary copy.
https://lore.kernel.org/bpf/20250926153952.1661146-1-chen.dylane@linux.dev

Proposed Solution
=================
Add external perf_callchain_entry parameter for get_perf_callchain to
allow us to use external buffer from BPF side. The biggest advantage is
that it can reduce unnecessary copies.

Todo
====
But I'm not sure if this modification is appropriate. After all, the
implementation of get_callchain_entry in the perf subsystem seems much more
complex than directly using an external buffer.

Comments and suggestions are always welcome.

Change list:
 - v1 -> v2
   From Jiri
   - rebase code, fix conflict
 - v1: https://lore.kernel.org/bpf/20251013174721.2681091-1-chen.dylane@linux.dev
 
 - v2 -> v3:
   From Andrii
   - entries per CPU used in a stack-like fashion
 - v2: https://lore.kernel.org/bpf/20251014100128.2721104-1-chen.dylane@linux.dev

 - v3 -> v4:
   From Peter
   - refactor get_perf_callchain and add three new APIs to use perf
     callchain easily.
   From Andrii
   - reuse the perf callchain management.

   - rename patch1 and patch2.
 - v3: https://lore.kernel.org/bpf/20251019170118.2955346-1-chen.dylane@linux.dev
 
 - v4 -> v5:
   From Yonghong
   - keep add_mark false in stackmap when refactor get_perf_callchain in
     patch1.
   - add atomic operation in get_recursion_context in patch2.
   - rename bpf_put_callchain_entry with bpf_put_perf_callchain in
     patch3.
   - rebase bpf-next master.
 - v4: https://lore.kernel.org/bpf/20251028162502.3418817-1-chen.dylane@linux.dev

Tao Chen (3):
  perf: Refactor get_perf_callchain
  perf: Add atomic operation in get_recursion_context
  bpf: Hold the perf callchain entry until used completely

 include/linux/perf_event.h |  9 +++++
 kernel/bpf/stackmap.c      | 62 +++++++++++++++++++++++++-------
 kernel/events/callchain.c  | 73 ++++++++++++++++++++++++--------------
 kernel/events/internal.h   |  5 +--
 4 files changed, 107 insertions(+), 42 deletions(-)

-- 
2.48.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-11-10  9:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-09 16:35 [PATCH bpf-next v5 0/3] Pass external callchain entry to get_perf_callchain Tao Chen
2025-11-09 16:35 ` [PATCH bpf-next v5 1/3] perf: Refactor get_perf_callchain Tao Chen
2025-11-09 16:58   ` bot+bpf-ci
2025-11-10  9:31     ` Tao Chen
2025-11-09 16:35 ` [PATCH bpf-next v5 2/3] perf: Add atomic operation in get_recursion_context Tao Chen
2025-11-10  8:52   ` Peter Zijlstra
2025-11-10  9:26     ` Tao Chen
2025-11-09 16:35 ` [PATCH bpf-next v5 3/3] bpf: Hold the perf callchain entry until used completely Tao Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.