From: Steven Rostedt <rostedt@kernel.org>
To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
bpf@vger.kernel.org, x86@kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Josh Poimboeuf <jpoimboe@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>, Jiri Olsa <jolsa@kernel.org>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Andrii Nakryiko <andrii@kernel.org>,
Indu Bhagat <indu.bhagat@oracle.com>,
"Jose E. Marchesi" <jemarch@gnu.org>,
Beau Belgrave <beaub@linux.microsoft.com>,
Jens Remus <jremus@linux.ibm.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Jens Axboe <axboe@kernel.dk>, Florian Weimer <fweimer@redhat.com>,
Sam James <sam@gentoo.org>
Subject: [PATCH v15 0/8] perf: Support the deferred unwinding infrastructure
Date: Mon, 25 Aug 2025 14:06:38 -0400 [thread overview]
Message-ID: <20250825180638.877627656@kernel.org> (raw)
This patch is based off of: https://lore.kernel.org/linux-trace-kernel/20250820180338.701352023@kernel.org
And requires these patches to be enabled: https://lore.kernel.org/linux-trace-kernel/20250820190546.172023727@kernel.org/
To run this series, you can checkout this repo that has this series as well as the above:
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git unwind/perf-test
This series implements the perf interface to use deferred user space stack
tracing.
The first 4 patches implement the kernel side of perf to do the deferred stack
tracing, and the last 4 patches implement the perf user space side to read
this new interface.
Patch 1 adds a new API interface to the user unwinder logic to allow perf to
get the current context cookie for it's task event tracing. Perf's task event
tracing maps a single task per perf event buffer and it follows the task
around, so it only needs to implement its own task_work to do the deferred
stack trace. Because it can still suffer not knowing which user stack trace
belongs to which kernel stack due to dropped events, having the cookie to
create a unique identifier for each user space stack trace to know which
kernel stack to append it to is useful.
Patch 2 adds the per task deferred stack traces to perf. It adds a new event
type called PERF_RECORD_CALLCHAIN_DEFERRED that is recorded when a task is
about to go back to user space and happens in a location that pages may be
faulted in. It also adds a new callchain context called PERF_CONTEXT_USER_DEFERRED
that is used as a place holder in a kernel callchain to append the deferred
user space stack trace to.
Patch 3 adds the user stack trace context cookie in the kernel callchain right
after the PERF_CONTEXT_USER_DEFERRED context so that the user space side can
map the request to the deferred user space stack trace.
Patch 4 adds support for the per CPU perf events that will allow the kernel to
associate each of the per CPU perf event buffers to a single application. This
is needed so that when a request for a deferred stack trace happens on a task
that then migrates to another CPU, it will know which CPU buffer to use to
record the stack trace on. It is possible to have more than one perf user tool
running and a request made by one perf tool should have the deferred trace go
to the same perf tool's perf CPU event buffer. A global list of all the
descriptors representing each perf tool that is using deferred stack tracing
is created to manage this.
The last 4 patches implement the perf user space tooling side of this.
Changes since v14: https://lore.kernel.org/linux-trace-kernel/20250718164119.089692174@kernel.org/
- Moved the clean up patches into their own series (mentioned at the beginning)
- Added unwind_user_get_cookie() API to allow the task events to add cookies
to differentiate which user stack belongs to which kernel stack in the event
of dropped events.
- Save the cookie in the kernel callchain right after the PERF_CONTEXT_USER_DEFERRED
- Have the perf user space tooling match the cookies as well as the TID
between the request and the user stack recording, to know which kernel stack
gets the user space trace appended to it based on its context cookie.
Josh Poimboeuf (1):
perf: Support deferred user callchains
Namhyung Kim (4):
perf tools: Minimal CALLCHAIN_DEFERRED support
perf record: Enable defer_callchain for user callchains
perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED
perf tools: Merge deferred user callchains
Steven Rostedt (3):
unwind deferred: Add unwind_user_get_cookie() API
perf: Have the deferred request record the user context cookie
perf: Support deferred user callchains for per CPU events
----
include/linux/perf_event.h | 11 +-
include/linux/unwind_deferred.h | 5 +
include/uapi/linux/perf_event.h | 25 +-
kernel/bpf/stackmap.c | 4 +-
kernel/events/callchain.c | 14 +-
kernel/events/core.c | 421 +++++++++++++++++++++++++++++-
kernel/unwind/deferred.c | 21 ++
tools/include/uapi/linux/perf_event.h | 25 +-
tools/lib/perf/include/perf/event.h | 8 +
tools/perf/Documentation/perf-script.txt | 5 +
tools/perf/builtin-script.c | 92 +++++++
tools/perf/util/callchain.c | 24 ++
tools/perf/util/callchain.h | 3 +
tools/perf/util/event.c | 1 +
tools/perf/util/evlist.c | 1 +
tools/perf/util/evlist.h | 1 +
tools/perf/util/evsel.c | 42 +++
tools/perf/util/evsel.h | 1 +
tools/perf/util/machine.c | 1 +
tools/perf/util/perf_event_attr_fprintf.c | 1 +
tools/perf/util/sample.h | 4 +-
tools/perf/util/session.c | 80 ++++++
tools/perf/util/tool.c | 2 +
tools/perf/util/tool.h | 4 +-
24 files changed, 786 insertions(+), 10 deletions(-)
next reply other threads:[~2025-08-25 18:07 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-25 18:06 Steven Rostedt [this message]
2025-08-25 18:06 ` [PATCH v15 1/8] unwind deferred: Add unwind_user_get_cookie() API Steven Rostedt
2025-08-25 18:06 ` [PATCH v15 2/8] perf: Support deferred user callchains Steven Rostedt
2025-08-25 18:06 ` [PATCH v15 3/8] perf: Have the deferred request record the user context cookie Steven Rostedt
2025-08-25 18:06 ` [PATCH v15 4/8] perf: Support deferred user callchains for per CPU events Steven Rostedt
2025-08-25 18:06 ` [PATCH v15 5/8] perf tools: Minimal CALLCHAIN_DEFERRED support Steven Rostedt
2025-08-25 18:06 ` [PATCH v15 6/8] perf record: Enable defer_callchain for user callchains Steven Rostedt
2025-08-25 18:06 ` [PATCH v15 7/8] perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED Steven Rostedt
2025-09-02 6:59 ` Namhyung Kim
2025-08-25 18:06 ` [PATCH v15 8/8] perf tools: Merge deferred user callchains Steven Rostedt
2025-09-02 7:07 ` Namhyung Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250825180638.877627656@kernel.org \
--to=rostedt@kernel.org \
--cc=acme@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=andrii@kernel.org \
--cc=axboe@kernel.dk \
--cc=beaub@linux.microsoft.com \
--cc=bpf@vger.kernel.org \
--cc=fweimer@redhat.com \
--cc=indu.bhagat@oracle.com \
--cc=jemarch@gnu.org \
--cc=jolsa@kernel.org \
--cc=jpoimboe@kernel.org \
--cc=jremus@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=sam@gentoo.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.