From: Steven Rostedt <rostedt@kernel.org>
To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
bpf@vger.kernel.org, x86@kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Josh Poimboeuf <jpoimboe@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>, Jiri Olsa <jolsa@kernel.org>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Andrii Nakryiko <andrii@kernel.org>,
Indu Bhagat <indu.bhagat@oracle.com>,
"Jose E. Marchesi" <jemarch@gnu.org>,
Beau Belgrave <beaub@linux.microsoft.com>,
Jens Remus <jremus@linux.ibm.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Florian Weimer <fweimer@redhat.com>, Sam James <sam@gentoo.org>,
Kees Cook <kees@kernel.org>,
"Carlos O'Donell" <codonell@redhat.com>
Subject: [PATCH v10 00/11] unwind_deferred: Implement sframe handling
Date: Wed, 27 Aug 2025 16:15:48 -0400 [thread overview]
Message-ID: <20250827201548.448472904@kernel.org> (raw)
[
This version is simply a rebase of v9 on top of the v6.17-rc3.
It needs to be updated to work with the latest SFrame specification.
Indu said she'll be able to make those changes, but I needed to
forward port the latest code.
You can test this code with the x86 and perf changes applied at:
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
unwind/sframe-test
]
This is the implementation of parsing the SFrame section in an ELF file.
It's a continuation of Josh's last work that can be found here:
https://lore.kernel.org/all/cover.1737511963.git.jpoimboe@kernel.org/
Currently the only way to get a user space stack trace from a stack
walk (and not just copying large amount of user stack into the kernel
ring buffer) is to use frame pointers. This has a few issues. The biggest
one is that compiling frame pointers into every application and library
has been shown to cause performance overhead.
Another issue is that the format of the frames may not always be consistent
between different compilers and some architectures (s390) has no defined
format to do a reliable stack walk. The only way to perform user space
profiling on these architectures is to copy the user stack into the kernel
buffer.
SFrames[1] is now supported in gcc binutils and soon will also be supported
by LLVM. SFrames acts more like ORC, and lives in the ELF executable
file as its own section. Like ORC it has two tables where the first table
is sorted by instruction pointers (IP) and using the current IP and finding
it's entry in the first table, it will take you to the second table which
will tell you where the return address of the current function is located
and then you can use that address to look it up in the first table to find
the return address of that function, and so on. This performs a user
space stack walk.
Now because the SFrame section lives in the ELF file it needs to be faulted
into memory when it is used. This means that walking the user space stack
requires being in a faultable context. As profilers like perf request a stack
trace in interrupt or NMI context, it cannot do the walking when it is
requested. Instead it must be deferred until it is safe to fault in user
space. One place this is known to be safe is when the task is about to return
back to user space.
This series makes the deferred unwind code implement SFrames.
[1] https://sourceware.org/binutils/wiki/sframe
Changes since v9: https://lore.kernel.org/linux-trace-kernel/20250717012848.927473176@kernel.org/
- Rebased on v6.17-rc3
- Update the changes to unwind/user.c to handle passing a const
unwind_user_frame pointer.
Josh Poimboeuf (11):
unwind_user/sframe: Add support for reading .sframe headers
unwind_user/sframe: Store sframe section data in per-mm maple tree
x86/uaccess: Add unsafe_copy_from_user() implementation
unwind_user/sframe: Add support for reading .sframe contents
unwind_user/sframe: Detect .sframe sections in executables
unwind_user/sframe: Wire up unwind_user to sframe
unwind_user/sframe/x86: Enable sframe unwinding on x86
unwind_user/sframe: Remove .sframe section on detected corruption
unwind_user/sframe: Show file name in debug output
unwind_user/sframe: Add .sframe validation option
unwind_user/sframe: Add prctl() interface for registering .sframe sections
----
MAINTAINERS | 1 +
arch/Kconfig | 23 ++
arch/x86/Kconfig | 1 +
arch/x86/include/asm/mmu.h | 2 +-
arch/x86/include/asm/uaccess.h | 39 ++-
fs/binfmt_elf.c | 49 +++-
include/linux/mm_types.h | 3 +
include/linux/sframe.h | 60 ++++
include/linux/unwind_user_types.h | 4 +-
include/uapi/linux/elf.h | 1 +
include/uapi/linux/prctl.h | 6 +-
kernel/fork.c | 10 +
kernel/sys.c | 9 +
kernel/unwind/Makefile | 3 +-
kernel/unwind/sframe.c | 593 ++++++++++++++++++++++++++++++++++++++
kernel/unwind/sframe.h | 71 +++++
kernel/unwind/sframe_debug.h | 68 +++++
kernel/unwind/user.c | 41 ++-
mm/init-mm.c | 2 +
19 files changed, 967 insertions(+), 19 deletions(-)
create mode 100644 include/linux/sframe.h
create mode 100644 kernel/unwind/sframe.c
create mode 100644 kernel/unwind/sframe.h
create mode 100644 kernel/unwind/sframe_debug.h
next reply other threads:[~2025-08-27 20:24 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-27 20:15 Steven Rostedt [this message]
2025-08-27 20:15 ` [PATCH v10 01/11] unwind_user/sframe: Add support for reading .sframe headers Steven Rostedt
2025-08-27 20:15 ` [PATCH v10 02/11] unwind_user/sframe: Store sframe section data in per-mm maple tree Steven Rostedt
2025-08-28 1:46 ` Liam R. Howlett
2025-08-28 14:28 ` Steven Rostedt
2025-08-28 15:27 ` Liam R. Howlett
2025-08-28 15:51 ` Steven Rostedt
2025-08-27 20:15 ` [PATCH v10 03/11] x86/uaccess: Add unsafe_copy_from_user() implementation Steven Rostedt
2025-08-27 20:15 ` [PATCH v10 04/11] unwind_user/sframe: Add support for reading .sframe contents Steven Rostedt
2025-08-27 20:15 ` [PATCH v10 05/11] unwind_user/sframe: Detect .sframe sections in executables Steven Rostedt
2025-08-27 20:15 ` [PATCH v10 06/11] unwind_user/sframe: Wire up unwind_user to sframe Steven Rostedt
2025-08-27 20:15 ` [PATCH v10 07/11] unwind_user/sframe/x86: Enable sframe unwinding on x86 Steven Rostedt
2025-08-27 20:15 ` [PATCH v10 08/11] unwind_user/sframe: Remove .sframe section on detected corruption Steven Rostedt
2025-08-27 20:15 ` [PATCH v10 09/11] unwind_user/sframe: Show file name in debug output Steven Rostedt
2025-08-27 20:15 ` [PATCH v10 10/11] unwind_user/sframe: Add .sframe validation option Steven Rostedt
2025-08-27 20:15 ` [PATCH v10 11/11] [DO NOT APPLY]unwind_user/sframe: Add prctl() interface for registering .sframe sections Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250827201548.448472904@kernel.org \
--to=rostedt@kernel.org \
--cc=acme@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=andrii@kernel.org \
--cc=beaub@linux.microsoft.com \
--cc=bpf@vger.kernel.org \
--cc=codonell@redhat.com \
--cc=fweimer@redhat.com \
--cc=indu.bhagat@oracle.com \
--cc=jemarch@gnu.org \
--cc=jolsa@kernel.org \
--cc=jpoimboe@kernel.org \
--cc=jremus@linux.ibm.com \
--cc=kees@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=sam@gentoo.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).