linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [LSF/MM/BPF TOPIC]  sframe: An orc like stack unwinder for the kernel to get a user space stacktrace
@ 2023-02-06 15:38 Steven Rostedt
  0 siblings, 0 replies; only message in thread
From: Steven Rostedt @ 2023-02-06 15:38 UTC (permalink / raw)
  To: lsf-pc; +Cc: linux-fsdevel, linux-mm, bpf, Ross Zwisler, Jose E. Marchesi


Title: sframe: An orc like stack unwinder for the kernel to get a user space stacktrace 

Due to performance reasons, most applications do not enable frame
pointers (although Fedora has announced that they have done so[1]
Although, some of the maintainers that did, would prefer another
solution[2]) Thus getting a reliable user space stack trace for
profiling or tracing can be difficult. One method that perf uses, is to
grab a large section of the user space stack and save it into the ring
buffer, and then use dwarf to later parse this information. This is
slow and wastes a lot of ring buffer real-estate leading to lost events.

It also requires post processing and after the trace to find where
these locations exist (the kernel can at least figure out where in the
file the addresses are by using the proc/$$/maps data).

With the a new sframe section that has been introduced by binutils[3],
this will allow the kernel to get an accurate stack trace from user
space. This may even be extended to allow for symbol lookup from user
space as well.

The idea is the following:

1. On exec (binfmt_elf.c) the kernel could see that an sframe section
exists in the kernel and flag the task struct, and record some
information of where it exists.

2. perf/ftrace/bpf in any context (NMI, interrupt, etc) wants to take a
user space stack trace and would request one. This will set a flag in
the task struct to go the ptrace path before entering back into user
space. A callback function would need to be registered to handle this
as well.

3. On the ptrace path (where it's guaranteed to be back into normal
context and is allowed to fault) it would read the sframe section to
extract the stack trace (and possibly another section to retrieve a
symbol table if necessary). It would then call a list of callback
functions that were added by perf/ftrace/bpf with the stack trace and
allow them to record it.

We would probably need a system call of some kind to allow the dynamic
linker to notify the kernel of sframe sections that exist in libraries
that are loaded after the initial exec as well.

I'd like to discuss the above ideas about getting this implemented.

-- Steve


[1] https://lwn.net/Articles/919940/
[2] https://social.kernel.org/notice/ARHY3JdFu2WMYDb888
[3] https://www.phoronix.com/news/GNU-Binutils-SFrame

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-02-06 15:42 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-06 15:38 [LSF/MM/BPF TOPIC] sframe: An orc like stack unwinder for the kernel to get a user space stacktrace Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).