public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: lsf-pc@lists.linux-foundation.org
Cc: bpf@vger.kernel.org, Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Blake Jones <blakejones@google.com>,
	Josh Poimboeuf <jpoimboe@kernel.org>,
	Indu Bhagat <indu.bhagat@oracle.com>
Subject: [LSF/MM/BPF TOPIC] BPF deferred stack trace unwinder
Date: Tue, 24 Feb 2026 08:56:43 -0500	[thread overview]
Message-ID: <20260224085643.16d9b682@fedora> (raw)

Hopefully this isn't too late but I got side tracked and never submitted.

We are currently working on enabling SFrames[1] to get user space stack
traces. This would allow us to retrieve reliable user space stack
traces from inside the kernel without the need for frame pointers. The
issue with SFrames is that they require reading user space memory. That
means reading the SFrame data can not be performed from an interrupt
context as the read will likely cause a major page fault.

To handle this, a deferred user stack trace unwinder was added to the
kernel[2]. How this works is to do the unwinding where it is safe to
have major page faults. The obvious place for that is just before going
back to user space (via task work). A tracer (like BPF) would call
unwind_deferred_init() to register itself with the unwinder and give it
a callback that gets called  before the task goes back to user space
with the user stack trace as well as a "cookie". BPF would call
unwind_deferred_request() at the time it wants the user space
stacktrace (could be in any context, like an interrupt or even an NMI).
That function returns a unique "cookie" that represents the user space
stack trace its callback will receive (along with the same cookie as a
parameter). Then BPF could record this cookie and have user space
mapping tie it together with whatever else was recorded then (like the
kernel stack trace).

Blake Jones brought up an issue that their tooling has with this
approach. That is it may be difficult to keep track of all the kernel
stack traces it needs to map to the user space stack trace. The reason
this is a problem is because the tooling saves the kernel and user
stack trace into a special hash. It is performed after a task is
scheduled out and back in and it records the time the task was off the
CPU with each kernel/user stack trace.

Currently the recording is done when the task schedules back in. But
due to faulting, it is not safe to call sframes unwinding at a moment
the task schedules in. It must wait until the task goes back to user
space and then tie all the kernel tracer to where the task scheduled
out with the user space stacktrace to create the hash and save it. As a
system call may have a hundred different places it can schedule out,
the tool can't cache all the kernel stack traces waiting for it to
schedule back in, as that would require saving a hundred stack traces
for every task. You can read more about the issue[3].

I would like to bring this topic up at LSF/MM/BPF.

-- Steve


[1] https://sourceware.org/binutils/docs/sframe-spec.html
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c6439bfaabf25b736154ac5640c677da2c085db4
[3] https://lore.kernel.org/all/20260126142118.2ea3cf13@gandalf.local.home/

                 reply	other threads:[~2026-02-24 13:56 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260224085643.16d9b682@fedora \
    --to=rostedt@goodmis.org \
    --cc=andrii.nakryiko@gmail.com \
    --cc=ast@kernel.org \
    --cc=blakejones@google.com \
    --cc=bpf@vger.kernel.org \
    --cc=indu.bhagat@oracle.com \
    --cc=irogers@google.com \
    --cc=jpoimboe@kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox