From: Mark Rutland <mark.rutland@arm.com>
To: Dylan Hatch <dylanbhatch@google.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>,
Weinan Liu <wnliu@google.com>, Will Deacon <will@kernel.org>,
Josh Poimboeuf <jpoimboe@kernel.org>,
Indu Bhagat <ibhagatgnu@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
Steven Rostedt <rostedt@goodmis.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Jiri Kosina <jikos@kernel.org>, Jens Remus <jremus@linux.ibm.com>,
Prasanna Kumar T S M <ptsm@linux.microsoft.com>,
Puranjay Mohan <puranjay@kernel.org>, Song Liu <song@kernel.org>,
joe.lawrence@redhat.com, linux-toolchains@vger.kernel.org,
linux-kernel@vger.kernel.org, live-patching@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
Randy Dunlap <rdunlap@infradead.org>
Subject: Re: [PATCH v5 8/8] unwind: arm64: Use sframe to unwind interrupt frames
Date: Tue, 12 May 2026 11:07:59 +0100 [thread overview]
Message-ID: <agL7_3RD8RrfiucX@J2N7QTR9R3> (raw)
In-Reply-To: <CADBMgpx9YxNUO6wLP7mYKxWW8L78Hk9gPwHrMjXUwPyUmGEu9w@mail.gmail.com>
On Mon, May 11, 2026 at 08:00:21PM -0700, Dylan Hatch wrote:
> Hi Mark,
>
> Thanks for all the feedback and help on this. I'm planning on getting
> your comments addressed in the coming days, but I have some initial
> clarifying questions.
>
> On Fri, May 1, 2026 at 9:46 AM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > Hi Dylan,
> >
> > Thanks for putting this together. I think this is looking pretty good.
> > However, there are some things that aren't quite right and need some
> > work, which I've commented on below.
> >
> > More generally, there are a few things that aren't addressed by this
> > series that we will also need to address. Importantly:
> >
> > (1) For correctness, we'll need to address a latent issue with unwinding
> > across an fgraph return trampoline, where the return address is
> > transiently unrecoverable.
> >
> > Before this series, that doesn't matter for livepatching because the
> > livepatching code isn't called synchronously within the fgraph
> > handler, and unwinds which cross an exception boundary are marked as
> > unreliable.
> >
> > After this series, that does matter as we can unwind across an
> > exception boundary, and might happen to interrupt that transient
> > window.
> >
> > I think we can solve that with some restructuring of that code,
> > restoring the original address *before* removing that from the
> > fgraph return stack, and ensuring that the unwinder can find it.
>
> If my understanding is correct, the issue arrises in return_to_handler
> as the return address is recovered:
>
> mov x0, sp
> bl ftrace_return_to_handler // addr = ftrace_return_to_hander(fregs);
> mov x30, x0 // restore the original return address
>
> Because ftrace_return_to_handler pops the return address from the
> return stack before it can be restored into the LR, it cannot be
> recovered.
Yes.
To be clear, please don't worry about solving that for the next version
of this series; let's get the SFrame bits into shape first. I just
wanted to highlight that there's some more general work that we'll need
to do.
I think we can *detect* this case (and mark the unwind as unreliable)
with some tiny changes to the arm64 code. I'm happy to put that
together.
> Based on this, I believe you are suggesting to restructure this code
> path such that the return address is removed from the return stack
> only after it has been restored to LR. But since kernel/trace/fgraph.c
> is core kernel code, will this end up requiring either (1) a similar
> restructuring of other arches supporting ftrace, or (2) an
> arm64-specific implementation of this recovery logic?
Yes, I am say that to *recover* the address we'd need to make changes to
core code.
In the mean time we can *detect* this case with some minimal changes to
arm64 code, and abort. As above, I'm happy to go put that together.
> It looks to me like there is essentially the same recovery pattern on
> other arches; is there a reason this transient unrecoverability isn't
> an issue for reliable unwind on other platforms?
Yep; on all architectures there's a transient period where the address
cannot be recovered. It's not a correctness issue so long as the
architecture detects this case and marks the unwind as unreliable.
IIUC x86 will mark the unwind as unreliable in this case.
I don't know whether other architectures detect this reliably. That's a
question for loonarch, parisc, powerpc, and s390 folk.
> > I'm not immediately sure whether kretprobes has a similar issue.
> >
> > (2) To make unwinding generally possible, we'll need to annotate some
> > assembly functions as unwindable. We'll need to do that for string
> > routines under lib/, and probably some crypto code, but we don't
> > need to do that for most code in head.S, entry.S, etc.
> >
> > The vast majority of relevant assembly functions are leaf functions
> > (where the return address is never moved out of the LR), so we can
> > probably get away with a simple annotation for those that avoids the
> > need for open-coded CFI directives everywhere.
>
> Are you suggesting something like a SYM_LEAF_FUNC_(START|END), that
> wraps CFI directives for leaf functions?
Yep; that's exactly the sort of thing I was thinking of.
That or have a seaprate annotation we can add, e.g.
SYM_FUNC_START(foo)
SYM_FUN_END(foo)
SYM_FUNC_IS_LEAF_AND_DOES_NOT_TOUCH_LR(foo)
> > I've pushed some reliable stacktrace tests to:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git stacktrace/tests
> >
> > That finds the fgraph issue (regardless of this series). When merged
> > with this series triggers a warning in kunwind_next_frame_record_meta(),
> > where unwind_next_frame_sframe() calls that erroneously as a fallback.
>
> Thanks for the pointer on these tests, they're super useful! I've been
> able to reproduce the fgraph failure you mentioned.
Great!
Mark.
prev parent reply other threads:[~2026-05-12 10:08 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260428183643.3796063-1-dylanbhatch@google.com>
[not found] ` <549d10b6-ba2b-4ae9-86ef-6157e13b6ee3@linux.ibm.com>
2026-05-12 1:10 ` [PATCH v5 0/8] unwind, arm64: add sframe unwinder for kernel Dylan Hatch
[not found] ` <20260428183643.3796063-9-dylanbhatch@google.com>
[not found] ` <afTYzAF_x41pyilu@J2N7QTR9R3>
[not found] ` <bc3fb59b-9d80-4957-af51-20db38e3487e@linux.ibm.com>
2026-05-05 10:29 ` [PATCH v5 8/8] unwind: arm64: Use sframe to unwind interrupt frames Mark Rutland
2026-05-05 15:52 ` Jens Remus
2026-05-12 3:00 ` Dylan Hatch
2026-05-12 8:55 ` Jens Remus
2026-05-12 10:18 ` Mark Rutland
2026-05-12 10:07 ` Mark Rutland [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=agL7_3RD8RrfiucX@J2N7QTR9R3 \
--to=mark.rutland@arm.com \
--cc=catalin.marinas@arm.com \
--cc=dylanbhatch@google.com \
--cc=ibhagatgnu@gmail.com \
--cc=jikos@kernel.org \
--cc=joe.lawrence@redhat.com \
--cc=jpoimboe@kernel.org \
--cc=jremus@linux.ibm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-toolchains@vger.kernel.org \
--cc=live-patching@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=ptsm@linux.microsoft.com \
--cc=puranjay@kernel.org \
--cc=rdunlap@infradead.org \
--cc=roman.gushchin@linux.dev \
--cc=rostedt@goodmis.org \
--cc=song@kernel.org \
--cc=will@kernel.org \
--cc=wnliu@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox