From: Nicolai Stange <nstange@suse.de>
To: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Jiri Kosina <jkosina@suse.cz>,
linux-kernel@vger.kernel.org, Torsten Duwe <duwe@lst.de>,
live-patching@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: ppc64le reliable stack unwinder and scheduled tasks
Date: Fri, 11 Jan 2019 01:00:38 +0100 [thread overview]
Message-ID: <87bm4ocbbt.fsf@suse.de> (raw)
In-Reply-To: <7f468285-b149-37e2-e782-c9e538b997a9@redhat.com> (Joe Lawrence's message of "Thu, 10 Jan 2019 16:14:00 -0500")
Hi Joe,
Joe Lawrence <joe.lawrence@redhat.com> writes:
> tl;dr: On ppc64le, what is top-most stack frame for scheduled tasks
> about?
If I'm reading the code in _switch() correctly, the first frame is
completely uninitialized except for the pointer back to the caller's
stack frame.
For completeness: _switch() saves the return address, i.e. the link
register into its parent's stack frame, as is mandated by the ABI and
consistent with your findings below: it's always the second stack frame
where the return address into __switch_to() is kept.
<snip>
>
>
> Example 1 (RHEL-7)
> ==================
>
> crash> struct task_struct.thread c00000022fd015c0 | grep ksp
> ksp = 0xc0000000288af9c0
>
> crash> rd 0xc0000000288af9c0 -e 0xc0000000288b0000
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>
> sp[0]:
>
> c0000000288af9c0: c0000000288afb90 0000000000dd0000 ...(............
> c0000000288af9d0: c000000000002a94 c000000001c60a00 .*..............
>
> crash> sym c000000000002a94
> c000000000002a94 (T) hardware_interrupt_common+0x114
So that c000000000002a94 certainly wasn't stored by _switch(). I think
what might have happened is that the switching frame aliased with some
prior interrupt frame as setup by hardware_interrupt_common().
The interrupt and switching frames seem to share a common layout as far
as the lower STACK_FRAME_OVERHEAD + sizeof(struct pt_regs) bytes are
concerned.
That address into hardware_interrupt_common() could have been written by
the do_IRQ() called from there.
> c0000000288af9e0: c000000001c60a80 0000000000000000 ................
> c0000000288af9f0: c0000000288afbc0 0000000000dd0000 ...(............
> c0000000288afa00: c0000000014322e0 c000000001c60a00 ."C.............
> c0000000288afa10: c0000002303ae380 c0000002303ae380 ..:0......:0....
> c0000000288afa20: 7265677368657265 0000000000002200 erehsger."......
>
> Uh-oh...
>
> /* Mark stacktraces with exception frames as unreliable. */
> stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER
Aliasing of the switching stack frame with some prior interrupt stack
frame would explain why that STACK_FRAME_REGS_MARKER is still found on
the stack, i.e. it's a leftover.
For testing, could you try whether clearing the word at STACK_FRAME_MARKER
from _switch() helps?
I.e. something like (completely untested):
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 435927f549c4..b747d0647ec4 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -596,6 +596,10 @@ _GLOBAL(_switch)
SAVE_8GPRS(14, r1)
SAVE_10GPRS(22, r1)
std r0,_NIP(r1) /* Return to switch caller */
+
+ li r23,0
+ std r23,96(r1) /* 96 == STACK_FRAME_MARKER * sizeof(long) */
+
mfcr r23
std r23,_CCR(r1)
std r1,KSP(r3) /* Set old stack pointer */
<snap>
>
> save_stack_trace_tsk_reliable
> =============================
>
> arch/powerpc/kernel/stacktrace.c :: save_stack_trace_tsk_reliable() does
> take into account the first stackframe, but only to verify that the link
> register is indeed pointing at kernel code address.
It's actually the other way around:
if (!firstframe && !__kernel_text_address(ip))
return 1;
So the address gets sanitized only if it's _not_ coming from the first
frame.
Thanks,
Nicolai
--
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
HRB 21284 (AG Nürnberg)
next prev parent reply other threads:[~2019-01-11 0:04 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-10 21:14 ppc64le reliable stack unwinder and scheduled tasks Joe Lawrence
2019-01-11 0:00 ` Nicolai Stange [this message]
2019-01-11 1:08 ` Joe Lawrence
2019-01-11 7:51 ` Nicolai Stange
2019-01-14 4:09 ` Joe Lawrence
2019-01-14 7:21 ` Nicolai Stange
2019-01-14 16:46 ` Joe Lawrence
2019-01-14 17:09 ` Josh Poimboeuf
2019-01-14 17:54 ` Joe Lawrence
2019-01-12 1:09 ` Balbir Singh
2019-01-12 8:45 ` Segher Boessenkool
2019-01-13 12:33 ` Balbir Singh
2019-01-13 13:05 ` Torsten Duwe
2019-01-17 14:52 ` Segher Boessenkool
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87bm4ocbbt.fsf@suse.de \
--to=nstange@suse.de \
--cc=duwe@lst.de \
--cc=jkosina@suse.cz \
--cc=joe.lawrence@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=live-patching@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).