linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 1/1] powerpc: update ppc_save_regs to save current r1 in pt_regs
@ 2023-06-15  9:10 Aditya Gupta
  2023-06-15 12:10 ` Nicholas Piggin
  2023-07-03  5:26 ` Michael Ellerman
  0 siblings, 2 replies; 4+ messages in thread
From: Aditya Gupta @ 2023-06-15  9:10 UTC (permalink / raw)
  To: linuxppc-dev, mpe
  Cc: Sourabh Jain, Naveen N. Rao, Nicholas Piggin, Mahesh J Salgaonkar,
	Hari Bathini

ppc_save_regs() skips one stack frame while saving the CPU register states.
Instead of saving current R1, it pulls the previous stack frame pointer.

When vmcores caused by direct panic call (such as `echo c >
/proc/sysrq-trigger`), are debugged with gdb, gdb fails to show the
backtrace correctly. On further analysis, it was found that it was because
of mismatch between r1 and NIP.

GDB uses NIP to get current function symbol and uses corresponding debug
info of that function to unwind previous frames, but due to the
mismatching r1 and NIP, the unwinding does not work, and it fails to
unwind to the 2nd frame and hence does not show the backtrace.

GDB backtrace with vmcore of kernel without this patch:

---------
(gdb) bt
 #0  0xc0000000002a53e8 in crash_setup_regs (oldregs=<optimized out>,
    newregs=0xc000000004f8f8d8) at ./arch/powerpc/include/asm/kexec.h:69
 #1  __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:974
 #2  0x0000000000000063 in ?? ()
 #3  0xc000000003579320 in ?? ()
---------

Further analysis revealed that the mismatch occurred because
"ppc_save_regs" was saving the previous stack's SP instead of the current
r1. This patch fixes this by storing current r1 in the saved pt_regs.

GDB backtrace with vmcore of patched kernel:

--------
(gdb) bt
 #0  0xc0000000002a53e8 in crash_setup_regs (oldregs=0x0, newregs=0xc00000000670b8d8)
    at ./arch/powerpc/include/asm/kexec.h:69
 #1  __crash_kexec (regs=regs@entry=0x0) at kernel/kexec_core.c:974
 #2  0xc000000000168918 in panic (fmt=fmt@entry=0xc000000001654a60 "sysrq triggered crash\n")
    at kernel/panic.c:358
 #3  0xc000000000b735f8 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:155
 #4  0xc000000000b742cc in __handle_sysrq (key=key@entry=99, check_mask=check_mask@entry=false)
    at drivers/tty/sysrq.c:602
 #5  0xc000000000b7506c in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>,
    count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1163
 #6  0xc00000000069a7bc in pde_write (ppos=<optimized out>, count=<optimized out>,
    buf=<optimized out>, file=<optimized out>, pde=0xc00000000362cb40) at fs/proc/inode.c:340
 #7  proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>,
    ppos=<optimized out>) at fs/proc/inode.c:352
 #8  0xc0000000005b3bbc in vfs_write (file=file@entry=0xc000000006aa6b00,
    buf=buf@entry=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>,
    count=count@entry=2, pos=pos@entry=0xc00000000670bda0) at fs/read_write.c:582
 #9  0xc0000000005b4264 in ksys_write (fd=<optimized out>,
    buf=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>, count=2)
    at fs/read_write.c:637
 #10 0xc00000000002ea2c in system_call_exception (regs=0xc00000000670be80, r0=<optimized out>)
    at arch/powerpc/kernel/syscall.c:171
 #11 0xc00000000000c270 in system_call_vectored_common ()
    at arch/powerpc/kernel/interrupt_64.S:192
--------

Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
---

More information:

This problem with gdb backtrace was discovered while working on a crash
tool enhancement to improve crash analysis using gdb passthrough to be
able print function arguments and local variables inside crash tool. gdb
passthrough simply asks gdb to handle the backtrace printing, where it
was noticed that it could not print correct backtrace in some vmcores.

The changes introduced here has an implication on xmon, that it might show
one extra `xmon` frame in backtrace. By looking at older commits it seems
that originally the ppc_save_regs function was introduced as
xmon_save_regs(). But now the same function has been renamed to
ppc_save_regs() and been used in few other places as well.

Tested this patch with multiple ways of crashing:
1. direct panic call (`echo c > /proc/sysrq-trigger`)
2. null dereference/oops path (the earlier implementation of `sysrq_handle_crash`)
3. sys reset
4. sys reset inside qemu (to test for any regressions, that were fixed by
   commit d16a58f8854b19)

Changelog
- V2:
  - fixed bogus LR by storing caller's LR area as pointed out by Naveen and
  Nick

---
 arch/powerpc/kernel/ppc_save_regs.S | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/ppc_save_regs.S b/arch/powerpc/kernel/ppc_save_regs.S
index 49813f982468..a9b9c32d0c1f 100644
--- a/arch/powerpc/kernel/ppc_save_regs.S
+++ b/arch/powerpc/kernel/ppc_save_regs.S
@@ -31,10 +31,10 @@ _GLOBAL(ppc_save_regs)
 	lbz	r0,PACAIRQSOFTMASK(r13)
 	PPC_STL	r0,SOFTE(r3)
 #endif
-	/* go up one stack frame for SP */
-	PPC_LL	r4,0(r1)
-	PPC_STL	r4,GPR1(r3)
+	/* store current SP */
+	PPC_STL	r1,GPR1(r3)
 	/* get caller's LR */
+	PPC_LL	r4,0(r1)
 	PPC_LL	r0,LRSAVE(r4)
 	PPC_STL	r0,_LINK(r3)
 	mflr	r0
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH v2 1/1] powerpc: update ppc_save_regs to save current r1 in pt_regs
  2023-06-15  9:10 [RFC PATCH v2 1/1] powerpc: update ppc_save_regs to save current r1 in pt_regs Aditya Gupta
@ 2023-06-15 12:10 ` Nicholas Piggin
  2023-06-19  3:48   ` Aditya Gupta
  2023-07-03  5:26 ` Michael Ellerman
  1 sibling, 1 reply; 4+ messages in thread
From: Nicholas Piggin @ 2023-06-15 12:10 UTC (permalink / raw)
  To: Aditya Gupta, linuxppc-dev, mpe
  Cc: Sourabh Jain, Naveen N. Rao, Mahesh J Salgaonkar, Hari Bathini

On Thu Jun 15, 2023 at 7:10 PM AEST, Aditya Gupta wrote:
> ppc_save_regs() skips one stack frame while saving the CPU register states.
> Instead of saving current R1, it pulls the previous stack frame pointer.
>
> When vmcores caused by direct panic call (such as `echo c >
> /proc/sysrq-trigger`), are debugged with gdb, gdb fails to show the
> backtrace correctly. On further analysis, it was found that it was because
> of mismatch between r1 and NIP.
>
> GDB uses NIP to get current function symbol and uses corresponding debug
> info of that function to unwind previous frames, but due to the
> mismatching r1 and NIP, the unwinding does not work, and it fails to
> unwind to the 2nd frame and hence does not show the backtrace.
>
> GDB backtrace with vmcore of kernel without this patch:
>
> ---------
> (gdb) bt
>  #0  0xc0000000002a53e8 in crash_setup_regs (oldregs=<optimized out>,
>     newregs=0xc000000004f8f8d8) at ./arch/powerpc/include/asm/kexec.h:69
>  #1  __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:974
>  #2  0x0000000000000063 in ?? ()
>  #3  0xc000000003579320 in ?? ()
> ---------
>
> Further analysis revealed that the mismatch occurred because
> "ppc_save_regs" was saving the previous stack's SP instead of the current
> r1. This patch fixes this by storing current r1 in the saved pt_regs.
>
> GDB backtrace with vmcore of patched kernel:
>
> --------
> (gdb) bt
>  #0  0xc0000000002a53e8 in crash_setup_regs (oldregs=0x0, newregs=0xc00000000670b8d8)
>     at ./arch/powerpc/include/asm/kexec.h:69
>  #1  __crash_kexec (regs=regs@entry=0x0) at kernel/kexec_core.c:974
>  #2  0xc000000000168918 in panic (fmt=fmt@entry=0xc000000001654a60 "sysrq triggered crash\n")
>     at kernel/panic.c:358
>  #3  0xc000000000b735f8 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:155
>  #4  0xc000000000b742cc in __handle_sysrq (key=key@entry=99, check_mask=check_mask@entry=false)
>     at drivers/tty/sysrq.c:602
>  #5  0xc000000000b7506c in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>,
>     count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1163
>  #6  0xc00000000069a7bc in pde_write (ppos=<optimized out>, count=<optimized out>,
>     buf=<optimized out>, file=<optimized out>, pde=0xc00000000362cb40) at fs/proc/inode.c:340
>  #7  proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>,
>     ppos=<optimized out>) at fs/proc/inode.c:352
>  #8  0xc0000000005b3bbc in vfs_write (file=file@entry=0xc000000006aa6b00,
>     buf=buf@entry=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>,
>     count=count@entry=2, pos=pos@entry=0xc00000000670bda0) at fs/read_write.c:582
>  #9  0xc0000000005b4264 in ksys_write (fd=<optimized out>,
>     buf=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>, count=2)
>     at fs/read_write.c:637
>  #10 0xc00000000002ea2c in system_call_exception (regs=0xc00000000670be80, r0=<optimized out>)
>     at arch/powerpc/kernel/syscall.c:171
>  #11 0xc00000000000c270 in system_call_vectored_common ()
>     at arch/powerpc/kernel/interrupt_64.S:192
> --------
>
> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>

So this now saves regs as though it was an interrupt taken in the
caller, at the instruction after the call to ppc_save_regs, whereas
previously the NIP was there, but R1 came from the caller's caller
and that mismatch is what causes gdb's dwarf unwinder to go haywire.

Nice catch, and I think I follow the fix and I think I agree with it.
Before the bug was introduced, NIP also came from the grandparent.
Which is what xmon presumably wanted, but since then I think maybe it
makes more sense to just have the parent caller.

Reivewed-by: Nicholas Piggin <npiggin@gmail.com>
Fixes: d16a58f8854b1 ("powerpc: Improve ppc_save_regs()")

Thanks,
Nick

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH v2 1/1] powerpc: update ppc_save_regs to save current r1 in pt_regs
  2023-06-15 12:10 ` Nicholas Piggin
@ 2023-06-19  3:48   ` Aditya Gupta
  0 siblings, 0 replies; 4+ messages in thread
From: Aditya Gupta @ 2023-06-19  3:48 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev, mpe
  Cc: Hari Bathini, Naveen N. Rao, Mahesh J Salgaonkar, Sourabh Jain

On 15/06/23 17:40, Nicholas Piggin wrote:
> On Thu Jun 15, 2023 at 7:10 PM AEST, Aditya Gupta wrote:
>> ppc_save_regs() skips one stack frame while saving the CPU register states.
>> Instead of saving current R1, it pulls the previous stack frame pointer.
>>
>> ...
>>
>> So this now saves regs as though it was an interrupt taken in the
>> caller, at the instruction after the call to ppc_save_regs, whereas
>> previously the NIP was there, but R1 came from the caller's caller
>> and that mismatch is what causes gdb's dwarf unwinder to go haywire.
>>
>> Nice catch, and I think I follow the fix and I think I agree with it.
>> Before the bug was introduced, NIP also came from the grandparent.
>> Which is what xmon presumably wanted, but since then I think maybe it
>> makes more sense to just have the parent caller.
>>
>> Reivewed-by: Nicholas Piggin <npiggin@gmail.com>
>> Fixes: d16a58f8854b1 ("powerpc: Improve ppc_save_regs()")

Thanks for reviewing this, and providing a Fixes tag too.

Thanks
- Aditya


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH v2 1/1] powerpc: update ppc_save_regs to save current r1 in pt_regs
  2023-06-15  9:10 [RFC PATCH v2 1/1] powerpc: update ppc_save_regs to save current r1 in pt_regs Aditya Gupta
  2023-06-15 12:10 ` Nicholas Piggin
@ 2023-07-03  5:26 ` Michael Ellerman
  1 sibling, 0 replies; 4+ messages in thread
From: Michael Ellerman @ 2023-07-03  5:26 UTC (permalink / raw)
  To: linuxppc-dev, Aditya Gupta
  Cc: Hari Bathini, Naveen N. Rao, Nicholas Piggin, Sourabh Jain,
	Mahesh J Salgaonkar

On Thu, 15 Jun 2023 14:40:47 +0530, Aditya Gupta wrote:
> ppc_save_regs() skips one stack frame while saving the CPU register states.
> Instead of saving current R1, it pulls the previous stack frame pointer.
> 
> When vmcores caused by direct panic call (such as `echo c >
> /proc/sysrq-trigger`), are debugged with gdb, gdb fails to show the
> backtrace correctly. On further analysis, it was found that it was because
> of mismatch between r1 and NIP.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc: update ppc_save_regs to save current r1 in pt_regs
      https://git.kernel.org/powerpc/c/b684c09f09e7a6af3794d4233ef785819e72db79

cheers

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-07-03  5:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-15  9:10 [RFC PATCH v2 1/1] powerpc: update ppc_save_regs to save current r1 in pt_regs Aditya Gupta
2023-06-15 12:10 ` Nicholas Piggin
2023-06-19  3:48   ` Aditya Gupta
2023-07-03  5:26 ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).