linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: Corentin Labbe <clabbe.montjoie@gmail.com>,
	Linus Walleij <linus.walleij@linaro.org>,
	Arnd Bergmann <arnd@arndb.de>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: boot flooded with unwind: Index not found
Date: Wed, 9 Mar 2022 00:01:26 +0000	[thread overview]
Message-ID: <YifuVmkcb1ie7bzk@shell.armlinux.org.uk> (raw)
In-Reply-To: <Yh9TdbWwHX/5Bhmt@shell.armlinux.org.uk>

On Wed, Mar 02, 2022 at 11:22:29AM +0000, Russell King (Oracle) wrote:
> On Wed, Mar 02, 2022 at 12:19:40PM +0100, Ard Biesheuvel wrote:
> > On Wed, 2 Mar 2022 at 12:12, Russell King (Oracle)
> > <linux@armlinux.org.uk> wrote:
> > >
> > > On Wed, Mar 02, 2022 at 11:09:49AM +0100, Corentin Labbe wrote:
> > > > The crash disappeared (but the suspicious RCU usage is still here).
> > >
> > > As the trace on those is:
> > >
> > > [    0.239629]  unwind_backtrace from show_stack+0x10/0x14
> > > [    0.239654]  show_stack from init_stack+0x1c54/0x2000
> > >
> > > unwind_backtrace() and show_stack() are both C code, the compiler will
> > > emit the unwind information for it. show_stack() isn't called from
> > > assembly code, only from C code, so the next function's unwind
> > > information should also be generated by the compiler.
> > >
> > > However, init_stack is not a function - it's an array of unsigned long.
> > > There is no way this should appear in the trace, and this suggests that
> > > the unwind of show_stack() has gone wrong.
> > >
> > > I don't see anything obvious in Ard's changes that would cause that
> > > though.
> > >
> > > Did it used to work fine with previous versions of linux-next - those
> > > versions where we had Ard's "arm-vmap-stacks-v6" tag merged in
> > > (commit 2fa394824493) and did this only appear when I merged
> > > "arm-ftrace-for-rmk" (commit 74aaaa1e9bba) ? Did merging
> > > "arm-ftrace-for-rmk" cause any change in your .config?
> > >
> > 
> > I can reproduce the RCU warnings, and I have tracked this down to the
> > change I made to return_address() for the graph tracer, which I
> > thought was justified after removing the call to
> > kernel_text_address():
> > 
> > --- a/arch/arm/include/asm/ftrace.h
> > +++ b/arch/arm/include/asm/ftrace.h
> > @@ -35,26 +35,8 @@ static inline unsigned long
> > ftrace_call_adjust(unsigned long addr)
> > 
> >  #ifndef __ASSEMBLY__
> > 
> > -#if defined(CONFIG_FRAME_POINTER) && !defined(CONFIG_ARM_UNWIND)
> > -/*
> > - * return_address uses walk_stackframe to do it's work.  If both
> > - * CONFIG_FRAME_POINTER=y and CONFIG_ARM_UNWIND=y walk_stackframe uses unwind
> > - * information.  For this to work in the function tracer many functions would
> > - * have to be marked with __notrace.  So for now just depend on
> > - * !CONFIG_ARM_UNWIND.
> > - */
> > -
> >  void *return_address(unsigned int);
> > 
> > -#else
> > -
> > -static inline void *return_address(unsigned int level)
> > -{
> > -       return NULL;
> > -}
> > -
> > -#endif
> > -
> >  #define ftrace_return_address(n) return_address(n)
> > 
> >  #define ARCH_HAS_SYSCALL_MATCH_SYM_NAME
> > 
> > However, the function graph tracer works happily with this bit
> > reverted, and so that is probably the best course of action here.
> > 
> > I have already sent the patch that reintroduces the
> > kernel_text_address() check - would you prefer a v2 of that one with
> > this change incorporated? Or a second patch that just reverts the
> > above? (Given that the bogus dereference was invoked from
> > return_address() as well, I suspect that this change would make the
> > get_kernel_nofault() change I proposed in this thread redundant)
> 
> I'd prefer patches on top of my devel-stable branch, thanks.

To reinterate what I've just put on IRC - we have not got to the bottom
of this problem yet - it still very much exists.

There seems to be something of a fundamental issue with the unwinder,
it now appears to be going wrong and failing to unwind beyond a
couple of functions, and the address it's coming out with appears to
be incorrect. I've only just discovered this because I created my very
own bug, and yet again, the timing sucks with the proximity of the
merge window.

I'm getting:

[   13.198803] [<c0017728>] (unwind_backtrace) from [<c0012828>] (show_stack+0x10/0x14)
[   13.198820] [<c0012828>] (show_stack) from [<c2be78d4>] (0xc2be78d4)

for the WARN_ON() stacktrace, and that address that apparently called
show_stack() is most definitely rubbish and incorrect. This makes any
WARN_ON() condition undebuggable.

This is with both 9183/1 and 9184/1 applied on top of pulling your
"arm-ftrace-for-rmk" tag and also with just the "arm-vmap-stacks-v6"
tag. This seems to point at one of these patches breaking the
unwinder:

a1c510d0adc6 ARM: implement support for vmap'ed stacks
532319b9c418 ARM: unwind: disregard unwind info before stack frame is set up
4ab6827081c6 ARM: unwind: dump exception stack from calling frame
b6506981f880 ARM: unwind: support unwinding across multiple stacks

Given that the unwinder is broken, I wonder whether 0183/1 and 9184/1
are actually required.

I did try to point this problem out a few emails back:

"As the trace on those is:

[    0.239629]  unwind_backtrace from show_stack+0x10/0x14
[    0.239654]  show_stack from init_stack+0x1c54/0x2000                        

unwind_backtrace() and show_stack() are both C code, the compiler will
emit the unwind information for it. show_stack() isn't called from
assembly code, only from C code, so the next function's unwind
information should also be generated by the compiler.

However, init_stack is not a function - it's an array of unsigned long.
There is no way this should appear in the trace, and this suggests that
the unwind of show_stack() has gone wrong."

In Corentin's case, there is no way init_stack should ever appear in
the stack trace. In my case, it's not init_stack, but 0xc2be78d4.

Can you try testing out a dummy WARN_ON(1) test in your kernel please?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-03-09  0:03 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-01 15:48 boot flooded with unwind: Index not found Corentin Labbe
2022-03-01 15:49 ` Russell King (Oracle)
2022-03-01 16:37   ` Ard Biesheuvel
2022-03-01 16:52     ` Ard Biesheuvel
2022-03-01 18:19       ` Corentin Labbe
2022-03-02  8:39       ` Corentin Labbe
2022-03-02  8:44         ` Ard Biesheuvel
     [not found]           ` <Yh8w7ldudhmbYv4N@Red>
2022-03-02  9:45             ` Ard Biesheuvel
2022-03-02 10:09               ` Corentin Labbe
2022-03-02 11:12                 ` Russell King (Oracle)
2022-03-02 11:19                   ` Ard Biesheuvel
2022-03-02 11:22                     ` Russell King (Oracle)
2022-03-09  0:01                       ` Russell King (Oracle) [this message]
2022-03-09  1:08                         ` Russell King (Oracle)
2022-03-09  7:20                           ` Ard Biesheuvel
2022-03-01 18:16   ` Corentin Labbe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YifuVmkcb1ie7bzk@shell.armlinux.org.uk \
    --to=linux@armlinux.org.uk \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=clabbe.montjoie@gmail.com \
    --cc=linus.walleij@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).