Re: [PATCH] kgdb: Flush console before entering kgdb on panic

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Daniel Thompson <daniel.thompson@linaro.org>
To: Petr Mladek <pmladek@suse.com>
Cc: Doug Anderson <dianders@chromium.org>,
	Jason Wessel <jason.wessel@windriver.com>,
	kgdb-bugreport@lists.sourceforge.net,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] kgdb: Flush console before entering kgdb on panic
Date: Mon, 16 Oct 2023 15:00:18 +0100	[thread overview]
Message-ID: <20231016140018.GA35190@aspen.lan> (raw)
In-Reply-To: <ZQrfGho-M9ZHQe9F@alley>

On Wed, Sep 20, 2023 at 02:01:30PM +0200, Petr Mladek wrote:
> On Wed 2023-08-30 10:53:59, Daniel Thompson wrote:
> > On Fri, Aug 25, 2023 at 07:18:44AM -0700, Doug Anderson wrote:
> > > Hi,
> > >
> > > On Fri, Aug 25, 2023 at 3:09 AM Daniel Thompson
> > > <daniel.thompson@linaro.org> wrote:
> > > >
> > > > On Tue, Aug 22, 2023 at 01:19:46PM -0700, Douglas Anderson wrote:
> > > > > When entering kdb/kgdb on a kernel panic, it was be observed that the
> > > > > console isn't flushed before the `kdb` prompt came up. Specifically,
> > > > > when using the buddy lockup detector on arm64 and running:
> > > > >   echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT
> > > > >
> > > > > I could see:
> > > > >   [   26.161099] lkdtm: Performing direct entry HARDLOCKUP
> > > > >   [   32.499881] watchdog: Watchdog detected hard LOCKUP on cpu 6
> > > > >   [   32.552865] Sending NMI from CPU 5 to CPUs 6:
> > > > >   [   32.557359] NMI backtrace for cpu 6
> > > > >   ... [backtrace for cpu 6] ...
> > > > >   [   32.558353] NMI backtrace for cpu 5
> > > > >   ... [backtrace for cpu 5] ...
> > > > >   [   32.867471] Sending NMI from CPU 5 to CPUs 0-4,7:
> > > > >   [   32.872321] NMI backtrace forP cpuANC: Hard LOCKUP
> > > > >
> > > > >   Entering kdb (current=..., pid 0) on processor 5 due to Keyboard Entry
> > > > >   [5]kdb>
> > > > >
> > > > > As you can see, backtraces for the other CPUs start printing and get
> > > > > interleaved with the kdb PANIC print.
> > > > >
> > > > > Let's replicate the commands to flush the console in the kdb panic
> > > > > entry point to avoid this.
> > > > >
> > > > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > > > > ---
> > > > >
> > > > >  kernel/debug/debug_core.c | 3 +++
> > > > >  1 file changed, 3 insertions(+)
> > > > >
> > > > > diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> > > > > index d5e9ccde3ab8..3a904d8697c8 100644
> > > > > --- a/kernel/debug/debug_core.c
> > > > > +++ b/kernel/debug/debug_core.c
> > > > > @@ -1006,6 +1006,9 @@ void kgdb_panic(const char *msg)
> > > > >       if (panic_timeout)
> > > > >               return;
> > > > >
> > > > > +     debug_locks_off();
> > > > > +     console_flush_on_panic(CONSOLE_FLUSH_PENDING);
> > > > > +
> > > > >       if (dbg_kdb_mode)
> > > > >               kdb_printf("PANIC: %s\n", msg);
> > > >
> > > > I'm somewhat included to say *this* (calling kdb_printf() when not
> > > > actually in the debugger) is the cause of the problem. kdb_printf()
> > > > does some pretty horid things to the console and isn't intended to
> > > > run while the system is active.
> > > >
> > > > I'd therefore be more tempted to defer the print to the b.p. trap
> > > > handler itself and make this part of kgdb_panic() look more like:
> > > >
> > > >         kgdb_panic_msg = msg;
> > > >         kgdb_breakpoint();
> > > >         kgdb_panic_msg = NULL;
> > >
> > > Unfortunately I think that only solves half the problem. As a quick
> > > test, I tried simply commenting out the "kdb_printf" line in
> > > kgdb_panic(). While that avoids the interleaved panic message and
> > > backtrace, it does nothing to actually get the backtraces printed out
> > > before you end up in kdb. As an example, this is what happened when I
> > > used `echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT` and
> > > had the "kdb_printf" in kgdb_panic() commented out:
> > >
> > > [   72.658424] lkdtm: Performing direct entry HARDLOCKUP
> > > [   82.181857] watchdog: Watchdog detected hard LOCKUP on cpu 6
> > > ...
> > > [   82.234801] Sending NMI from CPU 5 to CPUs 6:
> > > [   82.239296] NMI backtrace for cpu 6
> > > ... [ stack trace for CPU 6 ] ...
> > > [   82.240294] NMI backtrace for cpu 5
> > > ... [ stack trace for CPU 5 ] ...
> > > [   82.576443] Sending NMI from CPU 5 to CPUs 0-4,7:
> > > [   82.581291] NMI backtrace
> > > Entering kdb (current=0xffffff80da5a1080, pid 6978) on processor 5 due
> > > to Keyboard Entry
> > > [5]kdb>
> > >
> > > As you can see, I don't see the traces for CPUs 0-4 and 7. Those do
> > > show up if I use the "dmesg" command but it's a bit of a hassle to run
> > > "dmesg" to look for any extra debug messages every time I drop in kdb.
> > >
> > > I guess perhaps that part isn't obvious from the commit message?
> >
> > I figured it was a risk.
> >
> > In fact it's an area where my instinct to honour console messages and my
> > instinct to get into the kernel as soon as possible after the decision
> > to invoke it has been made come into conflict.
> >
> > In other words does it matter that the console buffers are not flushed
> > before entering kgdb? However having thought about it for a little while
> > (and knowing the console code tends to be written to be decently robust)
> > I can come to the view the flushing is best.
>
> Just for record. I wondered if there was bug in printk() that the
> backtraces were not flushed. And I think that it is because they
> are printed in NMI context where the conosle handling is deferred.
>
> From this POV, the explicit flush makes perfect sense.

Agreed... and (with apologies for the length of time to get this
done):
Applied!


Thanks

Daniel.

     prev parent reply	other threads:[~2023-10-16 14:00 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-22 20:19 [PATCH] kgdb: Flush console before entering kgdb on panic Douglas Anderson
2023-08-25 10:08 ` Daniel Thompson
2023-08-25 14:18   ` Doug Anderson
2023-08-30  9:53     ` Daniel Thompson
2023-09-20 12:01       ` Petr Mladek
2023-10-16 14:00         ` Daniel Thompson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231016140018.GA35190@aspen.lan \
    --to=daniel.thompson@linaro.org \
    --cc=dianders@chromium.org \
    --cc=jason.wessel@windriver.com \
    --cc=kgdb-bugreport@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox