From: Baoquan He <bhe@redhat.com>
To: kexec@lists.infradead.org
Subject: [PATCH V3] panic: Move panic_print before kmsg dumpers
Date: Sat, 29 Jan 2022 16:00:27 +0800 [thread overview]
Message-ID: <20220129080027.GC17613@MiWiFi-R3L-srv> (raw)
In-Reply-To: <YfE1zuhB2Qz73wqF@alley>
On 01/26/22 at 12:51pm, Petr Mladek wrote:
> On Mon 2022-01-24 16:57:17, Michael Kelley (LINUX) wrote:
> > From: Baoquan He <bhe@redhat.com> Sent: Friday, January 21, 2022 8:34 PM
> > >
> > > On 01/21/22 at 03:00pm, Michael Kelley (LINUX) wrote:
> > > > From: Baoquan He <bhe@redhat.com> Sent: Thursday, January 20, 2022 6:31 PM
> > > > >
> > > > > On 01/20/22 at 06:36pm, Guilherme G. Piccoli wrote:
> > > > > > Hi Baoquan, some comments inline below:
> > > > > >
> > > > > > On 20/01/2022 05:51, Baoquan He wrote:
> >
> > [snip]
> >
> > > > > > Do you think it should be necessary?
> > > > > > How about if we allow users to just "panic_print" with or without the
> > > > > > "crash_kexec_post_notifiers", then we pursue Petr suggestion of
> > > > > > refactoring the panic notifiers? So, after this future refactor, we
> > > > > > might have a much clear code.
> > > > >
> > > > > I haven't read Petr's reply in another panic notifier filter thread. For
> > > > > panic notifier, it's only enforced to use on HyperV platform, excepto of
> > > > > that, users need to explicitly add "crash_kexec_post_notifiers=1" to enable
> > > > > it. And we got bug report on the HyperV issue. In our internal discussion,
> > > > > we strongly suggest HyperV dev to change the default enablement, instead
> > > > > leave it to user to decide.
> > > > >
> > > >
> > > > Regarding Hyper-V: Invoking the Hyper-V notifier prior to running the
> > > > kdump kernel is necessary for correctness. During initial boot of the
> > > > main kernel, the Hyper-V and VMbus code in Linux sets up several guest
> > > > physical memory pages that are shared with Hyper-V, and that Hyper-V
> > > > may write to. A VMbus connection is also established. Before kexec'ing
> > > > into the kdump kernel, the sharing of these pages must be rescinded
> > > > and the VMbus connection must be terminated. If this isn't done, the
> > > > kdump kernel will see strange memory overwrites if these shared guest
> > > > physical memory pages get used for something else.
> >
> > In the Azure cloud, collecting data before crash dumps is a motivation
> > as well for setting crash_kexec_post_notifiers to true. That way as
> > cloud operator we can see broad failure trends, and in specific cases
> > customers often expect the cloud operator to be able to provide info
> > about a problem even if they have taken a kdump. Where did you
> > envision adding a comment in the code to help clarify these intentions?
> >
> > I looked at the code again, and should revise my previous comments
> > somewhat. The Hyper-V resets that I described indeed must be done
> > prior to kexec'ing the kdump kernel. Most such resets are actually
> > done via __crash_kexec() -> machine_crash_shutdown(), not via the
> > panic notifier. However, the Hyper-V panic notifier must terminate the
> > VMbus connection, because that must be done even if kdump is not
> > being invoked. See commit 74347a99e73.
> >
> > Most of the hangs seen in getting into the kdump kernel on Hyper-V/Azure
> > were probably due to the machine_crash_shutdown() path, and not due
> > to running the panic notifiers prior to kexec'ing the kdump kernel. The
> > exception is terminating the VMbus connection, which had problems that
> > are hopefully now fixed because of adding a timeout.
>
> My undestanding is that we could split the actions into three groups:
>
> 1. Actions that has to be before kexec'ing kdump kernel, like
> resetting physicall memory shared with Hyper-V.
>
> These operation(s) are needed only for kexec and can be done
> in kexec.
>
>
> 2. Notify Hyper-V so that, for example, Azure cloud, could collect
> data before crash dump.
>
> It is nice to have.
>
> It should be configurable if it is not completely safe. I mean
> that there should be a way to disable it when it might increase
> the risk that kexec'ing kdump kernel might fail.
>
>
> 3. Some actions are needed only when panic() ends up in the
> infinite loop.
>
> For example, unloading vmbus channel. At least the commit
> 74347a99e73ae00b8385f ("x86/Hyper-V: Unload vmbus channel in
> hv panic callback") says that it is done in kdump path
> out of box.
>
> All these operations are needed and used only when the kernel is
> running under Hyper-V.
>
> My mine intention is to understand if we need 2 or 3 notifier lists
> or the current one is enough.
>
> The 3 notifier lists would be:
>
> + always do (even before kdump)
> + optionally do before or after kdump
> + needed only when kdump is not called
Totally agree with above suggesitons for Hyper-V. Cleanup as ablove
seems necesary. Stuffing them into panic_notifiers package is not
appropriate.
next prev parent reply other threads:[~2022-01-29 8:00 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-14 18:30 [PATCH V3] panic: Move panic_print before kmsg dumpers Guilherme G. Piccoli
2022-01-17 3:33 ` Baoquan He
2022-01-17 6:13 ` Baoquan He
2022-01-17 12:58 ` Guilherme G. Piccoli
2022-01-19 7:13 ` Baoquan He
2022-01-19 12:57 ` Guilherme G. Piccoli
2022-01-19 15:48 ` Petr Mladek
2022-01-19 16:03 ` Guilherme G. Piccoli
2022-01-20 9:39 ` Petr Mladek
2022-01-20 15:51 ` Guilherme G. Piccoli
2022-01-20 8:51 ` Baoquan He
2022-01-20 21:36 ` Guilherme G. Piccoli
2022-01-21 2:31 ` Baoquan He
2022-01-21 13:17 ` Guilherme G. Piccoli
2022-01-22 10:31 ` Baoquan He
2022-01-22 13:49 ` Guilherme G. Piccoli
2022-01-26 3:29 ` Baoquan He
2022-01-21 15:00 ` Michael Kelley
2022-01-22 4:33 ` Baoquan He
2022-01-24 16:57 ` Michael Kelley
2022-01-26 11:51 ` Petr Mladek
2022-01-29 8:00 ` Baoquan He [this message]
2022-02-02 17:43 ` Michael Kelley
2022-02-07 8:33 ` Baoquan He
2022-01-28 9:03 ` Baoquan He
2022-01-28 18:24 ` Michael Kelley
2022-01-29 7:42 ` Baoquan He
2022-01-19 18:38 ` Petr Mladek
2022-01-19 19:51 ` Guilherme G. Piccoli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220129080027.GC17613@MiWiFi-R3L-srv \
--to=bhe@redhat.com \
--cc=kexec@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).