public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Gowans, James" <jgowans@amazon.com>
To: "pbonzini@redhat.com" <pbonzini@redhat.com>, "Graf (AWS),
	Alexander" <graf@amazon.de>,
	"seanjc@google.com" <seanjc@google.com>,
	"Schönherr, Jan H." <jschoenh@amazon.de>,
	"ebiederm@xmission.com" <ebiederm@xmission.com>
Cc: "yuzenghui@huawei.com" <yuzenghui@huawei.com>,
	"kvm-riscv@lists.infradead.org" <kvm-riscv@lists.infradead.org>,
	"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
	"james.morse@arm.com" <james.morse@arm.com>,
	"oliver.upton@linux.dev" <oliver.upton@linux.dev>,
	"suzuki.poulose@arm.com" <suzuki.poulose@arm.com>,
	"chenhuacai@kernel.org" <chenhuacai@kernel.org>,
	"atishp@atishpatra.org" <atishp@atishpatra.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"maz@kernel.org" <maz@kernel.org>,
	"kvmarm@lists.linux.dev" <kvmarm@lists.linux.dev>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"aleksandar.qemu.devel@gmail.com"
	<aleksandar.qemu.devel@gmail.com>,
	"anup@brainfault.org" <anup@brainfault.org>
Subject: Re: [PATCH v2 1/2] KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdown
Date: Mon, 11 Dec 2023 18:47:50 +0000	[thread overview]
Message-ID: <55ef5cbf44bb73030d6387aa181393b8cb044f77.camel@amazon.com> (raw)
In-Reply-To: <ZXdIIBUXcCIK28ys@google.com>

On Mon, 2023-12-11 at 09:34 -0800, Sean Christopherson wrote:
> On Sat, Dec 09, 2023, James Gowans wrote:
> > Thoughts on possible ways to fix this:
> > a) go back to reboot notifiers
> > b) get kexec to call syscore_shutdown() to invoke all of these callbacks
> > c) Add a KVM-specific callback to native_machine_shutdown(); we only
> > need this for Intel x86, right?
> 
> I don't like (c).  VMX is the most sensitive/problematic, e.g. the whole blocking
> of INIT thing, but SVM can also run afoul of EFER.SVME being cleared, and KVM really
> should leave virtualization enabled across kexec(), even if leaving virtualization
> enabled is relatively benign on other architectures.

Good to know. Agreed that clean shutdown in all cases is best and we
discard (c).
> 
> One more option would be:
> 
>  d) Add another sycore hook, e.g. syscore_kexec() specifically for this path.
> 
> > My slight preference is towards adding syscore_shutdown() to kexec, but
> > I'm not sure that's feasible. Adding kexec maintainers for input.
> 
> In a vacuum, that'd be my preference too.  It's the obvious choice IMO, e.g. the
> kexec_image->preserve_context path does syscore_suspend() (and then resume(), so
> it's not completely uncharted territory.
> 
> However, there's a rather big wrinkle in that not all of the existing .shutdown()
> implementations are obviously ok to call during kexec.  Luckily, AFAICT there are
> very few users of the syscore .shutdown hook, so it's at least feasible to go that
> route.
> 
> x86's mce_syscore_shutdown() should be ok, and arguably is correct, e.g. I don't
> see how leaving #MC reporting enabled across kexec can work.
> 
> ledtrig_cpu_syscore_shutdown() is also likely ok and arguably correct.

I like your observation here that we probably have other misses like MCE
which should be shut down too - that's a hint that adding
syscore_shutdown() to kexec is the way to go.

> 
> The interrupt controllers though?  x86 disables IRQs at the very beginning of
> machine_kexec(), so it's likely fine.  But every other architecture?  No clue.
> E.g. PPC's default_machine_kexec() sends IPIs to shutdown other CPUs, though I
> have no idea if that can run afoul of any of the paths below.
> 
>         arch/powerpc/platforms/cell/spu_base.c  .shutdown = spu_shutdown,
>         arch/x86/kernel/cpu/mce/core.c          .shutdown = mce_syscore_shutdown,
>         arch/x86/kernel/i8259.c                 .shutdown = i8259A_shutdown,
>         drivers/irqchip/irq-i8259.c             .shutdown = i8259A_shutdown,
>         drivers/irqchip/irq-sun6i-r.c           .shutdown = sun6i_r_intc_shutdown,
>         drivers/leds/trigger/ledtrig-cpu.c      .shutdown = ledtrig_cpu_syscore_shutdown,
>         drivers/power/reset/sc27xx-poweroff.c   .shutdown = sc27xx_poweroff_shutdown,
>         kernel/irq/generic-chip.c               .shutdown = irq_gc_shutdown,
>         virt/kvm/kvm_main.c                     .shutdown = kvm_shutdown,
> 
> The whole thing is a bit of a mess.  E.g. x86 treats machine_shutdown() from
> kexec pretty much the same as shutdown for reboot, but other architectures have
> what appear to be unique paths for handling kexec.
> 
> FWIW, if we want to go with option (b), syscore_shutdown() hooks could use
> kexec_in_progress to differentiate between "regular" shutdown/reboot and kexec.

Yeah, perhaps that's the best: add syscore_shutdown to kexec and get the
callers to handle both cases if necessary. We could get maintainers for
all of these drivers to sign off on the change and say whether they need
to differentiate between kexec and reboot.

Eric, what are your thoughts on this approach? I can try to whip up a
patch for this and add the maintainers for all of the drivers.

JG


  parent reply	other threads:[~2023-12-11 18:48 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-12 23:31 [PATCH v2 0/2] KVM: Fix race between reboot and hardware enabling Sean Christopherson
2023-05-12 23:31 ` [PATCH v2 1/2] KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdown Sean Christopherson
2023-12-09  7:26   ` Gowans, James
2023-12-10  4:53     ` Eric W. Biederman
2023-12-11  7:54       ` Gowans, James
2023-12-11 10:27         ` Gowans, James
2023-12-11 23:50           ` Eric W. Biederman
2023-12-12  8:50             ` Gowans, James
2023-12-12 13:38               ` Paolo Bonzini
2023-12-13  6:47               ` Gowans, James
2023-12-11 17:34     ` Sean Christopherson
2023-12-11 17:51       ` Sean Christopherson
2023-12-11 18:47       ` Gowans, James [this message]
2023-05-12 23:31 ` [PATCH v2 2/2] KVM: Don't enable hardware after a restart/shutdown is initiated Sean Christopherson
2023-05-18  7:38 ` [PATCH v2 0/2] KVM: Fix race between reboot and hardware enabling Marc Zyngier
2023-05-19 17:41 ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55ef5cbf44bb73030d6387aa181393b8cb044f77.camel@amazon.com \
    --to=jgowans@amazon.com \
    --cc=aleksandar.qemu.devel@gmail.com \
    --cc=anup@brainfault.org \
    --cc=atishp@atishpatra.org \
    --cc=chenhuacai@kernel.org \
    --cc=ebiederm@xmission.com \
    --cc=graf@amazon.de \
    --cc=james.morse@arm.com \
    --cc=jschoenh@amazon.de \
    --cc=kexec@lists.infradead.org \
    --cc=kvm-riscv@lists.infradead.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=suzuki.poulose@arm.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox