From: Sean Christopherson <sean.j.christopherson@intel.com>
To: Liran Alon <liran.alon@oracle.com>
Cc: Jinpu Wang <jinpu.wang@cloud.ionos.com>, kvm@vger.kernel.org
Subject: Re: Broadwell server reboot with vmx: unexpected exit reason 0x3
Date: Wed, 2 Oct 2019 10:29:44 -0700 [thread overview]
Message-ID: <20191002172943.GG9615@linux.intel.com> (raw)
In-Reply-To: <DDC3DE27-46A3-4CB4-9AB8-C3C2F1D54777@oracle.com>
On Mon, Sep 30, 2019 at 01:48:15PM +0300, Liran Alon wrote:
>
> > On 30 Sep 2019, at 11:43, Jinpu Wang <jinpu.wang@cloud.ionos.com> wrote:
> >
> > Dear KVM experts,
> >
> > We have a Broadwell server reboot itself recently, before the reboot,
> > there were error messages from KVM in netconsole:
> > [5599380.317055] kvm [9046]: vcpu1, guest rIP: 0xffffffff816ad716 vmx:
> > unexpected exit reason 0x3
> > [5599380.317060] kvm [49626]: vcpu0, guest rIP: 0xffffffff81060fe6
> > vmx: unexpected exit reason 0x3
> > [5599380.317062] kvm [36632]: vcpu0, guest rIP: 0xffffffff8103970d
> > vmx: unexpected exit reason 0x3
> > [5599380.317064] kvm [9620]: vcpu1, guest rIP: 0xffffffffb6c1b08e vmx:
> > unexpected exit reason 0x3
> > [5599380.317067] kvm [49925]: vcpu5, guest rIP: 0xffffffff9b406ea2
> > vmx: unexpected exit reason 0x3
> > [5599380.317068] kvm [49925]: vcpu3, guest rIP: 0xffffffff9b406ea2
> > vmx: unexpected exit reason 0x3
> > [5599380.317070] kvm [33871]: vcpu2, guest rIP: 0xffffffff81060fe6
> > vmx: unexpected exit reason 0x3
> > [5599380.317072] kvm [49925]: vcpu4, guest rIP: 0xffffffff9b406ea2
> > vmx: unexpected exit reason 0x3
> > [5599380.317074] kvm [48505]: vcpu1, guest rIP: 0xffffffffaf36bf9b
> > vmx: unexpected exit reason 0x3
> > [5599380.317076] kvm [21880]: vcpu1, guest rIP: 0xffffffff8103970d
> > vmx: unexpected exit reason 0x3
>
> The only way a CPU will raise this exit-reason (3 == EXIT_REASON_INIT_SIGNAL)
> is if CPU is in VMX non-root mode while it has a pending INIT signal in LAPIC.
>
> In simple terms, it means that one CPU was running inside guest while
> another CPU have sent it a signal to reset itself.
>
> I see in code that kvm_init() does register_reboot_notifier(&kvm_reboot_notifier).
> kvm_reboot() runs hardware_disable_nolock() on each CPU before reboot.
> Which should result on every CPU running VMX’s hardware_disable() which should
> exit VMX operation (VMXOFF) and disable VMX (Clear CR4.VMXE).
>
> Therefore, I’m quite puzzled on how a server reboot triggers the scenario you
> present here. Can you send your full kernel log?
My guess is that the system triggered an emergency reboot and was either
unable to force CPUs out of VMX non-root with NMIs, hit a triple fault
shutdown and auto-generated INITs before it could shootdown the other
CPUs, or didn't even attempt the NMI because VMX wasn't enabled on the
CPU that triggered reboot.
In arch/x86/kernel/reboot.c:
/* Use NMIs as IPIs to tell all CPUs to disable virtualization */
static void emergency_vmx_disable_all(void)
{
/* Just make sure we won't change CPUs while doing this */
local_irq_disable();
/*
* We need to disable VMX on all CPUs before rebooting, otherwise
* we risk hanging up the machine, because the CPU ignore INIT
* signals when VMX is enabled.
*
* We can't take any locks and we may be on an inconsistent
* state, so we use NMIs as IPIs to tell the other CPUs to disable
* VMX and halt.
*
* For safety, we will avoid running the nmi_shootdown_cpus()
* stuff unnecessarily, but we don't have a way to check
* if other CPUs have VMX enabled. So we will call it only if the
* CPU we are running on has VMX enabled.
*
* We will miss cases where VMX is not enabled on all CPUs. This
* shouldn't do much harm because KVM always enable VMX on all
* CPUs anyway. But we can miss it on the small window where KVM
* is still enabling VMX.
*/
if (cpu_has_vmx() && cpu_vmx_enabled()) {
/* Disable VMX on this CPU. */
cpu_vmxoff();
/* Halt and disable VMX on the other CPUs */
nmi_shootdown_cpus(vmxoff_nmi);
}
}
static void native_machine_emergency_restart(void)
{
...
if (reboot_emergency)
emergency_vmx_disable_all();
}
next prev parent reply other threads:[~2019-10-02 17:29 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-30 8:43 Broadwell server reboot with vmx: unexpected exit reason 0x3 Jinpu Wang
2019-09-30 10:48 ` Liran Alon
2019-10-02 17:29 ` Sean Christopherson [this message]
2019-10-04 8:53 ` Jinpu Wang
2019-10-17 18:52 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191002172943.GG9615@linux.intel.com \
--to=sean.j.christopherson@intel.com \
--cc=jinpu.wang@cloud.ionos.com \
--cc=kvm@vger.kernel.org \
--cc=liran.alon@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.