Re: [qemu-devel] Bug Report: VM crashed for some kinds of vCPU in nested virtualization

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jan Kiszka <jan.kiszka@web.de>
To: "\"李春奇 <Arthur Chunqi Li>\"" <yzt356@gmail.com>
Cc: qemu-devel@nongnu.org, kvm <kvm@vger.kernel.org>
Subject: Re: [qemu-devel] Bug Report: VM crashed for some kinds of vCPU in nested virtualization
Date: Tue, 16 Apr 2013 09:03:07 +0200	[thread overview]
Message-ID: <516CF7AB.9090107@web.de> (raw)
In-Reply-To: <CABpY8MK943vsm2m=axp2jdEkEafNzAi3N-t3LSKPODbfptH=5Q@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4599 bytes --]

On 2013-04-16 05:49, 李春奇 <Arthur Chunqi Li> wrote:
> I changed to the latest version of kvm kernel but the bug also occured.
> 
> On the startup of L1 VM on the host, the host kern.log will output:
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458090] kvm [2808]: vcpu0
> unhandled rdmsr: 0x345
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458166] kvm_set_msr_common: 22
> callbacks suppressed
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458169] kvm [2808]: vcpu0
> unhandled wrmsr: 0x40 data 0
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458176] kvm [2808]: vcpu0
> unhandled wrmsr: 0x60 data 0
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458182] kvm [2808]: vcpu0
> unhandled wrmsr: 0x41 data 0
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458188] kvm [2808]: vcpu0
> unhandled wrmsr: 0x61 data 0
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458194] kvm [2808]: vcpu0
> unhandled wrmsr: 0x42 data 0
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458200] kvm [2808]: vcpu0
> unhandled wrmsr: 0x62 data 0
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458206] kvm [2808]: vcpu0
> unhandled wrmsr: 0x43 data 0
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458211] kvm [2808]: vcpu0
> unhandled wrmsr: 0x63 data 0
> Apr 16 11:28:23 Blade1-02 kernel: [ 4908.471014] kvm [2808]: vcpu1
> unhandled wrmsr: 0x40 data 0
> Apr 16 11:28:23 Blade1-02 kernel: [ 4908.471024] kvm [2808]: vcpu1
> unhandled wrmsr: 0x60 data 0
> 
> When L1 VM starts and crashes, its kern.log will output:
> Apr 16 11:28:55 kvm1 kernel: [   33.590101] device tap0 entered promiscuous
> mode
> Apr 16 11:28:55 kvm1 kernel: [   33.590140] br0: port 2(tap0) entered
> forwarding state
> Apr 16 11:28:55 kvm1 kernel: [   33.590146] br0: port 2(tap0) entered
> forwarding state
> Apr 16 11:29:04 kvm1 kernel: [   42.592103] br0: port 2(tap0) entered
> forwarding state
> Apr 16 11:29:19 kvm1 kernel: [   57.752731] kvm [1673]: vcpu0 unhandled
> rdmsr: 0x345
> Apr 16 11:29:19 kvm1 kernel: [   57.797261] kvm [1673]: vcpu0 unhandled
> wrmsr: 0x40 data 0
> Apr 16 11:29:19 kvm1 kernel: [   57.797315] kvm [1673]: vcpu0 unhandled
> wrmsr: 0x60 data 0
> Apr 16 11:29:19 kvm1 kernel: [   57.797366] kvm [1673]: vcpu0 unhandled
> wrmsr: 0x41 data 0
> Apr 16 11:29:19 kvm1 kernel: [   57.797416] kvm [1673]: vcpu0 unhandled
> wrmsr: 0x61 data 0
> Apr 16 11:29:19 kvm1 kernel: [   57.797466] kvm [1673]: vcpu0 unhandled
> wrmsr: 0x42 data 0
> Apr 16 11:29:19 kvm1 kernel: [   57.797516] kvm [1673]: vcpu0 unhandled
> wrmsr: 0x62 data 0
> Apr 16 11:29:19 kvm1 kernel: [   57.797566] kvm [1673]: vcpu0 unhandled
> wrmsr: 0x43 data 0
> Apr 16 11:29:19 kvm1 kernel: [   57.797616] kvm [1673]: vcpu0 unhandled
> wrmsr: 0x63 data 0
> 
> The host will output simultaneously:
> Apr 16 11:29:20 Blade1-02 kernel: [ 4966.314742] nested_vmx_run: VMCS
> MSR_{LOAD,STORE} unsupported

That's an important information. KVM is not yet implementing this
feature, but L1 is using it - doomed to fail. This feature gap of nested
VMX needs to be closed at some point.

> 
> And the callback trace displayed on the console is the same as the previous
> mail.
> 
> Besides, the L1 and L2 guest may sometimes crash and output nothing, while
> sometimes it will output as above.
> 
> 
> So this indicates that the msr controls may fail for core2duo CPU emulator.
> 

Maybe varying the CPU type (try e.g. -cpu kvm64,+vmx) reduces the
likeliness of this scenario with KVM as guest.

> 
> For Jan,
> I have traced the code of qemu and KVM and found the relevant code of errno
> "KVM: entry failed, hardware error 0x7". The relevant code is in kernel
> arch/x86/kvm/vmx.c, function vmx_handle_exit():
> 
> if (exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) {
> vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> vcpu->run->fail_entry.hardware_entry_failure_reason
> = exit_reason;
> return 0;
> }
> 
> if (unlikely(vmx->fail)) {
> vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> vcpu->run->fail_entry.hardware_entry_failure_reason
> = vmcs_read32(VM_INSTRUCTION_ERROR);
> return 0;
> }
> 
> The entry failed hardware error may be caused from these two points, both
> are caused by VMENTRY failed. Because macro VMX_EXIT_REASONS_FAILED_VMENTRY
> is 0x80000000 and the output errno is 0x7, so this error is caused by the
> second branch. I'm not very clear what the result of
> vmcs_read32(VM_INSTRUCTION_ERROR) refers to.

Try to look this up in the Intel manual. It explains what instruction
error 7 means. You will also find it when tracing down the error message
of L0.

Jan



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

WARNING: multiple messages have this Message-ID (diff)

From: Jan Kiszka <jan.kiszka@web.de>
To: "\"李春奇 <Arthur Chunqi Li>\"" <yzt356@gmail.com>
Cc: qemu-devel@nongnu.org, kvm <kvm@vger.kernel.org>
Subject: Re: [Qemu-devel] [qemu-devel] Bug Report: VM crashed for some kinds of vCPU in nested virtualization
Date: Tue, 16 Apr 2013 09:03:07 +0200	[thread overview]
Message-ID: <516CF7AB.9090107@web.de> (raw)
In-Reply-To: <CABpY8MK943vsm2m=axp2jdEkEafNzAi3N-t3LSKPODbfptH=5Q@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4599 bytes --]

On 2013-04-16 05:49, 李春奇 <Arthur Chunqi Li> wrote:
> I changed to the latest version of kvm kernel but the bug also occured.
> 
> On the startup of L1 VM on the host, the host kern.log will output:
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458090] kvm [2808]: vcpu0
> unhandled rdmsr: 0x345
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458166] kvm_set_msr_common: 22
> callbacks suppressed
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458169] kvm [2808]: vcpu0
> unhandled wrmsr: 0x40 data 0
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458176] kvm [2808]: vcpu0
> unhandled wrmsr: 0x60 data 0
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458182] kvm [2808]: vcpu0
> unhandled wrmsr: 0x41 data 0
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458188] kvm [2808]: vcpu0
> unhandled wrmsr: 0x61 data 0
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458194] kvm [2808]: vcpu0
> unhandled wrmsr: 0x42 data 0
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458200] kvm [2808]: vcpu0
> unhandled wrmsr: 0x62 data 0
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458206] kvm [2808]: vcpu0
> unhandled wrmsr: 0x43 data 0
> Apr 16 11:28:22 Blade1-02 kernel: [ 4908.458211] kvm [2808]: vcpu0
> unhandled wrmsr: 0x63 data 0
> Apr 16 11:28:23 Blade1-02 kernel: [ 4908.471014] kvm [2808]: vcpu1
> unhandled wrmsr: 0x40 data 0
> Apr 16 11:28:23 Blade1-02 kernel: [ 4908.471024] kvm [2808]: vcpu1
> unhandled wrmsr: 0x60 data 0
> 
> When L1 VM starts and crashes, its kern.log will output:
> Apr 16 11:28:55 kvm1 kernel: [   33.590101] device tap0 entered promiscuous
> mode
> Apr 16 11:28:55 kvm1 kernel: [   33.590140] br0: port 2(tap0) entered
> forwarding state
> Apr 16 11:28:55 kvm1 kernel: [   33.590146] br0: port 2(tap0) entered
> forwarding state
> Apr 16 11:29:04 kvm1 kernel: [   42.592103] br0: port 2(tap0) entered
> forwarding state
> Apr 16 11:29:19 kvm1 kernel: [   57.752731] kvm [1673]: vcpu0 unhandled
> rdmsr: 0x345
> Apr 16 11:29:19 kvm1 kernel: [   57.797261] kvm [1673]: vcpu0 unhandled
> wrmsr: 0x40 data 0
> Apr 16 11:29:19 kvm1 kernel: [   57.797315] kvm [1673]: vcpu0 unhandled
> wrmsr: 0x60 data 0
> Apr 16 11:29:19 kvm1 kernel: [   57.797366] kvm [1673]: vcpu0 unhandled
> wrmsr: 0x41 data 0
> Apr 16 11:29:19 kvm1 kernel: [   57.797416] kvm [1673]: vcpu0 unhandled
> wrmsr: 0x61 data 0
> Apr 16 11:29:19 kvm1 kernel: [   57.797466] kvm [1673]: vcpu0 unhandled
> wrmsr: 0x42 data 0
> Apr 16 11:29:19 kvm1 kernel: [   57.797516] kvm [1673]: vcpu0 unhandled
> wrmsr: 0x62 data 0
> Apr 16 11:29:19 kvm1 kernel: [   57.797566] kvm [1673]: vcpu0 unhandled
> wrmsr: 0x43 data 0
> Apr 16 11:29:19 kvm1 kernel: [   57.797616] kvm [1673]: vcpu0 unhandled
> wrmsr: 0x63 data 0
> 
> The host will output simultaneously:
> Apr 16 11:29:20 Blade1-02 kernel: [ 4966.314742] nested_vmx_run: VMCS
> MSR_{LOAD,STORE} unsupported

That's an important information. KVM is not yet implementing this
feature, but L1 is using it - doomed to fail. This feature gap of nested
VMX needs to be closed at some point.

> 
> And the callback trace displayed on the console is the same as the previous
> mail.
> 
> Besides, the L1 and L2 guest may sometimes crash and output nothing, while
> sometimes it will output as above.
> 
> 
> So this indicates that the msr controls may fail for core2duo CPU emulator.
> 

Maybe varying the CPU type (try e.g. -cpu kvm64,+vmx) reduces the
likeliness of this scenario with KVM as guest.

> 
> For Jan,
> I have traced the code of qemu and KVM and found the relevant code of errno
> "KVM: entry failed, hardware error 0x7". The relevant code is in kernel
> arch/x86/kvm/vmx.c, function vmx_handle_exit():
> 
> if (exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) {
> vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> vcpu->run->fail_entry.hardware_entry_failure_reason
> = exit_reason;
> return 0;
> }
> 
> if (unlikely(vmx->fail)) {
> vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> vcpu->run->fail_entry.hardware_entry_failure_reason
> = vmcs_read32(VM_INSTRUCTION_ERROR);
> return 0;
> }
> 
> The entry failed hardware error may be caused from these two points, both
> are caused by VMENTRY failed. Because macro VMX_EXIT_REASONS_FAILED_VMENTRY
> is 0x80000000 and the output errno is 0x7, so this error is caused by the
> second branch. I'm not very clear what the result of
> vmcs_read32(VM_INSTRUCTION_ERROR) refers to.

Try to look this up in the Intel manual. It explains what instruction
error 7 means. You will also find it when tracing down the error message
of L0.

Jan



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

next prev parent reply	other threads:[~2013-04-16  7:03 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-15  6:24 [Qemu-devel] [qemu-devel] Bug Report: VM crashed for some kinds of vCPU in nested virtualization 李春奇 <Arthur Chunqi Li>
2013-04-15  7:43 ` Jan Kiszka
2013-04-15  7:43   ` [Qemu-devel] " Jan Kiszka
2013-04-16  3:49   ` 李春奇 <Arthur Chunqi Li>
2013-04-16  3:49     ` [Qemu-devel] " 李春奇 <Arthur Chunqi Li>
2013-04-16  7:03     ` Jan Kiszka [this message]
2013-04-16  7:03       ` Jan Kiszka
2013-04-16 10:19       ` 李春奇 <Arthur Chunqi Li>
2013-04-16 10:19         ` [Qemu-devel] " 李春奇 <Arthur Chunqi Li>
2013-04-16 10:29         ` Jan Kiszka
2013-04-16 10:29           ` [Qemu-devel] " Jan Kiszka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=516CF7AB.9090107@web.de \
    --to=jan.kiszka@web.de \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=yzt356@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.