kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* nVMX regression v3.13+, bisected
@ 2014-02-26 19:43 Stefan Bader
  2014-02-26 20:25 ` Paolo Bonzini
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Bader @ 2014-02-26 19:43 UTC (permalink / raw)
  To: kvm; +Cc: Anthoine Bourgeois, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1312 bytes --]

Hi,

I was looking at a bug report[1] about a regression on nested VMX that started
with kernel v3.13 (same issue still existed with v3.14-rc4). The problem shows
up when running a v3.13 kernel in L0 and then trying to launch a L2 (L1 was
either a v3.2 kernel or v3.13, so seemed to have no immediate influence). L2 is
trying to boot a iso image and hangs before the isolinux boot loader displays
anything. A preinstalled hd image fails to boot, too.

I bisected this and ended up on the following commit which, when reverted made
the launch work again:

Author: Anthoine Bourgeois <bourgeois@bertin.fr>
Date:   Wed Nov 13 11:45:37 2013 +0100

    kvm, vmx: Fix lazy FPU on nested guest

    If a nested guest does a NM fault but its CR0 doesn't contain the TS
    flag (because it was already cleared by the guest with L1 aid) then we
    have to activate FPU ourselves in L0 and then continue to L2. If TS flag
    is set then we fallback on the previous behavior, forward the fault to
    L1 if it asked for.

    Signed-off-by: Anthoine Bourgeois <bourgeois@bertin.fr>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

The condition to exit to L0 seems to be according to what the description says.
Could it be that the handling in L0 is doing something wrong?

-Stefan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 901 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: nVMX regression v3.13+, bisected
  2014-02-26 19:43 nVMX regression v3.13+, bisected Stefan Bader
@ 2014-02-26 20:25 ` Paolo Bonzini
  2014-02-26 20:27   ` Stefan Bader
  0 siblings, 1 reply; 11+ messages in thread
From: Paolo Bonzini @ 2014-02-26 20:25 UTC (permalink / raw)
  To: Stefan Bader, kvm; +Cc: Anthoine Bourgeois

Il 26/02/2014 20:43, Stefan Bader ha scritto:
> Hi,
>
> I was looking at a bug report[1] about a regression on nested VMX that started
> with kernel v3.13 (same issue still existed with v3.14-rc4). The problem shows
> up when running a v3.13 kernel in L0 and then trying to launch a L2 (L1 was
> either a v3.2 kernel or v3.13, so seemed to have no immediate influence). L2 is
> trying to boot a iso image and hangs before the isolinux boot loader displays
> anything. A preinstalled hd image fails to boot, too.
>
> I bisected this and ended up on the following commit which, when reverted made
> the launch work again:
>
> Author: Anthoine Bourgeois <bourgeois@bertin.fr>
> Date:   Wed Nov 13 11:45:37 2013 +0100
>
>     kvm, vmx: Fix lazy FPU on nested guest
>
>     If a nested guest does a NM fault but its CR0 doesn't contain the TS
>     flag (because it was already cleared by the guest with L1 aid) then we
>     have to activate FPU ourselves in L0 and then continue to L2. If TS flag
>     is set then we fallback on the previous behavior, forward the fault to
>     L1 if it asked for.
>
>     Signed-off-by: Anthoine Bourgeois <bourgeois@bertin.fr>
>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>
> The condition to exit to L0 seems to be according to what the description says.
> Could it be that the handling in L0 is doing something wrong?

Thanks, I'll look at it tomorrow or Friday.

Paolo


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: nVMX regression v3.13+, bisected
  2014-02-26 20:25 ` Paolo Bonzini
@ 2014-02-26 20:27   ` Stefan Bader
  2014-02-26 20:44     ` Kashyap Chamarthy
  2014-02-27 10:51     ` Paolo Bonzini
  0 siblings, 2 replies; 11+ messages in thread
From: Stefan Bader @ 2014-02-26 20:27 UTC (permalink / raw)
  To: Paolo Bonzini, kvm; +Cc: Anthoine Bourgeois

[-- Attachment #1: Type: text/plain, Size: 1711 bytes --]

On 26.02.2014 21:25, Paolo Bonzini wrote:
> Il 26/02/2014 20:43, Stefan Bader ha scritto:
>> Hi,
>>
>> I was looking at a bug report[1] about a regression on nested VMX that started
>> with kernel v3.13 (same issue still existed with v3.14-rc4). The problem shows
>> up when running a v3.13 kernel in L0 and then trying to launch a L2 (L1 was
>> either a v3.2 kernel or v3.13, so seemed to have no immediate influence). L2 is
>> trying to boot a iso image and hangs before the isolinux boot loader displays
>> anything. A preinstalled hd image fails to boot, too.
>>
>> I bisected this and ended up on the following commit which, when reverted made
>> the launch work again:
>>
>> Author: Anthoine Bourgeois <bourgeois@bertin.fr>
>> Date:   Wed Nov 13 11:45:37 2013 +0100
>>
>>     kvm, vmx: Fix lazy FPU on nested guest
>>
>>     If a nested guest does a NM fault but its CR0 doesn't contain the TS
>>     flag (because it was already cleared by the guest with L1 aid) then we
>>     have to activate FPU ourselves in L0 and then continue to L2. If TS flag
>>     is set then we fallback on the previous behavior, forward the fault to
>>     L1 if it asked for.
>>
>>     Signed-off-by: Anthoine Bourgeois <bourgeois@bertin.fr>
>>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>
>> The condition to exit to L0 seems to be according to what the description says.
>> Could it be that the handling in L0 is doing something wrong?
> 
> Thanks, I'll look at it tomorrow or Friday.
> 
> Paolo
> 
Great thanks. And maybe it helps if I actually add the link to the bug report as
I had intended... :-P

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1278531


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 901 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: nVMX regression v3.13+, bisected
  2014-02-26 20:27   ` Stefan Bader
@ 2014-02-26 20:44     ` Kashyap Chamarthy
  2014-02-27 12:10       ` Kashyap Chamarthy
  2014-02-27 10:51     ` Paolo Bonzini
  1 sibling, 1 reply; 11+ messages in thread
From: Kashyap Chamarthy @ 2014-02-26 20:44 UTC (permalink / raw)
  To: Stefan Bader; +Cc: Paolo Bonzini, kvm, Anthoine Bourgeois

On Wed, Feb 26, 2014 at 09:27:17PM +0100, Stefan Bader wrote:
> On 26.02.2014 21:25, Paolo Bonzini wrote:

[. . .]

> >>
> >> I bisected this and ended up on the following commit which, when reverted made
> >> the launch work again:
> >>
> >> Author: Anthoine Bourgeois <bourgeois@bertin.fr>
> >> Date:   Wed Nov 13 11:45:37 2013 +0100
> >>
> >>     kvm, vmx: Fix lazy FPU on nested guest
> >>
> >>     If a nested guest does a NM fault but its CR0 doesn't contain the TS
> >>     flag (because it was already cleared by the guest with L1 aid) then we
> >>     have to activate FPU ourselves in L0 and then continue to L2. If TS flag
> >>     is set then we fallback on the previous behavior, forward the fault to
> >>     L1 if it asked for.
> >>
> >>     Signed-off-by: Anthoine Bourgeois <bourgeois@bertin.fr>
> >>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >>
> >> The condition to exit to L0 seems to be according to what the description says.
> >> Could it be that the handling in L0 is doing something wrong?
> > 
> > Thanks, I'll look at it tomorrow or Friday.
> > 
> > Paolo
> > 
> Great thanks. And maybe it helps if I actually add the link to the bug report as
> I had intended... :-P
> 
> [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1278531

Yes, I'm seeing something similar[*] in a consistent manner with minimal
Fedora installs on L0, L1 and L2, but couldn't manage time to do the
bisecting. I thought this would be my first bisecting exercise, but you
already beat me to it.


  [*] https://bugzilla.kernel.org/show_bug.cgi?id=69491#c7



-- 
/kashyap

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: nVMX regression v3.13+, bisected
  2014-02-26 20:27   ` Stefan Bader
  2014-02-26 20:44     ` Kashyap Chamarthy
@ 2014-02-27 10:51     ` Paolo Bonzini
  2014-02-27 13:41       ` anthoine.bourgeois
  2014-02-27 17:01       ` anthoine.bourgeois
  1 sibling, 2 replies; 11+ messages in thread
From: Paolo Bonzini @ 2014-02-27 10:51 UTC (permalink / raw)
  To: Stefan Bader, kvm; +Cc: Anthoine Bourgeois

Il 26/02/2014 21:27, Stefan Bader ha scritto:
> On 26.02.2014 21:25, Paolo Bonzini wrote:
>> Il 26/02/2014 20:43, Stefan Bader ha scritto:
>>> Hi,
>>>
>>> I was looking at a bug report[1] about a regression on nested VMX that started
>>> with kernel v3.13 (same issue still existed with v3.14-rc4). The problem shows
>>> up when running a v3.13 kernel in L0 and then trying to launch a L2 (L1 was
>>> either a v3.2 kernel or v3.13, so seemed to have no immediate influence). L2 is
>>> trying to boot a iso image and hangs before the isolinux boot loader displays
>>> anything. A preinstalled hd image fails to boot, too.
>>>
>>> I bisected this and ended up on the following commit which, when reverted made
>>> the launch work again:
>>>
>>> Author: Anthoine Bourgeois <bourgeois@bertin.fr>
>>> Date:   Wed Nov 13 11:45:37 2013 +0100
>>>
>>>     kvm, vmx: Fix lazy FPU on nested guest
>>>
>>>     If a nested guest does a NM fault but its CR0 doesn't contain the TS
>>>     flag (because it was already cleared by the guest with L1 aid) then we
>>>     have to activate FPU ourselves in L0 and then continue to L2. If TS flag
>>>     is set then we fallback on the previous behavior, forward the fault to
>>>     L1 if it asked for.
>>>
>>>     Signed-off-by: Anthoine Bourgeois <bourgeois@bertin.fr>
>>>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>>
>>> The condition to exit to L0 seems to be according to what the description says.
>>> Could it be that the handling in L0 is doing something wrong?
>>
>> Thanks, I'll look at it tomorrow or Friday.
>>
>> Paolo
>>
> Great thanks. And maybe it helps if I actually add the link to the bug report as
> I had intended... :-P

I don't have my usual test machine available, but here is a possible guess.
nested_read_cr0 is the CR0 as read by L2, but here we want to look at the
CR0 value reflecting L1's setup.  This would suggest the following untested
patch:

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index a06f101ef64b..0d90601a2681 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6688,7 +6688,7 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
 		else if (is_page_fault(intr_info))
 			return enable_ept;
 		else if (is_no_device(intr_info) &&
-			 !(nested_read_cr0(vmcs12) & X86_CR0_TS))
+			 !(vmcs12->guest_cr0 & X86_CR0_TS))
 			return 0;
 		return vmcs12->exception_bitmap &
 				(1u << (intr_info & INTR_INFO_VECTOR_MASK));


Paolo

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: nVMX regression v3.13+, bisected
  2014-02-26 20:44     ` Kashyap Chamarthy
@ 2014-02-27 12:10       ` Kashyap Chamarthy
  2014-02-27 15:55         ` Kashyap Chamarthy
  0 siblings, 1 reply; 11+ messages in thread
From: Kashyap Chamarthy @ 2014-02-27 12:10 UTC (permalink / raw)
  To: Stefan Bader; +Cc: Paolo Bonzini, kvm, Anthoine Bourgeois

On Thu, Feb 27, 2014 at 02:14:23AM +0530, Kashyap Chamarthy wrote:
> On Wed, Feb 26, 2014 at 09:27:17PM +0100, Stefan Bader wrote:
> > On 26.02.2014 21:25, Paolo Bonzini wrote:
> 
> [. . .]
> 
> > >>
> > >> I bisected this and ended up on the following commit which, when reverted made
> > >> the launch work again:
> > >>
> > >> Author: Anthoine Bourgeois <bourgeois@bertin.fr>
> > >> Date:   Wed Nov 13 11:45:37 2013 +0100
> > >>
> > >>     kvm, vmx: Fix lazy FPU on nested guest
> > >>
> > >>     If a nested guest does a NM fault but its CR0 doesn't contain the TS
> > >>     flag (because it was already cleared by the guest with L1 aid) then we
> > >>     have to activate FPU ourselves in L0 and then continue to L2. If TS flag
> > >>     is set then we fallback on the previous behavior, forward the fault to
> > >>     L1 if it asked for.
> > >>
> > >>     Signed-off-by: Anthoine Bourgeois <bourgeois@bertin.fr>
> > >>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > >>
> > >> The condition to exit to L0 seems to be according to what the description says.
> > >> Could it be that the handling in L0 is doing something wrong?
> > > 
> > > Thanks, I'll look at it tomorrow or Friday.
> > > 
> > > Paolo
> > > 
> > Great thanks. And maybe it helps if I actually add the link to the bug report as
> > I had intended... :-P
> > 
> > [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1278531
> 
> Yes, I'm seeing something similar[*] in a consistent manner with minimal
> Fedora installs on L0, L1 and L2
 
Ok, I just tried to debug an L2 guest (a libguestfs appliance) via
gdb following this method[1]. This is how far I got:


>From shell on L1, launch the libguestfs appliance (note: here libguestfs is
compiled with gdb debugging enabled, so QEMU won't start running the
appliance):

    $ ./run libguestfs-test-tool
    [. . .]
    checking modpath /lib/modules/3.14.0-0.rc2.git0.1.fc21.x86_64 is a directory
    picked kernel vmlinuz-3.14.0-0.rc2.git0.1.fc21.x86_64
    supermin helper [00000ms] finished creating kernel
    [. . .]
    libguestfs: warning: qemu debugging is enabled, connect gdb to tcp::1234 to begin
    [. . .]


>From a different shell, I invoke gdb like that:


    (gdb) symbol-file  /usr/lib/debug/lib/modules/3.14.0-0.rc4.git0.1.fc21.x86_64/vmlinux 
    Reading symbols from /usr/lib/debug/lib/modules/3.14.0-0.rc4.git0.1.fc21.x86_64/vmlinux...done.
    (gdb) target remote tcp::1234
    Remote debugging using tcp::1234
    0x0000fff0 in ftrace_stack ()
    (gdb) bt
    #0  0x00000997 in irq_stack_union ()
    #1  0x00000000 in ?? ()
    (gdb) 
    (gdb) c
    Continuing.


Again, back to libguestfs-test-tool, it's just hung attempting to booting from ROM:

    [. . .]
    SGABIOS $Id: sgabios.S 8 2010-04-22 00:03:40Z nlaredo $ (mockbuild@) Wed Aug 14 23:57:08 UTC 2013
    Term: 80x24
    4 0
    SeaBIOS (version 1.7.4-20140106_154858-)
    Booting from ROM...


Back to gdb, to find out _what_ file the above function is trying to be
executed from:


    (gdb) c
    Continuing.
    ^C
    Program received signal SIGINT, Interrupt.
    0x00000997 in irq_stack_union ()
    (gdb) bt
    #0  0x00000997 in irq_stack_union ()
    #1  0x00000000 in ?? ()
    (gdb) list
    1       /*
    2        * Copyright 2002, 2003 Andi Kleen, SuSE Labs.
    3        *
    4        * This file is subject to the terms and conditions of the GNU General Public
    5        * License.  See the file COPYING in the main directory of this archive
    6        * for more details. No warranty for anything given at all.
    7        */
    8       #include <linux/linkage.h>
    9       #include <asm/dwarf2.h>
    10      #include <asm/errno.h>
    (gdb) 
    [. . .]
    (gdb) 
    241     ENDPROC(csum_partial_copy_generic)
    (gdb) 
    Line number 242 out of range; arch/x86/lib/csum-copy_64.S has 241 lines.
    (gdb)


PS: Paolo, I'll try to test with your new patch soon.

Thanks.

  [1] https://github.com/libguestfs/libguestfs/blob/master/src/launch-direct.c#L404


> 
>   [*] https://bugzilla.kernel.org/show_bug.cgi?id=69491#c7
> 
> 

-- 
/kashyap

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: nVMX regression v3.13+, bisected
  2014-02-27 10:51     ` Paolo Bonzini
@ 2014-02-27 13:41       ` anthoine.bourgeois
  2014-02-27 17:01       ` anthoine.bourgeois
  1 sibling, 0 replies; 11+ messages in thread
From: anthoine.bourgeois @ 2014-02-27 13:41 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Anthoine Bourgeois, kvm, Stefan Bader



-----Paolo Bonzini <paolo.bonzini@gmail.com> a écrit : -----
Il 26/02/2014 21:27, Stefan Bader ha scritto:
> On 26.02.2014 21:25, Paolo Bonzini wrote:
>> Il 26/02/2014 20:43, Stefan Bader ha scritto:
>>> Hi,
>>>
>>> I was looking at a bug report[1] about a regression on nested VMX that started
>>> with kernel v3.13 (same issue still existed with v3.14-rc4). The problem shows
>>> up when running a v3.13 kernel in L0 and then trying to launch a L2 (L1 was
>>> either a v3.2 kernel or v3.13, so seemed to have no immediate influence). L2 is
>>> trying to boot a iso image and hangs before the isolinux boot loader displays
>>> anything. A preinstalled hd image fails to boot, too.
>>>
>>> I bisected this and ended up on the following commit which, when reverted made
>>> the launch work again:
>>>
>>> Author: Anthoine Bourgeois <bourgeois@bertin.fr>
>>> Date:   Wed Nov 13 11:45:37 2013 +0100
>>>
>>>     kvm, vmx: Fix lazy FPU on nested guest
>>>
>>>     If a nested guest does a NM fault but its CR0 doesn't contain the TS
>>>     flag (because it was already cleared by the guest with L1 aid) then we
>>>     have to activate FPU ourselves in L0 and then continue to L2. If TS flag
>>>     is set then we fallback on the previous behavior, forward the fault to
>>>     L1 if it asked for.
>>>
>>>     Signed-off-by: Anthoine Bourgeois <bourgeois@bertin.fr>
>>>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>>
>>> The condition to exit to L0 seems to be according to what the description says.
>>> Could it be that the handling in L0 is doing something wrong?
>>
>> Thanks, I'll look at it tomorrow or Friday.
>>
>> Paolo
>>
> Great thanks. And maybe it helps if I actually add the link to the bug report as
> I had intended... :-P

I don't have my usual test machine available, but here is a possible guess.
nested_read_cr0 is the CR0 as read by L2, but here we want to look at the
CR0 value reflecting L1's setup.  This would suggest the following untested
patch:

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index a06f101ef64b..0d90601a2681 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6688,7 +6688,7 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
 	 else if (is_page_fault(intr_info))
 	 return enable_ept;
 	 else if (is_no_device(intr_info) &&
-	 !(nested_read_cr0(vmcs12) & X86_CR0_TS))
+	 !(vmcs12->guest_cr0 & X86_CR0_TS))
 	 return 0;
 	 return vmcs12->exception_bitmap &
 	 (1u << (intr_info & INTR_INFO_VECTOR_MASK));

Hi,

I install a new test machine and I test the patch. I'll report as soon as possible.

Regards,
Anthoine
1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: nVMX regression v3.13+, bisected
  2014-02-27 12:10       ` Kashyap Chamarthy
@ 2014-02-27 15:55         ` Kashyap Chamarthy
  0 siblings, 0 replies; 11+ messages in thread
From: Kashyap Chamarthy @ 2014-02-27 15:55 UTC (permalink / raw)
  To: Stefan Bader; +Cc: Paolo Bonzini, kvm, Anthoine Bourgeois

On Thu, Feb 27, 2014 at 05:40:56PM +0530, Kashyap Chamarthy wrote:
> On Thu, Feb 27, 2014 at 02:14:23AM +0530, Kashyap Chamarthy wrote:
> > On Wed, Feb 26, 2014 at 09:27:17PM +0100, Stefan Bader wrote:
> > > On 26.02.2014 21:25, Paolo Bonzini wrote:
> appliance):
> 
>     $ ./run libguestfs-test-tool
>     [. . .]
>     checking modpath /lib/modules/3.14.0-0.rc2.git0.1.fc21.x86_64 is a directory
>     picked kernel vmlinuz-3.14.0-0.rc2.git0.1.fc21.x86_64
>     supermin helper [00000ms] finished creating kernel
>     [. . .]
>     libguestfs: warning: qemu debugging is enabled, connect gdb to tcp::1234 to begin
>     [. . .]
> 
> 
> From a different shell, I invoke gdb like that:
> 
> 
>     (gdb) symbol-file  /usr/lib/debug/lib/modules/3.14.0-0.rc4.git0.1.fc21.x86_64/vmlinux 

Disregard me here. I loaded wrong symbol file (Thanks to David Gilbert
for spotting that on IRC) :-(

Just issued a Fedora Kernel test build[1] with Paolo's patch, will see
how it goes.

  [1] http://koji.fedoraproject.org/koji/taskinfo?taskID=6577135
     
-- 
/kashyap

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: nVMX regression v3.13+, bisected
  2014-02-27 17:01       ` anthoine.bourgeois
@ 2014-02-27 16:58         ` Paolo Bonzini
  2014-02-27 21:34           ` Kashyap Chamarthy
  0 siblings, 1 reply; 11+ messages in thread
From: Paolo Bonzini @ 2014-02-27 16:58 UTC (permalink / raw)
  To: anthoine.bourgeois
  Cc: Anthoine Bourgeois, kvm, Stefan Bader, Kashyap Chamarthy

Il 27/02/2014 18:01, anthoine.bourgeois@bertin.fr ha scritto:
> OK, so your patch works perfectly well with both of my test machines (a Ubuntu guest or
> a ChorusOS guest).
> I join the patch, can you signof it ?

I'll post it tonight or tomorrow, thanks.

Paolo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: nVMX regression v3.13+, bisected
  2014-02-27 10:51     ` Paolo Bonzini
  2014-02-27 13:41       ` anthoine.bourgeois
@ 2014-02-27 17:01       ` anthoine.bourgeois
  2014-02-27 16:58         ` Paolo Bonzini
  1 sibling, 1 reply; 11+ messages in thread
From: anthoine.bourgeois @ 2014-02-27 17:01 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Anthoine Bourgeois, kvm, Stefan Bader, Kashyap Chamarthy

[-- Attachment #1: Type: text/plain, Size: 2979 bytes --]



-----Paolo Bonzini <paolo.bonzini@gmail.com> a écrit : -----

>A : Stefan Bader <stefan.bader@canonical.com>, kvm@vger.kernel.org
>De : Paolo Bonzini 
>Envoyé par : Paolo Bonzini 
>Date : 27/02/2014 11:51
>Cc : Anthoine Bourgeois <bourgeois@bertin.fr>
>Objet : Re: nVMX regression v3.13+, bisected
>
>Il 26/02/2014 21:27, Stefan Bader ha scritto:
>> On 26.02.2014 21:25, Paolo Bonzini wrote:
>>> Il 26/02/2014 20:43, Stefan Bader ha scritto:
>>>> Hi,
>>>>
>>>> I was looking at a bug report[1] about a regression on nested VMX
>that started
>>>> with kernel v3.13 (same issue still existed with v3.14-rc4). The
>problem shows
>>>> up when running a v3.13 kernel in L0 and then trying to launch a
>L2 (L1 was
>>>> either a v3.2 kernel or v3.13, so seemed to have no immediate
>influence). L2 is
>>>> trying to boot a iso image and hangs before the isolinux boot
>loader displays
>>>> anything. A preinstalled hd image fails to boot, too.
>>>>
>>>> I bisected this and ended up on the following commit which, when
>reverted made
>>>> the launch work again:
>>>>
>>>> Author: Anthoine Bourgeois <bourgeois@bertin.fr>
>>>> Date: Wed Nov 13 11:45:37 2013 +0100
>>>>
>>>> kvm, vmx: Fix lazy FPU on nested guest
>>>>
>>>> If a nested guest does a NM fault but its CR0 doesn't contain
>the TS
>>>> flag (because it was already cleared by the guest with L1
>aid) then we
>>>> have to activate FPU ourselves in L0 and then continue to L2.
>If TS flag
>>>> is set then we fallback on the previous behavior, forward the
>fault to
>>>> L1 if it asked for.
>>>>
>>>> Signed-off-by: Anthoine Bourgeois <bourgeois@bertin.fr>
>>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>>>
>>>> The condition to exit to L0 seems to be according to what the
>description says.
>>>> Could it be that the handling in L0 is doing something wrong?
>>>
>>> Thanks, I'll look at it tomorrow or Friday.
>>>
>>> Paolo
>>>
>> Great thanks. And maybe it helps if I actually add the link to the
>bug report as
>> I had intended... :-P
>
>I don't have my usual test machine available, but here is a possible
>guess.
>nested_read_cr0 is the CR0 as read by L2, but here we want to look at
>the
>CR0 value reflecting L1's setup. This would suggest the following
>untested
>patch:
>
>diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>index a06f101ef64b..0d90601a2681 100644
>--- a/arch/x86/kvm/vmx.c
>+++ b/arch/x86/kvm/vmx.c
>@@ -6688,7 +6688,7 @@ static bool nested_vmx_exit_handled(struct
>kvm_vcpu *vcpu)
> else if (is_page_fault(intr_info))
> return enable_ept;
> else if (is_no_device(intr_info) &&
>-	 !(nested_read_cr0(vmcs12) & X86_CR0_TS))
>+	 !(vmcs12->guest_cr0 & X86_CR0_TS))
> return 0;
> return vmcs12->exception_bitmap &
> (1u << (intr_info & INTR_INFO_VECTOR_MASK));
>

OK, so your patch works perfectly well with both of my test machines (a Ubuntu guest or
a ChorusOS guest).
I join the patch, can you signof it ?

Regards,
Anthoine

PS: Sorry for my bad Lotus Notes mailer behaviour :-/
1

[-- Attachment #2: 0001-kvm-vmx-Fix-a-nested-cr0-read-on-NM-fault.patch --]
[-- Type: application/octet-stream, Size: 1043 bytes --]

From 7f3102b23a2eb8fc070345675a60faaa45d6dd7c Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu, 27 Feb 2014 17:46:43 +0100
Subject: [PATCH] kvm,vmx: Fix a nested cr0 read on NM fault

nested_read_cr0 is the CR0 as read by L2, but here we want to look at
the CR0 value reflecting L1's setup.
Fix bugzilla ID#69491

Reported-by: Kashyap Chamarthy <kchamart@redhat.com>
Tested-by: Anthoine Bourgeois <bourgeois@bertin.fr>
---
 arch/x86/kvm/vmx.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index da7837e..dcc4de3 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6644,7 +6644,7 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
 		else if (is_page_fault(intr_info))
 			return enable_ept;
 		else if (is_no_device(intr_info) &&
-			 !(nested_read_cr0(vmcs12) & X86_CR0_TS))
+			 !(vmcs12->guest_cr0 & X86_CR0_TS))
 			return 0;
 		return vmcs12->exception_bitmap &
 				(1u << (intr_info & INTR_INFO_VECTOR_MASK));
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: nVMX regression v3.13+, bisected
  2014-02-27 16:58         ` Paolo Bonzini
@ 2014-02-27 21:34           ` Kashyap Chamarthy
  0 siblings, 0 replies; 11+ messages in thread
From: Kashyap Chamarthy @ 2014-02-27 21:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: anthoine.bourgeois, Anthoine Bourgeois, kvm, Stefan Bader

On Thu, Feb 27, 2014 at 05:58:46PM +0100, Paolo Bonzini wrote:
> Il 27/02/2014 18:01, anthoine.bourgeois@bertin.fr ha scritto:
> >OK, so your patch works perfectly well with both of my test machines (a Ubuntu guest or
> >a ChorusOS guest).
> >I join the patch, can you signof it ?
> 
> I'll post it tonight or tomorrow, thanks.

I tested two cases:

  1. Built a Fedora Kernel[a] and installed on both L0 and L1, booted an
     L2 guest successfully[b].
       - I tried two things as L2 guest: (1) Run 'libguestfs-test-tool'
         that boots a minimal Kernel/initrd via 'libvirt' backend; (2) A
         regular Fedora-20 guest with QEMU command-line produced by
         libvirt.
         

  2. Build KVM git with Paolo's patch, install the Kernel on L0 and L1;
     (and reboot both):

        $ git log | head -1
        commit 404381c5839d67aa0c275ad1da96ef3d3928ca2c
    
    In this case, booting an L1 (guest hypervisor) itself results in the
    below stack trace for me:


    [. . .]
    [    0.039000] task: ffff880216078000 ti: ffff880216056000 task.ti: ffff880216056000
    [    0.039000] RIP: 0010:[<ffffffff81ceab25>]  [<ffffffff81ceab25>] intel_pmu_init+0x2d9/0x8e6
    [    0.039000] RSP: 0000:ffff880216057e40  EFLAGS: 00000202
    [    0.039000] RAX: 0000000000000003 RBX: 0000000000000000 RCX: 0000000000000345
    [    0.039000] RDX: 0000000000000003 RSI: 0000000000000730 RDI: 0000ffffffffffff
    [    0.039000] RBP: ffff880216057e48 R08: 0000000000000001 R09: 0000000000000007
    [    0.039000] R10: ffffffff81cc0c60 R11: ffff880216057c16 R12: ffffffff81ce9a4f
    [    0.039000] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
    [    0.039000] FS:  0000000000000000(0000) GS:ffff88021fc00000(0000) knlGS:0000000000000000
    [    0.039000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [    0.039000] CR2: ffff88021ffff000 CR3: 0000000001c0b000 CR4: 00000000001406f0
    [    0.039000] Stack:
    [    0.039000]  0000000000000000 ffff880216057e98 ffffffff81ce9a88 0000000000000000
    [    0.039000]  ffff88000009b000 ffff880216057e98 0000000000000000 ffffffff81ce9a4f
    [    0.039000]  0000000000000000 0000000000000000 0000000000000000 ffff880216057f08
    [    0.039000] Call Trace:
    [    0.039000]  [<ffffffff81ce9a88>] init_hw_perf_events+0x39/0x51a
    [    0.039000]  [<ffffffff81ce9a4f>] ? check_bugs+0x2d/0x2d
    [    0.039000]  [<ffffffff81000332>] do_one_initcall+0xf2/0x140
    [    0.039000]  [<ffffffff81cef22f>] ? native_smp_prepare_cpus+0x30c/0x32a
    [    0.039000]  [<ffffffff81ce2f19>] kernel_init_freeable+0xc4/0x1ec
    [    0.039000]  [<ffffffff817aa230>] ? rest_init+0x80/0x80
    [    0.039000]  [<ffffffff817aa239>] kernel_init+0x9/0xf0
    [    0.039000]  [<ffffffff817c307c>] ret_from_fork+0x7c/0xb0
    [    0.039000]  [<ffffffff817aa230>] ? rest_init+0x80/0x80
    [    0.039000] Code: 61 fd ff 44 89 0d e4 61 fd ff 89 0d 9e 62 fd ff 7e 2b 83 e2 1f b8 03 00 00 00 b9 45 03 00 00 83 fa 02 0f 4f c2 89 05 ab 61 fd ff <0f> 32 48 c1 e2 20 89 c0 48 09 c2 48 89 15 49 62 fd ff e8 64 10 
    [    0.039000] RIP  [<ffffffff81ceab25>] intel_pmu_init+0x2d9/0x8e6
    [    0.039000]  RSP <ffff880216057e40>
    [    0.039013] ---[ end trace 6a1e6dd839222f3e ]---
    [    0.040007] swapper/0 (1) used greatest stack depth: 5720 bytes left
    [    0.041008] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b


Thanks Paolo, for the fix.


  [a] http://koji.fedoraproject.org/koji/taskinfo?taskID=6577700
  [b] http://kashyapc.fedorapeople.org/temp/stdout-libguestfs-test-tool-in-L1-28FEB2014.txt

-- 
/kashyap

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-02-27 21:34 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-26 19:43 nVMX regression v3.13+, bisected Stefan Bader
2014-02-26 20:25 ` Paolo Bonzini
2014-02-26 20:27   ` Stefan Bader
2014-02-26 20:44     ` Kashyap Chamarthy
2014-02-27 12:10       ` Kashyap Chamarthy
2014-02-27 15:55         ` Kashyap Chamarthy
2014-02-27 10:51     ` Paolo Bonzini
2014-02-27 13:41       ` anthoine.bourgeois
2014-02-27 17:01       ` anthoine.bourgeois
2014-02-27 16:58         ` Paolo Bonzini
2014-02-27 21:34           ` Kashyap Chamarthy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).