* 2nd level lockups using VMX nesting on 3.11 based host kernel
@ 2013-09-03 13:19 Stefan Bader
2013-09-03 18:13 ` Gleb Natapov
0 siblings, 1 reply; 4+ messages in thread
From: Stefan Bader @ 2013-09-03 13:19 UTC (permalink / raw)
To: kvm
[-- Attachment #1: Type: text/plain, Size: 2618 bytes --]
With current 3.11 kernels we got reports of nested qemu failing in weird ways. I
believe 3.10 also had issues before. Not sure whether those were the same.
With 3.8 based kernels (close to current stable) I found no such issues.
It is possible to reproduce things with the following setup:
Host 64bit user-space, kernel 3.8 or 3.11 based, 64bit hypervisor,
Haswell CPU masked to core2duo for 1rst, 4-core, 8G memory
1rst 32 or 64bit user-space, kernel 3.8 or 3.11 based,
virtio net and block device, swap, 64bit hypervisor,
2-vcpu, 2G memory
2nd 32bit user-space, kernel 3.8 based
user network stack virtio, virtio block device,
2-vcpu, 1G memory
Test is basically to start the 2nd level guest from a base raw image file and
perform some package updates and install some new packages through ssh inside.
With a 3.8 kernel running on the host the host logs some attempted (and likely
ignored) MSR accesses (caused by masking the vcpu to core2duo) but the install
in the 2nd level succeeds, except when the 1rst level runs a 32bit userspace.
In that case I could observe the 1rst level qemu process to use a lot of cpu
time but the 2nd level showed signs of soft-lockup. Maybe at least one of the
second level vcpus not getting scheduled anymore?
Switching the host kernel to 3.11 (about -rc4) the 2nd level install fails with
various symptoms. For a 32bit user-space it looks like the previously described
lockup. Though I observed NMI reason 21 and 31 messages as well.
With a 3.8 kernel and 64bit user-space in 1rst level there were the NMI messages
again but a double fault crash in 2nd level which seemed to have cmos_interrupt
function on the start of the stack.
Using a 3.11-rc4 64bit user-space in 1rst level only had the double fault
without the NMI messages.
The symptoms could vary but the ones described above were the most likely with a
given combination. I also tried 3.11 with 64bit userspace on host and 1rst level
while not doing any cpu masking (so 1rst level sees a qemu 64bit vcpu). This got
rid of the msr messages but otherwise would make the 2nd level get stuck without
any messages. Sometimes with 1rst level busy sometimes not. Though, except for
very rare cases where things really went bad and in one case took down the host,
the 2nd level guest can be killed by ctrl-c from the 1rst level guest.
Now I am not sure which way to debug this better. Has anybody seem similar
things or can help me with some advice on how to get more information?
Thanks,
Stefan
Please cc me on replies as I am not subscribed.
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 899 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: 2nd level lockups using VMX nesting on 3.11 based host kernel
2013-09-03 13:19 2nd level lockups using VMX nesting on 3.11 based host kernel Stefan Bader
@ 2013-09-03 18:13 ` Gleb Natapov
2013-09-10 7:52 ` Stefan Bader
0 siblings, 1 reply; 4+ messages in thread
From: Gleb Natapov @ 2013-09-03 18:13 UTC (permalink / raw)
To: Stefan Bader; +Cc: kvm
On Tue, Sep 03, 2013 at 03:19:27PM +0200, Stefan Bader wrote:
> With current 3.11 kernels we got reports of nested qemu failing in weird ways. I
> believe 3.10 also had issues before. Not sure whether those were the same.
> With 3.8 based kernels (close to current stable) I found no such issues.
Try to bisect it.
--
Gleb.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 2nd level lockups using VMX nesting on 3.11 based host kernel
2013-09-03 18:13 ` Gleb Natapov
@ 2013-09-10 7:52 ` Stefan Bader
2013-09-11 16:32 ` Paolo Bonzini
0 siblings, 1 reply; 4+ messages in thread
From: Stefan Bader @ 2013-09-10 7:52 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm
[-- Attachment #1: Type: text/plain, Size: 2048 bytes --]
On 03.09.2013 20:13, Gleb Natapov wrote:
> On Tue, Sep 03, 2013 at 03:19:27PM +0200, Stefan Bader wrote:
>> With current 3.11 kernels we got reports of nested qemu failing in weird ways. I
>> believe 3.10 also had issues before. Not sure whether those were the same.
>> With 3.8 based kernels (close to current stable) I found no such issues.
> Try to bisect it.
It took a while to bisect. Though I am not sure this helps much. Starting from
v3.9, the first broken commit is:
commit 5f3d5799974b89100268ba813cec8db7bd0693fb
KVM: nVMX: Rework event injection and recovery
This sounds reasonable as this changes event injection between nested levels.
However starting with this patch I am unable to start any second level guest.
Very soon after the second level guest starts, the first (and by that the second
level as well) lock up completely without any visible messages.
This goes on until
commit 5a2892ce72e010e3cb96b438d7cdddce0c88e0e6
KVM: nVMX: Skip PF interception check when queuing during nested run
In between there was also a period where first level did not lock up but would
either seem not to schedule the second level guest or displayed internal error
messages from starting the second level.
Given that it sounds like the current double faults in second level might be one
of the issues introduced by the injection rework that remains until now while
other issues were fixed from the second commit on.
I am not really deeply familiar with the nVMX code, just trying to make sense of
observations. The double fault always seems to originate from the cmos_interrupt
function in the second level guest. It is not immediate and sometimes took
several repeated runs to trigger (during bisect I would require 10 successful
test runs before marking it good). So could it maybe be some event / interrupt
(cmos related?) that accidentally gets injected into the wrong guest level? Or
maybe the same event taking place at the same time for more than one level and
messing up things?
-Stefan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 899 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 2nd level lockups using VMX nesting on 3.11 based host kernel
2013-09-10 7:52 ` Stefan Bader
@ 2013-09-11 16:32 ` Paolo Bonzini
0 siblings, 0 replies; 4+ messages in thread
From: Paolo Bonzini @ 2013-09-11 16:32 UTC (permalink / raw)
To: Stefan Bader; +Cc: Gleb Natapov, kvm
Il 10/09/2013 09:52, Stefan Bader ha scritto:
> On 03.09.2013 20:13, Gleb Natapov wrote:
>> On Tue, Sep 03, 2013 at 03:19:27PM +0200, Stefan Bader wrote:
>>> With current 3.11 kernels we got reports of nested qemu failing
>>> in weird ways. I believe 3.10 also had issues before. Not sure
>>> whether those were the same. With 3.8 based kernels (close to
>>> current stable) I found no such issues.
>> Try to bisect it.
>
> It took a while to bisect. Though I am not sure this helps much.
> Starting from v3.9, the first broken commit is:
>
> commit 5f3d5799974b89100268ba813cec8db7bd0693fb KVM: nVMX: Rework
> event injection and recovery
>
> This sounds reasonable as this changes event injection between
> nested levels. However starting with this patch I am unable to
> start any second level guest. Very soon after the second level
> guest starts, the first (and by that the second level as well) lock
> up completely without any visible messages.
>
> This goes on until
>
> commit 5a2892ce72e010e3cb96b438d7cdddce0c88e0e6 KVM: nVMX: Skip PF
> interception check when queuing during nested run
I'm not sure I'm seeing the same issue as you, but it is similar
enough to point out.
Nested virtualization is completely broken with shadow paging on the
host even before commit 5f3d5799974b89100268ba813cec8db7bd0693fb.
Whether it works probably depends on the combination of host and guest
kernels; I am constantly using 3.10 in the guest. It is very
reproducible, my testcase is x86/realmode.flat from kvm-unit-tests.
There are several problems, some of which were fixed along the way. I
bisected while doing this:
- apply patch 63fbf59 (nVMX: reset rflags register cache during nested
vmentry., 2013-07-28)
- use the emulate_invalid_guest_state=0 argument to kvm-intel. This
is fixed somewhere between commit 5f3d579 and commit 205befd (KVM:
nVMX: correctly set tr base on nested vmexit emulation, 2013-08-04); I
haven't bisected it fully, but it should not be necessary.
The resulting faulty patch is the same as yours. The symptoms are the
same for all three cases:
- commit 5f3d579 + patch 63fbf59 + emulate_invalid_guest_state=0
- commit 205befd + emulate_invalid_guest_state=0
- commit 205befd + emulate_invalid_guest_state=1
My first impression is that a pagefault is injected erroneously, will
look more at it tomorrow.
Paolo
> In between there was also a period where first level did not lock
> up but would either seem not to schedule the second level guest or
> displayed internal error messages from starting the second level.
>
> Given that it sounds like the current double faults in second level
> might be one of the issues introduced by the injection rework that
> remains until now while other issues were fixed from the second
> commit on.
>
> I am not really deeply familiar with the nVMX code, just trying to
> make sense of observations. The double fault always seems to
> originate from the cmos_interrupt function in the second level
> guest. It is not immediate and sometimes took several repeated runs
> to trigger (during bisect I would require 10 successful test runs
> before marking it good). So could it maybe be some event /
> interrupt (cmos related?) that accidentally gets injected into the
> wrong guest level? Or maybe the same event taking place at the same
> time for more than one level and messing up things?
>
> -Stefan
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-09-11 16:32 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-03 13:19 2nd level lockups using VMX nesting on 3.11 based host kernel Stefan Bader
2013-09-03 18:13 ` Gleb Natapov
2013-09-10 7:52 ` Stefan Bader
2013-09-11 16:32 ` Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox