[nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.

All of lore.kernel.org
 help / color / mirror / Atom feed

* [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
@ 2013-05-10 13:00 Kashyap Chamarthy
       [not found] ` <CAOaxAcZ1uyx-RrmDEiZhG2H8H5zTCK9iz1nHJKEJwUfhn=vZHA@mail.gmail.com>
  2013-05-10 15:12 ` Jan Kiszka
  0 siblings, 2 replies; 30+ messages in thread
From: Kashyap Chamarthy @ 2013-05-10 13:00 UTC (permalink / raw)
  To: kvm

[-- Attachment #1: Type: text/plain, Size: 4689 bytes --]

Heya,

This is on Intel Haswell.

First, some version info:

L0, L1 -- both of them have same versions of kernel, qemu:

=====
$ rpm -q kernel --changelog | head -2
* Thu May 09 2013 Josh Boyer  - 3.10.0-0.rc0.git23.1
- Linux v3.9-11789-ge0fd9af
=====

=====
$ uname -r ; rpm -q qemu-kvm libvirt-daemon-kvm libguestfs
3.10.0-0.rc0.git23.1.fc20.x86_64
qemu-kvm-1.4.1-1.fc19.x86_64
libvirt-daemon-kvm-1.0.5-2.fc19.x86_64
libguestfs-1.21.35-1.fc19.x86_64
=====

Additionally, neither nmi_watchdog, nor hpet enabled on L0 & L1 kernels:
=====
 $ egrep -i 'nmi|hpet' /etc/grub2.cfg
 $
=====

KVM parameters on L0 :
=====
$ cat /sys/module/kvm_intel/parameters/nested
Y
$ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs
Y
$ cat /sys/module/kvm_intel/parameters/enable_apicv
N
$ cat /sys/module/kvm_intel/parameters/ept
Y
=====

-> That's the stack trace I'm seeing, when I start the L2 guest:
------------------------------------------------
.......
[    2.162235] Kernel panic - not syncing: VFS: Unable to mount root
fs on unknown-block(0,0)
[    2.163080] Pid: 1, comm: swapper/0 Not tainted 3.8.11-200.fc18.x86_64 #1
[    2.163080] Call Trace:
[    2.163080]  [<ffffffff81649c19>] panic+0xc1/0x1d0
[    2.163080]  [<ffffffff81d010e0>] mount_block_root+0x1fa/0x2ac
[    2.163080]  [<ffffffff81d011e9>] mount_root+0x57/0x5b
[    2.163080]  [<ffffffff81d0132a>] prepare_namespace+0x13d/0x176
[    2.163080]  [<ffffffff81d00e1c>] kernel_init_freeable+0x1cf/0x1da
[    2.163080]  [<ffffffff81d00610>] ? do_early_param+0x8c/0x8c
[    2.163080]  [<ffffffff81637ca0>] ? rest_init+0x80/0x80
[    2.163080]  [<ffffffff81637cae>] kernel_init+0xe/0xf0
[    2.163080]  [<ffffffff8165bd6c>] ret_from_fork+0x7c/0xb0
[    2.163080]  [<ffffffff81637ca0>] ? rest_init+0x80/0x80
[    2.163080] Uhhuh. NMI received for unknown reason 30 on CPU 0.
[    2.163080] Do you have a strange power saving mode enabled?
[    2.163080] Dazed and confused, but trying to continue
[    2.163080] Uhhuh. NMI received for unknown reason 20 on CPU 0.
[    2.163080] Do you have a strange power saving mode enabled?
[    2.163080] Dazed and confused, but trying to continue
[    2.163080] Uhhuh. NMI received for unknown reason 30 on CPU 0.
------------------------------------------------

I'm able to reproduce to reproduce this consistently.

L1 QEMU command-line:
====================
    $ ps -ef | grep -i qemu
    qemu      4962     1 21 15:41 ?        00:00:41
/usr/bin/qemu-system-x86_64 -machine accel=kvm -name regular-guest -S
-machine pc-i440fx-1.4,accel=kvm,usb=off -cpu Haswell,+vmx -m 6144
-smp 4,sockets=4,cores=1,threads=1 -uuid
4ed9ac0b-7f72-dfcf-68b3-e6fe2ac588b2 -nographic -no-user-config
-nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/regular-guest.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
-no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2
-drive file=/home/test/vmimages/regular-guest.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:80:c1:34,bus=pci.0,addr=0x3
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

L2 QEMU command-line:
====================

    $ qemu      2042     1  0 May09 ?        00:05:03
/usr/bin/qemu-system-x86_64 -machine accel=kvm -name nested-guest -S
-machine pc-i440fx-1.4,accel=kvm,usb=off -m 2048 -smp
2,sockets=2,cores=1,threads=1 -uuid
02ea8988-1054-b08b-bafe-cfbe9659976c -nographic -no-user-config
-nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/nested-guest.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
-no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2
-drive file=/home/test/vmimages/nested-guest.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:65:c4:e6,bus=pci.0,addr=0x3
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5


I attached the vmxcap script output.

Before I debug further, anyone has hints here?

Many thanks in advance.

[1] Notes --  https://github.com/kashyapc/nested-virt-notes-intel-f18

/kashyap

[-- Attachment #2: vmxcap-info.txt --]
[-- Type: text/plain, Size: 4533 bytes --]

Basic VMX Information
  Revision                                 18
  VMCS size                                1024
  VMCS restricted to 32 bit addresses      no
  Dual-monitor support                     yes
  VMCS memory type                         6
  INS/OUTS instruction information         yes
  IA32_VMX_TRUE_*_CTLS support             yes
pin-based controls
  External interrupt exiting               yes
  NMI exiting                              yes
  Virtual NMIs                             yes
  Activate VMX-preemption timer            yes
  Process posted interrupts                no
primary processor-based controls
  Interrupt window exiting                 yes
  Use TSC offsetting                       yes
  HLT exiting                              yes
  INVLPG exiting                           yes
  MWAIT exiting                            yes
  RDPMC exiting                            yes
  RDTSC exiting                            yes
  CR3-load exiting                         default
  CR3-store exiting                        default
  CR8-load exiting                         yes
  CR8-store exiting                        yes
  Use TPR shadow                           yes
  NMI-window exiting                       yes
  MOV-DR exiting                           yes
  Unconditional I/O exiting                yes
  Use I/O bitmaps                          yes
  Monitor trap flag                        yes
  Use MSR bitmaps                          yes
  MONITOR exiting                          yes
  PAUSE exiting                            yes
  Activate secondary control               yes
secondary processor-based controls
  Virtualize APIC accesses                 yes
  Enable EPT                               yes
  Descriptor-table exiting                 yes
  Enable RDTSCP                            yes
  Virtualize x2APIC mode                   yes
  Enable VPID                              yes
  WBINVD exiting                           yes
  Unrestricted guest                       yes
  APIC register emulation                  no
  Virtual interrupt delivery               no
  PAUSE-loop exiting                       yes
  RDRAND exiting                           yes
  Enable INVPCID                           yes
  Enable VM functions                      yes
  VMCS shadowing                           yes
  EPT-violation #VE                        no
VM-Exit controls
  Save debug controls                      default
  Host address-space size                  yes
  Load IA32_PERF_GLOBAL_CTRL               yes
  Acknowledge interrupt on exit            yes
  Save IA32_PAT                            yes
  Load IA32_PAT                            yes
  Save IA32_EFER                           yes
  Load IA32_EFER                           yes
  Save VMX-preemption timer value          yes
VM-Entry controls
  Load debug controls                      default
  IA-64 mode guest                         yes
  Entry to SMM                             yes
  Deactivate dual-monitor treatment        yes
  Load IA32_PERF_GLOBAL_CTRL               yes
  Load IA32_PAT                            yes
  Load IA32_EFER                           yes
Miscellaneous data
  VMX-preemption timer scale (log2)        5
  Store EFER.LMA into IA-32e mode guest control yes
  HLT activity state                       yes
  Shutdown activity state                  yes
  Wait-for-SIPI activity state             yes
  IA32_SMBASE support                      yes
  Number of CR3-target values              4
  MSR-load/store count recommenation       0
  IA32_SMM_MONITOR_CTL[2] can be set to 1  yes
  VMWRITE to VM-exit information fields    yes
  MSEG revision identifier                 0
VPID and EPT capabilities
  Execute-only EPT translations            yes
  Page-walk length 4                       yes
  Paging-structure memory type UC          yes
  Paging-structure memory type WB          yes
  2MB EPT pages                            yes
  1GB EPT pages                            yes
  INVEPT supported                         yes
  EPT accessed and dirty flags             yes
  Single-context INVEPT                    yes
  All-context INVEPT                       yes
  INVVPID supported                        yes
  Individual-address INVVPID               yes
  Single-context INVVPID                   yes
  All-context INVVPID                      yes
  Single-context-retaining-globals INVVPID yes
VM Functions
  EPTP Switching                           yes


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
       [not found] ` <CAOaxAcZ1uyx-RrmDEiZhG2H8H5zTCK9iz1nHJKEJwUfhn=vZHA@mail.gmail.com>
@ 2013-05-10 14:41   ` Kashyap Chamarthy
  0 siblings, 0 replies; 30+ messages in thread
From: Kashyap Chamarthy @ 2013-05-10 14:41 UTC (permalink / raw)
  To: kvm

On Fri, May 10, 2013 at 8:03 PM, Kashyap Chamarthy <kashyap.cv@gmail.com> wrote:
> Also,
>
> I'm able to reproduce this consistently:  When I create an L2 guest:
>
> --------------
> [.....]
>
> [  463.655031] Dazed and confused, but trying to continue
> [  463.975563] Uhhuh. NMI received for unknown reason 20 on CPU 1.
> [  463.976040] Do you have a strange power saving mode enabled?
> [  463.976040] Dazed and confused, but trying to continue
>  29  199M   29 58.7M    0     0   136k      0  0:25:02  0:07:20  0:17:42
> 153k
> [  465.136405] Uhhuh. NMI received for unknown reason 30 on CPU 1.
> [  465.137042] Do you have a strange power saving mode enabled?
> [  465.137042] Dazed and confused, but trying to continue
> [  466.645818] Uhhuh. NMI received for unknown reason 20 on CPU 1.
> [  466.646044] Do you have a strange power saving mode enabled?
> [  466.646044] Dazed and confused, but trying to continue
> [  466.907999] Uhhuh. NMI received for unknown reason 30 on CPU 1.
> [  466.908033] Do you have a strange power saving mode enabled?
>
> --------------
>
>
>
> On Fri, May 10, 2013 at 6:30 PM, Kashyap Chamarthy <kashyap.cv@gmail.com>
> wrote:
>>
>> Heya,
>>
>> This is on Intel Haswell.
>>
>> First, some version info:
>>
>> L0, L1 -- both of them have same versions of kernel, qemu:
>>
>> =====
>> $ rpm -q kernel --changelog | head -2
>> * Thu May 09 2013 Josh Boyer  - 3.10.0-0.rc0.git23.1
>> - Linux v3.9-11789-ge0fd9af
>> =====
>>
>> =====
>> $ uname -r ; rpm -q qemu-kvm libvirt-daemon-kvm libguestfs
>> 3.10.0-0.rc0.git23.1.fc20.x86_64
>> qemu-kvm-1.4.1-1.fc19.x86_64
>> libvirt-daemon-kvm-1.0.5-2.fc19.x86_64
>> libguestfs-1.21.35-1.fc19.x86_64
>> =====
>>
>> Additionally, neither nmi_watchdog, nor hpet enabled on L0 & L1 kernels:
>> =====
>>  $ egrep -i 'nmi|hpet' /etc/grub2.cfg
>>  $
>> =====
>>
>> KVM parameters on L0 :
>> =====
>> $ cat /sys/module/kvm_intel/parameters/nested
>> Y
>> $ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs
>> Y
>> $ cat /sys/module/kvm_intel/parameters/enable_apicv
>> N
>> $ cat /sys/module/kvm_intel/parameters/ept
>> Y
>> =====
>>
>> -> That's the stack trace I'm seeing, when I start the L2 guest:
>> ------------------------------------------------
>> .......
>> [    2.162235] Kernel panic - not syncing: VFS: Unable to mount root
>> fs on unknown-block(0,0)
>> [    2.163080] Pid: 1, comm: swapper/0 Not tainted 3.8.11-200.fc18.x86_64
>> #1
>> [    2.163080] Call Trace:
>> [    2.163080]  [<ffffffff81649c19>] panic+0xc1/0x1d0
>> [    2.163080]  [<ffffffff81d010e0>] mount_block_root+0x1fa/0x2ac
>> [    2.163080]  [<ffffffff81d011e9>] mount_root+0x57/0x5b
>> [    2.163080]  [<ffffffff81d0132a>] prepare_namespace+0x13d/0x176
>> [    2.163080]  [<ffffffff81d00e1c>] kernel_init_freeable+0x1cf/0x1da
>> [    2.163080]  [<ffffffff81d00610>] ? do_early_param+0x8c/0x8c
>> [    2.163080]  [<ffffffff81637ca0>] ? rest_init+0x80/0x80
>> [    2.163080]  [<ffffffff81637cae>] kernel_init+0xe/0xf0
>> [    2.163080]  [<ffffffff8165bd6c>] ret_from_fork+0x7c/0xb0
>> [    2.163080]  [<ffffffff81637ca0>] ? rest_init+0x80/0x80
>> [    2.163080] Uhhuh. NMI received for unknown reason 30 on CPU 0.
>> [    2.163080] Do you have a strange power saving mode enabled?
>> [    2.163080] Dazed and confused, but trying to continue
>> [    2.163080] Uhhuh. NMI received for unknown reason 20 on CPU 0.
>> [    2.163080] Do you have a strange power saving mode enabled?
>> [    2.163080] Dazed and confused, but trying to continue
>> [    2.163080] Uhhuh. NMI received for unknown reason 30 on CPU 0.
>> ------------------------------------------------
>>
>> I'm able to reproduce to reproduce this consistently.
>>
>> L1 QEMU command-line:
>> ====================
>>     $ ps -ef | grep -i qemu
>>     qemu      4962     1 21 15:41 ?        00:00:41
>> /usr/bin/qemu-system-x86_64 -machine accel=kvm -name regular-guest -S
>> -machine pc-i440fx-1.4,accel=kvm,usb=off -cpu Haswell,+vmx -m 6144
>> -smp 4,sockets=4,cores=1,threads=1 -uuid
>> 4ed9ac0b-7f72-dfcf-68b3-e6fe2ac588b2 -nographic -no-user-config
>> -nodefaults -chardev
>>
>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/regular-guest.monitor,server,nowait
>> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
>> -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2
>> -drive
>> file=/home/test/vmimages/regular-guest.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none
>> -device
>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
>> -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device
>>
>> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:80:c1:34,bus=pci.0,addr=0x3
>> -chardev pty,id=charserial0 -device
>> isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0
>> -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
>>
>> L2 QEMU command-line:
>> ====================
>>
>>     $ qemu      2042     1  0 May09 ?        00:05:03
>> /usr/bin/qemu-system-x86_64 -machine accel=kvm -name nested-guest -S
>> -machine pc-i440fx-1.4,accel=kvm,usb=off -m 2048 -smp
>> 2,sockets=2,cores=1,threads=1 -uuid
>> 02ea8988-1054-b08b-bafe-cfbe9659976c -nographic -no-user-config
>> -nodefaults -chardev
>>
>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/nested-guest.monitor,server,nowait
>> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
>> -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2
>> -drive
>> file=/home/test/vmimages/nested-guest.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none
>> -device
>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
>> -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device
>>
>> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:65:c4:e6,bus=pci.0,addr=0x3
>> -chardev pty,id=charserial0 -device
>> isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0
>> -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
>>
>>
>> I attached the vmxcap script output.
>>
>> Before I debug further, anyone has hints here?
>>
>> Many thanks in advance.
>>
>> [1] Notes --  https://github.com/kashyapc/nested-virt-notes-intel-f18
>>


Also,

I'm able to reproduce this consistently:  When I create an L2 guest:

=========
......
[  463.655031] Dazed and confused, but trying to continue
[  463.975563] Uhhuh. NMI received for unknown reason 20 on CPU 1.
[  463.976040] Do you have a strange power saving mode enabled?
[  463.976040] Dazed and confused, but trying to continue
 29  199M   29 58.7M    0     0   136k      0  0:25:02  0:07:20  0:17:42  153k
[  465.136405] Uhhuh. NMI received for unknown reason 30 on CPU 1.
[  465.137042] Do you have a strange power saving mode enabled?
[  465.137042] Dazed and confused, but trying to continue
[  466.645818] Uhhuh. NMI received for unknown reason 20 on CPU 1.
[  466.646044] Do you have a strange power saving mode enabled?
[  466.646044] Dazed and confused, but trying to continue
[  466.907999] Uhhuh. NMI received for unknown reason 30 on CPU 1.
[  466.908033] Do you have a strange power saving mode enabled?
=========

Like I mentioned in my earlier email, I don't have NMI watchdog
enabled. Am I doing something silly here?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-10 13:00 [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted Kashyap Chamarthy
       [not found] ` <CAOaxAcZ1uyx-RrmDEiZhG2H8H5zTCK9iz1nHJKEJwUfhn=vZHA@mail.gmail.com>
@ 2013-05-10 15:12 ` Jan Kiszka
  2013-05-10 15:24   ` Jan Kiszka
  1 sibling, 1 reply; 30+ messages in thread
From: Jan Kiszka @ 2013-05-10 15:12 UTC (permalink / raw)
  To: Kashyap Chamarthy; +Cc: kvm

On 2013-05-10 15:00, Kashyap Chamarthy wrote:
> Heya,
> 
> This is on Intel Haswell.
> 
> First, some version info:
> 
> L0, L1 -- both of them have same versions of kernel, qemu:
> 
> =====
> $ rpm -q kernel --changelog | head -2
> * Thu May 09 2013 Josh Boyer  - 3.10.0-0.rc0.git23.1
> - Linux v3.9-11789-ge0fd9af

Please recheck with kvm.git, next branch.

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-10 15:12 ` Jan Kiszka
@ 2013-05-10 15:24   ` Jan Kiszka
  2013-05-10 15:39     ` Kashyap Chamarthy
  0 siblings, 1 reply; 30+ messages in thread
From: Jan Kiszka @ 2013-05-10 15:24 UTC (permalink / raw)
  To: Kashyap Chamarthy; +Cc: kvm@vger.kernel.org

On 2013-05-10 17:12, Jan Kiszka wrote:
> On 2013-05-10 15:00, Kashyap Chamarthy wrote:
>> Heya,
>>
>> This is on Intel Haswell.
>>
>> First, some version info:
>>
>> L0, L1 -- both of them have same versions of kernel, qemu:
>>
>> =====
>> $ rpm -q kernel --changelog | head -2
>> * Thu May 09 2013 Josh Boyer  - 3.10.0-0.rc0.git23.1
>> - Linux v3.9-11789-ge0fd9af
> 
> Please recheck with kvm.git, next branch.

Hmm, looks like your branch already contains the patch I was thinking of
(03b28f8).

You could try if leaving shadow VMCS off makes a difference, but I bet
that is unrelated. You get that backtrace in L1, correct? I'll have to
see if I can reproduce it.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-10 15:24   ` Jan Kiszka
@ 2013-05-10 15:39     ` Kashyap Chamarthy
  2013-05-10 15:46       ` Kashyap Chamarthy
  2013-05-10 16:33       ` Jan Kiszka
  0 siblings, 2 replies; 30+ messages in thread
From: Kashyap Chamarthy @ 2013-05-10 15:39 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: kvm@vger.kernel.org

On Fri, May 10, 2013 at 8:54 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>
> On 2013-05-10 17:12, Jan Kiszka wrote:
> > On 2013-05-10 15:00, Kashyap Chamarthy wrote:
> >> Heya,
> >>
> >> This is on Intel Haswell.
> >>
> >> First, some version info:
> >>
> >> L0, L1 -- both of them have same versions of kernel, qemu:
> >>
> >> =====
> >> $ rpm -q kernel --changelog | head -2
> >> * Thu May 09 2013 Josh Boyer  - 3.10.0-0.rc0.git23.1
> >> - Linux v3.9-11789-ge0fd9af
> >
> > Please recheck with kvm.git, next branch.
>
> Hmm, looks like your branch already contains the patch I was thinking of
> (03b28f8).

Yes.

>
> You could try if leaving shadow VMCS off makes a difference, but I bet
> that is unrelated.

Right. I could try. But, like you said, does it *really* make a difference.

> You get that backtrace in L1, correct?

Yes.

If you have any further tracing pointers, I could do some debugging.

> I'll have to
> see if I can reproduce it.

Thanks.

If you're looking for a clear reproducer, this is how I conducted my
tests, and here's where I'm capturing all of the related work:

[1] Setup -- https://raw.github.com/kashyapc/nvmx-haswell/master/SETUP-nVMX.rst

[2] Simple scripts used to create L1 and L2 --
https://github.com/kashyapc/nvmx-haswell/tree/master/tests/scripts

[3] Libvirt XMLs I used (for reference) --
https://github.com/kashyapc/nvmx-haswell/tree/master/tests/libvirt-xmls-for-l1-l2


Thanks in advance.

/kashyap

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-10 15:39     ` Kashyap Chamarthy
@ 2013-05-10 15:46       ` Kashyap Chamarthy
  2013-05-10 16:33       ` Jan Kiszka
  1 sibling, 0 replies; 30+ messages in thread
From: Kashyap Chamarthy @ 2013-05-10 15:46 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: kvm@vger.kernel.org

> [3] Libvirt XMLs I used (for reference) --
> https://github.com/kashyapc/nvmx-haswell/tree/master/tests/libvirt-xmls-for-l1-l2

Oops, forgot to add, here we go --
https://github.com/kashyapc/nvmx-haswell/tree/master/tests/libvirt-xmls-for-l1-l2

/kashyap

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-10 15:39     ` Kashyap Chamarthy
  2013-05-10 15:46       ` Kashyap Chamarthy
@ 2013-05-10 16:33       ` Jan Kiszka
  2013-05-10 17:40         ` Nakajima, Jun
  2013-05-10 21:37         ` Kashyap Chamarthy
  1 sibling, 2 replies; 30+ messages in thread
From: Jan Kiszka @ 2013-05-10 16:33 UTC (permalink / raw)
  To: Kashyap Chamarthy; +Cc: kvm@vger.kernel.org

On 2013-05-10 17:39, Kashyap Chamarthy wrote:
> On Fri, May 10, 2013 at 8:54 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>
>> On 2013-05-10 17:12, Jan Kiszka wrote:
>>> On 2013-05-10 15:00, Kashyap Chamarthy wrote:
>>>> Heya,
>>>>
>>>> This is on Intel Haswell.
>>>>
>>>> First, some version info:
>>>>
>>>> L0, L1 -- both of them have same versions of kernel, qemu:
>>>>
>>>> =====
>>>> $ rpm -q kernel --changelog | head -2
>>>> * Thu May 09 2013 Josh Boyer  - 3.10.0-0.rc0.git23.1
>>>> - Linux v3.9-11789-ge0fd9af
>>>
>>> Please recheck with kvm.git, next branch.
>>
>> Hmm, looks like your branch already contains the patch I was thinking of
>> (03b28f8).
> 
> Yes.
> 
>>
>> You could try if leaving shadow VMCS off makes a difference, but I bet
>> that is unrelated.
> 
> Right. I could try. But, like you said, does it *really* make a difference.

We know after you tried. I don't have access to a Haswell box, so we
better exclude this beforehand.

> 
>> You get that backtrace in L1, correct?
> 
> Yes.
> 
> If you have any further tracing pointers, I could do some debugging.

Thanks, I may come back to you if reproduction fails here.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-10 16:33       ` Jan Kiszka
@ 2013-05-10 17:40         ` Nakajima, Jun
  2013-05-10 18:09           ` Jan Kiszka
  2013-05-12  8:32           ` Gleb Natapov
  2013-05-10 21:37         ` Kashyap Chamarthy
  1 sibling, 2 replies; 30+ messages in thread
From: Nakajima, Jun @ 2013-05-10 17:40 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Kashyap Chamarthy, kvm@vger.kernel.org

On Fri, May 10, 2013 at 9:33 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2013-05-10 17:39, Kashyap Chamarthy wrote:
>> On Fri, May 10, 2013 at 8:54 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>
>>> On 2013-05-10 17:12, Jan Kiszka wrote:
>>>> On 2013-05-10 15:00, Kashyap Chamarthy wrote:
>>>>> Heya,
>>>>>
>>>>> This is on Intel Haswell.
>>>>>
>>>>> First, some version info:
>>>>>
>>>>> L0, L1 -- both of them have same versions of kernel, qemu:
>>>>>

I tried to reproduce such a problem, and I found L2 (Linux) hangs in
SeaBIOS, after line "iPXE (http://ipxe.org) ...". It happens with or
w/o VMCS shadowing (and even without my virtual EPT patches). I didn't
realize this problem until I updated the L1 kernel to the latest (e.g.
3.9.0) from 3.7.0. L0 uses the kvm.git, next branch. It's possible
that the L1 kernel exposed a bug with the nested virtualization, as we
saw such cases before.

--
Jun
Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-10 17:40         ` Nakajima, Jun
@ 2013-05-10 18:09           ` Jan Kiszka
  2013-05-10 21:37             ` Kashyap Chamarthy
  2013-05-12  8:32           ` Gleb Natapov
  1 sibling, 1 reply; 30+ messages in thread
From: Jan Kiszka @ 2013-05-10 18:09 UTC (permalink / raw)
  To: Nakajima, Jun; +Cc: Kashyap Chamarthy, kvm@vger.kernel.org

On 2013-05-10 19:40, Nakajima, Jun wrote:
> On Fri, May 10, 2013 at 9:33 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>> On 2013-05-10 17:39, Kashyap Chamarthy wrote:
>>> On Fri, May 10, 2013 at 8:54 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>
>>>> On 2013-05-10 17:12, Jan Kiszka wrote:
>>>>> On 2013-05-10 15:00, Kashyap Chamarthy wrote:
>>>>>> Heya,
>>>>>>
>>>>>> This is on Intel Haswell.
>>>>>>
>>>>>> First, some version info:
>>>>>>
>>>>>> L0, L1 -- both of them have same versions of kernel, qemu:
>>>>>>
> 
> I tried to reproduce such a problem, and I found L2 (Linux) hangs in
> SeaBIOS, after line "iPXE (http://ipxe.org) ...". It happens with or
> w/o VMCS shadowing (and even without my virtual EPT patches). I didn't
> realize this problem until I updated the L1 kernel to the latest (e.g.
> 3.9.0) from 3.7.0. L0 uses the kvm.git, next branch. It's possible
> that the L1 kernel exposed a bug with the nested virtualization, as we
> saw such cases before.

Hmm, no such issues here ATM although I'm on 3.9 for L1 as well.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-10 16:33       ` Jan Kiszka
  2013-05-10 17:40         ` Nakajima, Jun
@ 2013-05-10 21:37         ` Kashyap Chamarthy
  1 sibling, 0 replies; 30+ messages in thread
From: Kashyap Chamarthy @ 2013-05-10 21:37 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: kvm@vger.kernel.org

>
> We know after you tried. I don't have access to a Haswell box, so we
> better exclude this beforehand.

Fair enough. I'll try that too, and let you know.

>
>>
>>> You get that backtrace in L1, correct?
>>
>> Yes.
>>
>> If you have any further tracing pointers, I could do some debugging.
>
> Thanks, I may come back to you if reproduction fails here.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-10 18:09           ` Jan Kiszka
@ 2013-05-10 21:37             ` Kashyap Chamarthy
  2013-05-11  6:55               ` Kashyap Chamarthy
  0 siblings, 1 reply; 30+ messages in thread
From: Kashyap Chamarthy @ 2013-05-10 21:37 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Nakajima, Jun, kvm@vger.kernel.org

On Fri, May 10, 2013 at 11:39 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2013-05-10 19:40, Nakajima, Jun wrote:
>> On Fri, May 10, 2013 at 9:33 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>> On 2013-05-10 17:39, Kashyap Chamarthy wrote:
>>>> On Fri, May 10, 2013 at 8:54 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>>
>>>>> On 2013-05-10 17:12, Jan Kiszka wrote:
>>>>>> On 2013-05-10 15:00, Kashyap Chamarthy wrote:
>>>>>>> Heya,
>>>>>>>
>>>>>>> This is on Intel Haswell.
>>>>>>>
>>>>>>> First, some version info:
>>>>>>>
>>>>>>> L0, L1 -- both of them have same versions of kernel, qemu:
>>>>>>>
>>
>> I tried to reproduce such a problem, and I found L2 (Linux) hangs in
>> SeaBIOS, after line "iPXE (http://ipxe.org) ...". It happens with or
>> w/o VMCS shadowing (and even without my virtual EPT patches). I didn't
>> realize this problem until I updated the L1 kernel to the latest (e.g.
>> 3.9.0) from 3.7.0. L0 uses the kvm.git, next branch. It's possible
>> that the L1 kernel exposed a bug with the nested virtualization, as we
>> saw such cases before.
>
> Hmm, no such issues here ATM although I'm on 3.9 for L1 as well.

Interesting. But Jan, you're not using an "Haswell" machine, right ?

>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
> Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-10 21:37             ` Kashyap Chamarthy
@ 2013-05-11  6:55               ` Kashyap Chamarthy
  0 siblings, 0 replies; 30+ messages in thread
From: Kashyap Chamarthy @ 2013-05-11  6:55 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Nakajima, Jun, kvm@vger.kernel.org

Side note: While testing nVMX, I was hitting a libvirt bug, and filed this one:
-- https://bugzilla.redhat.com/show_bug.cgi?id=961665 -- [virsh]
Attempt to force destroy a guest fails due to 'unknown' reason, leaves
a defunct qemu process

which I was told is possibly a Kernel/KVM bug.

Any further insights here ? Also, others testing with Libvirt
(versions mentioned in the above bug), are you also seeing this ?

Thanks.

/kashyap

Also, others testing w/ Libvirt - do you also see this ?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-10 17:40         ` Nakajima, Jun
  2013-05-10 18:09           ` Jan Kiszka
@ 2013-05-12  8:32           ` Gleb Natapov
  2013-05-12 12:30             ` Kashyap Chamarthy
  1 sibling, 1 reply; 30+ messages in thread
From: Gleb Natapov @ 2013-05-12  8:32 UTC (permalink / raw)
  To: Nakajima, Jun; +Cc: Jan Kiszka, Kashyap Chamarthy, kvm@vger.kernel.org

On Fri, May 10, 2013 at 10:40:22AM -0700, Nakajima, Jun wrote:
> On Fri, May 10, 2013 at 9:33 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> > On 2013-05-10 17:39, Kashyap Chamarthy wrote:
> >> On Fri, May 10, 2013 at 8:54 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> >>>
> >>> On 2013-05-10 17:12, Jan Kiszka wrote:
> >>>> On 2013-05-10 15:00, Kashyap Chamarthy wrote:
> >>>>> Heya,
> >>>>>
> >>>>> This is on Intel Haswell.
> >>>>>
> >>>>> First, some version info:
> >>>>>
> >>>>> L0, L1 -- both of them have same versions of kernel, qemu:
> >>>>>
> 
> I tried to reproduce such a problem, and I found L2 (Linux) hangs in
> SeaBIOS, after line "iPXE (http://ipxe.org) ...". It happens with or
> w/o VMCS shadowing (and even without my virtual EPT patches). I didn't
> realize this problem until I updated the L1 kernel to the latest (e.g.
> 3.9.0) from 3.7.0. L0 uses the kvm.git, next branch. It's possible
> that the L1 kernel exposed a bug with the nested virtualization, as we
> saw such cases before.
> 
This is probably fixed by 8d76c49e9ffeee839bc0b7a3278a23f99101263e. Try
it please.

--
			Gleb.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-12  8:32           ` Gleb Natapov
@ 2013-05-12 12:30             ` Kashyap Chamarthy
  2013-05-12 12:38               ` Gleb Natapov
  0 siblings, 1 reply; 30+ messages in thread
From: Kashyap Chamarthy @ 2013-05-12 12:30 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Nakajima, Jun, Jan Kiszka, kvm@vger.kernel.org

>>
>> I tried to reproduce such a problem, and I found L2 (Linux) hangs in
>> SeaBIOS, after line "iPXE (http://ipxe.org) ...". It happens with or
>> w/o VMCS shadowing (and even without my virtual EPT patches). I didn't
>> realize this problem until I updated the L1 kernel to the latest (e.g.
>> 3.9.0) from 3.7.0. L0 uses the kvm.git, next branch. It's possible
>> that the L1 kernel exposed a bug with the nested virtualization, as we
>> saw such cases before.
>>
> This is probably fixed by 8d76c49e9ffeee839bc0b7a3278a23f99101263e. Try
> it please.

I don't see the above SeaBIOS hang, however I'm able to consistently
reproduce this stack trace when booting L1 guest:

============
....
[    2.516894] VFS: Cannot open root device "mapper/fedora-root" or
unknown-block(0,0): error -6
[    2.527636] Please append a correct "root=" boot option; here are
the available partitions:
[    2.538792] Kernel panic - not syncing: VFS: Unable to mount root
fs on unknown-block(0,0)
[    2.539716] Pid: 1, comm: swapper/0 Not tainted 3.8.11-200.fc18.x86_64 #1
[    2.539716] Call Trace:
[    2.539716]  [<ffffffff81649c19>] panic+0xc1/0x1d0
[    2.539716]  [<ffffffff81d010e0>] mount_block_root+0x1fa/0x2ac
[    2.539716]  [<ffffffff81d011e9>] mount_root+0x57/0x5b
[    2.539716]  [<ffffffff81d0132a>] prepare_namespace+0x13d/0x176
[    2.539716]  [<ffffffff81d00e1c>] kernel_init_freeable+0x1cf/0x1da
[    2.539716]  [<ffffffff81d00610>] ? do_early_param+0x8c/0x8c
[    2.539716]  [<ffffffff81637ca0>] ? rest_init+0x80/0x80
[    2.539716]  [<ffffffff81637cae>] kernel_init+0xe/0xf0
[    2.539716]  [<ffffffff8165bd6c>] ret_from_fork+0x7c/0xb0
[    2.539716]  [<ffffffff81637ca0>] ? rest_init+0x80/0x80
[    2.539716] Uhhuh. NMI received for unknown reason 30 on CPU 1.
[    2.539716] Do you have a strange power saving mode enabled?
[    2.539716] Dazed and confused, but trying to continue
[    2.539716] Uhhuh. NMI received for unknown reason 20 on CPU 1.
============

Howver, L1 boots just fine.

When I try to boot L2, it throws this different stack trace.
============
[176092.303585]   lock(&dev->device_lock);
[176092.307947]
[176092.307947]  *** DEADLOCK ***
[176092.307947]
[176092.314943] 2 locks held by systemd/1:
[176092.319283]  #0:  (misc_mtx){+.+.+.}, at: [<ffffffff814534b8>]
misc_open+0x28/0x1d0
[176092.328104]  #1:  (&wdd->lock){+.+...}, at: [<ffffffff81557f22>]
watchdog_start+0x22/0x80
[176092.337532]
[176092.337532] stack backtrace:
[176092.342661] CPU: 1 PID: 1 Comm: systemd Not tainted
3.10.0-0.rc0.git23.1.fc20.x86_64 #1
[176092.351823] Hardware name: Intel Corporation Shark Bay Client
platform/Flathead Creek Crb, BIOS HSWLPTU1.86C.0109.R03.1301282055
01/28/2013
[176092.366101]  ffffffff8257d070 ffff880241b1b9c0 ffffffff81719128
ffff880241b1ba00
[176092.374617]  ffffffff81714d75 ffff880241b1ba50 ffff880241b80960
ffff880241b80000
[176092.383130]  0000000000000002 0000000000000002 ffff880241b80960
ffff880241b1bac0
[176092.391647] Call Trace:
[176092.394514]  [<ffffffff81719128>] dump_stack+0x19/0x1b
2m  OK  ] Re[176092.400430]  [<ffffffff81714d75>] print_circular_bug+0x201/0x210
[176092.408898]  [<ffffffff810db094>] __lock_acquire+0x17c4/0x1b30
ached target Shu[176092.415602]  [<ffffffff81720d7c>] ?
_raw_spin_unlock_irq+0x2c/0x50
[176092.424276]  [<ffffffff810dbbf2>] lock_acquire+0xa2/0x1f0
tdown.
[176092.430489]  [<ffffffff8149028d>] ? mei_wd_ops_start+0x2d/0xf0
[176092.438070]  [<ffffffff8171d590>] mutex_lock_nested+0x80/0x400
[176092.444772]  [<ffffffff8149028d>] ? mei_wd_ops_start+0x2d/0xf0
[176092.451471]  [<ffffffff8149028d>] ? mei_wd_ops_start+0x2d/0xf0
[176092.458172]  [<ffffffff81557f22>] ? watchdog_start+0x22/0x80
[176092.464678]  [<ffffffff81557f22>] ? watchdog_start+0x22/0x80
[176092.471182]  [<ffffffff8149028d>] mei_wd_ops_start+0x2d/0xf0
[176092.477687]  [<ffffffff81557f5d>] watchdog_start+0x5d/0x80
[176092.483994]  [<ffffffff81558168>] watchdog_open+0x88/0xf0
[176092.490214]  [<ffffffff81453547>] misc_open+0xb7/0x1d0
[176092.496128]  [<ffffffff811e15d2>] chrdev_open+0x92/0x1d0
[176092.502240]  [<ffffffff811da57b>] do_dentry_open+0x24b/0x300
[176092.508745]  [<ffffffff812e8e7c>] ? security_inode_permission+0x1c/0x30
[176092.516330]  [<ffffffff811e1540>] ? cdev_put+0x30/0x30
[176092.522243]  [<ffffffff811da670>] finish_open+0x40/0x50
[176092.528256]  [<ffffffff811ec139>] do_last+0x4d9/0xe40
[176092.534071]  [<ffffffff811ecb53>] path_openat+0xb3/0x530
[176092.540193]  [<ffffffff810acc1f>] ? local_clock+0x5f/0x70
[176092.546403]  [<ffffffff8101fcf5>] ? native_sched_clock+0x15/0x80
[176092.553301]  [<ffffffff810d5d9d>] ? trace_hardirqs_off+0xd/0x10
[176092.560099]  [<ffffffff811ed658>] do_filp_open+0x38/0x80
[176092.566211]  [<ffffffff81720c77>] ? _raw_spin_unlock+0x27/0x40
[176092.572913]  [<ffffffff811fc39f>] ? __alloc_fd+0xaf/0x200
[176092.579123]  [<ffffffff811db9a9>] do_sys_open+0xe9/0x1c0
[176092.585235]  [<ffffffff811dba9e>] SyS_open+0x1e/0x20
[176092.590953]  [<ffffffff8172a999>] system_call_fastpath+0x16/0x1b
Sending SIGTERM to remaining processes...
[176092.622745] systemd-journald[338]: Received SIGTERM
Sending SIGKILL to remaining processes...
Hardware watchdog 'INTCAMT', version 0
Unmounting file systems.
Unmounting /sys/kernel/config.
Unmounting /dev/mqueue.
Unmounting /dev/hugepages.
Unmounting /sys/kernel/debug.
[176094.363845] EXT4-fs (dm-1): re-mounted. Opts: (null)
[176094.548631] EXT4-fs (dm-1): re-mounted. Opts: (null)
[176094.554450] EXT4-fs (dm-1): re-mounted. Opts: (null)
All filesystems unmounted.
Deactivating swaps.
All swaps deactivated.
Detaching loop devices.
All loop devices detached.
Detaching DM devices.
Detaching DM 253:2.
Detaching DM 253:0.
Not all DM devices detached, 1 left.
Detaching DM devices.
Not all DM devices detached, 1 left.
Cannot finalize remaining file systems and devices, giving up.
Storage is finalized.
Successfully changed into root pivot.
Returning to initrd...
[176094.675812] dracut Warning: Killing all remaining processes
============


L1 Kernel:  3.10.0-0.rc0.git26.1.fc20.x86_64

L2 Kernel: 3.10.0-0.rc0.git26.1.fc20.x86_64

How I re-produced this, I noted it in my previous emails to this thread.

Am I doing anything plain incorrect ?

Thanks in advance.

/kashyap

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-12 12:30             ` Kashyap Chamarthy
@ 2013-05-12 12:38               ` Gleb Natapov
  2013-05-12 12:42                 ` Kashyap Chamarthy
  0 siblings, 1 reply; 30+ messages in thread
From: Gleb Natapov @ 2013-05-12 12:38 UTC (permalink / raw)
  To: Kashyap Chamarthy; +Cc: Nakajima, Jun, Jan Kiszka, kvm@vger.kernel.org

On Sun, May 12, 2013 at 06:00:38PM +0530, Kashyap Chamarthy wrote:
> >>
> >> I tried to reproduce such a problem, and I found L2 (Linux) hangs in
> >> SeaBIOS, after line "iPXE (http://ipxe.org) ...". It happens with or
> >> w/o VMCS shadowing (and even without my virtual EPT patches). I didn't
> >> realize this problem until I updated the L1 kernel to the latest (e.g.
> >> 3.9.0) from 3.7.0. L0 uses the kvm.git, next branch. It's possible
> >> that the L1 kernel exposed a bug with the nested virtualization, as we
> >> saw such cases before.
> >>
> > This is probably fixed by 8d76c49e9ffeee839bc0b7a3278a23f99101263e. Try
> > it please.
> 
> I don't see the above SeaBIOS hang, however I'm able to consistently
> reproduce this stack trace when booting L1 guest:
> 
You mean L2 here?

L2 guest cannot find root file system. Unlikely related to KVM.

> ============
> ....
> [    2.516894] VFS: Cannot open root device "mapper/fedora-root" or
> unknown-block(0,0): error -6
> [    2.527636] Please append a correct "root=" boot option; here are
> the available partitions:
> [    2.538792] Kernel panic - not syncing: VFS: Unable to mount root
> fs on unknown-block(0,0)
> [    2.539716] Pid: 1, comm: swapper/0 Not tainted 3.8.11-200.fc18.x86_64 #1
> [    2.539716] Call Trace:
> [    2.539716]  [<ffffffff81649c19>] panic+0xc1/0x1d0
> [    2.539716]  [<ffffffff81d010e0>] mount_block_root+0x1fa/0x2ac
> [    2.539716]  [<ffffffff81d011e9>] mount_root+0x57/0x5b
> [    2.539716]  [<ffffffff81d0132a>] prepare_namespace+0x13d/0x176
> [    2.539716]  [<ffffffff81d00e1c>] kernel_init_freeable+0x1cf/0x1da
> [    2.539716]  [<ffffffff81d00610>] ? do_early_param+0x8c/0x8c
> [    2.539716]  [<ffffffff81637ca0>] ? rest_init+0x80/0x80
> [    2.539716]  [<ffffffff81637cae>] kernel_init+0xe/0xf0
> [    2.539716]  [<ffffffff8165bd6c>] ret_from_fork+0x7c/0xb0
> [    2.539716]  [<ffffffff81637ca0>] ? rest_init+0x80/0x80
> [    2.539716] Uhhuh. NMI received for unknown reason 30 on CPU 1.
> [    2.539716] Do you have a strange power saving mode enabled?
> [    2.539716] Dazed and confused, but trying to continue
> [    2.539716] Uhhuh. NMI received for unknown reason 20 on CPU 1.
> ============
> 
> Howver, L1 boots just fine.
> 
> When I try to boot L2, it throws this different stack trace.
Who is "it"? The stack trace bellow is from L0 judging by hardware name.
Again not KVM related.

> ============
> [176092.303585]   lock(&dev->device_lock);
> [176092.307947]
> [176092.307947]  *** DEADLOCK ***
> [176092.307947]
> [176092.314943] 2 locks held by systemd/1:
> [176092.319283]  #0:  (misc_mtx){+.+.+.}, at: [<ffffffff814534b8>]
> misc_open+0x28/0x1d0
> [176092.328104]  #1:  (&wdd->lock){+.+...}, at: [<ffffffff81557f22>]
> watchdog_start+0x22/0x80
> [176092.337532]
> [176092.337532] stack backtrace:
> [176092.342661] CPU: 1 PID: 1 Comm: systemd Not tainted
> 3.10.0-0.rc0.git23.1.fc20.x86_64 #1
> [176092.351823] Hardware name: Intel Corporation Shark Bay Client
> platform/Flathead Creek Crb, BIOS HSWLPTU1.86C.0109.R03.1301282055
> 01/28/2013
> [176092.366101]  ffffffff8257d070 ffff880241b1b9c0 ffffffff81719128
> ffff880241b1ba00
> [176092.374617]  ffffffff81714d75 ffff880241b1ba50 ffff880241b80960
> ffff880241b80000
> [176092.383130]  0000000000000002 0000000000000002 ffff880241b80960
> ffff880241b1bac0
> [176092.391647] Call Trace:
> [176092.394514]  [<ffffffff81719128>] dump_stack+0x19/0x1b
> 2m  OK  ] Re[176092.400430]  [<ffffffff81714d75>] print_circular_bug+0x201/0x210
> [176092.408898]  [<ffffffff810db094>] __lock_acquire+0x17c4/0x1b30
> ached target Shu[176092.415602]  [<ffffffff81720d7c>] ?
> _raw_spin_unlock_irq+0x2c/0x50
> [176092.424276]  [<ffffffff810dbbf2>] lock_acquire+0xa2/0x1f0
> tdown.
> [176092.430489]  [<ffffffff8149028d>] ? mei_wd_ops_start+0x2d/0xf0
> [176092.438070]  [<ffffffff8171d590>] mutex_lock_nested+0x80/0x400
> [176092.444772]  [<ffffffff8149028d>] ? mei_wd_ops_start+0x2d/0xf0
> [176092.451471]  [<ffffffff8149028d>] ? mei_wd_ops_start+0x2d/0xf0
> [176092.458172]  [<ffffffff81557f22>] ? watchdog_start+0x22/0x80
> [176092.464678]  [<ffffffff81557f22>] ? watchdog_start+0x22/0x80
> [176092.471182]  [<ffffffff8149028d>] mei_wd_ops_start+0x2d/0xf0
> [176092.477687]  [<ffffffff81557f5d>] watchdog_start+0x5d/0x80
> [176092.483994]  [<ffffffff81558168>] watchdog_open+0x88/0xf0
> [176092.490214]  [<ffffffff81453547>] misc_open+0xb7/0x1d0
> [176092.496128]  [<ffffffff811e15d2>] chrdev_open+0x92/0x1d0
> [176092.502240]  [<ffffffff811da57b>] do_dentry_open+0x24b/0x300
> [176092.508745]  [<ffffffff812e8e7c>] ? security_inode_permission+0x1c/0x30
> [176092.516330]  [<ffffffff811e1540>] ? cdev_put+0x30/0x30
> [176092.522243]  [<ffffffff811da670>] finish_open+0x40/0x50
> [176092.528256]  [<ffffffff811ec139>] do_last+0x4d9/0xe40
> [176092.534071]  [<ffffffff811ecb53>] path_openat+0xb3/0x530
> [176092.540193]  [<ffffffff810acc1f>] ? local_clock+0x5f/0x70
> [176092.546403]  [<ffffffff8101fcf5>] ? native_sched_clock+0x15/0x80
> [176092.553301]  [<ffffffff810d5d9d>] ? trace_hardirqs_off+0xd/0x10
> [176092.560099]  [<ffffffff811ed658>] do_filp_open+0x38/0x80
> [176092.566211]  [<ffffffff81720c77>] ? _raw_spin_unlock+0x27/0x40
> [176092.572913]  [<ffffffff811fc39f>] ? __alloc_fd+0xaf/0x200
> [176092.579123]  [<ffffffff811db9a9>] do_sys_open+0xe9/0x1c0
> [176092.585235]  [<ffffffff811dba9e>] SyS_open+0x1e/0x20
> [176092.590953]  [<ffffffff8172a999>] system_call_fastpath+0x16/0x1b
> Sending SIGTERM to remaining processes...
> [176092.622745] systemd-journald[338]: Received SIGTERM
> Sending SIGKILL to remaining processes...
> Hardware watchdog 'INTCAMT', version 0
> Unmounting file systems.
> Unmounting /sys/kernel/config.
> Unmounting /dev/mqueue.
> Unmounting /dev/hugepages.
> Unmounting /sys/kernel/debug.
> [176094.363845] EXT4-fs (dm-1): re-mounted. Opts: (null)
> [176094.548631] EXT4-fs (dm-1): re-mounted. Opts: (null)
> [176094.554450] EXT4-fs (dm-1): re-mounted. Opts: (null)
> All filesystems unmounted.
> Deactivating swaps.
> All swaps deactivated.
> Detaching loop devices.
> All loop devices detached.
> Detaching DM devices.
> Detaching DM 253:2.
> Detaching DM 253:0.
> Not all DM devices detached, 1 left.
> Detaching DM devices.
> Not all DM devices detached, 1 left.
> Cannot finalize remaining file systems and devices, giving up.
> Storage is finalized.
> Successfully changed into root pivot.
> Returning to initrd...
> [176094.675812] dracut Warning: Killing all remaining processes
> ============
> 
> 
> L1 Kernel:  3.10.0-0.rc0.git26.1.fc20.x86_64
> 
> L2 Kernel: 3.10.0-0.rc0.git26.1.fc20.x86_64
> 
> How I re-produced this, I noted it in my previous emails to this thread.
> 
> Am I doing anything plain incorrect ?
> 

--
			Gleb.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-12 12:38               ` Gleb Natapov
@ 2013-05-12 12:42                 ` Kashyap Chamarthy
  2013-05-12 12:59                   ` Abel Gordon
  0 siblings, 1 reply; 30+ messages in thread
From: Kashyap Chamarthy @ 2013-05-12 12:42 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Nakajima, Jun, Jan Kiszka, kvm@vger.kernel.org

On Sun, May 12, 2013 at 6:08 PM, Gleb Natapov <gleb@redhat.com> wrote:
> On Sun, May 12, 2013 at 06:00:38PM +0530, Kashyap Chamarthy wrote:
>> >>
>> >> I tried to reproduce such a problem, and I found L2 (Linux) hangs in
>> >> SeaBIOS, after line "iPXE (http://ipxe.org) ...". It happens with or
>> >> w/o VMCS shadowing (and even without my virtual EPT patches). I didn't
>> >> realize this problem until I updated the L1 kernel to the latest (e.g.
>> >> 3.9.0) from 3.7.0. L0 uses the kvm.git, next branch. It's possible
>> >> that the L1 kernel exposed a bug with the nested virtualization, as we
>> >> saw such cases before.
>> >>
>> > This is probably fixed by 8d76c49e9ffeee839bc0b7a3278a23f99101263e. Try
>> > it please.
>>
>> I don't see the above SeaBIOS hang, however I'm able to consistently
>> reproduce this stack trace when booting L1 guest:
>>
> You mean L2 here?

Yes. (Sorry about that.)

>
> L2 guest cannot find root file system. Unlikely related to KVM.

Yeah, fair enough.

>>
>> Howver, L1 boots just fine.
>>
>> When I try to boot L2, it throws this different stack trace.
> Who is "it"? The stack trace bellow is from L0 judging by hardware name.
> Again not KVM related.

Again, sorry :(.  I was just about to reply that this was physical host.

I'm testing by disabling VMCS Shadowing per Jan Kiszka's suggestion,
and retrying. But I doubt that's the reason my L2 is seg-faulting. If
it still fails, I'll try to create a new L2 to see I can reproduce
more consistently.

Thanks for your response.

/kashyap

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-12 12:42                 ` Kashyap Chamarthy
@ 2013-05-12 12:59                   ` Abel Gordon
  2013-05-12 13:06                     ` Kashyap Chamarthy
  0 siblings, 1 reply; 30+ messages in thread
From: Abel Gordon @ 2013-05-12 12:59 UTC (permalink / raw)
  To: Kashyap Chamarthy
  Cc: Gleb Natapov, Jan Kiszka, Nakajima, Jun, kvm@vger.kernel.org,
	kvm-owner

Kashyap Chamarthy <kashyap.cv@gmail.com> wrote on 12/05/2013 03:42:33 PM:

> Again, sorry :(.  I was just about to reply that this was physical host.
>
> I'm testing by disabling VMCS Shadowing per Jan Kiszka's suggestion,
> and retrying. But I doubt that's the reason my L2 is seg-faulting. If
> it still fails, I'll try to create a new L2 to see I can reproduce
> more consistently.

I doubt shadow-vmcs is related to this issue.
Note shadow vmcs is disabled unless you have a processor
that supports this feature. Do you ?! Also note you can disable
shadow-vmcs using the kvm-intel kernel module parameter
"enable_shadow_vmcs".

Anyway, if you conclude this is related to shadow-vmcs let me know and
I'll try to help.

Regards,
Abel.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-12 12:59                   ` Abel Gordon
@ 2013-05-12 13:06                     ` Kashyap Chamarthy
  2013-05-12 13:31                       ` Kashyap Chamarthy
  2013-05-12 13:48                       ` Abel Gordon
  0 siblings, 2 replies; 30+ messages in thread
From: Kashyap Chamarthy @ 2013-05-12 13:06 UTC (permalink / raw)
  To: Abel Gordon
  Cc: Gleb Natapov, Jan Kiszka, Nakajima, Jun, kvm@vger.kernel.org,
	kvm-owner

On Sun, May 12, 2013 at 6:29 PM, Abel Gordon <ABELG@il.ibm.com> wrote:
>
>
> Kashyap Chamarthy <kashyap.cv@gmail.com> wrote on 12/05/2013 03:42:33 PM:
>
>> Again, sorry :(.  I was just about to reply that this was physical host.
>>
>> I'm testing by disabling VMCS Shadowing per Jan Kiszka's suggestion,
>> and retrying. But I doubt that's the reason my L2 is seg-faulting. If
>> it still fails, I'll try to create a new L2 to see I can reproduce
>> more consistently.
>
> I doubt shadow-vmcs is related to this issue.

Indeed. I just re-tested w/o it, and it has no effect.

I'm trying a guest w/ newer kernel in L2.

> Note shadow vmcs is disabled unless you have a processor
> that supports this feature. Do you ?!

Yes, I noted this in my previous email. I'm using Intel Haswell.

Here's the info from MSR bits on the machine(From `Table 35-3`, MSRs
in Procesors Based on Intel Core Microarchitecture, `Volume 3C of the
SDM )
--------------------------------------------
    # Read msr value
    $ rdmsr 0x48B
    7cff00000000

    # Check Shadow VMCS is enabled:
    $ rdmsr 0x00000485
    300481e5
--------------------------------------------

And, on the Kernel command line:
--------------------------------------------
    # nested
    $ cat /sys/module/kvm_intel/parameters/nested
    Y

    # shadow VMCS
    $ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs
    Y
--------------------------------------------


Just for reference, here's the detailed procedure I noted  while
testing it on Haswell --
https://raw.github.com/kashyapc/nvmx-haswell/master/SETUP-nVMX.rst

Also note you can disable
> shadow-vmcs using the kvm-intel kernel module parameter
> "enable_shadow_vmcs".

Yes, to test w/o shadow VMCS, I disabled it by adding "options
kvm-intel enable_shadow_vmcs=y" to /etc/modprobe.d/dist.conf & reboot
the host.

>
> Anyway, if you conclude this is related to shadow-vmcs let me know and
> I'll try to help.

So, from the above info shadow-vmcs is ruled-out. I'm trying to
investigate further, will post details if I have new findings.

Thank you for your help.

/kashyap

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-12 13:06                     ` Kashyap Chamarthy
@ 2013-05-12 13:31                       ` Kashyap Chamarthy
  2013-05-12 13:33                         ` Kashyap Chamarthy
  2013-05-12 15:52                         ` Gleb Natapov
  2013-05-12 13:48                       ` Abel Gordon
  1 sibling, 2 replies; 30+ messages in thread
From: Kashyap Chamarthy @ 2013-05-12 13:31 UTC (permalink / raw)
  To: Abel Gordon
  Cc: Gleb Natapov, Jan Kiszka, Nakajima, Jun, kvm@vger.kernel.org,
	kvm-owner

> So, from the above info shadow-vmcs is ruled-out. I'm trying to
> investigate further, will post details if I have new findings.

Update:
---------

I just tried to create L2 w/  Fedora-19 TC4 compose of 11MAY2013, I
contibuously see the below fragment (F18/F19, whatever the L2 guest
is):

--------------------
....
[  217.938034] Uhhuh. NMI received for unknown reason 30 on CPU 0.
[  217.938034] Do you have a strange power saving mode enabled?
.[  222.523373] Uhhuh. NMI received for unknown reason 20 on CPU 0.
[  222.524073] Do you have a strange power saving mode enabled?
[  222.524073] Dazed and confused, but trying to continue
[  243.860319] Uhhuh. NMI received for unknown reason 30 on CPU 0
.....
--------------------
At the moment, L2 guest creation stuck at the above message

However I have neither HPET or NMI Watchdog enabled on L0/L1. I checked it by:

$ cat /etc/grub2.cfg  | egrep -i 'hpet|nmi'

I wonder if I'm missing something trivial, or maybe this is some kind
of bug that needs more deeper investigation. But Jan Kiszka reported
he isn't seeing any probs (but he isn't using "Haswell" ).

Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-12 13:31                       ` Kashyap Chamarthy
@ 2013-05-12 13:33                         ` Kashyap Chamarthy
  2013-05-12 15:52                         ` Gleb Natapov
  1 sibling, 0 replies; 30+ messages in thread
From: Kashyap Chamarthy @ 2013-05-12 13:33 UTC (permalink / raw)
  To: Abel Gordon
  Cc: Gleb Natapov, Jan Kiszka, Nakajima, Jun, kvm@vger.kernel.org,
	kvm-owner

> --------------------
> ....
> [  217.938034] Uhhuh. NMI received for unknown reason 30 on CPU 0.
> [  217.938034] Do you have a strange power saving mode enabled?
> .[  222.523373] Uhhuh. NMI received for unknown reason 20 on CPU 0.
> [  222.524073] Do you have a strange power saving mode enabled?
> [  222.524073] Dazed and confused, but trying to continue
> [  243.860319] Uhhuh. NMI received for unknown reason 30 on CPU 0
> .....
> --------------------
> At the moment, L2 guest creation stuck at the above message
>

Well, not entirely stuck, it's moving, but w/ intermittent spitting of
the above message.
========
.
.
.
Installing nss-softokn-freebl (10/236)
[  716.751098] Uhhuh. NMI received for unknown reason 30 on CPU 1.
[  716.751098] Do you have a strange power saving mode enabled?
Installing glibc-common (11/236)d, but trying to continue

[  735.785034] Uhhuh. NMI received for unknown reason 20 on CPU 0.
[  735.785034] Do you have a strange power saving mode enabled?
[  735.785034] Dazed and confused, but trying to continue
[  736.502032] Uhhuh. NMI received for unknown reason 30 on CPU 0.
[  736.502032] Do you have a strange power saving mode enabled?
[  736.502032] Dazed and confused, but trying to continue
[  737.204936] Uhhuh. NMI received for unknown reason 20 on CPU 1.
[  737.205051] Do you have a strange power saving mode enabled?
Installing glibc (12/236)confused, but trying to continue
.
.
.
========

/kashyap

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-12 13:06                     ` Kashyap Chamarthy
  2013-05-12 13:31                       ` Kashyap Chamarthy
@ 2013-05-12 13:48                       ` Abel Gordon
  2013-05-12 15:35                         ` Gleb Natapov
  2013-05-12 16:29                         ` Kashyap Chamarthy
  1 sibling, 2 replies; 30+ messages in thread
From: Abel Gordon @ 2013-05-12 13:48 UTC (permalink / raw)
  To: Kashyap Chamarthy
  Cc: Gleb Natapov, Jan Kiszka, Nakajima, Jun, kvm@vger.kernel.org,
	kvm-owner



Kashyap Chamarthy <kashyap.cv@gmail.com> wrote on 12/05/2013 04:06:40 PM:


> > Note shadow vmcs is disabled unless you have a processor
> > that supports this feature. Do you ?!
>
> Yes, I noted this in my previous email. I'm using Intel Haswell.
>
> Here's the info from MSR bits on the machine(From `Table 35-3`, MSRs
> in Procesors Based on Intel Core Microarchitecture, `Volume 3C of the
> SDM )
> --------------------------------------------
>     # Read msr value
>     $ rdmsr 0x48B
>     7cff00000000
>
>     # Check Shadow VMCS is enabled:
>     $ rdmsr 0x00000485
>     300481e5
> --------------------------------------------
>
> And, on the Kernel command line:
> --------------------------------------------
>     # nested
>     $ cat /sys/module/kvm_intel/parameters/nested
>     Y
>
>     # shadow VMCS
>     $ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs
>     Y
> --------------------------------------------

Yep, shadow-vmcs enabled :)


> Just for reference, here's the detailed procedure I noted  while
> testing it on Haswell --
> https://raw.github.com/kashyapc/nvmx-haswell/master/SETUP-nVMX.rst
>
> Also note you can disable
> > shadow-vmcs using the kvm-intel kernel module parameter
> > "enable_shadow_vmcs".
>
> Yes, to test w/o shadow VMCS, I disabled it by adding "options
> kvm-intel enable_shadow_vmcs=y" to /etc/modprobe.d/dist.conf & reboot
> the host.

I assume you meant enable_shadow_vmcs=n :)

Small question: did you try to disable apicv/posted interrupts at L0 ?
(for L1 you can't enable these features because they are not emulated)





^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-12 13:48                       ` Abel Gordon
@ 2013-05-12 15:35                         ` Gleb Natapov
  2013-05-12 16:29                         ` Kashyap Chamarthy
  1 sibling, 0 replies; 30+ messages in thread
From: Gleb Natapov @ 2013-05-12 15:35 UTC (permalink / raw)
  To: Abel Gordon
  Cc: Kashyap Chamarthy, Jan Kiszka, Nakajima, Jun, kvm@vger.kernel.org,
	kvm-owner

On Sun, May 12, 2013 at 04:48:28PM +0300, Abel Gordon wrote:
> 
> 
> Kashyap Chamarthy <kashyap.cv@gmail.com> wrote on 12/05/2013 04:06:40 PM:
> 
> 
> > > Note shadow vmcs is disabled unless you have a processor
> > > that supports this feature. Do you ?!
> >
> > Yes, I noted this in my previous email. I'm using Intel Haswell.
> >
> > Here's the info from MSR bits on the machine(From `Table 35-3`, MSRs
> > in Procesors Based on Intel Core Microarchitecture, `Volume 3C of the
> > SDM )
> > --------------------------------------------
> >     # Read msr value
> >     $ rdmsr 0x48B
> >     7cff00000000
> >
> >     # Check Shadow VMCS is enabled:
> >     $ rdmsr 0x00000485
> >     300481e5
> > --------------------------------------------
> >
> > And, on the Kernel command line:
> > --------------------------------------------
> >     # nested
> >     $ cat /sys/module/kvm_intel/parameters/nested
> >     Y
> >
> >     # shadow VMCS
> >     $ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs
> >     Y
> > --------------------------------------------
> 
> Yep, shadow-vmcs enabled :)
> 
> 
> > Just for reference, here's the detailed procedure I noted  while
> > testing it on Haswell --
> > https://raw.github.com/kashyapc/nvmx-haswell/master/SETUP-nVMX.rst
> >
> > Also note you can disable
> > > shadow-vmcs using the kvm-intel kernel module parameter
> > > "enable_shadow_vmcs".
> >
> > Yes, to test w/o shadow VMCS, I disabled it by adding "options
> > kvm-intel enable_shadow_vmcs=y" to /etc/modprobe.d/dist.conf & reboot
> > the host.
> 
> I assume you meant enable_shadow_vmcs=n :)
> 
> Small question: did you try to disable apicv/posted interrupts at L0 ?
> (for L1 you can't enable these features because they are not emulated)
> 
AFAIK Haswell does not have apicv/posted interrupts. Not the one I have
access to anyway.

--
			Gleb.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-12 13:31                       ` Kashyap Chamarthy
  2013-05-12 13:33                         ` Kashyap Chamarthy
@ 2013-05-12 15:52                         ` Gleb Natapov
  2013-05-12 16:52                           ` Kashyap Chamarthy
  1 sibling, 1 reply; 30+ messages in thread
From: Gleb Natapov @ 2013-05-12 15:52 UTC (permalink / raw)
  To: Kashyap Chamarthy
  Cc: Abel Gordon, Jan Kiszka, Nakajima, Jun, kvm@vger.kernel.org,
	kvm-owner

On Sun, May 12, 2013 at 07:01:43PM +0530, Kashyap Chamarthy wrote:
> > So, from the above info shadow-vmcs is ruled-out. I'm trying to
> > investigate further, will post details if I have new findings.
> 
> Update:
> ---------
> 
> I just tried to create L2 w/  Fedora-19 TC4 compose of 11MAY2013, I
> contibuously see the below fragment (F18/F19, whatever the L2 guest
> is):
> 
> --------------------
> ....
> [  217.938034] Uhhuh. NMI received for unknown reason 30 on CPU 0.
> [  217.938034] Do you have a strange power saving mode enabled?
> .[  222.523373] Uhhuh. NMI received for unknown reason 20 on CPU 0.
> [  222.524073] Do you have a strange power saving mode enabled?
> [  222.524073] Dazed and confused, but trying to continue
> [  243.860319] Uhhuh. NMI received for unknown reason 30 on CPU 0
> .....
> --------------------
> At the moment, L2 guest creation stuck at the above message
> 
Are those in L2 dmesg or L1?

> However I have neither HPET or NMI Watchdog enabled on L0/L1. I checked it by:
> 
> $ cat /etc/grub2.cfg  | egrep -i 'hpet|nmi'
> 
IIRC watchdog is enabled by default.

> I wonder if I'm missing something trivial, or maybe this is some kind
> of bug that needs more deeper investigation. But Jan Kiszka reported
> he isn't seeing any probs (but he isn't using "Haswell" ).
> 
> Thanks.

--
			Gleb.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-12 13:48                       ` Abel Gordon
  2013-05-12 15:35                         ` Gleb Natapov
@ 2013-05-12 16:29                         ` Kashyap Chamarthy
  1 sibling, 0 replies; 30+ messages in thread
From: Kashyap Chamarthy @ 2013-05-12 16:29 UTC (permalink / raw)
  To: Abel Gordon
  Cc: Gleb Natapov, Jan Kiszka, Nakajima, Jun, kvm@vger.kernel.org,
	kvm-owner

> Yep, shadow-vmcs enabled :)

:)  Good to clarify.

>
>
>> Just for reference, here's the detailed procedure I noted  while
>> testing it on Haswell --
>> https://raw.github.com/kashyapc/nvmx-haswell/master/SETUP-nVMX.rst
>>
>> Also note you can disable
>> > shadow-vmcs using the kvm-intel kernel module parameter
>> > "enable_shadow_vmcs".
>>
>> Yes, to test w/o shadow VMCS, I disabled it by adding "options
>> kvm-intel enable_shadow_vmcs=y" to /etc/modprobe.d/dist.conf & reboot
>> the host.
>
> I assume you meant enable_shadow_vmcs=n :)

Yes, oops, typo :)

>
> Small question: did you try to disable apicv/posted interrupts at L0 ?

I don't have to explicitly disable. Like Gleb (correctly) noted in his
response, APIC-V is not present on Haswell machines. So, it's disabled
by default.

    $ cat /sys/module/kvm_intel/parameters/enable_apicv
    N

(Side note: I did post the out o/p of the above parameters and more in
the SETUP-nVMX.rst notes I pointed above. But, I understand, that
document is a bit large :) )

> (for L1 you can't enable these features because they are not emulated)

Yes, Paolo clarified this to me on IRC, when I erroneously assumed so.
Thanks Paolo !

Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-12 15:52                         ` Gleb Natapov
@ 2013-05-12 16:52                           ` Kashyap Chamarthy
  2013-05-13  6:31                             ` Jan Kiszka
  0 siblings, 1 reply; 30+ messages in thread
From: Kashyap Chamarthy @ 2013-05-12 16:52 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Abel Gordon, Jan Kiszka, Nakajima, Jun, kvm@vger.kernel.org,
	kvm-owner

>> --------------------
>> ....
>> [  217.938034] Uhhuh. NMI received for unknown reason 30 on CPU 0.
>> [  217.938034] Do you have a strange power saving mode enabled?
>> .[  222.523373] Uhhuh. NMI received for unknown reason 20 on CPU 0.
>> [  222.524073] Do you have a strange power saving mode enabled?
>> [  222.524073] Dazed and confused, but trying to continue
>> [  243.860319] Uhhuh. NMI received for unknown reason 30 on CPU 0
>> .....
>> --------------------
>> At the moment, L2 guest creation stuck at the above message
>>
> Are those in L2 dmesg or L1?

L2 dmesg.


>> $ cat /etc/grub2.cfg  | egrep -i 'hpet|nmi'
>>
> IIRC watchdog is enabled by default.

Indeed, you're right. I disabled NMI on L1, and rebooted the newly
created L2 guest starts just fine.

I'm cloning it to run another instance of L2. And, later try some
kernel compiles inside L2 to see if I can consistently get some
measurable numbers.


Thanks all for your help.

/kashyap

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-12 16:52                           ` Kashyap Chamarthy
@ 2013-05-13  6:31                             ` Jan Kiszka
  2013-05-13  6:39                               ` Gleb Natapov
  0 siblings, 1 reply; 30+ messages in thread
From: Jan Kiszka @ 2013-05-13  6:31 UTC (permalink / raw)
  To: Kashyap Chamarthy
  Cc: Gleb Natapov, Abel Gordon, Nakajima, Jun, kvm@vger.kernel.org,
	kvm-owner@vger.kernel.org

On 2013-05-12 18:52, Kashyap Chamarthy wrote:
>>> --------------------
>>> ....
>>> [  217.938034] Uhhuh. NMI received for unknown reason 30 on CPU 0.
>>> [  217.938034] Do you have a strange power saving mode enabled?
>>> .[  222.523373] Uhhuh. NMI received for unknown reason 20 on CPU 0.
>>> [  222.524073] Do you have a strange power saving mode enabled?
>>> [  222.524073] Dazed and confused, but trying to continue
>>> [  243.860319] Uhhuh. NMI received for unknown reason 30 on CPU 0
>>> .....
>>> --------------------
>>> At the moment, L2 guest creation stuck at the above message
>>>
>> Are those in L2 dmesg or L1?
> 
> L2 dmesg.
> 
> 
>>> $ cat /etc/grub2.cfg  | egrep -i 'hpet|nmi'
>>>
>> IIRC watchdog is enabled by default.
> 
> Indeed, you're right. I disabled NMI on L1, and rebooted the newly
> created L2 guest starts just fine.

NMI watchdogs go via some perf counters theses days IIRC. Can anyone
tell me which of those may be used in Kashyap's setup? I'm probably
lacking them for my guests and therefore do not see the errors.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-13  6:31                             ` Jan Kiszka
@ 2013-05-13  6:39                               ` Gleb Natapov
  2013-05-13  6:45                                 ` Ren, Yongjie
  0 siblings, 1 reply; 30+ messages in thread
From: Gleb Natapov @ 2013-05-13  6:39 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Kashyap Chamarthy, Abel Gordon, Nakajima, Jun,
	kvm@vger.kernel.org, kvm-owner@vger.kernel.org

On Mon, May 13, 2013 at 08:31:33AM +0200, Jan Kiszka wrote:
> On 2013-05-12 18:52, Kashyap Chamarthy wrote:
> >>> --------------------
> >>> ....
> >>> [  217.938034] Uhhuh. NMI received for unknown reason 30 on CPU 0.
> >>> [  217.938034] Do you have a strange power saving mode enabled?
> >>> .[  222.523373] Uhhuh. NMI received for unknown reason 20 on CPU 0.
> >>> [  222.524073] Do you have a strange power saving mode enabled?
> >>> [  222.524073] Dazed and confused, but trying to continue
> >>> [  243.860319] Uhhuh. NMI received for unknown reason 30 on CPU 0
> >>> .....
> >>> --------------------
> >>> At the moment, L2 guest creation stuck at the above message
> >>>
> >> Are those in L2 dmesg or L1?
> > 
> > L2 dmesg.
> > 
> > 
> >>> $ cat /etc/grub2.cfg  | egrep -i 'hpet|nmi'
> >>>
> >> IIRC watchdog is enabled by default.
> > 
> > Indeed, you're right. I disabled NMI on L1, and rebooted the newly
> > created L2 guest starts just fine.
> 
> NMI watchdogs go via some perf counters theses days IIRC. Can anyone
> tell me which of those may be used in Kashyap's setup? I'm probably
> lacking them for my guests and therefore do not see the errors.
> 
Try running with -cpu host for L1. Your CPU definition probably lacks
PMU leaf.

--
			Gleb.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-13  6:39                               ` Gleb Natapov
@ 2013-05-13  6:45                                 ` Ren, Yongjie
  2013-05-13  6:57                                   ` Jan Kiszka
  0 siblings, 1 reply; 30+ messages in thread
From: Ren, Yongjie @ 2013-05-13  6:45 UTC (permalink / raw)
  To: Gleb Natapov, Jan Kiszka
  Cc: Kashyap Chamarthy, Abel Gordon, Nakajima, Jun,
	kvm@vger.kernel.org, kvm-owner@vger.kernel.org

> -----Original Message-----
> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org]
> On Behalf Of Gleb Natapov
> Sent: Monday, May 13, 2013 2:39 PM
> To: Jan Kiszka
> Cc: Kashyap Chamarthy; Abel Gordon; Nakajima, Jun;
> kvm@vger.kernel.org; kvm-owner@vger.kernel.org
> Subject: Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest
> is rebooted.
> 
> On Mon, May 13, 2013 at 08:31:33AM +0200, Jan Kiszka wrote:
> > On 2013-05-12 18:52, Kashyap Chamarthy wrote:
> > >>> --------------------
> > >>> ....
> > >>> [  217.938034] Uhhuh. NMI received for unknown reason 30 on CPU
> 0.
> > >>> [  217.938034] Do you have a strange power saving mode enabled?
> > >>> .[  222.523373] Uhhuh. NMI received for unknown reason 20 on
> CPU 0.
> > >>> [  222.524073] Do you have a strange power saving mode enabled?
> > >>> [  222.524073] Dazed and confused, but trying to continue
> > >>> [  243.860319] Uhhuh. NMI received for unknown reason 30 on CPU
> 0
> > >>> .....
> > >>> --------------------
> > >>> At the moment, L2 guest creation stuck at the above message
> > >>>
> > >> Are those in L2 dmesg or L1?
> > >
> > > L2 dmesg.
> > >
> > >
> > >>> $ cat /etc/grub2.cfg  | egrep -i 'hpet|nmi'
> > >>>
> > >> IIRC watchdog is enabled by default.
> > >
> > > Indeed, you're right. I disabled NMI on L1, and rebooted the newly
> > > created L2 guest starts just fine.
> >
> > NMI watchdogs go via some perf counters theses days IIRC. Can anyone
> > tell me which of those may be used in Kashyap's setup? I'm probably
> > lacking them for my guests and therefore do not see the errors.
> >
> Try running with -cpu host for L1. Your CPU definition probably lacks
> PMU leaf.
> 
I met the same NMI issue in L2, too.
L1: -cpu host  (or -cpu Haswell,+vmx)
L2: -cpu qemu64 by default

If I use '-cpu qemu64,+vmx' to create L1, I'll not meet NMI issue in L2.

Best Regards,
     Yongjie (Jay)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-13  6:45                                 ` Ren, Yongjie
@ 2013-05-13  6:57                                   ` Jan Kiszka
  2013-05-13  7:06                                     ` Gleb Natapov
  0 siblings, 1 reply; 30+ messages in thread
From: Jan Kiszka @ 2013-05-13  6:57 UTC (permalink / raw)
  To: Ren, Yongjie
  Cc: Gleb Natapov, Kashyap Chamarthy, Abel Gordon, Nakajima, Jun,
	kvm@vger.kernel.org, kvm-owner@vger.kernel.org

On 2013-05-13 08:45, Ren, Yongjie wrote:
>> -----Original Message-----
>> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org]
>> On Behalf Of Gleb Natapov
>> Sent: Monday, May 13, 2013 2:39 PM
>> To: Jan Kiszka
>> Cc: Kashyap Chamarthy; Abel Gordon; Nakajima, Jun;
>> kvm@vger.kernel.org; kvm-owner@vger.kernel.org
>> Subject: Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest
>> is rebooted.
>>
>> On Mon, May 13, 2013 at 08:31:33AM +0200, Jan Kiszka wrote:
>>> On 2013-05-12 18:52, Kashyap Chamarthy wrote:
>>>>>> --------------------
>>>>>> ....
>>>>>> [  217.938034] Uhhuh. NMI received for unknown reason 30 on CPU
>> 0.
>>>>>> [  217.938034] Do you have a strange power saving mode enabled?
>>>>>> .[  222.523373] Uhhuh. NMI received for unknown reason 20 on
>> CPU 0.
>>>>>> [  222.524073] Do you have a strange power saving mode enabled?
>>>>>> [  222.524073] Dazed and confused, but trying to continue
>>>>>> [  243.860319] Uhhuh. NMI received for unknown reason 30 on CPU
>> 0
>>>>>> .....
>>>>>> --------------------
>>>>>> At the moment, L2 guest creation stuck at the above message
>>>>>>
>>>>> Are those in L2 dmesg or L1?
>>>>
>>>> L2 dmesg.
>>>>
>>>>
>>>>>> $ cat /etc/grub2.cfg  | egrep -i 'hpet|nmi'
>>>>>>
>>>>> IIRC watchdog is enabled by default.
>>>>
>>>> Indeed, you're right. I disabled NMI on L1, and rebooted the newly
>>>> created L2 guest starts just fine.
>>>
>>> NMI watchdogs go via some perf counters theses days IIRC. Can anyone
>>> tell me which of those may be used in Kashyap's setup? I'm probably
>>> lacking them for my guests and therefore do not see the errors.
>>>
>> Try running with -cpu host for L1. Your CPU definition probably lacks
>> PMU leaf.
>>
> I met the same NMI issue in L2, too.
> L1: -cpu host  (or -cpu Haswell,+vmx)
> L2: -cpu qemu64 by default
> 
> If I use '-cpu qemu64,+vmx' to create L1, I'll not meet NMI issue in L2.
> 

That, and it looks like my guest kernel was lacking
CONFIG_LOCKUP_DETECTOR. Will rebuild and retest later.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted.
  2013-05-13  6:57                                   ` Jan Kiszka
@ 2013-05-13  7:06                                     ` Gleb Natapov
  0 siblings, 0 replies; 30+ messages in thread
From: Gleb Natapov @ 2013-05-13  7:06 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Ren, Yongjie, Kashyap Chamarthy, Abel Gordon, Nakajima, Jun,
	kvm@vger.kernel.org, kvm-owner@vger.kernel.org

On Mon, May 13, 2013 at 08:57:32AM +0200, Jan Kiszka wrote:
> On 2013-05-13 08:45, Ren, Yongjie wrote:
> >> -----Original Message-----
> >> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org]
> >> On Behalf Of Gleb Natapov
> >> Sent: Monday, May 13, 2013 2:39 PM
> >> To: Jan Kiszka
> >> Cc: Kashyap Chamarthy; Abel Gordon; Nakajima, Jun;
> >> kvm@vger.kernel.org; kvm-owner@vger.kernel.org
> >> Subject: Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest
> >> is rebooted.
> >>
> >> On Mon, May 13, 2013 at 08:31:33AM +0200, Jan Kiszka wrote:
> >>> On 2013-05-12 18:52, Kashyap Chamarthy wrote:
> >>>>>> --------------------
> >>>>>> ....
> >>>>>> [  217.938034] Uhhuh. NMI received for unknown reason 30 on CPU
> >> 0.
> >>>>>> [  217.938034] Do you have a strange power saving mode enabled?
> >>>>>> .[  222.523373] Uhhuh. NMI received for unknown reason 20 on
> >> CPU 0.
> >>>>>> [  222.524073] Do you have a strange power saving mode enabled?
> >>>>>> [  222.524073] Dazed and confused, but trying to continue
> >>>>>> [  243.860319] Uhhuh. NMI received for unknown reason 30 on CPU
> >> 0
> >>>>>> .....
> >>>>>> --------------------
> >>>>>> At the moment, L2 guest creation stuck at the above message
> >>>>>>
> >>>>> Are those in L2 dmesg or L1?
> >>>>
> >>>> L2 dmesg.
> >>>>
> >>>>
> >>>>>> $ cat /etc/grub2.cfg  | egrep -i 'hpet|nmi'
> >>>>>>
> >>>>> IIRC watchdog is enabled by default.
> >>>>
> >>>> Indeed, you're right. I disabled NMI on L1, and rebooted the newly
> >>>> created L2 guest starts just fine.
> >>>
> >>> NMI watchdogs go via some perf counters theses days IIRC. Can anyone
> >>> tell me which of those may be used in Kashyap's setup? I'm probably
> >>> lacking them for my guests and therefore do not see the errors.
> >>>
> >> Try running with -cpu host for L1. Your CPU definition probably lacks
> >> PMU leaf.
> >>
> > I met the same NMI issue in L2, too.
> > L1: -cpu host  (or -cpu Haswell,+vmx)
> > L2: -cpu qemu64 by default
> > 
> > If I use '-cpu qemu64,+vmx' to create L1, I'll not meet NMI issue in L2.
> > 
> 
> That, and it looks like my guest kernel was lacking
> CONFIG_LOCKUP_DETECTOR. Will rebuild and retest later.
> 
It looks like NMI injected by L0 to L1 are mistakenly injected into L2.
Can you test this by injecting NMI into L1 via qemu monitor?

--
			Gleb.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2013-05-13  7:06 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-10 13:00 [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted Kashyap Chamarthy
     [not found] ` <CAOaxAcZ1uyx-RrmDEiZhG2H8H5zTCK9iz1nHJKEJwUfhn=vZHA@mail.gmail.com>
2013-05-10 14:41   ` Kashyap Chamarthy
2013-05-10 15:12 ` Jan Kiszka
2013-05-10 15:24   ` Jan Kiszka
2013-05-10 15:39     ` Kashyap Chamarthy
2013-05-10 15:46       ` Kashyap Chamarthy
2013-05-10 16:33       ` Jan Kiszka
2013-05-10 17:40         ` Nakajima, Jun
2013-05-10 18:09           ` Jan Kiszka
2013-05-10 21:37             ` Kashyap Chamarthy
2013-05-11  6:55               ` Kashyap Chamarthy
2013-05-12  8:32           ` Gleb Natapov
2013-05-12 12:30             ` Kashyap Chamarthy
2013-05-12 12:38               ` Gleb Natapov
2013-05-12 12:42                 ` Kashyap Chamarthy
2013-05-12 12:59                   ` Abel Gordon
2013-05-12 13:06                     ` Kashyap Chamarthy
2013-05-12 13:31                       ` Kashyap Chamarthy
2013-05-12 13:33                         ` Kashyap Chamarthy
2013-05-12 15:52                         ` Gleb Natapov
2013-05-12 16:52                           ` Kashyap Chamarthy
2013-05-13  6:31                             ` Jan Kiszka
2013-05-13  6:39                               ` Gleb Natapov
2013-05-13  6:45                                 ` Ren, Yongjie
2013-05-13  6:57                                   ` Jan Kiszka
2013-05-13  7:06                                     ` Gleb Natapov
2013-05-12 13:48                       ` Abel Gordon
2013-05-12 15:35                         ` Gleb Natapov
2013-05-12 16:29                         ` Kashyap Chamarthy
2013-05-10 21:37         ` Kashyap Chamarthy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.