From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gleb Natapov Subject: Re: [nVMX with: v3.9-11789-ge0fd9af] Stack trace when L2 guest is rebooted. Date: Sun, 12 May 2013 15:38:49 +0300 Message-ID: <20130512123849.GK10830@redhat.com> References: <518D0E54.7000004@siemens.com> <518D111A.2070604@siemens.com> <518D216C.90107@siemens.com> <20130512083210.GD10830@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "Nakajima, Jun" , Jan Kiszka , "kvm@vger.kernel.org" To: Kashyap Chamarthy Return-path: Received: from mx1.redhat.com ([209.132.183.28]:28294 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751122Ab3ELMi5 (ORCPT ); Sun, 12 May 2013 08:38:57 -0400 Content-Disposition: inline In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On Sun, May 12, 2013 at 06:00:38PM +0530, Kashyap Chamarthy wrote: > >> > >> I tried to reproduce such a problem, and I found L2 (Linux) hangs in > >> SeaBIOS, after line "iPXE (http://ipxe.org) ...". It happens with or > >> w/o VMCS shadowing (and even without my virtual EPT patches). I didn't > >> realize this problem until I updated the L1 kernel to the latest (e.g. > >> 3.9.0) from 3.7.0. L0 uses the kvm.git, next branch. It's possible > >> that the L1 kernel exposed a bug with the nested virtualization, as we > >> saw such cases before. > >> > > This is probably fixed by 8d76c49e9ffeee839bc0b7a3278a23f99101263e. Try > > it please. > > I don't see the above SeaBIOS hang, however I'm able to consistently > reproduce this stack trace when booting L1 guest: > You mean L2 here? L2 guest cannot find root file system. Unlikely related to KVM. > ============ > .... > [ 2.516894] VFS: Cannot open root device "mapper/fedora-root" or > unknown-block(0,0): error -6 > [ 2.527636] Please append a correct "root=" boot option; here are > the available partitions: > [ 2.538792] Kernel panic - not syncing: VFS: Unable to mount root > fs on unknown-block(0,0) > [ 2.539716] Pid: 1, comm: swapper/0 Not tainted 3.8.11-200.fc18.x86_64 #1 > [ 2.539716] Call Trace: > [ 2.539716] [] panic+0xc1/0x1d0 > [ 2.539716] [] mount_block_root+0x1fa/0x2ac > [ 2.539716] [] mount_root+0x57/0x5b > [ 2.539716] [] prepare_namespace+0x13d/0x176 > [ 2.539716] [] kernel_init_freeable+0x1cf/0x1da > [ 2.539716] [] ? do_early_param+0x8c/0x8c > [ 2.539716] [] ? rest_init+0x80/0x80 > [ 2.539716] [] kernel_init+0xe/0xf0 > [ 2.539716] [] ret_from_fork+0x7c/0xb0 > [ 2.539716] [] ? rest_init+0x80/0x80 > [ 2.539716] Uhhuh. NMI received for unknown reason 30 on CPU 1. > [ 2.539716] Do you have a strange power saving mode enabled? > [ 2.539716] Dazed and confused, but trying to continue > [ 2.539716] Uhhuh. NMI received for unknown reason 20 on CPU 1. > ============ > > Howver, L1 boots just fine. > > When I try to boot L2, it throws this different stack trace. Who is "it"? The stack trace bellow is from L0 judging by hardware name. Again not KVM related. > ============ > [176092.303585] lock(&dev->device_lock); > [176092.307947] > [176092.307947] *** DEADLOCK *** > [176092.307947] > [176092.314943] 2 locks held by systemd/1: > [176092.319283] #0: (misc_mtx){+.+.+.}, at: [] > misc_open+0x28/0x1d0 > [176092.328104] #1: (&wdd->lock){+.+...}, at: [] > watchdog_start+0x22/0x80 > [176092.337532] > [176092.337532] stack backtrace: > [176092.342661] CPU: 1 PID: 1 Comm: systemd Not tainted > 3.10.0-0.rc0.git23.1.fc20.x86_64 #1 > [176092.351823] Hardware name: Intel Corporation Shark Bay Client > platform/Flathead Creek Crb, BIOS HSWLPTU1.86C.0109.R03.1301282055 > 01/28/2013 > [176092.366101] ffffffff8257d070 ffff880241b1b9c0 ffffffff81719128 > ffff880241b1ba00 > [176092.374617] ffffffff81714d75 ffff880241b1ba50 ffff880241b80960 > ffff880241b80000 > [176092.383130] 0000000000000002 0000000000000002 ffff880241b80960 > ffff880241b1bac0 > [176092.391647] Call Trace: > [176092.394514] [] dump_stack+0x19/0x1b > 2m OK ] Re[176092.400430] [] print_circular_bug+0x201/0x210 > [176092.408898] [] __lock_acquire+0x17c4/0x1b30 > ached target Shu[176092.415602] [] ? > _raw_spin_unlock_irq+0x2c/0x50 > [176092.424276] [] lock_acquire+0xa2/0x1f0 > tdown. > [176092.430489] [] ? mei_wd_ops_start+0x2d/0xf0 > [176092.438070] [] mutex_lock_nested+0x80/0x400 > [176092.444772] [] ? mei_wd_ops_start+0x2d/0xf0 > [176092.451471] [] ? mei_wd_ops_start+0x2d/0xf0 > [176092.458172] [] ? watchdog_start+0x22/0x80 > [176092.464678] [] ? watchdog_start+0x22/0x80 > [176092.471182] [] mei_wd_ops_start+0x2d/0xf0 > [176092.477687] [] watchdog_start+0x5d/0x80 > [176092.483994] [] watchdog_open+0x88/0xf0 > [176092.490214] [] misc_open+0xb7/0x1d0 > [176092.496128] [] chrdev_open+0x92/0x1d0 > [176092.502240] [] do_dentry_open+0x24b/0x300 > [176092.508745] [] ? security_inode_permission+0x1c/0x30 > [176092.516330] [] ? cdev_put+0x30/0x30 > [176092.522243] [] finish_open+0x40/0x50 > [176092.528256] [] do_last+0x4d9/0xe40 > [176092.534071] [] path_openat+0xb3/0x530 > [176092.540193] [] ? local_clock+0x5f/0x70 > [176092.546403] [] ? native_sched_clock+0x15/0x80 > [176092.553301] [] ? trace_hardirqs_off+0xd/0x10 > [176092.560099] [] do_filp_open+0x38/0x80 > [176092.566211] [] ? _raw_spin_unlock+0x27/0x40 > [176092.572913] [] ? __alloc_fd+0xaf/0x200 > [176092.579123] [] do_sys_open+0xe9/0x1c0 > [176092.585235] [] SyS_open+0x1e/0x20 > [176092.590953] [] system_call_fastpath+0x16/0x1b > Sending SIGTERM to remaining processes... > [176092.622745] systemd-journald[338]: Received SIGTERM > Sending SIGKILL to remaining processes... > Hardware watchdog 'INTCAMT', version 0 > Unmounting file systems. > Unmounting /sys/kernel/config. > Unmounting /dev/mqueue. > Unmounting /dev/hugepages. > Unmounting /sys/kernel/debug. > [176094.363845] EXT4-fs (dm-1): re-mounted. Opts: (null) > [176094.548631] EXT4-fs (dm-1): re-mounted. Opts: (null) > [176094.554450] EXT4-fs (dm-1): re-mounted. Opts: (null) > All filesystems unmounted. > Deactivating swaps. > All swaps deactivated. > Detaching loop devices. > All loop devices detached. > Detaching DM devices. > Detaching DM 253:2. > Detaching DM 253:0. > Not all DM devices detached, 1 left. > Detaching DM devices. > Not all DM devices detached, 1 left. > Cannot finalize remaining file systems and devices, giving up. > Storage is finalized. > Successfully changed into root pivot. > Returning to initrd... > [176094.675812] dracut Warning: Killing all remaining processes > ============ > > > L1 Kernel: 3.10.0-0.rc0.git26.1.fc20.x86_64 > > L2 Kernel: 3.10.0-0.rc0.git26.1.fc20.x86_64 > > How I re-produced this, I noted it in my previous emails to this thread. > > Am I doing anything plain incorrect ? > -- Gleb.