From: Paolo Bonzini <pbonzini@redhat.com>
To: Jim Minter <jminter@redhat.com>,
qemu-devel <qemu-devel@nongnu.org>,
Hannes Reinecke <hare@suse.de>
Subject: Re: [Qemu-devel] sda abort with virtio-scsi
Date: Thu, 4 Feb 2016 00:19:56 +0100 [thread overview]
Message-ID: <56B28B1C.7060202@redhat.com> (raw)
In-Reply-To: <56B2754B.7030809@redhat.com>
On 03/02/2016 22:46, Jim Minter wrote:
> I am hitting the following VM lockup issue running a VM with latest
> RHEL7 kernel on a host also running latest RHEL7 kernel. FWIW I'm using
> virtio-scsi because I want to use discard=unmap. I ran the VM as follows:
>
> /usr/libexec/qemu-kvm -nodefaults \
> -cpu host \
> -smp 4 \
> -m 8192 \
> -drive discard=unmap,file=vm.qcow2,id=disk1,if=none,cache=unsafe \
> -device virtio-scsi-pci \
> -device scsi-disk,drive=disk1 \
> -netdev bridge,id=net0,br=br0 \
> -device virtio-net-pci,netdev=net0,mac=$(utils/random-mac.py) \
> -chardev socket,id=chan0,path=/tmp/rhev.sock,server,nowait \
> -chardev socket,id=chan1,path=/tmp/qemu.sock,server,nowait \
> -monitor unix:tmp/vm.sock,server,nowait \
> -device virtio-serial-pci \
> -device virtserialport,chardev=chan0,name=com.redhat.rhevm.vdsm \
> -device virtserialport,chardev=chan1,name=org.qemu.guest_agent.0 \
> -device cirrus-vga \
> -vnc none \
> -usbdevice tablet
>
> The host was busyish at the time, but not excessively (IMO). Nothing
> untoward in the host's kernel log; host storage subsystem is fine. I
> didn't get any qemu logs this time around, but I will when the issue
> next recurs. The VM's full kernel log is attached; here are the
> highlights:
Hannes, were you going to send a patch to disable time outs?
>
> INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=60002 jiffies, g=5253, c=5252, q=0)
> sending NMI to all CPUs:
> NMI backtrace for cpu 1
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.10.0-327.4.5.el7.x86_64 #1
> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> task: ffff88023417d080 ti: ffff8802341a4000 task.ti: ffff8802341a4000
> RIP: 0010:[<ffffffff81058e96>] [<ffffffff81058e96>] native_safe_halt+0x6/0x10
> RSP: 0018:ffff8802341a7e98 EFLAGS: 00000286
> RAX: 00000000ffffffed RBX: ffff8802341a4000 RCX: 0100000000000000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000046
> RBP: ffff8802341a7e98 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
> R13: ffff8802341a4000 R14: ffff8802341a4000 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffff88023fc80000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f4978587008 CR3: 000000003645e000 CR4: 00000000003407e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Stack:
> ffff8802341a7eb8 ffffffff8101dbcf ffff8802341a4000 ffffffff81a68260
> ffff8802341a7ec8 ffffffff8101e4d6 ffff8802341a7f20 ffffffff810d62e5
> ffff8802341a7fd8 ffff8802341a4000 2581685d70de192c 7ba58fdb3a3bc8d4
> Call Trace:
> [<ffffffff8101dbcf>] default_idle+0x1f/0xc0
> [<ffffffff8101e4d6>] arch_cpu_idle+0x26/0x30
> [<ffffffff810d62e5>] cpu_startup_entry+0x245/0x290
> [<ffffffff810475fa>] start_secondary+0x1ba/0x230
> Code: 00 00 00 00 00 55 48 89 e5 fa 5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb 5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb f4 <5d> c3 0f 1f 84 00 00 00 00 00 55 48 89 e5 f4 5d c3 66 0f 1f 84
> NMI backtrace for cpu 0
This is the NMI watchdog firing; the CPU got stuck for 20 seconds. The
issue was not a busy host, but a busy storage (could it be a network
partition if the disk was hosted on NFS???)
Firing the NMI watchdog is fixed in more recent QEMU, which has
asynchronous cancellation, assuming you're running RHEL's QEMU 1.5.3
(try /usr/libexec/qemu-kvm --version, or rpm -qf /usr/libexec/qemu-kvm).
Thanks,
Paolo
next prev parent reply other threads:[~2016-02-03 23:20 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-03 21:46 [Qemu-devel] sda abort with virtio-scsi Jim Minter
2016-02-03 23:19 ` Paolo Bonzini [this message]
2016-02-03 23:34 ` Jim Minter
2016-02-04 10:23 ` Paolo Bonzini
2016-02-04 11:00 ` Denis V. Lunev
2016-02-04 13:41 ` Jim Minter
2016-02-04 13:54 ` Hannes Reinecke
2016-02-04 15:03 ` Paolo Bonzini
2016-02-04 15:11 ` Hannes Reinecke
2016-02-08 20:02 ` Jim Minter
2016-02-04 6:59 ` Hannes Reinecke
2016-02-04 11:27 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56B28B1C.7060202@redhat.com \
--to=pbonzini@redhat.com \
--cc=hare@suse.de \
--cc=jminter@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.