qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Unresponsive linux guest once migrated
@ 2014-03-27 22:52 Chris Dunlop
  2014-03-27 23:29 ` Marcin Gibuła
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Dunlop @ 2014-03-27 22:52 UTC (permalink / raw)
  To: qemu-devel

Hi,

I have a problem where I migrate a linux guest VM, and on the
receiving side the guest goes to 100% cpu as seen by the host, and
the guest itself is unresponsive, e.g. not responding to ping etc.
The only way out I've found is to destroy the guest.

This seems to only happen if the guest has been idle for an extended
period (e.g. overnight). I've migrated the guest 100 times in a row
without any problems when the guest has been used "a little" (e.g.
logging in and looking around, it's not doing anything normally).

I've not had similar problems migrating Windows guests.

guest - debian wheezy, kernel 3.2.0-4-amd64
host - debian wheezy, kernel 3.10.33 x86_64 (self-compiled)
qemu - qemu_1.7.0+dfsg-2~bpo70+2 + rbd (self-compiled)

All guests use ceph rbd for backing store.

qemu-system-x86_64 -enable-kvm -name test -S -machine pc-1.0,accel=kvm,usb=off -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 620dd8e0-f24c-485d-a134-ba5961ce6531 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/test.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=rbd:pool/test:id=test:key=xxxxxxxxxxx=:auth_supported=cephx\;none,if=none,id=drive-virtio-disk0,format=raw -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:29:10:16,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -incoming tcp:[::]:49152 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4

Ps tells me the qemu-system-x86_64 process has 17 threads, and it's
the 2nd last of these that's consuming the cpu. Strace on that
thread doen't tell me much:

rt_sigtimedwait([BUS USR1], 0x7f5761957b30, {0, 0}, 8) = -1 EAGAIN (Resource temporarily unavailable)
rt_sigpending([])                       = 0
ioctl(16, KVM_RUN <unfinished ...>

Using 'echo l > /proc/sysrq-trigger' a few times shows me the CPU
running that thread is always at vmx_vcpu_run+0x5eb, e.g.:

[571745.343753] NMI backtrace for cpu 2
[571745.343779] CPU: 2 PID: 31618 Comm: qemu-system-x86 Tainted: G           O 3.10.33-otn-00017-g510ea14 #2
[571745.343827] Hardware name: Supermicro X8DTH-i/6/iF/6F/X8DTH, BIOS 2.0c       07/19/11   
[571745.343871] task: ffff880002f99380 ti: ffff8801acaf0000 task.ti: ffff8801acaf0000
[571745.343915] RIP: 0010:[<ffffffffa104130b>]  [<ffffffffa104130b>] vmx_vcpu_run+0x5eb/0x670 [kvm_intel]
[571745.343978] RSP: 0018:ffff8801acaf3cc8  EFLAGS: 00000082
[571745.344004] RAX: 0000000080000202 RBX: 0000000001443980 RCX: ffff8801fd698000
[571745.344046] RDX: 0000000000000200 RSI: 00000000693e2680 RDI: ffff8801fd698000
[571745.344089] RBP: ffff8801acaf3d38 R08: 00000000693e9b40 R09: 0000000000000000
[571745.344131] R10: 0000000000000f08 R11: 0000000000000000 R12: 0000000000000000
[571745.344174] R13: 0000000000000001 R14: 0000000000000014 R15: ffffffffffffffff
[571745.344217] FS:  00007f5609fec700(0000) GS:ffff88081fc80000(0000) knlGS:fffff801388f8000
[571745.344261] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[571745.344288] CR2: 0000000001449b8a CR3: 00000006eaa6c000 CR4: 00000000000027e0
[571745.344330] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[571745.344373] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[571745.344415] Stack:
[571745.344435]  ffff8801acaf3d38 ffffffffa1042576 0000000000000000 ffff8801fd698000
[571745.344487]  0000000200000000 ffff8801fd698000 ffff8801acaf3d18 ffff8805cbfbc040
[571745.344539]  0000000000000002 ffff8805cbfbc040 0000000000000001 0000000000000000
[571745.344590] Call Trace:
[571745.344615]  [<ffffffffa1042576>] ? vmx_handle_exit+0xf6/0x8d0 [kvm_intel]
[571745.344661]  [<ffffffffa0459341>] kvm_arch_vcpu_ioctl_run+0x9a1/0x1100 [kvm]
[571745.344699]  [<ffffffffa04543d7>] ? kvm_arch_vcpu_load+0x57/0x1e0 [kvm]
[571745.344734]  [<ffffffffa0444d24>] kvm_vcpu_ioctl+0x2b4/0x580 [kvm]
[571745.344767]  [<ffffffffa04468ef>] ? kvm_vm_ioctl+0x57f/0x5f0 [kvm]
[571745.344797]  [<ffffffff81147090>] do_vfs_ioctl+0x90/0x520
[571745.344825]  [<ffffffff8106fd98>] ? __enqueue_entity+0x78/0x80 
[571745.344853]  [<ffffffff81083b38>] ? SyS_futex+0x98/0x1a0
[571745.344887]  [<ffffffffa044e1b4>] ? kvm_on_user_return+0x64/0x70 [kvm]
[571745.344916]  [<ffffffff81147570>] SyS_ioctl+0x50/0x90
[571745.344944]  [<ffffffff813bf782>] system_call_fastpath+0x16/0x1b
[571745.344971] Code: 82 1c 02 00 00 a8 10 0f 84 8b fa ff ff e9 66 ff ff ff 66 0f 1f 44 00 00 85 c0 0f 89 51 fd ff ff 48 8b 7d a8 e8 87 9f 40 ff cd 02 <48> 8b 7d a8 e8 9c 9f 40 ff e9 38 fd ff ff 48 89 f9 48 c1 e9 0d 


What can I do to help track this down?

Cheers,

Chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-04-02 17:06 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-27 22:52 [Qemu-devel] Unresponsive linux guest once migrated Chris Dunlop
2014-03-27 23:29 ` Marcin Gibuła
2014-03-27 23:59   ` Chris Dunlop
2014-03-31  8:39     ` Marcin Gibuła
2014-04-02  5:41       ` Chris Dunlop
2014-04-02  8:45         ` Marcin Gibuła
2014-04-02  9:04           ` Dr. David Alan Gilbert
2014-04-02  9:30             ` Marcin Gibuła
2014-04-02  9:39               ` Dr. David Alan Gilbert
2014-04-02 10:18                 ` Marcin Gibuła
2014-04-02 17:05                 ` Marcin Gibuła

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).