From: Mikhail Sennikovskii <mikhail.sennikovskii@profitbricks.com>
To: kvm@vger.kernel.org
Subject: Windows 2008 Guest BSODS with CLOCK_WATCHDOG_TIMEOUT on VM migration
Date: Tue, 27 Jan 2015 14:55:28 +0100 [thread overview]
Message-ID: <54C798D0.6010600@profitbricks.com> (raw)
In-Reply-To: <54C254AF.7010101@profitbricks.com>
Hi all,
I've posted the bolow mail to the qemu-dev mailing list, but I've got no
response there.
That's why I decided to re-post it here as well, and besides that I
think this could be a kvm-specific issue as well.
Some additional thing to note:
I can reproduce the issue on my Debian 7 with 3.16.0-0.bpo.4-amd64
kernel as well.
I would typically use a max_downtime adjusted to 1 second instead of
default 30 ms.
I also noticed that the issue happens much more rarelly if I increase
the migration bandwidth, i.e. like
diff --git a/migration.c b/migration.c
index 26f4b65..d2e3b39 100644
--- a/migration.c
+++ b/migration.c
@@ -36,7 +36,7 @@ enum {
MIG_STATE_COMPLETED,
};
-#define MAX_THROTTLE (32 << 20) /* Migration speed throttling */
+#define MAX_THROTTLE (90 << 20) /* Migration speed throttling */
Like I said below, I would be glad to provide you with any additional
information.
Thanks,
Mikhail
On 23.01.2015 15:03, Mikhail Sennikovskii wrote:
> Hi all,
>
> I'm running a slitely modified migration over tcp test in virt-test,
> which does a migration from one "smp=2" VM to another on the same host
> over TCP,
> and exposes some dummy CPU load inside the GUEST while migration, and
> after a series of runs I'm alwais getting a CLOCK_WATCHDOG_TIMEOUT
> BSOD inside the guest,
> which happens when
> "
> An expected clock interrupt was not received on a secondary processor
> in an
> MP system within the allocated interval. This indicates that the
> specified
> processor is hung and not processing interrupts.
> "
>
> This seems to happen with any qemu version I've tested (1.2 and above,
> including upstream),
> and I was testing it with 3.13.0-44-generic kernel on my Ubuntu
> 14.04.1 LTS with SMP4 host, as well as on 3.12.26-1 kernel with Debian
> 6 with SMP6 host.
>
> One thing I noticed is that exposing a dummy CPU load on the HOST
> (like running multiple instances of the "while true; do false; done"
> script) in parallel with doing migration makes the issue to be quite
> easily reproducible.
>
>
> Looking inside the windows crash dump, the second CPU is just running
> at IRQL 0, and it aparently not hung, as Windows is able to save its
> state in the crash dump correctly, which assumes running some code on it.
> So this aparently seems to be some timing issue (like host scheduler
> does not schedule the thread executing secondary CPU's code in time).
>
> Could you give me some insight on this, i.e. is there a way to
> customize QEMU/KVM to avoid such issue?
>
> If you think this might be a qemu/kvm issue, I can provide you any
> info, like windows crash dumps, or the test-case to reproduce this.
>
>
> qemu is started as:
>
> from-VM:
>
> qemu-system-x86_64 \
> -S \
> -name 'virt-tests-vm1' \
> -sandbox off \
> -M pc-1.0 \
> -nodefaults \
> -vga std \
> -chardev
> socket,id=qmp_id_qmp1,path=/tmp/monitor-qmp1-20150123-112624-aFZmIkNT,server,nowait
> \
> -mon chardev=qmp_id_qmp1,mode=control \
> -chardev
> socket,id=serial_id_serial0,path=/tmp/serial-serial0-20150123-112624-aFZmIkNT,server,nowait
> \
> -device isa-serial,chardev=serial_id_serial0 \
> -chardev
> socket,id=seabioslog_id_20150123-112624-aFZmIkNT,path=/tmp/seabios-20150123-112624-aFZmIkNT,server,nowait
> \
> -device
> isa-debugcon,chardev=seabioslog_id_20150123-112624-aFZmIkNT,iobase=0x402
> \
> -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=03 \
> -drive id=drive_image1,if=none,file=/path/to/image.qcow2 \
> -device
> virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=04
> \
> -device
> virtio-net-pci,mac=9a:74:75:76:77:78,id=idFdaC4M,vectors=4,netdev=idKFZNXH,bus=pci.0,addr=05
> \
> -netdev
> user,id=idKFZNXH,hostfwd=tcp::5000-:22,hostfwd=tcp::5001-:10023 \
> -m 2G \
> -smp 2,maxcpus=2,cores=1,threads=1,sockets=2 \
> -cpu phenom \
> -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
> -vnc :0 \
> -rtc base=localtime,clock=host,driftfix=none \
> -boot order=cdn,once=c,menu=off \
> -enable-kvm
>
> to-VM:
>
> qemu-system-x86_64 \
> -S \
> -name 'virt-tests-vm1' \
> -sandbox off \
> -M pc-1.0 \
> -nodefaults \
> -vga std \
> -chardev
> socket,id=qmp_id_qmp1,path=/tmp/monitor-qmp1-20150123-112750-VehjvEqK,server,nowait
> \
> -mon chardev=qmp_id_qmp1,mode=control \
> -chardev
> socket,id=serial_id_serial0,path=/tmp/serial-serial0-20150123-112750-VehjvEqK,server,nowait
> \
> -device isa-serial,chardev=serial_id_serial0 \
> -chardev
> socket,id=seabioslog_id_20150123-112750-VehjvEqK,path=/tmp/seabios-20150123-112750-VehjvEqK,server,nowait
> \
> -device
> isa-debugcon,chardev=seabioslog_id_20150123-112750-VehjvEqK,iobase=0x402
> \
> -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=03 \
> -drive id=drive_image1,if=none,file=/path/to/image.qcow2 \
> -device
> virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=04
> \
> -device
> virtio-net-pci,mac=9a:74:75:76:77:78,id=idI46M9C,vectors=4,netdev=idl9vRQt,bus=pci.0,addr=05
> \
> -netdev
> user,id=idl9vRQt,hostfwd=tcp::5002-:22,hostfwd=tcp::5003-:10023 \
> -m 2G \
> -smp 2,maxcpus=2,cores=1,threads=1,sockets=2 \
> -cpu phenom \
> -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
> -vnc :1 \
> -rtc base=localtime,clock=host,driftfix=none \
> -boot order=cdn,once=c,menu=off \
> -enable-kvm \
> -incoming tcp:0:5200
>
>
> Thanks,
> Mikhail
next parent reply other threads:[~2015-01-27 13:55 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20150118030317.23598.27686.malonedeb@chaenomeles.canonical.com>
[not found] ` <54C254AF.7010101@profitbricks.com>
2015-01-27 13:55 ` Mikhail Sennikovskii [this message]
2015-01-27 19:09 ` Windows 2008 Guest BSODS with CLOCK_WATCHDOG_TIMEOUT on VM migration Jidong Xiao
2015-01-28 6:42 ` Zhang Haoyu
2015-01-29 7:56 ` Mikhail Sennikovskii
2015-01-29 7:57 ` Mikhail Sennikovskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54C798D0.6010600@profitbricks.com \
--to=mikhail.sennikovskii@profitbricks.com \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox