public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Mikhail Sennikovskii <mikhail.sennikovskii@profitbricks.com>
To: Jidong Xiao <jidong.xiao@gmail.com>
Cc: KVM <kvm@vger.kernel.org>
Subject: Re: Windows 2008 Guest BSODS with CLOCK_WATCHDOG_TIMEOUT on VM migration
Date: Thu, 29 Jan 2015 08:57:31 +0100	[thread overview]
Message-ID: <54C9E7EB.5090402@profitbricks.com> (raw)
In-Reply-To: <CAG4AFWbXkTh1U+v6sWCttZAu9UhZkSz=BvGORFBPw8LPpAbFvw@mail.gmail.com>

Hi Jidong,

right, this issue is SMP-specific.

Mikhail

On 27.01.2015 20:09, Jidong Xiao wrote:
> On Tue, Jan 27, 2015 at 5:55 AM, Mikhail Sennikovskii
> <mikhail.sennikovskii@profitbricks.com> wrote:
>> Hi all,
>>
>> I've posted the bolow mail to the qemu-dev mailing list, but I've got no
>> response there.
>> That's why I decided to re-post it here as well, and besides that I think
>> this could be a kvm-specific issue as well.
>>
>> Some additional thing to note:
>> I can reproduce the issue on my Debian 7 with 3.16.0-0.bpo.4-amd64 kernel as
>> well.
>> I would typically use a max_downtime adjusted to 1 second instead of default
>> 30 ms.
>> I also noticed that the issue happens much more rarelly if I increase the
>> migration bandwidth, i.e. like
>>
>> diff --git a/migration.c b/migration.c
>> index 26f4b65..d2e3b39 100644
>> --- a/migration.c
>> +++ b/migration.c
>> @@ -36,7 +36,7 @@ enum {
>>       MIG_STATE_COMPLETED,
>>   };
>>
>> -#define MAX_THROTTLE  (32 << 20)      /* Migration speed throttling */
>> +#define MAX_THROTTLE  (90 << 20)      /* Migration speed throttling */
>>
>> Like I said below, I would be glad to provide you with any additional
>> information.
>>
>> Thanks,
>> Mikhail
>>
> Hi, Mikhail,
>
> So if you choose to use one vcpu, instead of smp, this issue would not
> happen, right?
>
> -Jidong
>
>> On 23.01.2015 15:03, Mikhail Sennikovskii wrote:
>>> Hi all,
>>>
>>> I'm running a slitely modified migration over tcp test in virt-test, which
>>> does a migration from one "smp=2" VM to another on the same host over TCP,
>>> and exposes some dummy CPU load inside the GUEST while migration, and
>>> after a series of runs I'm alwais getting a CLOCK_WATCHDOG_TIMEOUT BSOD
>>> inside the guest,
>>> which happens when
>>> "
>>> An expected clock interrupt was not received on a secondary processor in
>>> an
>>> MP system within the allocated interval. This indicates that the specified
>>> processor is hung and not processing interrupts.
>>> "
>>>
>>> This seems to happen with any qemu version I've tested (1.2 and above,
>>> including upstream),
>>> and I was testing it with 3.13.0-44-generic kernel on my Ubuntu 14.04.1
>>> LTS with SMP4 host, as well as on 3.12.26-1 kernel with Debian 6 with SMP6
>>> host.
>>>
>>> One thing I noticed is that exposing a dummy CPU load on the HOST (like
>>> running multiple instances of the "while true; do false; done" script) in
>>> parallel with doing migration makes the issue to be quite easily
>>> reproducible.
>>>
>>>
>>> Looking inside the windows crash dump, the second CPU is just running at
>>> IRQL 0, and it aparently not hung, as Windows is able to save its state in
>>> the crash dump correctly, which assumes running some code on it.
>>> So this aparently seems to be some timing issue (like host scheduler does
>>> not schedule the thread executing secondary CPU's code in time).
>>>
>>> Could you give me some insight on this, i.e. is there a way to customize
>>> QEMU/KVM to avoid such issue?
>>>
>>> If you think this might be a qemu/kvm issue, I can provide you any info,
>>> like windows crash dumps, or the test-case to reproduce this.
>>>
>>>
>>> qemu is started as:
>>>
>>> from-VM:
>>>
>>> qemu-system-x86_64 \
>>>      -S  \
>>>      -name 'virt-tests-vm1'  \
>>>      -sandbox off  \
>>>      -M pc-1.0  \
>>>      -nodefaults  \
>>>      -vga std  \
>>>      -chardev
>>> socket,id=qmp_id_qmp1,path=/tmp/monitor-qmp1-20150123-112624-aFZmIkNT,server,nowait
>>> \
>>>      -mon chardev=qmp_id_qmp1,mode=control  \
>>>      -chardev
>>> socket,id=serial_id_serial0,path=/tmp/serial-serial0-20150123-112624-aFZmIkNT,server,nowait
>>> \
>>>      -device isa-serial,chardev=serial_id_serial0  \
>>>      -chardev
>>> socket,id=seabioslog_id_20150123-112624-aFZmIkNT,path=/tmp/seabios-20150123-112624-aFZmIkNT,server,nowait
>>> \
>>>      -device
>>> isa-debugcon,chardev=seabioslog_id_20150123-112624-aFZmIkNT,iobase=0x402 \
>>>      -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=03 \
>>>      -drive id=drive_image1,if=none,file=/path/to/image.qcow2 \
>>>      -device
>>> virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=04 \
>>>      -device
>>> virtio-net-pci,mac=9a:74:75:76:77:78,id=idFdaC4M,vectors=4,netdev=idKFZNXH,bus=pci.0,addr=05
>>> \
>>>      -netdev
>>> user,id=idKFZNXH,hostfwd=tcp::5000-:22,hostfwd=tcp::5001-:10023  \
>>>      -m 2G  \
>>>      -smp 2,maxcpus=2,cores=1,threads=1,sockets=2  \
>>>      -cpu phenom \
>>>      -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
>>>      -vnc :0  \
>>>      -rtc base=localtime,clock=host,driftfix=none  \
>>>      -boot order=cdn,once=c,menu=off \
>>>      -enable-kvm
>>>
>>> to-VM:
>>>
>>> qemu-system-x86_64 \
>>>      -S  \
>>>      -name 'virt-tests-vm1'  \
>>>      -sandbox off  \
>>>      -M pc-1.0  \
>>>      -nodefaults  \
>>>      -vga std  \
>>>      -chardev
>>> socket,id=qmp_id_qmp1,path=/tmp/monitor-qmp1-20150123-112750-VehjvEqK,server,nowait
>>> \
>>>      -mon chardev=qmp_id_qmp1,mode=control  \
>>>      -chardev
>>> socket,id=serial_id_serial0,path=/tmp/serial-serial0-20150123-112750-VehjvEqK,server,nowait
>>> \
>>>      -device isa-serial,chardev=serial_id_serial0  \
>>>      -chardev
>>> socket,id=seabioslog_id_20150123-112750-VehjvEqK,path=/tmp/seabios-20150123-112750-VehjvEqK,server,nowait
>>> \
>>>      -device
>>> isa-debugcon,chardev=seabioslog_id_20150123-112750-VehjvEqK,iobase=0x402 \
>>>      -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=03 \
>>>      -drive id=drive_image1,if=none,file=/path/to/image.qcow2 \
>>>      -device
>>> virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=04 \
>>>      -device
>>> virtio-net-pci,mac=9a:74:75:76:77:78,id=idI46M9C,vectors=4,netdev=idl9vRQt,bus=pci.0,addr=05
>>> \
>>>      -netdev
>>> user,id=idl9vRQt,hostfwd=tcp::5002-:22,hostfwd=tcp::5003-:10023  \
>>>      -m 2G  \
>>>      -smp 2,maxcpus=2,cores=1,threads=1,sockets=2  \
>>>      -cpu phenom \
>>>      -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
>>>      -vnc :1  \
>>>      -rtc base=localtime,clock=host,driftfix=none  \
>>>      -boot order=cdn,once=c,menu=off \
>>>      -enable-kvm \
>>>      -incoming tcp:0:5200
>>>
>>>
>>> Thanks,
>>> Mikhail
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


      parent reply	other threads:[~2015-01-29  7:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20150118030317.23598.27686.malonedeb@chaenomeles.canonical.com>
     [not found] ` <54C254AF.7010101@profitbricks.com>
2015-01-27 13:55   ` Windows 2008 Guest BSODS with CLOCK_WATCHDOG_TIMEOUT on VM migration Mikhail Sennikovskii
2015-01-27 19:09     ` Jidong Xiao
2015-01-28  6:42       ` Zhang Haoyu
2015-01-29  7:56         ` Mikhail Sennikovskii
2015-01-29  7:57       ` Mikhail Sennikovskii [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54C9E7EB.5090402@profitbricks.com \
    --to=mikhail.sennikovskii@profitbricks.com \
    --cc=jidong.xiao@gmail.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox