From mboxrd@z Thu Jan  1 00:00:00 1970
From: Mikhail Sennikovskii <mikhail.sennikovskii@profitbricks.com>
Subject: Re: Windows 2008 Guest BSODS with CLOCK_WATCHDOG_TIMEOUT on VM migration
Date: Thu, 29 Jan 2015 08:56:39 +0100
Message-ID: <54C9E7B7.9030501@profitbricks.com>
References: <20150118030317.23598.27686.malonedeb@chaenomeles.canonical.com>, <54C254AF.7010101@profitbricks.com>, <54C798D0.6010600@profitbricks.com>, <CAG4AFWbXkTh1U+v6sWCttZAu9UhZkSz=BvGORFBPw8LPpAbFvw@mail.gmail.com> <201501281442344387170@sangfor.com.cn>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: KVM <kvm@vger.kernel.org>
To: Zhang Haoyu <zhanghy@sangfor.com.cn>,
	Jidong Xiao <jidong.xiao@gmail.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mail-wi0-f180.google.com ([209.85.212.180]:51927 "EHLO
	mail-wi0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1752160AbbA2H4q (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 29 Jan 2015 02:56:46 -0500
Received: by mail-wi0-f180.google.com with SMTP id h11so20719107wiw.1
        for <kvm@vger.kernel.org>; Wed, 28 Jan 2015 23:56:44 -0800 (PST)
In-Reply-To: <201501281442344387170@sangfor.com.cn>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Hi Zhang,

Thanks a lot for the suggestion, it indeed worked for me!
I.e. after adding the hv_relaxed to the list of CPU properties I can no 
longer reproduce the BSOD on migration with any kernel version that I 
used so far.

Thanks for your help,
Mikhail

On 28.01.2015 07:42, Zhang Haoyu wrote:
> On 2015-01-28 03:10:23, Jidong Xiao wrote:
>> On Tue, Jan 27, 2015 at 5:55 AM, Mikhail Sennikovskii
>> <mikhail.sennikovskii@profitbricks.com> wrote:
>>> Hi all,
>>>
>>> I've posted the bolow mail to the qemu-dev mailing list, but I've got no
>>> response there.
>>> That's why I decided to re-post it here as well, and besides that I think
>>> this could be a kvm-specific issue as well.
>>>
>>> Some additional thing to note:
>>> I can reproduce the issue on my Debian 7 with 3.16.0-0.bpo.4-amd64 kernel as
>>> well.
>>> I would typically use a max_downtime adjusted to 1 second instead of default
>>> 30 ms.
>>> I also noticed that the issue happens much more rarelly if I increase the
>>> migration bandwidth, i.e. like
>>>
>>> diff --git a/migration.c b/migration.c
>>> index 26f4b65..d2e3b39 100644
>>> --- a/migration.c
>>> +++ b/migration.c
>>> @@ -36,7 +36,7 @@ enum {
>>>       MIG_STATE_COMPLETED,
>>>   };
>>>
>>> -#define MAX_THROTTLE  (32 << 20)      /* Migration speed throttling */
>>> +#define MAX_THROTTLE  (90 << 20)      /* Migration speed throttling */
>>>
>>> Like I said below, I would be glad to provide you with any additional
>>> information.
>>>
>>> Thanks,
>>> Mikhail
>>>
>> Hi, Mikhail,
>>
>> So if you choose to use one vcpu, instead of smp, this issue would not
>> happen, right?
>>
> I think you can try cpu feature hv_relaxed, like
> -cpu Haswell,hv_relaxed
>
>> -Jidong
>>
>>> On 23.01.2015 15:03, Mikhail Sennikovskii wrote:
>>>> Hi all,
>>>>
>>>> I'm running a slitely modified migration over tcp test in virt-test, which
>>>> does a migration from one "smp=2" VM to another on the same host over TCP,
>>>> and exposes some dummy CPU load inside the GUEST while migration, and
>>>> after a series of runs I'm alwais getting a CLOCK_WATCHDOG_TIMEOUT BSOD
>>>> inside the guest,
>>>> which happens when
>>>> "
>>>> An expected clock interrupt was not received on a secondary processor in
>>>> an
>>>> MP system within the allocated interval. This indicates that the specified
>>>> processor is hung and not processing interrupts.
>>>> "
>>>>
>>>> This seems to happen with any qemu version I've tested (1.2 and above,
>>>> including upstream),
>>>> and I was testing it with 3.13.0-44-generic kernel on my Ubuntu 14.04.1
>>>> LTS with SMP4 host, as well as on 3.12.26-1 kernel with Debian 6 with SMP6
>>>> host.
>>>>
>>>> One thing I noticed is that exposing a dummy CPU load on the HOST (like
>>>> running multiple instances of the "while true; do false; done" script) in
>>>> parallel with doing migration makes the issue to be quite easily
>>>> reproducible.
>>>>
>>>>
>>>> Looking inside the windows crash dump, the second CPU is just running at
>>>> IRQL 0, and it aparently not hung, as Windows is able to save its state in
>>>> the crash dump correctly, which assumes running some code on it.
>>>> So this aparently seems to be some timing issue (like host scheduler does
>>>> not schedule the thread executing secondary CPU's code in time).
>>>>
>>>> Could you give me some insight on this, i.e. is there a way to customize
>>>> QEMU/KVM to avoid such issue?
>>>>
>>>> If you think this might be a qemu/kvm issue, I can provide you any info,
>>>> like windows crash dumps, or the test-case to reproduce this.
>>>>
>>>>
>>>> qemu is started as:
>>>>
>>>> from-VM:
>>>>
>>>> qemu-system-x86_64 \
>>>>      -S  \
>>>>      -name 'virt-tests-vm1'  \
>>>>      -sandbox off  \
>>>>      -M pc-1.0  \
>>>>      -nodefaults  \
>>>>      -vga std  \
>>>>      -chardev
>>>> socket,id=qmp_id_qmp1,path=/tmp/monitor-qmp1-20150123-112624-aFZmIkNT,server,nowait
>>>> \
>>>>      -mon chardev=qmp_id_qmp1,mode=control  \
>>>>      -chardev
>>>> socket,id=serial_id_serial0,path=/tmp/serial-serial0-20150123-112624-aFZmIkNT,server,nowait
>>>> \
>>>>      -device isa-serial,chardev=serial_id_serial0  \
>>>>      -chardev
>>>> socket,id=seabioslog_id_20150123-112624-aFZmIkNT,path=/tmp/seabios-20150123-112624-aFZmIkNT,server,nowait
>>>> \
>>>>      -device
>>>> isa-debugcon,chardev=seabioslog_id_20150123-112624-aFZmIkNT,iobase=0x402 \
>>>>      -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=03 \
>>>>      -drive id=drive_image1,if=none,file=/path/to/image.qcow2 \
>>>>      -device
>>>> virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=04 \
>>>>      -device
>>>> virtio-net-pci,mac=9a:74:75:76:77:78,id=idFdaC4M,vectors=4,netdev=idKFZNXH,bus=pci.0,addr=05
>>>> \
>>>>      -netdev
>>>> user,id=idKFZNXH,hostfwd=tcp::5000-:22,hostfwd=tcp::5001-:10023  \
>>>>      -m 2G  \
>>>>      -smp 2,maxcpus=2,cores=1,threads=1,sockets=2  \
>>>>      -cpu phenom \
>>>>      -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
>>>>      -vnc :0  \
>>>>      -rtc base=localtime,clock=host,driftfix=none  \
>>>>      -boot order=cdn,once=c,menu=off \
>>>>      -enable-kvm
>>>>
>>>> to-VM:
>>>>
>>>> qemu-system-x86_64 \
>>>>      -S  \
>>>>      -name 'virt-tests-vm1'  \
>>>>      -sandbox off  \
>>>>      -M pc-1.0  \
>>>>      -nodefaults  \
>>>>      -vga std  \
>>>>      -chardev
>>>> socket,id=qmp_id_qmp1,path=/tmp/monitor-qmp1-20150123-112750-VehjvEqK,server,nowait
>>>> \
>>>>      -mon chardev=qmp_id_qmp1,mode=control  \
>>>>      -chardev
>>>> socket,id=serial_id_serial0,path=/tmp/serial-serial0-20150123-112750-VehjvEqK,server,nowait
>>>> \
>>>>      -device isa-serial,chardev=serial_id_serial0  \
>>>>      -chardev
>>>> socket,id=seabioslog_id_20150123-112750-VehjvEqK,path=/tmp/seabios-20150123-112750-VehjvEqK,server,nowait
>>>> \
>>>>      -device
>>>> isa-debugcon,chardev=seabioslog_id_20150123-112750-VehjvEqK,iobase=0x402 \
>>>>      -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=03 \
>>>>      -drive id=drive_image1,if=none,file=/path/to/image.qcow2 \
>>>>      -device
>>>> virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=04 \
>>>>      -device
>>>> virtio-net-pci,mac=9a:74:75:76:77:78,id=idI46M9C,vectors=4,netdev=idl9vRQt,bus=pci.0,addr=05
>>>> \
>>>>      -netdev
>>>> user,id=idl9vRQt,hostfwd=tcp::5002-:22,hostfwd=tcp::5003-:10023  \
>>>>      -m 2G  \
>>>>      -smp 2,maxcpus=2,cores=1,threads=1,sockets=2  \
>>>>      -cpu phenom \
>>>>      -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
>>>>      -vnc :1  \
>>>>      -rtc base=localtime,clock=host,driftfix=none  \
>>>>      -boot order=cdn,once=c,menu=off \
>>>>      -enable-kvm \
>>>>      -incoming tcp:0:5200
>>>>
>>>>
>>>> Thanks,
>>>> Mikhail