From mboxrd@z Thu Jan 1 00:00:00 1970 From: Harri Olin Subject: Re: Freezing Windows 2008 x64bit guest Date: Wed, 21 Jul 2010 13:09:06 +0300 Message-ID: <4C46C742.20808@gmail.com> References: <20100715131944.GA24978@mp3.niederrhein.de> <20100715134441.GQ4689@redhat.com> <4C43FBEE.6000409@gmail.com> <20100719074207.GA4689@redhat.com> <4C4692DB.1000703@gmail.com> <20100721083728.GI27238@redhat.com> <4C46BC65.9050003@gmail.com> <20100721094846.GJ27238@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Cc: Christoph Adomeit , "kvm@vger.kernel.org" To: Gleb Natapov Return-path: Received: from emh03.mail.saunalahti.fi ([62.142.5.109]:36278 "EHLO emh03.mail.saunalahti.fi" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750834Ab0GUKJQ (ORCPT ); Wed, 21 Jul 2010 06:09:16 -0400 In-Reply-To: <20100721094846.GJ27238@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: Gleb Natapov wrote: > On Wed, Jul 21, 2010 at 12:22:45PM +0300, Harri Olin wrote: >> Gleb Natapov wrote: >>> On Wed, Jul 21, 2010 at 09:25:31AM +0300, Harri Olin wrote: >>>> Gleb Natapov kirjoitti: >>>>> On Mon, Jul 19, 2010 at 10:17:02AM +0300, Harri Olin wrote: >>>>>> Gleb Natapov kirjoitti: >>>>>>> On Thu, Jul 15, 2010 at 03:19:44PM +0200, Christoph Adomeit wrote: >>>>>>>> But one Windows 2008 64 Bit Server Standard is freezing regularly. >>>>>>>> This happens sometimes 3 times a day, sometimes it takes 2 days >>>>>>>> until freeze. The Windows Machine is a clean fresh install. >>>>>> I think I have seen same problem occur on my Windows 2008 SBS SP2 >>>>>> 64bit system, but a bit less often, only like once a week. >>>>>> Now I haven't seen crashes but only freezes with qemu on 100% and >>>>>> virtual system unresponsive. >>>>>> >>>>>> qemu command line: >>>>>> /usr/local/qemu-kvm-0.11.1/bin/qemu-system-x86_64 -drive >>>>>> file=/dev/rigelvg/w2k8system,cache=none,boot=on -drive >>>>>> file=/dev/rigelvg/w2k8data,cache=none -m 6144 -vnc :1 -net >>>>>> nic,macaddr=C0:FF:12:FB:AA:01,model=e1000 -net tap -smp 4 -localtime >>>>>> >>>>>> >>>>> Try with different model then e1000 please. Default driver that comes >>>>> with Windows known to have problems. >>>> Didn't help, changed to default realtek emulation, but system >>>> freezed again. Command line: >>>> /usr/local/qemu-kvm-0.11.1/bin/qemu-system-x86_64 -drive >>>> file=/dev/rigelvg/w2k8system,cache=none,boot=on -drive >>>> file=/dev/rigelvg/w2k8data,cache=none -m 6144 -vnc :1 -net >>>> nic,macaddr=C0:FF:12:FB:AA:01 -net tap -smp 4 -localtime >>>> >>>> This time the virtual system was not totally unresponsive somehow, >>>> system pinged just fine and I think DNS server worked too. However >>>> on console mouse moved but didn't react to clicking, ctrl-alt-del, >>>> etc. >>>> >>> Does sendkey from monitor works? qemu-kvm-0.11.1 is very old and this is >>> not total freeze which even harder to debug. I don't see anything >>> extraordinary in your logs. 4643 interrupt per second for 4 cpus is >>> normal if windows runs multimedia or other app that need hi-res timers. >>> Does your host swapping? Is there any chance that you can try upstream qemu-kvm? >>> >> I haven't tested sendkey. I originally tried 0.12.2 or so but with >> that windows crashed more often with bluescreen, I went down to >> 0.11.1. This might be related to -td-rtc-hack and/or e1000 as using >> td-rtc-hack on 0.11.1 windows crashed too to bluescreen. >> > With your guest -td-rtc-hack does nothing anyway, so you can just drop > it. Although it shouldn't cause BSOD with upstream (it had such bug in > the past and I am not sure that 0.12 has the fix). And you are using > HPET anyway so -td-rtc-hack should be nop for you. Ok, I think I added it because windows guest clock lagged behind a lot but didn't check if it helped. >> Only thing that catches my eye on stats frozen vs normal is >> irq_exits which jumps from 200 to 10000 when frozen and irq_window >> which goes 10x. > Hm, default Windows timer rate is 64Hz, so if nothing happens in the > guest you should see at least 256 irq injection per second. In your KVM > start output I only see 4655 irq injection per second not 10000. And > trace shows that most of them are timer interrupts (vector 209 IIRC). Frozen kvm_stat can be seen here; when I grabbed this, windows did not answer to ping. stats: http://mizar.remote.agasha.com/k/kvm/kvm_stat_hang.txt trace: http://mizar.remote.agasha.com/k/kvm/kvm_trace_hang.txt Here's another frozen state stat dump; when I got this windows did answer to ping but was otherwise unresponsive: stats: http://mizar.remote.agasha.com/k/kvm/kvm_stat_2_log.txt trace: http://mizar.remote.agasha.com/k/kvm/kvm_trace_2.txt On both times irq_exits go to 10000 but irq_injections stay at around 4600. On both times qemu process was at 100% cpu load. Note that stats quoted below are from normal operation, where everything works. >> Host has swap partition enabled but hasn't swapped once yet, >> currently there's over 1GB of free. > And how many cpus does it have? Host has one quad-core Q8300 and 8GB of memory. > BTW I see a lot of instruction emulations. That is strange. Looking at > you trace again I see that the bulk of the emulations are caused by mmio > to HPET. Please run with -no-hpet flag. I'll do that later today too. > >> I'll update host kernel to latest stable and qemu-kvm to upstream >> later today. >> >> Here's stats from running system: >> efer_reload 0 0 >> exits 386474714 23272 >> fpu_reload 173420913 12775 >> halt_exits 51894921 4294 >> halt_wakeup 51039254 4223 >> host_state_reload 188645310 13420 >> hypercalls 0 0 >> insn_emulation 212729430 15545 >> insn_emulation_fail 324 0 >> invlpg 5177828 134 >> io_exits 23132581 161 >> irq_exits 7485833 225 >> irq_injections 59239306 4655 >> irq_window 2147438 153 >> largepages 0 0 >> mmio_exits 110931423 9026 >> mmu_cache_miss 468410 0 >> mmu_flooded 193153 0 >> mmu_pde_zapped 352178 0 >> mmu_pte_updated 678023 0 >> mmu_pte_write 587175 0 >> mmu_recycled 34998 0 >> mmu_shadow_zapped 454073 0 >> mmu_unsync 19084 0 >> nmi_injections 0 0 >> nmi_window 0 0 >> pf_fixed 25171943 120 >> pf_guest 6209034 2 >> remote_tlb_flush 3044169 23 >> request_irq 0 0 >> signal_exits 0 0 >> tlb_flush 8422950 149 >>