* kvm-63: Windows Server 2003 randomly hangs with 100% CPU
@ 2008-03-28 14:21 Felix Leimbach
2008-03-30 15:44 ` Felix Leimbach
0 siblings, 1 reply; 3+ messages in thread
From: Felix Leimbach @ 2008-03-28 14:21 UTC (permalink / raw)
To: kvm-devel
Hello list,
I encountered two Windows Server 2003 Std SP1 guests which would
randomly hang with 100% CPU after several hours of normal operation.
The guest would not respond at all, not even to ping requests. Its
process on the host uses 100% CPU.
The Host:
Dual-Core AMD Opteron(tm) Processor 2212
10GB DDRII RAM
Linux 2.6.24.3
KVM-63 userspace
Other KVM guests (Linux and Windows Vista) are rock-solid
The two problematic guests:
Imaged from VMWare ESX 2.5 with reinstall of the ACPI Uniprocessor HAL
and drivers.
Switching to the Standard PC HAL and using -no-acpi does not help.
Tried with the new virtio NIC first and changed to ne2k_pci later but no
effect.
One guest uses a raw hard disk image the other qcow2.
The guests are launched with this command-line:
kvm -hda bonus-system.qcow2 -m 1024 -net
nic,model=ne2k_pci,macaddr=52:54:00:74:A1:0C -net tap,ifname=vm-bonus
-vnc 10.73.250.1:2 -k de -monitor tcp:10.73.250.1:5952,server,nowait
-localtime
Currently I'm testing with a fresh install of Win2003 STD SP1 to rule
out the imaged-from-ESX factor.
Also I'm running one of the machines with -no-kvm now and will report
back how that works out.
Any idea how to go about debugging this?
Also I see interesting syslog entries. But they don't seem directly
related to the hangs, as the times do not match.
Mar 27 14:30:26 kernsrc@obelix kvm: 1363: cpu0 kvm_set_msr_common:
MSR_IA32_MC0_STATUS 0x0, nop
Mar 27 15:46:52 kernsrc@obelix Ignoring de-assert INIT to vcpu 0
Mar 27 17:20:45 kernsrc@obelix Ignoring de-assert INIT to vcpu 0
Mar 27 17:37:30 kernsrc@obelix apic write: bad size=1 fee00030
Mar 27 17:37:30 kernsrc@obelix Ignoring de-assert INIT to vcpu 0
Mar 27 17:42:19 kernsrc@obelix apic write: bad size=1 fee00030
Mar 27 17:42:19 kernsrc@obelix Ignoring de-assert INIT to vcpu 0
Mar 27 17:55:34 kernsrc@obelix apic write: bad size=1 fee00030
Mar 27 17:55:34 kernsrc@obelix Ignoring de-assert INIT to vcpu 0
Mar 27 20:13:14 kernsrc@obelix apic write: bad size=1 fee00030
Mar 27 20:13:14 kernsrc@obelix Ignoring de-assert INIT to vcpu 0
Cheers,
FL
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: kvm-63: Windows Server 2003 randomly hangs with 100% CPU
2008-03-28 14:21 kvm-63: Windows Server 2003 randomly hangs with 100% CPU Felix Leimbach
@ 2008-03-30 15:44 ` Felix Leimbach
2008-04-01 15:39 ` Felix Leimbach
0 siblings, 1 reply; 3+ messages in thread
From: Felix Leimbach @ 2008-03-30 15:44 UTC (permalink / raw)
To: kvm-devel
Update:
The same mysterious 100% CPU Windows hangs occur with KVM-64 on a
2.6.24.3 kernel.
While I could not yet reproduce the hang on a cleanly installed Windows
Server 2003 STD guest, the same hang did occur on an idle Windows Vista
guest which was cleanly installed in the KVM environment.
So I guess my transferring the guests from an ESX 2.5 machine does *not*
cause the hangs.
The linux guests (2.6.24 kernels) run 100% stable.
I'm still willing to assist in debugging this if a KVM developer is
interested.
A TCP monitor port is open and I'm able to run gdb on the hanged machine
(if that is of any use with closed-source guests).
Next I'm going to test on 2.6.25-rc7-git5 with KVM-64.
On a sidenote: The help-texts in the 2.6.25 kernel for the new virtio
{pci,balloon} drivers do not state, whether the drivers are intended for
the guest or the host or on both. I think that would be worth pointing
out, like it is done in the CONFIG_KVM help text (the keyword being
"hosting" there).
Felix Leimbach wrote:
> Hello list,
>
> I encountered two Windows Server 2003 Std SP1 guests which would
> randomly hang with 100% CPU after several hours of normal operation.
> The guest would not respond at all, not even to ping requests. Its
> process on the host uses 100% CPU.
>
> The Host:
> Dual-Core AMD Opteron(tm) Processor 2212
> 10GB DDRII RAM
> Linux 2.6.24.3
> KVM-63 userspace
> Other KVM guests (Linux and Windows Vista) are rock-solid
>
> The two problematic guests:
> Imaged from VMWare ESX 2.5 with reinstall of the ACPI Uniprocessor HAL
> and drivers.
> Switching to the Standard PC HAL and using -no-acpi does not help.
> Tried with the new virtio NIC first and changed to ne2k_pci later but no
> effect.
> One guest uses a raw hard disk image the other qcow2.
>
> The guests are launched with this command-line:
> kvm -hda bonus-system.qcow2 -m 1024 -net
> nic,model=ne2k_pci,macaddr=52:54:00:74:A1:0C -net tap,ifname=vm-bonus
> -vnc 10.73.250.1:2 -k de -monitor tcp:10.73.250.1:5952,server,nowait
> -localtime
>
> Currently I'm testing with a fresh install of Win2003 STD SP1 to rule
> out the imaged-from-ESX factor.
> Also I'm running one of the machines with -no-kvm now and will report
> back how that works out.
>
> Any idea how to go about debugging this?
>
> Also I see interesting syslog entries. But they don't seem directly
> related to the hangs, as the times do not match.
> Mar 27 14:30:26 kernsrc@obelix kvm: 1363: cpu0 kvm_set_msr_common:
> MSR_IA32_MC0_STATUS 0x0, nop
> Mar 27 15:46:52 kernsrc@obelix Ignoring de-assert INIT to vcpu 0
> Mar 27 17:20:45 kernsrc@obelix Ignoring de-assert INIT to vcpu 0
> Mar 27 17:37:30 kernsrc@obelix apic write: bad size=1 fee00030
> Mar 27 17:37:30 kernsrc@obelix Ignoring de-assert INIT to vcpu 0
> Mar 27 17:42:19 kernsrc@obelix apic write: bad size=1 fee00030
> Mar 27 17:42:19 kernsrc@obelix Ignoring de-assert INIT to vcpu 0
> Mar 27 17:55:34 kernsrc@obelix apic write: bad size=1 fee00030
> Mar 27 17:55:34 kernsrc@obelix Ignoring de-assert INIT to vcpu 0
> Mar 27 20:13:14 kernsrc@obelix apic write: bad size=1 fee00030
> Mar 27 20:13:14 kernsrc@obelix Ignoring de-assert INIT to vcpu 0
>
> Cheers,
> FL
>
> -------------------------------------------------------------------------
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services for
> just about anything Open Source.
> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
> _______________________________________________
> kvm-devel mailing list
> kvm-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/kvm-devel
>
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: kvm-63: Windows Server 2003 randomly hangs with 100% CPU
2008-03-30 15:44 ` Felix Leimbach
@ 2008-04-01 15:39 ` Felix Leimbach
0 siblings, 0 replies; 3+ messages in thread
From: Felix Leimbach @ 2008-04-01 15:39 UTC (permalink / raw)
To: kvm-devel
[-- Attachment #1.1: Type: text/plain, Size: 341 bytes --]
> Next I'm going to test on 2.6.25-rc7-git5 with KVM-64.
Running on 2.6.25-rc7-git5 with KVM 64 for 48 hours now and no more hangs!
Note that the longest uptime of the Windows 2003 Std guest was around 20
hours before.
So it *seems* that 2.6.25 does indeed solve the problem, although a
couple more days of testing is needed to be sure.
[-- Attachment #1.2: Type: text/html, Size: 634 bytes --]
[-- Attachment #2: Type: text/plain, Size: 278 bytes --]
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
[-- Attachment #3: Type: text/plain, Size: 158 bytes --]
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-04-01 15:39 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-28 14:21 kvm-63: Windows Server 2003 randomly hangs with 100% CPU Felix Leimbach
2008-03-30 15:44 ` Felix Leimbach
2008-04-01 15:39 ` Felix Leimbach
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox