* Guest Hang Bugs
@ 2009-01-14 19:17 James Thomason
2009-01-15 15:27 ` Avi Kivity
0 siblings, 1 reply; 5+ messages in thread
From: James Thomason @ 2009-01-14 19:17 UTC (permalink / raw)
To: kvm
Hello,
I am able to reliably reproduce a condition where a guest goes into a tight
loop or spinlock on all running cores. The scenario is exactly as described
in bug 2351676, though my environment differs as detailed below. My
observation is that the issue is correlated to the number of VCPUs assigned
to the guest and CPU load. The higher the number of VCPUs and CPU
utilization, the more easily it is triggered. If a KVM developer is
interested in debugging live, I might be able to arrange getting the system
in question into a DMZ. A review of the kvm tracker leads me to believe
that the following bugs are possibly related:
[ 2351676 ] Guests hang periodically on Ubuntu-8.10
[ 2353811 ] Solaris 10 guest unstable
[ 2494730 ] Guests "stalling" on kvm-82
[ 2138079 ] kvm locks up system
[ 2113643 ] guests AND host still getting stuck under CPU load
KVM Host Configuration:
4 x Quad-Core AMD Opteron Processors (8346 HE @ 1.8Ghz)
64GB DDR2 667Mhz
Fedora 10 x64
Kernel 2.6.28
KVM-82
KVM Guest Configuration:
32GB Memory
1 to 16 VCPUs
Centos 5.2 x64
Kernel 2.6.28
IDE disk
e1000 NIC
Regards,
James
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Guest Hang Bugs
2009-01-14 19:17 Guest Hang Bugs James Thomason
@ 2009-01-15 15:27 ` Avi Kivity
2009-01-15 22:49 ` James Thomason
0 siblings, 1 reply; 5+ messages in thread
From: Avi Kivity @ 2009-01-15 15:27 UTC (permalink / raw)
To: James Thomason; +Cc: kvm
James Thomason wrote:
> Hello,
>
> I am able to reliably reproduce a condition where a guest goes into a tight
> loop or spinlock on all running cores. The scenario is exactly as described
> in bug 2351676, though my environment differs as detailed below. My
> observation is that the issue is correlated to the number of VCPUs assigned
> to the guest and CPU load. The higher the number of VCPUs and CPU
> utilization, the more easily it is triggered. If a KVM developer is
> interested in debugging live, I might be able to arrange getting the system
> in question into a DMZ. A review of the kvm tracker leads me to believe
> that the following bugs are possibly related:
>
> [ 2351676 ] Guests hang periodically on Ubuntu-8.10
> [ 2353811 ] Solaris 10 guest unstable
>
> [ 2494730 ] Guests "stalling" on kvm-82
> [ 2138079 ] kvm locks up system
> [ 2113643 ] guests AND host still getting stuck under CPU load
>
>
Cam you try adding clocksource=acpi_pm to the _guest_ kernel command line?
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Guest Hang Bugs
2009-01-15 15:27 ` Avi Kivity
@ 2009-01-15 22:49 ` James Thomason
2009-01-16 1:26 ` James Thomason
0 siblings, 1 reply; 5+ messages in thread
From: James Thomason @ 2009-01-15 22:49 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm
On 1/15/09 7:27 AM, "Avi Kivity" <avi@redhat.com> wrote:
> Cam you try adding clocksource=acpi_pm to the _guest_ kernel command line?
Avi,
I booted the guest with clocksource=acpi_pm and it has been running under
heavy load for about 4 hours with kvm -smp 12. Considering that the guest
previously would not even boot with kvm -smp 12, I think this is the root
cause. As another data point, I was not able to trigger the bug when
running a guest with all active cores on socket 0 only, but I was able to
trigger it when putting half the cores on socket 1. So in a nutshell the
kernel uses an unreliable clock source by default on SMP systems?
Thanks for your help!
Regards,
James
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Guest Hang Bugs
2009-01-15 22:49 ` James Thomason
@ 2009-01-16 1:26 ` James Thomason
2009-01-18 9:15 ` Avi Kivity
0 siblings, 1 reply; 5+ messages in thread
From: James Thomason @ 2009-01-16 1:26 UTC (permalink / raw)
To: James Thomason, Avi Kivity; +Cc: kvm
Avi and others,
The system under load finally panicked and all cores hit 100%, so I spoke
too soon. Unfortunately my console was garbled and I was not able to get the
text of that panic.
Afterwards I am unable to boot the guest with kvm -smp > 2 because the guest
panics at boot time. The text of that boot time panic is below. However,
booting the guest with kvm -smp 1 works. After rebooting the KVM host, I am
able to boot the guest with kvm -smp 12 etc again.
I will try to induce the guest to panic again and see if I can capture it.
Configuration:
KVM HOST:
Ubuntu 8.10
KVM 83 2.6.29-rc1
KVM Guest:
Ubuntu 8.10
2.6.27-9-server
Text of Guest Boot Panic:
Boot from (hd0,0) ext3 cd532b94-1a97-4337-bfe9-75e699839bab
Starting up ...
[ 3.290000] Stuck ??
[ 3.320000] BUG: unable to handle kernel NULL pointer dereference at
0000000000000018
[ 3.320000] IP: [<ffffffff802e35de>] kmem_cache_create+0x7e/0x1a0
[ 3.320000] PGD 0
[ 3.320000] Oops: 0002 [1] SMP
[ 3.320000] CPU 0
[ 3.320000] Modules linked in:
[ 3.320000] Pid: 1, comm: swapper Tainted: G S 2.6.27-9-server #1
[ 3.320000] RIP: 0010:[<ffffffff802e35de>] [<ffffffff802e35de>]
kmem_cache_create+0x7e/0x1a0
[ 3.320000] RSP: 0018:ffff88081cc7be50 EFLAGS: 00010293
[ 3.320000] RAX: 0000000000000080 RBX: ffffffff806df710 RCX:
0000000000000001
[ 3.320000] RDX: 0000000000000000 RSI: 0000000000000001 RDI:
0000000000000001
[ 3.320000] RBP: ffff88081cc7be90 R08: 0000000000000040 R09:
ffffffff806e1e18
[ 3.320000] R10: ffff88081ccfdc00 R11: 0000000000000400 R12:
0000000000000080
[ 3.320000] R13: 0000000000000000 R14: 0000000000000000 R15:
ffffffff805f9c69
[ 3.320000] FS: 0000000000000000(0000) GS:ffffffff806e1a80(0000)
knlGS:0000000000000000
[ 3.320000] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[ 3.320000] CR2: 0000000000000018 CR3: 0000000000201000 CR4:
00000000000006e0
[ 3.320000] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 3.320000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 3.320000] Process swapper (pid: 1, threadinfo ffff88081cc7a000, task
ffff88081cc80000)
[ 3.320000] Stack: ffffffff80743eb2 0000000000000000 0000000000042000
ffffffff80784848
[ 3.320000] ffffffff807449ac 0000000000000000 0000000000093360
0000000000000000
[ 3.320000] ffff88081cc7bea0 ffffffff80744a09 ffff88081cc7bf10
ffffffff8020a041
[ 3.320000] Call Trace:
[ 3.320000] [<ffffffff80743eb2>] ? init_pipe_fs+0x0/0x56
[ 3.320000] [<ffffffff807449ac>] ? eventpoll_init+0x0/0x8a
[ 3.320000] [<ffffffff80744a09>] eventpoll_init+0x5d/0x8a
[ 3.320000] [<ffffffff8020a041>] do_one_initcall+0x41/0x170
[ 3.320000] [<ffffffff8029ff09>] ? register_irq_proc+0x19/0xf0
[ 3.320000] [<ffffffff80330000>] ? load_elf_binary+0xc70/0x11c0
[ 3.320000] [<ffffffff807207fb>] kernel_init+0xd2/0x128
[ 3.320000] [<ffffffff802444d7>] ? schedule_tail+0x27/0x70
[ 3.320000] [<ffffffff80213c99>] child_rip+0xa/0x11
[ 3.320000] [<ffffffff80720729>] ? kernel_init+0x0/0x128
[ 3.320000] [<ffffffff80213c8f>] ? child_rip+0x0/0x11
[ 3.320000]
[ 3.320000]
[ 3.320000] Code: 83 40 7c 01 8b 40 0c bf ff ff ff ff 44 39 e0 41 0f 4e
c4 89 43 0c eb 16 0f 1f 44 00 00 48 63 c7 48 8b 94 c3 f0 02 00 00 8b 43 0c
<89> 42 18 48 c7 c6 18 1e 6e 80 e8 b3 ed 0b 00 83 f8 3f 89 c7 7e
[ 3.320000] RIP [<ffffffff802e35de>] kmem_cache_create+0x7e/0x1a0
[ 3.320000] RSP <ffff88081cc7be50>
[ 3.320000] CR2: 0000000000000018
[ 3.320000] ---[ end trace 4eaa2a86a8e2da22 ]---
[ 3.320000] Kernel panic - not syncing: Attempted to kill init!
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Guest Hang Bugs
2009-01-16 1:26 ` James Thomason
@ 2009-01-18 9:15 ` Avi Kivity
0 siblings, 0 replies; 5+ messages in thread
From: Avi Kivity @ 2009-01-18 9:15 UTC (permalink / raw)
To: James Thomason; +Cc: kvm
James Thomason wrote:
> Avi and others,
>
> The system under load finally panicked and all cores hit 100%, so I spoke
> too soon. Unfortunately my console was garbled and I was not able to get the
> text of that panic.
>
That's probably an unrelated bug.
Did the host or guest panic?
> Afterwards I am unable to boot the guest with kvm -smp > 2 because the guest
> panics at boot time. The text of that boot time panic is below. However,
> booting the guest with kvm -smp 1 works. After rebooting the KVM host, I am
> able to boot the guest with kvm -smp 12 etc again.
>
> I will try to induce the guest to panic again and see if I can capture it.
>
> Configuration:
>
> KVM HOST:
> Ubuntu 8.10
> KVM 83 2.6.29-rc1
>
> KVM Guest:
> Ubuntu 8.10
> 2.6.27-9-server
>
> Text of Guest Boot Panic:
>
> Boot from (hd0,0) ext3 cd532b94-1a97-4337-bfe9-75e699839bab
> Starting up ...
> [ 3.290000] Stuck ??
>
Anything before this?
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-01-18 9:15 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-14 19:17 Guest Hang Bugs James Thomason
2009-01-15 15:27 ` Avi Kivity
2009-01-15 22:49 ` James Thomason
2009-01-16 1:26 ` James Thomason
2009-01-18 9:15 ` Avi Kivity
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox