From: Juergen Gross <juergen.gross@ts.fujitsu.com>
To: Andre Przywara <andre.przywara@amd.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
Ian Jackson <Ian.Jackson@eu.citrix.com>,
Keir Fraser <keir.fraser@eu.citrix.com>
Subject: Re: Hypervisor crash(!) on xl cpupool-numa-split
Date: Fri, 28 Jan 2011 12:44:00 +0100 [thread overview]
Message-ID: <4D42AC00.8050109@ts.fujitsu.com> (raw)
In-Reply-To: <4D42A35D.3050507@amd.com>
On 01/28/11 12:07, Andre Przywara wrote:
> Juergen Gross wrote:
>> On 01/28/11 00:18, Andre Przywara wrote:
>>> Hi,
>>>
>>> when I boot my machine without restricting Dom0 (dom0_mem=
>>> dom0_max_vcpus=) I get an _hypervisor_ crash when I run
>>> # xl cpupool-numa-split
>>> If Dom0's resources are limited on the Xen cmdline, everything works
>>> fine.
>>> The crashdump points to a scheduling problem with weights, so I assume
>>> the NUMA distribution algorithm some fools the hypervisor completely.
>>>
>>> I will investigate this further tomorrow, but maybe someone has some
>>> good idea.
>>
>> I've seen this once with an older cpupool version on a 24 processor
>> machine.
>> It was NOT related to NUMA, but did occur only on reboot after a Dom0
>> panic.
>> The machine had an init script creating a cpupool and populating it with
>> cpus. The machine was in a panic loop due to the BUG in sched_acct
>> then until
>> it was resetted manually. After the reset the problem was gone.
>>
>> As I was never able to reproduce the problem later (the same software is
>> running on dozens of machines!), I assumed there was a problem related to
>> the first Dom0 panic, may be some destroyed BIOS tables.
>>
>> Can the crash be reproduced easily?
> Yes.
> If I don't specify dom0_max_vcpus= and dom0_mem= on the Xen cmdline, I
> can reliably trigger the crash with xl cpupool-numa-split.
> Omitting dom0_max_vcpus only does not suffice.
Do I understand correctly?
No crash with only dom0_max_vcpus= and no crash with only dom0_mem= ?
Could you try this patch?
diff -r b59f04eb8978 xen/common/schedule.c
--- a/xen/common/schedule.c Fri Jan 21 18:06:23 2011 +0000
+++ b/xen/common/schedule.c Fri Jan 28 12:42:46 2011 +0100
@@ -1301,7 +1301,9 @@ void schedule_cpu_switch(unsigned int cp
idle = idle_vcpu[cpu];
ppriv = SCHED_OP(new_ops, alloc_pdata, cpu);
+ BUG_ON(ppriv == NULL);
vpriv = SCHED_OP(new_ops, alloc_vdata, idle, idle->domain->sched_priv);
+ BUG_ON(vpriv == NULL);
pcpu_schedule_lock_irqsave(cpu, flags);
--
Juergen Gross Principal Developer Operating Systems
TSP ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28 Internet: ts.fujitsu.com
D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html
next prev parent reply other threads:[~2011-01-28 11:44 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-27 23:18 Hypervisor crash(!) on xl cpupool-numa-split Andre Przywara
2011-01-28 6:47 ` Juergen Gross
2011-01-28 11:07 ` Andre Przywara
2011-01-28 11:44 ` Juergen Gross [this message]
2011-01-28 13:14 ` Andre Przywara
2011-01-31 7:04 ` Juergen Gross
2011-01-31 14:59 ` Andre Przywara
2011-01-31 15:28 ` George Dunlap
2011-02-01 16:32 ` Andre Przywara
2011-02-02 6:27 ` Juergen Gross
2011-02-02 8:49 ` Juergen Gross
2011-02-02 10:05 ` Juergen Gross
2011-02-02 10:59 ` Andre Przywara
2011-02-02 14:39 ` Stephan Diestelhorst
2011-02-02 15:14 ` Juergen Gross
2011-02-02 16:01 ` Stephan Diestelhorst
2011-02-03 5:57 ` Juergen Gross
2011-02-03 9:18 ` Juergen Gross
2011-02-04 14:09 ` Andre Przywara
2011-02-07 12:38 ` Andre Przywara
2011-02-07 13:32 ` Juergen Gross
2011-02-07 15:55 ` George Dunlap
2011-02-08 5:43 ` Juergen Gross
2011-02-08 12:08 ` George Dunlap
2011-02-08 12:14 ` George Dunlap
2011-02-08 16:33 ` Andre Przywara
2011-02-09 12:27 ` George Dunlap
2011-02-09 12:27 ` George Dunlap
2011-02-09 13:04 ` Juergen Gross
2011-02-09 13:39 ` Andre Przywara
2011-02-09 13:51 ` Andre Przywara
2011-02-09 14:21 ` Juergen Gross
2011-02-10 6:42 ` Juergen Gross
2011-02-10 9:25 ` Andre Przywara
2011-02-10 14:18 ` Andre Przywara
2011-02-11 6:17 ` Juergen Gross
2011-02-11 7:39 ` Andre Przywara
2011-02-14 17:57 ` George Dunlap
2011-02-15 7:22 ` Juergen Gross
2011-02-16 9:47 ` Juergen Gross
2011-02-16 13:54 ` George Dunlap
[not found] ` <4D6237C6.1050206@amd.c om>
2011-02-16 14:11 ` Juergen Gross
2011-02-16 14:28 ` Juergen Gross
2011-02-17 0:05 ` André Przywara
2011-02-17 7:05 ` Juergen Gross
2011-02-17 9:11 ` Juergen Gross
2011-02-21 10:00 ` Andre Przywara
2011-02-21 13:19 ` Juergen Gross
2011-02-21 14:45 ` Andre Przywara
2011-02-21 14:50 ` Juergen Gross
2011-02-08 12:23 ` Juergen Gross
2011-01-28 11:13 ` George Dunlap
2011-01-28 13:05 ` Andre Przywara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D42AC00.8050109@ts.fujitsu.com \
--to=juergen.gross@ts.fujitsu.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=andre.przywara@amd.com \
--cc=keir.fraser@eu.citrix.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.