xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andre Przywara <andre.przywara@amd.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad@darnok.org>,
	Jeremy Fitzhardinge <jeremy@goop.org>,
	xen-devel <xen-devel@lists.xen.org>
Subject: Re: Dom0 crash with old style AMD NUMA detection
Date: Fri, 21 Sep 2012 19:49:31 +0200	[thread overview]
Message-ID: <505CA8AB.6000808@amd.com> (raw)
In-Reply-To: <20120817142237.GA8467@phenom.dumpdata.com>

On 08/17/2012 04:22 PM, Konrad Rzeszutek Wilk wrote:
> On Fri, Aug 03, 2012 at 08:36:28AM -0400, Konrad Rzeszutek Wilk wrote:
>> On Fri, Aug 03, 2012 at 02:20:31PM +0200, Andre Przywara wrote:

Sorry Konrad, almost forgot.
Comment (and Ack) below...

>>> we see Dom0 crashes due to the kernel detecting the NUMA topology not by
>>> ACPI, but directly from the northbridge (CONFIG_AMD_NUMA).
>>>
>>> This will detect the actual NUMA config of the physical machine, but
>>> will crash about the mismatch with Dom0's virtual memory. Variation of
>>> the theme: Dom0 sees what it's not supposed to see.
>>>
>>> This happens with the said config option enabled and on a machine where
>>> this scanning is still enabled (K8 and Fam10h, not Bulldozer class)
>>>
>>> We have this dump then:
>>> [    0.000000] NUMA: Warning: node ids are out of bound, from=-1 to=-1
>>> distance=10
>>> [    0.000000] Scanning NUMA topology in Northbridge 24
>>> [    0.000000] Number of physical nodes 4
>>> [    0.000000] Node 0 MemBase 0000000000000000 Limit 0000000040000000
>>> [    0.000000] Node 1 MemBase 0000000040000000 Limit 0000000138000000
>>> [    0.000000] Node 2 MemBase 0000000138000000 Limit 00000001f8000000
>>> [    0.000000] Node 3 MemBase 00000001f8000000 Limit 0000000238000000
>>> [    0.000000] Initmem setup node 0 0000000000000000-0000000040000000
>>> [    0.000000]   NODE_DATA [000000003ffd9000 - 000000003fffffff]
>>> [    0.000000] Initmem setup node 1 0000000040000000-0000000138000000
>>> [    0.000000]   NODE_DATA [0000000137fd9000 - 0000000137ffffff]
>>> [    0.000000] Initmem setup node 2 0000000138000000-00000001f8000000
>>> [    0.000000]   NODE_DATA [00000001f095e000 - 00000001f0984fff]
>>> [    0.000000] Initmem setup node 3 00000001f8000000-0000000238000000
>>> [    0.000000] Cannot find 159744 bytes in node 3
>>> [    0.000000] BUG: unable to handle kernel NULL pointer dereference at
>>> (null)
>>> [    0.000000] IP: [<ffffffff81d220e6>] __alloc_bootmem_node+0x43/0x96
>>> [    0.000000] PGD 0
>>> [    0.000000] Oops: 0000 [#1] SMP
>>> [    0.000000] CPU 0
>>> [    0.000000] Modules linked in:
>>> [    0.000000]
>>> [    0.000000] Pid: 0, comm: swapper Not tainted 3.3.6 #1 AMD Dinar/Dinar
>>> [    0.000000] RIP: e030:[<ffffffff81d220e6>]  [<ffffffff81d220e6>]
>>> __alloc_bootmem_node+0x43/0x96
>>> [    0.000000] RSP: e02b:ffffffff81c01de8  EFLAGS: 00010046
>>> [    0.000000] RAX: 0000000000000000 RBX: 00000000000000c0 RCX:
>>> 0000000000000000
>>> [    0.000000] RDX: 0000000000000040 RSI: 00000000000000c0 RDI:
>>> 0000000000000000
>>> [    0.000000] RBP: ffffffff81c01e08 R08: 0000000000000000 R09:
>>> 0000000000000000
>>> [    0.000000] R10: 0000000000098000 R11: 0000000000000000 R12:
>>> 0000000000000000
>>> [    0.000000] R13: 0000000000000000 R14: 0000000000000040 R15:
>>> 0000000000000003
>>> [    0.000000] FS:  0000000000000000(0000) GS:ffffffff81ced000(0000)
>>> knlGS:0000000000000000
>>> [    0.000000] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [    0.000000] CR2: 0000000000000000 CR3: 0000000001c05000 CR4:
>>> 0000000000000660
>>> [    0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>> 0000000000000000
>>> [    0.000000] DR3: 0000000000000000 DR6: 0000000000000000 DR7:
>>> 0000000000000000
>>> [    0.000000] Process swapper (pid: 0, threadinfo ffffffff81c00000,
>>> task ffffffff81c0d020)
>>> [    0.000000] Stack:
>>> [    0.000000]  00000000000000c0 0000000000000003 0000000000000000
>>> 000000000000003f
>>> [    0.000000]  ffffffff81c01e68 ffffffff81d23024 0000000000400000
>>> 0000000000000002
>>> [    0.000000]  0000000000080000 ffff8801f055e000 ffff8801f055e1f8
>>> 0000000000000000
>>> [    0.000000] Call Trace:
>>> [    0.000000]  [<ffffffff81d23024>]
>>> sparse_early_usemaps_alloc_node+0x64/0x178
>>> [    0.000000]  [<ffffffff81d23348>] sparse_init+0xe4/0x25a
>>> [    0.000000]  [<ffffffff81d16840>] paging_init+0x13/0x22
>>> [    0.000000]  [<ffffffff81d07fbb>] setup_arch+0x9c6/0xa9b
>>> [    0.000000]  [<ffffffff81683954>] ? printk+0x3c/0x3e
>>> [    0.000000]  [<ffffffff81d01a38>] start_kernel+0xe5/0x468
>>> [    0.000000]  [<ffffffff81d012cf>] x86_64_start_reservations+0xba/0xc1
>>> [    0.000000]  [<ffffffff81007153>] ? xen_setup_runstate_info+0x2c/0x36
>>> [    0.000000]  [<ffffffff81d050ee>] xen_start_kernel+0x565/0x56c
>>> [    0.000000] Code: 79 bc 3e ff 85 c0 74 23 80 3d 19 e9 21 00 00 75 59
>>> be 2a
>>> 01 00 00 48 c7 c7 d0 55 a8 81 e8 b6 dc 31 ff c6 05 ff e8 21 00 01 eb 3f
>>> <41>  8b
>>> bc 24 60 60 02 00 49 83 c8 ff 4c 89 e9 4c 89 f2 48 89 de
>>> [    0.000000] RIP  [<ffffffff81d220e6>] __alloc_bootmem_node+0x43/0x96
>>> [    0.000000]  RSP<ffffffff81c01de8>
>>> [    0.000000] CR2: 0000000000000000
>>> [    0.000000] ---[ end trace a7919e7f17c0a725 ]---
>>> [    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
>>> (XEN) Domain 0 crashed: 'noreboot' set - not rebooting.
>>>
>>>
>>>
>>> The obvious solution would be to explicitly deny northbridge scanning
>>> when running as Dom0, though I am not sure how to implement this without
>>> upsetting the other kernel folks about "that crappy Xen thing" again ;-)
>>
>> Heh.
>> Is there a numa=0 option that could be used to override it to turn it
>> off?
>
> Not compile tested.. but was thinking something like this:
>
> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
> index 43fd630..838cc1f 100644
> --- a/arch/x86/xen/setup.c
> +++ b/arch/x86/xen/setup.c
> @@ -17,6 +17,7 @@
>   #include<asm/e820.h>
>   #include<asm/setup.h>
>   #include<asm/acpi.h>
> +#include<asm/numa.h>
>   #include<asm/xen/hypervisor.h>
>   #include<asm/xen/hypercall.h>
>
> @@ -528,4 +529,7 @@ void __init xen_arch_setup(void)
>   	disable_cpufreq();
>   	WARN_ON(set_pm_idle_to_default());
>   	fiddle_vdso();
> +#ifdef CONFIG_NUMA
> +	numa_off = 1;
> +#endif
>   }
>

Acked-by: Andre Przywara <andre.przywara@amd.com>

I compiled and boot-tested this on my (single node ;-) test box.
First bare-metal, dmesg: No NUMA configuration found
Then again, but with numa=off on the cmd-line: NUMA turned off
Then under Xen as Dom0 kernel: NUMA turned off

So the code behaves under Xen as one would have explicitly specified 
numa=off, which is what we want.
I couldn't get hold of the test machine (old K8 server) that the bug was 
once triggered, that's why I'm reluctant to give my Tested-by.
Will try this ASAP.

Regards,
Andre.

-- 
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany

  parent reply	other threads:[~2012-09-21 17:49 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-03 12:20 Dom0 crash with old style AMD NUMA detection Andre Przywara
2012-08-03 12:36 ` Konrad Rzeszutek Wilk
2012-08-17 14:22   ` Konrad Rzeszutek Wilk
2012-09-14 18:58     ` Konrad Rzeszutek Wilk
2012-09-17  7:29       ` Andre Przywara
2012-09-17 19:14         ` Konrad Rzeszutek Wilk
2012-09-18  9:57           ` Andre Przywara
2012-09-18 13:44             ` Konrad Rzeszutek Wilk
2012-09-18 16:50               ` Andre Przywara
2012-09-18 14:55                 ` Konrad Rzeszutek Wilk
2012-09-21 17:49     ` Andre Przywara [this message]
2012-09-21 17:48       ` Konrad Rzeszutek Wilk
2012-09-21 23:46         ` Andre Przywara
2012-09-24 13:48           ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=505CA8AB.6000808@amd.com \
    --to=andre.przywara@amd.com \
    --cc=jeremy@goop.org \
    --cc=konrad.wilk@oracle.com \
    --cc=konrad@darnok.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).