xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andre Przywara <andre.przywara@amd.com>
To: Konrad Rzeszutek Wilk <konrad@kernel.org>
Cc: Konrad Rzeszutek Wilk <konrad@darnok.org>,
	Jeremy Fitzhardinge <jeremy@goop.org>,
	xen-devel <xen-devel@lists.xen.org>,
	Dario Faggioli <raistlin@linux.it>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: Dom0 crash with old style AMD NUMA detection
Date: Tue, 18 Sep 2012 11:57:33 +0200	[thread overview]
Message-ID: <5058458D.7030603@amd.com> (raw)
In-Reply-To: <20120917191432.GA18552@phenom.dumpdata.com>

On 09/17/2012 09:14 PM, Konrad Rzeszutek Wilk wrote:
> On Mon, Sep 17, 2012 at 09:29:22AM +0200, Andre Przywara wrote:
>> On 09/14/2012 08:58 PM, Konrad Rzeszutek Wilk wrote:
>>>>>> [    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
>>>>>> (XEN) Domain 0 crashed: 'noreboot' set - not rebooting.
>>>>>>
>>>>>>
>>>>>>
>>>>>> The obvious solution would be to explicitly deny northbridge scanning
>>>>>> when running as Dom0, though I am not sure how to implement this without
>>>>>> upsetting the other kernel folks about "that crappy Xen thing" again ;-)
>>>>>
>>>>> Heh.
>>>>> Is there a numa=0 option that could be used to override it to turn it
>>>>> off?
>>>>
>>>> Not compile tested.. but was thinking something like this:
>>>
>>> ping?
>>
>> That looks good to me - at least for the time being.
>
> OK, can I've your Tested-by/Acked-by on it pls?
>
>> I just want to check how this interacts with upcoming Dom0 NUMA
>> support. It wouldn't be too clever if we deliberately disable NUMA
>
> We can always revert this patch in future versions of Linux.

I don't like this idea. Then we have Linux kernel up to 3.5 working and 
say from 3.8 on again, but 3.6 and 3.7 cannot use NUMA. That would be 
pretty unfortunate.

I haven't checked back with Dario, but I'd suspect that we use ACPI for 
injecting NUMA topology into Dom0. Even if not, a general "numa=off" for 
Dom0 is too much of a sledgehammer for me.

>> and future Xen version will allow us to use it. So let me check if I
>> can confine this turn-off to the fallback K8 northbridge reading.
>
> This potentially could work, but I would prefer to not do it for 3.6.

Mmh, I don't get the idea of your patch below. One can always read the 
NUMA topology from the AMD northbridge, but this is deprecated if favor 
of ACPI. The amdtopology.c stuff was only there to enable NUMA for very 
early Opterons, where BIOSes didn't provide (sane) SRAT tables.
Though we disallow ACPI for NUMA on Dom0, this northbridge scanning 
unfortunately "shines through" the virtualization, actually revealing 
the system's NUMA topology, which is usually much different from Dom0's one.

So instead I want to do more something like this:

diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h
index bfacd2c..7811c0d 100644
--- a/arch/x86/include/asm/numa.h
+++ b/arch/x86/include/asm/numa.h
@@ -20,6 +20,8 @@

  extern int numa_off;

+extern bool deny_amd_nb_numa_scan;
+
  /*
   * __apicid_to_node[] stores the raw mapping between physical apicid and
   * node and is used to initialize cpu_to_node mapping.
diff --git a/arch/x86/mm/amdtopology.c b/arch/x86/mm/amdtopology.c
index 5247d01..f223a67 100644
--- a/arch/x86/mm/amdtopology.c
+++ b/arch/x86/mm/amdtopology.c
@@ -29,6 +29,8 @@

  static unsigned char __initdata nodeids[8];

+bool deny_amd_nb_numa_scan = 0;
+
  static __init int find_northbridge(void)
  {
  	int num;
@@ -78,6 +80,9 @@ int __init amd_numa_init(void)
  	u32 nodeid, reg;
  	unsigned int bits, cores, apicid_base;

+	if (deny_amd_nb_numa_scan)
+		return -ENOENT;
+
  	if (!early_pci_allowed())
  		return -EINVAL;

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index d11ca11..6db63c0 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -532,6 +532,8 @@ void __init xen_arch_setup(void)
  	}
  #endif

+	deny_amd_nb_numa_scan = 1;
+
  	memcpy(boot_command_line, xen_start_info->cmd_line,
  	       MAX_GUEST_CMDLINE > COMMAND_LINE_SIZE ?
  	       COMMAND_LINE_SIZE : MAX_GUEST_CMDLINE);

This would just turn off this one kind of NUMA discovery for Dom0.
The patch is admittedly a bit rough (not sure about the proper placement 
into #ifdef's, for instance) and not well tested yet.
Also one could think about using a more general variable name to cover 
other hardware things in the future that Dom0 shouldn't use.
So this isn't something still for 3.6, probably not even for 3.7.

What about if we drop the patch for this problem at all for 3.6 and 
recommend "numa=off" as a workaround? This is much less sticky than a 
kernel patch and could appear in the Xen wiki, for instance.
After all this isn't a strict regression (appears with every 3.x kernel, 
AFAICT).
Most of the time the northbridge scanning will yield bogus results, so 
the kernel eventually discards it, but sometimes it seems to slip 
through and causes trouble.
Also it does not trigger on newer (Bulldozer) class CPUs, since we 
deliberately avoided adding the new northbridge PCI-ID for this routine.

Regards,
Andre.

>
> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
> index a4790bf..b4edce4 100644
> --- a/arch/x86/xen/setup.c
> +++ b/arch/x86/xen/setup.c
> @@ -17,6 +17,7 @@
>   #include <asm/e820.h>
>   #include <asm/setup.h>
>   #include <asm/acpi.h>
> +#include <asm/numa.h>
>   #include <asm/xen/hypervisor.h>
>   #include <asm/xen/hypercall.h>
>
> @@ -483,7 +484,32 @@ void __cpuinit xen_enable_sysenter(void)
>   	if(ret != 0)
>   		setup_clear_cpu_cap(sysenter_feature);
>   }
> +#ifdef CONFIG_AMD_NUMA
> +int __cpuinit xen_amd_k8(void)
> +{
> +	int num;
> +
> +	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
> +		return -ENOENT;
> +
> +	for (num = 0; num < 32; num++) {
> +		u32 header;
> +
> +		header = read_pci_config(0, num, 0, 0x00);
> +		if (header != (PCI_VENDOR_ID_AMD | (0x1100<<16)) &&
> +			header != (PCI_VENDOR_ID_AMD | (0x1200<<16)) &&
> +			header != (PCI_VENDOR_ID_AMD | (0x1300<<16)))
> +			continue;
>
> +		header = read_pci_config(0, num, 1, 0x00);
> +		if (header != (PCI_VENDOR_ID_AMD | (0x1101<<16)) &&
> +			header != (PCI_VENDOR_ID_AMD | (0x1201<<16)) &&
> +			header != (PCI_VENDOR_ID_AMD | (0x1301<<16)))
> +			continue;
> +		return num;
> +	}
> +	return -ENOENT;
> +#endif
>   void __cpuinit xen_enable_syscall(void)
>   {
>   #ifdef CONFIG_X86_64
> @@ -542,4 +568,8 @@ void __init xen_arch_setup(void)
>   	disable_cpufreq();
>   	WARN_ON(set_pm_idle_to_default());
>   	fiddle_vdso();
> +#ifdef CONFIG_AMD_NUMA
> +	if (xen_amd_k8() >= 0)
> +		numa_off=1;
> +#endif
>   }
>



-- 
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany

  reply	other threads:[~2012-09-18  9:57 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-03 12:20 Dom0 crash with old style AMD NUMA detection Andre Przywara
2012-08-03 12:36 ` Konrad Rzeszutek Wilk
2012-08-17 14:22   ` Konrad Rzeszutek Wilk
2012-09-14 18:58     ` Konrad Rzeszutek Wilk
2012-09-17  7:29       ` Andre Przywara
2012-09-17 19:14         ` Konrad Rzeszutek Wilk
2012-09-18  9:57           ` Andre Przywara [this message]
2012-09-18 13:44             ` Konrad Rzeszutek Wilk
2012-09-18 16:50               ` Andre Przywara
2012-09-18 14:55                 ` Konrad Rzeszutek Wilk
2012-09-21 17:49     ` Andre Przywara
2012-09-21 17:48       ` Konrad Rzeszutek Wilk
2012-09-21 23:46         ` Andre Przywara
2012-09-24 13:48           ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5058458D.7030603@amd.com \
    --to=andre.przywara@amd.com \
    --cc=jeremy@goop.org \
    --cc=konrad.wilk@oracle.com \
    --cc=konrad@darnok.org \
    --cc=konrad@kernel.org \
    --cc=raistlin@linux.it \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).