From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: Dom0 crash with old style AMD NUMA detection Date: Mon, 24 Sep 2012 09:48:50 -0400 Message-ID: <20120924134850.GC31618@phenom.dumpdata.com> References: <501BC20F.3040205@amd.com> <20120803123628.GB10670@andromeda.dapyr.net> <20120817142237.GA8467@phenom.dumpdata.com> <505CA8AB.6000808@amd.com> <20120921174833.GC6821@phenom.dumpdata.com> <505CFC71.5090702@amd.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <505CFC71.5090702@amd.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andre Przywara Cc: Konrad Rzeszutek Wilk , Jeremy Fitzhardinge , xen-devel , Konrad Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org On Sat, Sep 22, 2012 at 01:46:57AM +0200, Andre Przywara wrote: > On 09/21/2012 07:48 PM, Konrad Rzeszutek Wilk wrote: > >>Acked-by: Andre Przywara > >> > >>I compiled and boot-tested this on my (single node ;-) test box. > >>First bare-metal, dmesg: No NUMA configuration found > >>Then again, but with numa=off on the cmd-line: NUMA turned off > >>Then under Xen as Dom0 kernel: NUMA turned off > >> > >>So the code behaves under Xen as one would have explicitly specified > >>numa=off, which is what we want. > > > >Right. > >>I couldn't get hold of the test machine (old K8 server) that the bug > >>was once triggered, that's why I'm reluctant to give my Tested-by. > >>Will try this ASAP. > > > >OK, will wait with this - it would be a bit silly if the patch did not > >fix the issue :-) > > Thanks for you patience. I tried some machines, it not only affects > K8s, but also Barcelonas and Magny-Cours. > Boot those with a Xen HV and restrict Dom0's memory to something > well below the first node's size (say dom0_mem=512M). If the 3.x > Dom0 kernel has CONFIG_AMD_NUMA compiled in, the box will crash, > because the hardware's NUMA info read from the northbridge does not > fit to Dom0's understanding of it's memory. > With your fix the box booted fine, NUMA is turned off and everyone is happy. > Double checked by commenting the numa_off=1 line in your patch: > crash again. So this line definitely fixes this. > > Tested-by: Andre Przywara OK, send out a git pull for it today. If Linus doesn't take it, I will just have to do it in v3.7 time-frame and do the stable kernel backport. Thanks again for testing and reporting this! > > Regards, > Andre. > > -- > Andre Przywara > AMD-Operating System Research Center (OSRC), Dresden, Germany >