From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: Xen-4.3 and -unstable regression from changeset "numa-sched: leave node-affinity alone if not in 'auto' mode" Date: Thu, 28 Nov 2013 15:09:49 +0000 Message-ID: <52975CBD.8010408@eu.citrix.com> References: <529737AD.7070708@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <529737AD.7070708@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andrew Cooper , Xen-devel List , Dario Faggioli Cc: Jan Beulich List-Id: xen-devel@lists.xenproject.org On 11/28/2013 12:31 PM, Andrew Cooper wrote: > Hello, > > I have recently positivly identified > b54a623efbcf5bff25c55117add1b4427b4e2f1b as causing a boot failure. > > Serial log is attached. The crash is completely deterministic, and is > from an IBM xSeries 3530 M4 server. > > Given the crash and bad patch, I suspect it is more to do with the > NUMA/memory layout than the specifics of the server. > > Dario: Being your patch, do you have any ideas? Do you have a xen-syms you can use to find out what line the crash happened at? Dom0 should have auto_node_affinity set at this point; so before this patch you'd have: nodemask = NODEMASK_MASK_NONE; [set nodes in nodemask from cpumask] d->node_affinity=nodemask After, you have: nodes_clear(d->node_affinity) [set nodes in d->node_affinity from cpumask] Everything looks like it should be the same. Can you try just reverting what's in the positive side of the if()? I.e., adding back in nodemask=NODE_MASK_NONE at the top, and the nodemask copying, and see what happens? -George