From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754743AbZJESvD (ORCPT ); Mon, 5 Oct 2009 14:51:03 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754702AbZJESvB (ORCPT ); Mon, 5 Oct 2009 14:51:01 -0400 Received: from one.firstfloor.org ([213.235.205.2]:40377 "EHLO one.firstfloor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754681AbZJESvB (ORCPT ); Mon, 5 Oct 2009 14:51:01 -0400 Date: Mon, 5 Oct 2009 20:50:24 +0200 From: Andi Kleen To: Yinghai Lu Cc: Andi Kleen , Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" , Suresh Siddha , Tejun Heo , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] x86: use near online node instead of round bin for numa Message-ID: <20091005185024.GS1656@one.firstfloor.org> References: <4AC7974C.20304@kernel.org> <87my45yk79.fsf@basil.nowhere.org> <4ACA3677.40407@kernel.org> <20091005183539.GR1656@one.firstfloor.org> <4ACA3DAE.6030300@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4ACA3DAE.6030300@kernel.org> User-Agent: Mutt/1.4.2.2i Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 05, 2009 at 11:40:46AM -0700, Yinghai Lu wrote: > Andi Kleen wrote: > > On Mon, Oct 05, 2009 at 11:09:59AM -0700, Yinghai Lu wrote: > >> Andi Kleen wrote: > >>> Yinghai Lu writes: > >>> > >>>> cpu to node mapping is set in following sequence: > >>>> 1. numa_init_array: set up roundbin from cpu to online node > >>>> 2. init_cpu_to_node: set that according to apicid_to_node[] according to srat > >>>> only handle that node is online, and leave other cpu on node > >>>> without ram (aka not online) to still round-bin > >>>> 3. later srat_detect_node for intel/amd, will use first_online node or near by > >>>> node. > >>>> > >>>> problem is that setup_per_cpu_areas() is called between 2 and 3. the per_cpu > >>>> for cpu on node with ram is on different node. and could put that on node with > >>>> two hops away. > >>>> > >>>> so try add find_near_online_node() and call int init_cpu_to_node() > >>> This fallback case should not really happen anyways, unless the BIOS is buggy > >>> (in this case it might better to completely reject the SRAT because > >>> more might be wrong). > >> SRAT is right, and some node has no ram installed. > > > > In this case there should be still a PXM to define the CPU locality -- your BIOS is broken. > > Please fix it there. > > I don't think so. Let's put it like this: your BIOS does not describe the full system topology which is a severe BIOS bug. Putting hacks into Linux to work around that is not the right solution. -Andi -- ak@linux.intel.com -- Speaking for myself only.