From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S265667AbUA0AJv (ORCPT ); Mon, 26 Jan 2004 19:09:51 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S265683AbUA0AJv (ORCPT ); Mon, 26 Jan 2004 19:09:51 -0500 Received: from e6.ny.us.ibm.com ([32.97.182.106]:49331 "EHLO e6.ny.us.ibm.com") by vger.kernel.org with ESMTP id S265667AbUA0AJt (ORCPT ); Mon, 26 Jan 2004 19:09:49 -0500 Date: Mon, 26 Jan 2004 16:09:37 -0800 From: "Martin J. Bligh" To: Andrew Theurer , Nick Piggin cc: Rusty Russell , linux-kernel@vger.kernel.org Subject: Re: New NUMA scheduler and hotplug CPU Message-ID: <35060000.1075162177@flay> In-Reply-To: <200401261740.12657.habanero@us.ibm.com> References: <20040125235431.7BC192C0FF@lists.samba.org> <315060000.1075134874@[10.10.2.4]> <40159C41.9030707@cyberone.com.au> <200401261740.12657.habanero@us.ibm.com> X-Mailer: Mulberry/2.1.2 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org > Call me crazy, but why not let the topology be determined via userspace at a > more appropriate time? When you hotplug, you tell it where in the scheduler > to plug it. Have structures in the scheduler which represent the > nodes-runqueues-cpus topology (in the past I tried a node/rq/cpu structs with > simple pointers), but let the topology be built based on user's desires thru > hotplug. Well, I agree with the "at a more appropriate time" bit. But there's no real need to make a bunch of complicated stuff out in userspace for this - we're trying to lay out the scheduler domains according to the hardware topology of the machine. It's not a userspace namespace or anything. Having userspace fishing down way deep in hardware specific stuff is silly - the kernel is there as a hardware abstraction layer. Now if you wanted to use sched domains for workload management or something and involve userspace, then yes ... that'd be more appropriate. > For example, you boot on just the boot cpu, which by default is in the first > node on the first runqueue. All other cpus, whether being "booted" for the > for the first time or hotplugged (maybe now there's really no difference), > the hotplugging tells where the cpu should be, in what node and what > runqueue. HT cpus work even better, because you can hotplug siblings, once > at a time if you wanted, to the same runqueue. Or you have cpus sharing a > die, same thing, lots of choices here. This removes any per-arch updates to > the kernel for things like scheduler topology, and lets them go somewhere > else more easily changes, like userspace. Ummm ... but *none* of that is dictated as policy stuff - it's all just the hardware layout of the machine. You cannot "decide" as the sysadmin which node a CPU is in, or which HT sibling it has. It's just there ;-) The only thing you could possibly dictate is the CPU number you want assigned to the new CPU, which frankly, I think is pointless - they're arbitrary tags, and always have been. M.