From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S265586AbUAZXYr (ORCPT ); Mon, 26 Jan 2004 18:24:47 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S265587AbUAZXYq (ORCPT ); Mon, 26 Jan 2004 18:24:46 -0500 Received: from e6.ny.us.ibm.com ([32.97.182.106]:39162 "EHLO e6.ny.us.ibm.com") by vger.kernel.org with ESMTP id S265586AbUAZXYm (ORCPT ); Mon, 26 Jan 2004 18:24:42 -0500 Date: Mon, 26 Jan 2004 15:24:31 -0800 From: "Martin J. Bligh" To: Nick Piggin cc: Rusty Russell , linux-kernel@vger.kernel.org Subject: Re: New NUMA scheduler and hotplug CPU Message-ID: <31860000.1075159471@flay> In-Reply-To: <40159C41.9030707@cyberone.com.au> References: <20040125235431.7BC192C0FF@lists.samba.org> <4014CF39.50209@cyberone.com.au> <315060000.1075134874@[10.10.2.4]> <40159C41.9030707@cyberone.com.au> X-Mailer: Mulberry/2.1.2 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org >> Well isn't it a bad idea to have cpus in the data that are offline? >> It'll throw off all your balancing calculations, won't it? You seemed >> to be careful to do things like divide the total load on the node by >> the number of CPUs on the node, and that'll get totally borked if you >> have fake CPUs in there. > > I think it mostly does a good job at making sure to only take > online cpus into account. If there are places where it doesn't > then it shouldn't be too hard to fix. It'd make the code a damned sight simpler and cleaner if you dropped all that stuff, and updated the structures when you hotplugged a CPU, which is really the only sensible way to do it anyway ... For instance, if I remove cpu X, then bring back a new CPU on another node (or in another HT sibling pair) as CPU X, then you'll need to update all that stuff anyway. CPUs aren't fixed position in that map - the ordering handed out is arbitrary. >> To me, it'd make more sense to add the CPUs to the scheduler structures >> as they get brought online. I can also imagine machines where you have >> a massive (infinite?) variety of possible CPUs that could appear - >> like an NUMA box where you could just plug arbitrary numbers of new >> nodes in as you wanted. > > I guess so, but you'd still need NR_CPUS to be >= that arbitrary > number. Yup ... but you don't have to enumerate all possible positions that way. See Linus' arguement re dynamic device numbers and ISCSI disks, etc. Same thing applies. > Well this would be the problem. I guess its quite possible that > one doesn't know the topology of newly added CPUs before hand. > > Well OK, this would require a per architecture function to handle > CPU hotplug. It could possibly just default to arch_init_sched_domains, > and just completely reinitialise everything which would be the simplest. Yeah, it's not trivially simple. But then neither is the rest of CPU hotplug, to do it right ;-) Requiring CPU hotplug callback hooks does seem to be the right way to interface with the sched code though ... M.