From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932341AbVHKSOY (ORCPT ); Thu, 11 Aug 2005 14:14:24 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S932342AbVHKSOY (ORCPT ); Thu, 11 Aug 2005 14:14:24 -0400 Received: from fmr23.intel.com ([143.183.121.15]:48609 "EHLO scsfmr003.sc.intel.com") by vger.kernel.org with ESMTP id S932341AbVHKSOX (ORCPT ); Thu, 11 Aug 2005 14:14:23 -0400 Date: Thu, 11 Aug 2005 11:14:11 -0700 From: "Siddha, Suresh B" To: Nick Piggin Cc: "Siddha, Suresh B" , Ingo Molnar , Andrew Morton , lkml , steiner@sgi.com, dvhltc@us.ibm.com, mbligh@mbligh.org Subject: Re: allow the load to grow upto its cpu_power (was Re: [Patch] don't kick ALB in the presence of pinned task) Message-ID: <20050811111411.A581@unix-os.sc.intel.com> References: <20050801174221.B11610@unix-os.sc.intel.com> <20050802092717.GB20978@elte.hu> <20050809160813.B1938@unix-os.sc.intel.com> <42F94A00.3070504@yahoo.com.au> <20050809190352.D1938@unix-os.sc.intel.com> <1123729750.5188.13.camel@npiggin-nld.site> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <1123729750.5188.13.camel@npiggin-nld.site>; from nickpiggin@yahoo.com.au on Thu, Aug 11, 2005 at 01:09:10PM +1000 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 11, 2005 at 01:09:10PM +1000, Nick Piggin wrote: > I have a variation on the 2nd part of your patch which I think > I would prefer. IMO it kind of generalises the current imbalance > calculation to handle this case rather than introducing a new > special case. There is a difference between our changes. When the system is lightly loaded, my patch minimizes the number of groups picking up that load. This will help in power savings for example in the context of CMP. There are more changes required (user or kernel) for complete power savings, but this is a direction towards that. How about this patch? -- Don't pull tasks from a group if that would cause the group's total load to drop below its total cpu_power (ie. cause the group to start going idle). Signed-off-by: Suresh Siddha Signed-off-by: Nick Piggin --- linux-2.6.13-rc5/kernel/sched.c~ 2005-08-09 13:30:19.067072328 -0700 +++ linux-2.6.13-rc5/kernel/sched.c 2005-08-11 09:29:55.937128384 -0700 @@ -1886,6 +1886,7 @@ { struct sched_group *busiest = NULL, *this = NULL, *group = sd->groups; unsigned long max_load, avg_load, total_load, this_load, total_pwr; + unsigned long excess_load, max_pull; int load_idx; max_load = this_load = total_load = total_pwr = 0; @@ -1932,7 +1933,7 @@ group = group->next; } while (group != sd->groups); - if (!busiest || this_load >= max_load) + if (!busiest || this_load >= max_load || max_load <= SCHED_LOAD_SCALE) goto out_balanced; avg_load = (SCHED_LOAD_SCALE * total_load) / total_pwr; @@ -1952,9 +1953,19 @@ * by pulling tasks to us. Be careful of negative numbers as they'll * appear as very large values with unsigned longs. */ + + /* Don't want to pull so many tasks that a group would go idle */ + excess_load = min(max_load - avg_load, max_load - SCHED_LOAD_SCALE); + + if (this_load < SCHED_LOAD_SCALE) + /* pull as many tasks so that this group is fully utilized */ + max_pull = max(avg_load - this_load, SCHED_LOAD_SCALE - this_load); + else + max_pull = avg_load - this_load; + /* How much load to actually move to equalise the imbalance */ - *imbalance = min((max_load - avg_load) * busiest->cpu_power, - (avg_load - this_load) * this->cpu_power) + *imbalance = min(excess_load * busiest->cpu_power, + max_pull * this->cpu_power) / SCHED_LOAD_SCALE; if (*imbalance < SCHED_LOAD_SCALE) {