From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S266998AbUBFXXn (ORCPT ); Fri, 6 Feb 2004 18:23:43 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S266667AbUBFXVR (ORCPT ); Fri, 6 Feb 2004 18:21:17 -0500 Received: from mail-04.iinet.net.au ([203.59.3.36]:4530 "HELO mail.iinet.net.au") by vger.kernel.org with SMTP id S266988AbUBFXU6 (ORCPT ); Fri, 6 Feb 2004 18:20:58 -0500 Message-ID: <40242152.5030606@cyberone.com.au> Date: Sat, 07 Feb 2004 10:20:50 +1100 From: Nick Piggin User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040122 Debian/1.6-1 X-Accept-Language: en MIME-Version: 1.0 To: Rick Lindsley CC: "Martin J. Bligh" , Anton Blanchard , akpm@osdl.org, linux-kernel@vger.kernel.org, dvhltc@us.ibm.com Subject: Re: [PATCH] Load balancing problem in 2.6.2-mm1 References: <200402062311.i16NBdF14365@owlet.beaverton.ibm.com> In-Reply-To: <200402062311.i16NBdF14365@owlet.beaverton.ibm.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Rick Lindsley wrote: > OK, but do you agree that the rate we rebalance things like 2 vs 1 should > be slower than the rate we rebalance 3 vs 1 ? Fairness is only relevant > over a long term imbalance anyway, so there should be a big damper on > "fairness only" rebalances. > >I think, given the precision we're granted via SCHED_LOAD_SCALE, in >combination with the new "load average" (cpu_load) code, that we can >achieve what we want. > >If cpu0 has 2 runnable tasks and cpu1 has 1 runnable task, won't we see >the "load average" of cpu0 slowly approach 2, but not jump there? > > Yep >Right now, we round up on all fractions and Martin has proposed a patch >which takes it the other way and rounds down. What if in marginal >cases like this where this is a small but persistent difference, we >could bump the task to another cpu when it reaches (say) 1.8 or 1.9? >That would keep it there longer for shorter-lived tasks, but for those >long-runners, they'd eventually spread the pain around a little. > >And yes, a cpu_load of even 1.0 should *never* get migrated to a cpu >with a load 0.0. Instead of > This isn't the load though, but imbalance. It has already passed through our imbalance_pct filter (or we are idle), so we can pretty safely assume that we want to try to move at least one task. > > *imbalance = (*imbalance + SCHED_LOAD_SCALE - 1) >> SCHED_LOAD_SHIFT; > >how about, for instance, > > if (max_load <= SCHED_LOAD_SCALE) > *imbalance = 0; > else > *imbalance = (*imbalance + (SCHED_LOAD_SCALE / 6) - 1) > >> SCHED_LOAD_SHIFT; > >The intent is to never move anything if max_load is 1 or less (what >advantage is there?) and to create a slight tendency to round up at >loads greater than that, which would still tend to leave things where >they were until they'd been there a while. In fact the "bonus" >(SCHED_LOAD_SCALE / 6 - 1) could be another configurable in the scheduling >domain so that at some level you're not interested in fairness and >they just don't bounce at all. > > Hopefully just tending to round down more would damp it better. *imbalance = (*imbalance + SCHED_LOAD_SCALE/2) >> SCHED_LOAD_SHIFT; Or even remove the addition all together.