From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933743AbcECQxn (ORCPT ); Tue, 3 May 2016 12:53:43 -0400 Received: from foss.arm.com ([217.140.101.70]:40271 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756115AbcECQxm (ORCPT ); Tue, 3 May 2016 12:53:42 -0400 Subject: Re: [PATCH 4/7] sched/fair: Clean up the logic in fix_small_imbalance() To: Peter Zijlstra References: <1461958364-675-1-git-send-email-dietmar.eggemann@arm.com> <1461958364-675-5-git-send-email-dietmar.eggemann@arm.com> <20160503101225.GM3430@twins.programming.kicks-ass.net> Cc: linux-kernel@vger.kernel.org, Morten Rasmussen From: Dietmar Eggemann Message-ID: <5728D793.3070909@arm.com> Date: Tue, 3 May 2016 17:53:39 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <20160503101225.GM3430@twins.programming.kicks-ass.net> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/05/16 11:12, Peter Zijlstra wrote: > On Fri, Apr 29, 2016 at 08:32:41PM +0100, Dietmar Eggemann wrote: >> Avoid the need to add scaled_busy_load_per_task on both sides of the if >> condition to determine whether imbalance has to be set to >> busiest->load_per_task or not. >> >> The imbn variable was introduced with commit 2dd73a4f09be ("[PATCH] >> sched: implement smpnice") and the original if condition was >> >> if (max_load - this_load >= busiest_load_per_task * imbn) >> >> which over time changed into the current version where >> scaled_busy_load_per_task is to be found on both sides of >> the if condition. > > This appears to have started with: > > dd41f596cda0 ("sched: cfs core code") > > which for unexplained reasons does: > > - if (max_load - this_load >= busiest_load_per_task * imbn) { > + if (max_load - this_load + SCHED_LOAD_SCALE_FUZZ >= > + busiest_load_per_task * imbn) { > > > And later patches (by me) change that FUZZ into a variable metric, > because a fixed fuzz like that didn't at all work for the small loads > that result from cgroup tasks. > > > > Now fix_small_imbalance() always hurt my head; it originated in the > original sched_domain balancer from Nick which wasn't smpnice aware; and > lives on until today. I see, all this code is already in the history.git kernel. > > Its purpose is to determine if moving one task over is beneficial. > However over time -- and smpnice started this -- the idea of _one_ task > became quite muddled. > > With the fine grained load accounting of today; does it even make sense > to ask this question? IOW. what does fix_small_imbalance() really gain > us -- other than a head-ache? So task priority breaks the assumption that 1 task is equivalent to SCHED_LOAD_SCALE and so does fine grained load accounting. fix_small_imbalance() is called twice from calculate_imbalance, if we would get rid of it, I don't know if we should bail out of lb in case the avg load values don't align nicely (busiest > sd avg > local) or just continue w/ lb. In the second case, where the imbalance value is raised (to busiest->load_per_task), we probably can just continue w/ lb, hoping that there is a task on the src rq which fits the smaller imbalance value.