From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753575AbbCXSr5 (ORCPT ); Tue, 24 Mar 2015 14:47:57 -0400 Received: from service87.mimecast.com ([91.220.42.44]:34688 "EHLO service87.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752096AbbCXSr4 convert rfc822-to-8bit (ORCPT ); Tue, 24 Mar 2015 14:47:56 -0400 Message-ID: <5511B157.6030200@arm.com> Date: Tue, 24 Mar 2015 18:47:51 +0000 From: Dietmar Eggemann User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Peter Zijlstra , Morten Rasmussen CC: "mingo@redhat.com" , "vincent.guittot@linaro.org" , "yuyang.du@intel.com" , "preeti@linux.vnet.ibm.com" , "mturquette@linaro.org" , "nico@linaro.org" , "rjw@rjwysocki.net" , Juri Lelli , "linux-kernel@vger.kernel.org" Subject: Re: [RFCv3 PATCH 44/48] sched: Tipping point from energy-aware to conventional load balancing References: <1423074685-6336-1-git-send-email-morten.rasmussen@arm.com> <1423074685-6336-45-git-send-email-morten.rasmussen@arm.com> <20150324152655.GT23123@twins.programming.kicks-ass.net> In-Reply-To: <20150324152655.GT23123@twins.programming.kicks-ass.net> X-OriginalArrivalTime: 24 Mar 2015 18:47:51.0949 (UTC) FILETIME=[0838F3D0:01D06663] X-MC-Unique: 115032418475301501 Content-Type: text/plain; charset=WINDOWS-1252; format=flowed Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 24/03/15 15:26, Peter Zijlstra wrote: > On Wed, Feb 04, 2015 at 06:31:21PM +0000, Morten Rasmussen wrote: >> From: Dietmar Eggemann >> >> Energy-aware load balancing bases on cpu usage so the upper bound of its >> operational range is a fully utilized cpu. Above this tipping point it >> makes more sense to use weighted_cpuload to preserve smp_nice. >> This patch implements the tipping point detection in update_sg_lb_stats >> as if one cpu is over-utilized the current energy-aware load balance >> operation will fall back into the conventional weighted load based one. >> >> cc: Ingo Molnar >> cc: Peter Zijlstra >> >> Signed-off-by: Dietmar Eggemann >> --- >> kernel/sched/fair.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index 6b79603..4849bad 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -6723,6 +6723,10 @@ static inline void update_sg_lb_stats(struct lb_env *env, >> sgs->sum_weighted_load += weighted_cpuload(i); >> if (idle_cpu(i)) >> sgs->idle_cpus++; >> + >> + /* If cpu is over-utilized, bail out of ea */ >> + if (env->use_ea && cpu_overutilized(i, env->sd)) >> + env->use_ea = false; >> } > > I don't immediately see why this is desired. Why would a single > overloaded CPU be reason to quit? It could be the cpus simply aren't > 'balanced' right and the group as a whole is still under utilized. We want to play it safe here. E.g. in a >2 cluster system, this over-utilized cpu could run >1 high priority tasks on a cluster with energy efficient cpus and this cluster could still not be the lb src on DIE level because a not over-utilized cluster with less energy-efficient cpus (burning more energy) could be chosen instead. We could construct cases where the other cpus in this energy efficient cluster can't help the over-utilized cpu during lb on MC level. I can see that using per-cpu data in code which deals w/ sg's is against the sd scalability design where we should rely on per-sg and not per-cpu data though. By bailing out in such a scenario we at least guarantee smpnice provided by conv. CFS. We could also favor an sg with an over-utilized cpu to become the src but which one do we pick if there're multiple potential src sg's w/ an over-utilized cpu? > > In that case we want to continue the balance pass to reach this > equilibrium. >