From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751782AbcEKC6T (ORCPT ); Tue, 10 May 2016 22:58:19 -0400 Received: from mga03.intel.com ([134.134.136.65]:42583 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751236AbcEKC6R (ORCPT ); Tue, 10 May 2016 22:58:17 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,607,1455004800"; d="scan'208";a="803557024" Date: Wed, 11 May 2016 03:16:46 +0800 From: Yuyang Du To: Mike Galbraith Cc: Peter Zijlstra , Chris Mason , Ingo Molnar , Matt Fleming , linux-kernel@vger.kernel.org Subject: Re: sched: tweak select_idle_sibling to look for idle threads Message-ID: <20160510191646.GA4870@intel.com> References: <1462694935.4155.83.camel@suse.de> <20160508185747.GL16093@intel.com> <1462765540.3803.44.camel@suse.de> <20160508202201.GM16093@intel.com> <1462779853.3803.128.camel@suse.de> <20160509011311.GQ16093@intel.com> <1462786745.3803.181.camel@suse.de> <20160509232623.GR16093@intel.com> <1462866562.3702.33.camel@suse.de> <1462893965.3702.56.camel@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1462893965.3702.56.camel@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 10, 2016 at 05:26:05PM +0200, Mike Galbraith wrote: > On Tue, 2016-05-10 at 09:49 +0200, Mike Galbraith wrote: > > > Only whacking > > cfs_rq_runnable_load_avg() with a rock makes schbench -m -t > > -a work well. 'Course a rock in its gearbox also > > rendered load balancing fairly busted for the general case :) > > Smaller rock doesn't injure heavy tbench, but more importantly, still > demonstrates the issue when you want full spread. > > schbench -m4 -t38 -a > > cputime 30000 threads 38 p99 177 > cputime 30000 threads 39 p99 10160 > > LB_TIP_AVG_HIGH > cputime 30000 threads 38 p99 193 > cputime 30000 threads 39 p99 184 > cputime 30000 threads 40 p99 203 > cputime 30000 threads 41 p99 202 > cputime 30000 threads 42 p99 205 > cputime 30000 threads 43 p99 218 > cputime 30000 threads 44 p99 237 > cputime 30000 threads 45 p99 245 > cputime 30000 threads 46 p99 262 > cputime 30000 threads 47 p99 296 > cputime 30000 threads 48 p99 3308 > > 47*4+4=nr_cpus yay yay... and haha, "a perfect world"... > --- > kernel/sched/fair.c | 3 +++ > kernel/sched/features.h | 1 + > 2 files changed, 4 insertions(+) > > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -3027,6 +3027,9 @@ void remove_entity_load_avg(struct sched > > static inline unsigned long cfs_rq_runnable_load_avg(struct cfs_rq *cfs_rq) > { > + if (sched_feat(LB_TIP_AVG_HIGH) && cfs_rq->load.weight > cfs_rq->runnable_load_avg*2) > + return cfs_rq->runnable_load_avg + min_t(unsigned long, NICE_0_LOAD, > + cfs_rq->load.weight/2); > return cfs_rq->runnable_load_avg; > } cfs_rq->runnable_load_avg is for sure no greater than (in this case much less than, maybe 1/2 of) load.weight, whereas load_avg is not necessarily a rock in gearbox that only impedes speed up, but also speed down. But I really don't know the load references in select_task_rq() should be what kind. So maybe the real issue is a mix of them, i.e., conflated balancing and just wanting an idle cpu. ?