From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752116AbcEGJF6 (ORCPT ); Sat, 7 May 2016 05:05:58 -0400 Received: from mga01.intel.com ([192.55.52.88]:7988 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750936AbcEGJF4 (ORCPT ); Sat, 7 May 2016 05:05:56 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,589,1455004800"; d="scan'208";a="948135701" Date: Sat, 7 May 2016 09:24:17 +0800 From: Yuyang Du To: Mike Galbraith Cc: Peter Zijlstra , Chris Mason , Ingo Molnar , Matt Fleming , linux-kernel@vger.kernel.org Subject: Re: sched: tweak select_idle_sibling to look for idle threads Message-ID: <20160507012417.GK16093@intel.com> References: <20160405180822.tjtyyc3qh4leflfj@floor.thefacebook.com> <20160409190554.honue3gtian2p6vr@floor.thefacebook.com> <20160430124731.GE2975@worktop.cust.blueprintrf.com> <1462086753.9717.29.camel@suse.de> <20160501085303.GF2975@worktop.cust.blueprintrf.com> <1462094425.9717.45.camel@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1462094425.9717.45.camel@suse.de> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, May 01, 2016 at 11:20:25AM +0200, Mike Galbraith wrote: > On Sun, 2016-05-01 at 10:53 +0200, Peter Zijlstra wrote: > > On Sun, May 01, 2016 at 09:12:33AM +0200, Mike Galbraith wrote: > > > On Sat, 2016-04-30 at 14:47 +0200, Peter Zijlstra wrote: > > > > > > Can you guys have a play with this; I think one and two node tbench are > > > > good, but I seem to be getting significant run to run variance on that, > > > > so maybe I'm not doing it right. > > > > > > Nah, tbench is just variance prone. It got dinged up at clients=cores > > > on my desktop box, on 4 sockets the high end got seriously dinged up. > > > > Ouch, yeah, big hurt. Lets try that again... :-) > > Yeah, box could use a little bandaid and a hug :) > > Playing with Chris' benchmark, seems the biggest problem is that we > don't buddy up waker of many and it's wakees in a node.. ie the wake > wide thing isn't necessarily our friend when there are multiple wakers > of many. If I run an instance per node with one mother of all work in > autobench mode, it works exactly as you'd expect, game over is when > wakees = socket size. It never get's near that point if I let things > wander, it beats itself up well before we get there. Maybe give the criteria a bit margin, not just wakees tend to equal llc_size, but the numbers are so wild to easily break the fragile condition, like: if (master * 100 < slave * factor * 110) return 0; And since you accumulate wakee number (and decay at HZ), this check tends to not satisfy ever? if (slave < factor) return 0;