From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751585AbbJLKBE (ORCPT ); Mon, 12 Oct 2015 06:01:04 -0400 Received: from mga03.intel.com ([134.134.136.65]:8598 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751173AbbJLKBC (ORCPT ); Mon, 12 Oct 2015 06:01:02 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.17,671,1437462000"; d="scan'208";a="662527698" Date: Mon, 12 Oct 2015 10:12:31 +0800 From: Yuyang Du To: Peter Zijlstra Cc: Mike Galbraith , linux-kernel@vger.kernel.org Subject: Re: 4.3 group scheduling regression Message-ID: <20151012021230.GK11102@intel.com> References: <20151008111959.GM3816@twins.programming.kicks-ass.net> <1444483369.2804.9.camel@gmail.com> <20151010170142.GI3816@twins.programming.kicks-ass.net> <1444530318.3363.40.camel@gmail.com> <1444585321.4169.18.camel@gmail.com> <20151012072344.GM3604@twins.programming.kicks-ass.net> <1444635897.3425.19.camel@gmail.com> <20151012080407.GJ3816@twins.programming.kicks-ass.net> <20151012005351.GJ11102@intel.com> <20151012091206.GK3816@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151012091206.GK3816@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 12, 2015 at 11:12:06AM +0200, Peter Zijlstra wrote: > On Mon, Oct 12, 2015 at 08:53:51AM +0800, Yuyang Du wrote: > > Good morning, Peter. > > > > On Mon, Oct 12, 2015 at 10:04:07AM +0200, Peter Zijlstra wrote: > > > On Mon, Oct 12, 2015 at 09:44:57AM +0200, Mike Galbraith wrote: > > > > > > > It's odd to me that things look pretty much the same good/bad tree with > > > > hogs vs hogs or hogs vs tbench (with top anyway, just adding up times). > > > > Seems Xorg+mplayer more or less playing cross group ping-pong must be > > > > the BadThing trigger. > > > > > > Ohh, wait, Xorg and mplayer are _not_ in the same group? I was assuming > > > you had your entire user session in 1 (auto) group and was competing > > > against 8 manual cgroups. > > > > > > So how exactly are things configured? > > > > Hmm... my impression is the naughty boy mplayer (+Xorg) isn't favored, due > > to the per CPU group entity share distribution. Let me dig more. > > So in the old code we had 'magic' to deal with the case where a cgroup > was consuming less than 1 cpu's worth of runtime. For example, a single > task running in the group. > > In that scenario it might be possible that the group entity weight: > > se->weight = (tg->shares * cfs_rq->weight) / tg->weight; > > Strongly deviates from the tg->shares; you want the single task reflect > the full group shares to the next level; due to the whole distributed > approximation stuff. Yeah, I thought so. > I see you've deleted all that code; see the former > __update_group_entity_contrib(). Probably not there, it actually was an icky way to adjust things. > It could be that we need to bring that back. But let me think a little > bit more on this.. I'm having a hard time waking :/ I am guessing it is in calc_tg_weight(), and naughty boys do make them more favored, what a reality... Mike, beg you test the following? -- diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 4df37a4..b184da0 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2370,7 +2370,7 @@ static inline long calc_tg_weight(struct task_group *tg, struct cfs_rq *cfs_rq) */ tg_weight = atomic_long_read(&tg->load_avg); tg_weight -= cfs_rq->tg_load_avg_contrib; - tg_weight += cfs_rq_load_avg(cfs_rq); + tg_weight += cfs_rq->load.weight; return tg_weight; } @@ -2380,7 +2380,7 @@ static long calc_cfs_shares(struct cfs_rq *cfs_rq, struct task_group *tg) long tg_weight, load, shares; tg_weight = calc_tg_weight(tg, cfs_rq); - load = cfs_rq_load_avg(cfs_rq); + load = cfs_rq->load.weight; shares = (tg->shares * load); if (tg_weight)