From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S932227AbbJMIYa (ORCPT <rfc822;w@1wt.eu>);
	Tue, 13 Oct 2015 04:24:30 -0400
Received: from mga11.intel.com ([192.55.52.93]:45766 "EHLO mga11.intel.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752465AbbJMIYF (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Tue, 13 Oct 2015 04:24:05 -0400
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.17,677,1437462000"; 
   d="scan'208";a="663238043"
Date: Tue, 13 Oct 2015 08:35:35 +0800
From: Yuyang Du <yuyang.du@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>, linux-kernel@vger.kernel.org
Subject: Re: 4.3 group scheduling regression
Message-ID: <20151013003535.GO11102@intel.com>
References: <1444585321.4169.18.camel@gmail.com>
 <20151012072344.GM3604@twins.programming.kicks-ass.net>
 <1444635897.3425.19.camel@gmail.com>
 <20151012080407.GJ3816@twins.programming.kicks-ass.net>
 <20151012005351.GJ11102@intel.com>
 <20151012091206.GK3816@twins.programming.kicks-ass.net>
 <20151012021230.GK11102@intel.com>
 <1444645411.3534.5.camel@gmail.com>
 <20151012195516.GM11102@intel.com>
 <20151013080648.GP3604@twins.programming.kicks-ass.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20151013080648.GP3604@twins.programming.kicks-ass.net>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Tue, Oct 13, 2015 at 10:06:48AM +0200, Peter Zijlstra wrote:
> On Tue, Oct 13, 2015 at 03:55:17AM +0800, Yuyang Du wrote:
> 
> > I think maybe the real disease is the tg->load_avg is not updated in time.
> > I.e., it is after migrate, the source cfs_rq does not decrease its contribution
> > to the parent's tg->load_avg fast enough.
> 
> No, using the load_avg for shares calculation seems wrong; that would
> mean we'd first have to ramp up the avg before you react.
> 
> You want to react quickly to actual load changes, esp. going up.
> 
> We use the avg to guess the global group load, since that's the best
> compromise we have, but locally it doesn't make sense to use the avg if
> we have the actual values.

In Mike's case, since the mplayer group has only one active task, after
the task migrates, the source cfs_rq should have zero contrib to the
tg, so at the destination, the group entity should have the entire tg's
share. It is just the zeroing can be that fast we need.

But yes, in a general case, the load_avg (that has the blocked load) is
likely to lag behind. Using the actual load.weight to accelerate the
process makes sense. It is especially helpful to the less hungry tasks.