From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S934281AbcI0TVS (ORCPT <rfc822;w@1wt.eu>);
        Tue, 27 Sep 2016 15:21:18 -0400
Received: from mail-wm0-f44.google.com ([74.125.82.44]:35224 "EHLO
        mail-wm0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1750762AbcI0TVK (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 27 Sep 2016 15:21:10 -0400
Date: Tue, 27 Sep 2016 20:21:07 +0100
From: Matt Fleming <matt@codeblueprint.co.uk>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@kernel.org>,
        linux-kernel <linux-kernel@vger.kernel.org>,
        Mike Galbraith <umgwanakikbuti@gmail.com>,
        Yuyang Du <yuyang.du@intel.com>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>
Subject: Re: [PATCH] sched/fair: Do not decay new task load on first enqueue
Message-ID: <20160927192107.GB16071@codeblueprint.co.uk>
References: <20160923115808.2330-1-matt@codeblueprint.co.uk>
 <CAKfTPtCi_ekH0ENU+oUJsQka2XagvY=gk=RDZRfpLFWypKrroQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAKfTPtCi_ekH0ENU+oUJsQka2XagvY=gk=RDZRfpLFWypKrroQ@mail.gmail.com>
User-Agent: Mutt/1.5.24+41 (02bc14ed1569) (2015-08-30)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Fri, 23 Sep, at 04:30:25PM, Vincent Guittot wrote:
> 
> Does it mean that you can see the perf drop that you mention below
> because load is decayed to 1002 instead of staying to 1024 ?
 
The performance drop comes from the fact that enqueueing/dequeueing a
task with load 1002 during fork() results in a zero runnable_load_avg,
which signals to the load balancer that the CPU is idle, so the next
time we fork() we'll pick the same CPU to enqueue on -- and the cycle
continues.

I mention the performance regression mainly because it's the thing
that led to me discovering this bug, and only a little as support for
applying the patch ;-)

> 1002 mainly comes from period_contrib being set to 1023 during
> init_entity_runnable_average so any delay longer than 1us between
> attach_entity_load_avg and enqueue_entity_load_avg will trig the decay
> of the load from 1024 to 1002
 
Right.

> But this patch doesn't change the behavior of runnable_load_avg, isn't
> it ? it has only an impact on the initial value of p->se.avg.load_avg
> when the task is enqueued.
 
Correct. It isn't guaranteed that runnable_load_avg will be non-zero
with this patch applied, that was just the case for the workload and
the machine I tested.

> > Arguably the real problem is that balancing on fork doesn't look at
> > the blocked contribution of tasks, only the runnable load and it's
> > possible for the two metrics to be wildly different on a relatively
> > idle system.
> 
> fair enough

I did have some patches somewhere to address this. I'll have to dig
them out.