public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
	Ingo Molnar <mingo@kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Russell King - ARM Linux <linux@arm.linux.org.uk>,
	LAK <linux-arm-kernel@lists.infradead.org>,
	Preeti U Murthy <preeti@linux.vnet.ibm.com>,
	Mike Galbraith <efault@gmx.de>,
	Nicolas Pitre <nicolas.pitre@linaro.org>,
	"linaro-kernel@lists.linaro.org" <linaro-kernel@lists.linaro.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Dietmar Eggemann <Dietmar.Eggemann@arm.com>
Subject: Re: [PATCH v3 09/12] Revert "sched: Put rq's sched_avg under CONFIG_FAIR_GROUP_SCHED"
Date: Mon, 14 Jul 2014 15:20:52 +0200	[thread overview]
Message-ID: <20140714132052.GY9918@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20140714125529.GN26542@e103034-lin>

[-- Attachment #1: Type: text/plain, Size: 4212 bytes --]

On Mon, Jul 14, 2014 at 01:55:29PM +0100, Morten Rasmussen wrote:
> On Fri, Jul 11, 2014 at 09:12:38PM +0100, Peter Zijlstra wrote:
> > On Fri, Jul 11, 2014 at 07:39:29PM +0200, Vincent Guittot wrote:
> > > In my mind, arch_scale_cpu_freq was intend to scale the capacity of
> > > the CPU according to the current dvfs operating point.
> > > As it's no more use anywhere now that we have arch_scale_cpu, we could
> > > probably remove it .. and see when it will become used.
> > 
> > I probably should have written comments when I wrote that code, but it
> > was meant to be used only where, as described above, we limit things.
> > Ondemand and such, which will temporarily decrease freq, will ramp it up
> > again at demand, and therefore lowering the capacity will skew things.
> > 
> > You'll put less load on because its run slower, and then you'll run it
> > slower because there's less load on -> cyclic FAIL.
> 
> Agreed. We can't use a frequency scaled compute capacity for all
> load-balancing decisions. However, IMHO, it would be useful to have know
> the current compute capacity in addition to the max compute capacity
> when considering energy costs. So we would have something like:
> 
> * capacity_max: cpu capacity at highest frequency.
> 
> * capacity_cur: cpu capacity at current frequency.
> 
> * capacity_avail: cpu capacity currently available. Basically
>   capacity_cur taking rt, deadline, and irq accounting into account.
> 
> capacity_max should probably include rt, deadline, and irq accounting as
> well. Or we need both?

I'm struggling to fully grasp your intent. We need DVFS like accounting
for sure, and that means a current freq hook, but I'm not entirely sure
how that relates to capacity.

> Based on your description arch_scale_freq_capacity() can't be abused to
> implement capacity_cur (and capacity_avail) unless it is repurposed.
> Nobody seems to implement it. Otherwise we would need something similar
> to update capacity_cur (and capacity_avail).

Yeah, I never got around to doing so. I started doing a APERF/MPERF SMT
capacity thing for x86 but never finished that. The naive implementation
suffered the same FAIL loop as above because APERF stops on idle. So
when idle your capacity drops to nothing, leading to no new work,
leading to more idle etc.

I never got around to fixing that -- adding an idle filter, and ever
since things have somewhat bitrotted.

> As a side note, we can potentially get into a similar fail cycle already
> due to the lack of scale invariance in the entity load tracking.

Yah, I think that got mentioned a long while ago.

> > > > In that same discussion ISTR a suggestion about adding avg_running time,
> > > > as opposed to the current avg_runnable. The sum of avg_running should be
> > > > much more accurate, and still react correctly to migrations.
> > > 
> > > I haven't look in details but I agree that avg_running would be much
> > > more accurate than avg_runnable and should probably fit the
> > > requirement. Does it means that we could re-add the avg_running (or
> > > something similar) that has disappeared during the review of load avg
> > > tracking patchset ?
> > 
> > Sure, I think we killed it there because there wasn't an actual use for
> > it and I'm always in favour of stripping everything to their bare bones,
> > esp big and complex things.
> > 
> > And then later, add things back once we have need for it.
> 
> I think it is a useful addition to the set of utilization metrics. I
> don't think it is universally more accurate than runnable_avg. Actually
> quite the opposite when the cpu is overloaded. But for partially loaded
> cpus it is very useful if you don't want to factor in waiting time on
> the rq.

Well, different things different names. Utilization as per literature is
simply the fraction of CPU time actually used. In that sense running_avg
is about right for that. Our current runnable_avg is entirely different
(as I think we all agree by now).

But yes, for application the tipping point is u == 1, up until that
point pure utilization makes sense, after that our runnable_avg makes
more sense.



[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

  reply	other threads:[~2014-07-14 13:22 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-30 16:05 [PATCH v3 00/12] sched: consolidation of cpu_power Vincent Guittot
2014-06-30 16:05 ` [PATCH v3 01/12] sched: fix imbalance flag reset Vincent Guittot
2014-07-08  3:13   ` Preeti U Murthy
2014-07-08 10:12     ` Vincent Guittot
2014-07-09  3:54       ` Preeti U Murthy
2014-07-09  8:27         ` Vincent Guittot
2014-07-09 10:43         ` Peter Zijlstra
2014-07-09 11:41           ` Preeti U Murthy
2014-07-09 14:44             ` Peter Zijlstra
2014-07-10  9:14           ` Vincent Guittot
2014-07-10  9:30             ` [PATCH v4 ] " Vincent Guittot
2014-07-10 10:57               ` Preeti U Murthy
2014-07-10 11:04             ` [PATCH v3 01/12] " Preeti U Murthy
2014-07-09  3:05   ` Rik van Riel
2014-07-09  3:36     ` Rik van Riel
2014-07-09 10:14   ` Peter Zijlstra
2014-07-09 10:30     ` Vincent Guittot
2014-06-30 16:05 ` [PATCH v3 02/12] sched: remove a wake_affine condition Vincent Guittot
2014-07-09  3:06   ` Rik van Riel
2014-06-30 16:05 ` [PATCH v3 03/12] sched: fix avg_load computation Vincent Guittot
2014-07-09  3:10   ` Rik van Riel
2014-06-30 16:05 ` [PATCH v3 04/12] sched: Allow all archs to set the power_orig Vincent Guittot
2014-07-09  3:11   ` Rik van Riel
2014-07-09 10:57   ` Peter Zijlstra
2014-07-10 13:42     ` Vincent Guittot
2014-06-30 16:05 ` [PATCH v3 05/12] ARM: topology: use new cpu_power interface Vincent Guittot
2014-07-09  3:11   ` Rik van Riel
2014-07-09  7:49   ` Amit Kucheria
2014-07-09 10:09     ` Vincent Guittot
2014-06-30 16:05 ` [PATCH v3 06/12] sched: add per rq cpu_power_orig Vincent Guittot
2014-07-09  3:11   ` Rik van Riel
2014-07-09  7:50   ` Amit Kucheria
2014-06-30 16:05 ` [PATCH v3 07/12] sched: test the cpu's capacity in wake affine Vincent Guittot
2014-07-09  3:12   ` Rik van Riel
2014-07-10 11:06   ` Peter Zijlstra
2014-07-10 13:58     ` Vincent Guittot
2014-06-30 16:05 ` [PATCH v3 08/12] sched: move cfs task on a CPU with higher capacity Vincent Guittot
2014-07-10 11:18   ` Peter Zijlstra
2014-07-10 14:03     ` Vincent Guittot
2014-07-11 14:51       ` Peter Zijlstra
2014-07-11 15:17         ` Vincent Guittot
2014-07-14 13:51           ` Peter Zijlstra
2014-07-15  9:21             ` Vincent Guittot
2014-07-10 11:24   ` Peter Zijlstra
2014-07-10 13:59     ` Vincent Guittot
2014-07-10 11:31   ` Peter Zijlstra
2014-06-30 16:05 ` [PATCH v3 09/12] Revert "sched: Put rq's sched_avg under CONFIG_FAIR_GROUP_SCHED" Vincent Guittot
2014-07-10 13:16   ` Peter Zijlstra
2014-07-11  7:51     ` Vincent Guittot
2014-07-11 15:13       ` Peter Zijlstra
2014-07-11 17:39         ` Vincent Guittot
2014-07-11 20:12           ` Peter Zijlstra
2014-07-14 12:55             ` Morten Rasmussen
2014-07-14 13:20               ` Peter Zijlstra [this message]
2014-07-14 14:04                 ` Morten Rasmussen
2014-07-14 16:22                   ` Peter Zijlstra
2014-07-15  9:20             ` Vincent Guittot
2014-07-14 17:54           ` Dietmar Eggemann
2014-07-18  1:27             ` Yuyang Du
2014-07-11 16:13       ` Morten Rasmussen
2014-07-15  9:27         ` Vincent Guittot
2014-07-15  9:32           ` Morten Rasmussen
2014-07-15  9:53             ` Vincent Guittot
2014-06-30 16:05 ` [PATCH v3 10/12] sched: get CPU's utilization statistic Vincent Guittot
2014-06-30 16:05 ` [PATCH v3 11/12] sched: replace capacity_factor by utilization Vincent Guittot
2014-06-30 16:05 ` [PATCH v3 12/12] sched: add SD_PREFER_SIBLING for SMT level Vincent Guittot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140714132052.GY9918@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=Dietmar.Eggemann@arm.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=efault@gmx.de \
    --cc=linaro-kernel@lists.linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=mingo@kernel.org \
    --cc=morten.rasmussen@arm.com \
    --cc=nicolas.pitre@linaro.org \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox