linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Quentin Perret <quentin.perret@arm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
	mingo@kernel.org, linux-kernel@vger.kernel.org,
	rjw@rjwysocki.net, juri.lelli@redhat.com,
	dietmar.eggemann@arm.com, Morten.Rasmussen@arm.com,
	viresh.kumar@linaro.org, valentin.schneider@arm.com
Subject: Re: [PATCH v5 00/10] track CPU utilization
Date: Mon, 4 Jun 2018 18:13:40 +0100	[thread overview]
Message-ID: <20180604171339.GA25372@e108498-lin.cambridge.arm.com> (raw)
In-Reply-To: <20180604165047.GU12180@hirez.programming.kicks-ass.net>

On Monday 04 Jun 2018 at 18:50:47 (+0200), Peter Zijlstra wrote:
> On Fri, May 25, 2018 at 03:12:21PM +0200, Vincent Guittot wrote:
> > When both cfs and rt tasks compete to run on a CPU, we can see some frequency
> > drops with schedutil governor. In such case, the cfs_rq's utilization doesn't
> > reflect anymore the utilization of cfs tasks but only the remaining part that
> > is not used by rt tasks. We should monitor the stolen utilization and take
> > it into account when selecting OPP. This patchset doesn't change the OPP
> > selection policy for RT tasks but only for CFS tasks
> 
> So the problem is that when RT/DL/stop/IRQ happens and preempts CFS
> tasks, time continues and the CFS load tracking will see !running and
> decay things.
> 
> Then, when we get back to CFS, we'll have lower load/util than we
> expected.
> 
> In particular, your focus is on OPP selection, and where we would have
> say: u=1 (always running task), after being preempted by our RT task for
> a while, it will now have u=.5. With the effect that when the RT task
> goes sleep we'll drop our OPP to .5 max -- which is 'wrong', right?
> 
> Your solution is to track RT/DL/stop/IRQ with the identical PELT average
> as we track cfs util. Such that we can then add the various averages to
> reconstruct the actual utilisation signal.
> 
> This should work for the case of the utilization signal on UP. When we
> consider that PELT migrates the signal around on SMP, but we don't do
> that to the per-rq signals we have for RT/DL/stop/IRQ.
> 
> There is also the 'complaint' that this ends up with 2 util signals for
> DL, complicating things.
> 
> 
> So this patch-set tracks the !cfs occupation using the same function,
> which is all good. But what, if instead of using that to compensate the
> OPP selection, we employ that to renormalize the util signal?
> 
> If we normalize util against the dynamic (rt_avg affected) cpu_capacity,
> then I think your initial problem goes away. Because while the RT task
> will push the util to .5, it will at the same time push the CPU capacity
> to .5, and renormalized that gives 1.
> 
>   NOTE: the renorm would then become something like:
>         scale_cpu = arch_scale_cpu_capacity() / rt_frac();

Isn't it equivalent ? I mean, you can remove RT/DL/stop/IRQ from the CPU
capacity and compare the CFS util_avg against that, or you can add
RT/DL/stop/IRQ to the CFS util_avg and compare it to arch_scale_cpu_capacity().
Both should be interchangeable no ? By adding RT/DL/IRQ PELT signals
to the CFS util_avg, Vincent is proposing to go with the latter I think.

But aren't the signals we currently use to account for RT/DL/stop/IRQ in
cpu_capacity good enough for that ? Can't we just add the diff between
capacity_orig_of and capacity_of to the CFS util and do OPP selection with
that (for !nr_rt_running) ? Maybe add a min with dl running_bw to be on
the safe side ... ?

> 
> 
> On IRC I mentioned stopping the CFS clock when preempted, and while that
> would result in fixed numbers, Vincent was right in pointing out the
> numbers will be difficult to interpret, since the meaning will be purely
> CPU local and I'm not sure you can actually fix it again with
> normalization.
> 
> Imagine, running a .3 RT task, that would push the (always running) CFS
> down to .7, but because we discard all !cfs time, it actually has 1. If
> we try and normalize that we'll end up with ~1.43, which is of course
> completely broken.
> 
> 
> _However_, all that happens for util, also happens for load. So the above
> scenario will also make the CPU appear less loaded than it actually is.
> 
> Now, we actually try and compensate for that by decreasing the capacity
> of the CPU. But because the existing rt_avg and PELT signals are so
> out-of-tune, this is likely to be less than ideal. With that fixed
> however, the best this appears to do is, as per the above, preserve the
> actual load. But what we really wanted is to actually inflate the load,
> such that someone will take load from us -- we're doing less actual work
> after all.
> 
> Possibly, we can do something like:
> 
> 	scale_cpu_capacity / (rt_frac^2)
> 
> for load, then we inflate the load and could maybe get rid of all this
> capacity_of() sprinkling, but that needs more thinking.
> 
> 
> But I really feel we need to consider both util and load, as this issue
> affects both.

  reply	other threads:[~2018-06-04 17:13 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-25 13:12 [PATCH v5 00/10] track CPU utilization Vincent Guittot
2018-05-25 13:12 ` [PATCH v5 01/10] sched/pelt: Move pelt related code in a dedicated file Vincent Guittot
2018-05-25 14:26   ` Quentin Perret
2018-05-25 16:14     ` Peter Zijlstra
2018-05-29  8:21       ` Quentin Perret
2018-05-25 18:04     ` Patrick Bellasi
2018-05-29 14:55       ` Quentin Perret
2018-05-29 15:02         ` Vincent Guittot
2018-05-29 15:04           ` Quentin Perret
2018-05-25 13:12 ` [PATCH v5 02/10] sched/rt: add rt_rq utilization tracking Vincent Guittot
2018-05-25 15:54   ` Patrick Bellasi
2018-05-29 13:29     ` Vincent Guittot
2018-05-30  9:32       ` Patrick Bellasi
2018-05-30 10:06         ` Vincent Guittot
2018-05-30 11:01           ` Patrick Bellasi
2018-05-30 14:39             ` Vincent Guittot
2018-05-25 13:12 ` [PATCH v5 03/10] cpufreq/schedutil: add rt " Vincent Guittot
2018-05-30  7:03   ` Viresh Kumar
2018-05-30  8:23     ` Vincent Guittot
2018-05-30  9:40   ` Patrick Bellasi
2018-05-30  9:53     ` Vincent Guittot
2018-05-30 16:46   ` Quentin Perret
2018-05-31  8:46     ` Juri Lelli
2018-06-01 16:23       ` Peter Zijlstra
2018-06-01 17:23         ` Patrick Bellasi
2018-06-04 10:17           ` Quentin Perret
2018-06-04 15:16             ` Patrick Bellasi
2018-05-25 13:12 ` [PATCH v5 04/10] sched/dl: add dl_rq " Vincent Guittot
2018-05-30 10:50   ` Patrick Bellasi
2018-05-30 11:51     ` Vincent Guittot
2018-05-25 13:12 ` [PATCH v5 05/10] cpufreq/schedutil: get max utilization Vincent Guittot
2018-05-28 10:12   ` Juri Lelli
2018-05-28 14:57     ` Vincent Guittot
2018-05-28 15:22       ` Juri Lelli
2018-05-28 16:34         ` Vincent Guittot
2018-05-31 10:27           ` Patrick Bellasi
2018-05-31 13:02             ` Vincent Guittot
2018-06-01 13:53               ` Vincent Guittot
2018-06-01 17:45                 ` Joel Fernandes
2018-06-04  6:41                   ` Vincent Guittot
2018-06-04  7:04                     ` Juri Lelli
2018-06-04  7:14                       ` Vincent Guittot
2018-06-04 10:12                         ` Juri Lelli
2018-06-04 12:35                           ` Vincent Guittot
2018-05-29  5:08     ` Joel Fernandes
2018-05-29  6:31       ` Juri Lelli
2018-05-29  6:48         ` Vincent Guittot
2018-05-29  9:47           ` Juri Lelli
2018-05-29  8:40   ` Quentin Perret
2018-05-29  9:52     ` Juri Lelli
2018-05-30  8:37       ` Quentin Perret
2018-05-30  8:51         ` Juri Lelli
2018-05-25 13:12 ` [PATCH v5 06/10] sched: remove rt and dl from sched_avg Vincent Guittot
2018-05-25 13:12 ` [PATCH v5 07/10] sched/irq: add irq utilization tracking Vincent Guittot
2018-05-30 15:55   ` Dietmar Eggemann
2018-05-30 18:45     ` Vincent Guittot
2018-05-31 16:54       ` Dietmar Eggemann
2018-06-06 16:06         ` Vincent Guittot
2018-06-07  8:29           ` Dietmar Eggemann
2018-06-07  8:44             ` Vincent Guittot
2018-06-07  9:06               ` Dietmar Eggemann
2018-05-25 13:12 ` [PATCH v5 08/10] cpufreq/schedutil: take into account interrupt Vincent Guittot
2018-05-28 10:41   ` Juri Lelli
2018-05-28 12:06     ` Vincent Guittot
2018-05-28 12:37       ` Juri Lelli
2018-05-25 13:12 ` [PATCH v5 09/10] sched: remove rt_avg code Vincent Guittot
2018-05-25 13:12 ` [PATCH v5 10/10] proc/sched: remove unused sched_time_avg_ms Vincent Guittot
2018-06-04 16:50 ` [PATCH v5 00/10] track CPU utilization Peter Zijlstra
2018-06-04 17:13   ` Quentin Perret [this message]
2018-06-04 18:08   ` Vincent Guittot
2018-06-05 14:18     ` Peter Zijlstra
2018-06-05 15:03       ` Juri Lelli
2018-06-05 15:38       ` Patrick Bellasi
2018-06-05 22:27         ` Peter Zijlstra
2018-06-06  9:44       ` Quentin Perret
2018-06-06  9:59         ` Vincent Guittot
2018-06-06 10:02           ` Vincent Guittot
2018-06-06 10:12           ` Quentin Perret
2018-06-05  8:36 ` Vincent Guittot
2018-06-05 10:57   ` Quentin Perret
2018-06-05 11:59     ` Vincent Guittot
2018-06-05 13:12       ` Quentin Perret
2018-06-05 13:18         ` Vincent Guittot
2018-06-05 13:52           ` Quentin Perret
2018-06-05 13:55             ` Vincent Guittot
2018-06-05 14:09               ` Quentin Perret
2018-06-05 14:21                 ` Quentin Perret
2018-06-05 12:11     ` Juri Lelli
2018-06-05 13:05       ` Quentin Perret
2018-06-05 13:15         ` Juri Lelli
2018-06-05 14:01           ` Quentin Perret
2018-06-05 14:13             ` Juri Lelli
2018-06-06 13:05               ` Claudio Scordino
2018-06-06 13:20                 ` Quentin Perret
2018-06-06 13:53                   ` Claudio Scordino
2018-06-06 14:10                     ` Quentin Perret
2018-06-06 21:05                   ` luca abeni
2018-06-07  8:25                     ` Quentin Perret
2018-06-06 20:53                 ` luca abeni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180604171339.GA25372@e108498-lin.cambridge.arm.com \
    --to=quentin.perret@arm.com \
    --cc=Morten.Rasmussen@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).