linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: mark gross <markgross@thegnar.org>
Cc: Morten Rasmussen <Morten.Rasmussen@arm.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"mingo@kernel.org" <mingo@kernel.org>,
	"rjw@rjwysocki.net" <rjw@rjwysocki.net>,
	"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>
Subject: Re: [0/11] Energy-aware scheduling use-cases and scheduler issues
Date: Mon, 13 Jan 2014 12:04:18 +0000	[thread overview]
Message-ID: <20140113120418.GD11805@arm.com> (raw)
In-Reply-To: <20140112164759.GB5008@mgross-Lenovo-Yoga-2-Pro>

On Sun, Jan 12, 2014 at 04:47:59PM +0000, mark gross wrote:
> On Mon, Dec 30, 2013 at 12:10:10PM +0000, Morten Rasmussen wrote:
> > I was hoping that we could come up with a fairly simplistic energy model
> > that could guide the scheduling decisions based on data provided by the
> > vendor. I would start we something very simple and see far we can get
> > and which data that is necessary.
>
> I keep flip flopping in my mind over what is more important.  Energy modeling or
> latency performance measuring.

Both ;)

> I mean, one way to look at the world is given a workload with minimal latency
> and throughput expectations we need deliver those first and then optimize power.

I agree, that's why I don't think blind packing tasks to the left would
work either for power or performance.

> With poor load balancing we do not deliver on performance expectations typically
> in the areas of latencies.  Note, Linux does well on throughput IMO because that
> is easier to measure with kstats and other sampling.
> 
> what sorts of missing thing are needed to measure and understand when wrong
> choices are getting made?  What basic information do we need to capture to know
> if we are doing a good job or not?

I think whatever power awareness we add to the scheduler should aim to
optimise the power consumption (based on some simple model of measuring
the idle time/states and transition on certain platforms and estimating
the energy) but with *minimal* effect on the latency and throughput.
Standard latency/performance benchmarks should always be run to ensure
there are no regressions. Morten's use-cases try to describe scenarios
where the scheduler can do better from a power perspective but without
(drastically) affecting other parameters.

If you have a predictable workload, the scheduler can make the right
decision to optimise for power while keeping the latency under control.
The problem is when the workload changes, the latency would be affected
if tasks need to migrate or the CPU frequency needs to be increased (and
for the latter we currently rely on a cpufreq governor or driver to
detect the workload change and this introduces additional latencies).
Given these pretty independent cpufreq decisions, the best heuristics
for now wrt latency is probably to spread the workload among all the
CPUs and leave enough room for workload changes.

But even with latency under certain limits, you may have for example
small threads (like audio decoding) that could still fit on a CPU when
running at the minimal P-state, with the risk of a big sudden change in
the workload of such thread. That's a trade-off between optimising for
performance and power. A power-aware scheduler does not aim to trade the
latency or throughput for power but rather how well it copes with
workload unpredictability, what margins are guaranteed.

IMHO, adding power awareness to the scheduler could be done in two
(main) ways:

1. Heuristics like packing small tasks with tunables like what "small"
   actually means, how many such "small" tasks and such parameters would
   be specific to each SoC.

2. Power model in the scheduler (I proposed a simplistic one at the end
   of last year) where the scheduler can associate an energy cost with
   its actions (e.g. migrating a task to a CPU) and it would try to
   optimise the overall system energy consumption while preserving the
   latency and throughput.

I consider the second approach being better as you can extend it other
things like power budgets. But it doesn't always go well with hardware
people who don't want to expose real numbers (they don't even need to be
real W or J but just some relative numbers).

-- 
Catalin

      reply	other threads:[~2014-01-13 12:04 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-20 16:45 [0/11] Energy-aware scheduling use-cases and scheduler issues Morten Rasmussen
2013-12-20 16:45 ` [1/11] issue 1: Missing power topology information in scheduler Morten Rasmussen
2013-12-22 15:19   ` mark gross
2013-12-30 14:00     ` Morten Rasmussen
2014-01-13 20:23       ` Rafael J. Wysocki
2014-01-14 16:21         ` Morten Rasmussen
2014-01-14 17:09           ` Peter Zijlstra
2013-12-20 16:45 ` [2/11] issue 2: Energy-awareness for heterogeneous systems Morten Rasmussen
2013-12-20 16:45 ` [3/11] issue 3: No understanding of potential cpu capacity Morten Rasmussen
2013-12-20 16:45 ` [4/11] issue 4: Tracking idle states Morten Rasmussen
2013-12-20 16:45 ` [5/11] issue 5: Frequency and uarch invariant task load Morten Rasmussen
2013-12-20 16:45 ` [6/11] issue 6: Poor and non-deterministic performance on heterogeneous systems Morten Rasmussen
2013-12-20 16:45 ` [7/11] use-case 1: Webbrowsing on Android Morten Rasmussen
2013-12-20 16:45 ` [8/11] use-case 2: Audio playback " Morten Rasmussen
2014-01-07 12:15   ` Peter Zijlstra
2014-01-07 12:16     ` Peter Zijlstra
2014-01-07 16:02       ` Morten Rasmussen
2014-01-07 15:55     ` Morten Rasmussen
2013-12-20 16:45 ` [9/11] use-case 3: Video " Morten Rasmussen
2013-12-20 16:45 ` [10/11] use-case 4: Game " Morten Rasmussen
2013-12-20 16:45 ` [11/11] system 1: Saving energy using DVFS Morten Rasmussen
2013-12-22 16:28 ` [0/11] Energy-aware scheduling use-cases and scheduler issues mark gross
2013-12-30 12:10   ` Morten Rasmussen
2014-01-12 16:47     ` mark gross
2014-01-13 12:04       ` Catalin Marinas [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140113120418.GD11805@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=Morten.Rasmussen@arm.com \
    --cc=linux-pm@vger.kernel.org \
    --cc=markgross@thegnar.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).