Re: [1/11] issue 1: Missing power topology information in scheduler

linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Morten Rasmussen <morten.rasmussen@arm.com>
To: mark gross <markgross@thegnar.org>
Cc: "peterz@infradead.org" <peterz@infradead.org>,
	"mingo@kernel.org" <mingo@kernel.org>,
	"rjw@rjwysocki.net" <rjw@rjwysocki.net>,
	"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>
Subject: Re: [1/11] issue 1: Missing power topology information in scheduler
Date: Mon, 30 Dec 2013 14:00:25 +0000	[thread overview]
Message-ID: <20131230140024.GB2936@e103034-lin> (raw)
In-Reply-To: <20131222151905.GA3250@mgross-Lenovo-Yoga-2-Pro>

On Sun, Dec 22, 2013 at 03:19:05PM +0000, mark gross wrote:
> On Fri, Dec 20, 2013 at 04:45:41PM +0000, Morten Rasmussen wrote:
> > The current mainline scheduler has no power topology information
> > available to enable it to make energy-aware decisions. The energy cost
> > of running a cpu at different frequencies and the energy cost of waking
> > up another cpu are needed.
> > 
> > One example where this could be useful is audio on Android. With the
> > current mainline scheduler it would utilize three cpus when active. Due
> > to the size of the tasks it is still possible to meet the performance
> > criteria when execution is serialized on a single cpu. Depending on the
> > power topology leaving two cpus idle and running one longer may lead to
> > energy savings if the cpus can be power-gated individually.
> > 
> > The audio performance requirements can be satisfied by most cpus at the
> > lowest frequency. Video is a more interesting use-case due to its higher
> > performance requirements. Running all tasks on a single cpu is likely to
> > require a higher frequency than if the tasks are spread out across
> > more cpus.
> > 
> > Running Android video playback on an ARM Cortex-A7 platform with 1, 2,
> > and 4 cpus online has lead to the following power measurements
> > (normalized):
> > 
> > video 720p (Android)
> > cpus	power
> > 1	1.59
> > 2	1.00
> > 4	1.10
> 
> I wonder what 3 CPU's shows?  Also, is this "display-on" power measured from
> the battery?  The variance seems too big for a display-on measurement.

These are cpus-only measurements excluding gpu, dram, and other
peripherals. So yes, the relative total power saving is much smaller.

I don't have numbers for 3 cpus, but I will see if I can get them. Based
on the traces for 2 and 4 cpus my guess is that 3 cpus would be very
close to 2 cpus if not slightly better. The available parallelism seems
limited. The fourth cpu is hardly used and the third is only used for
short periods. 

> 
> Here we seem to have a workload consisting of about 2 threads and where if we
> use more than 2 CPUs' we pay a penalty for task migration.  There is no tie to
> cpu L2 or power rail topology in this example.  From this data alown the
> scheduler simply needs to avoid using more CPU's until the workload truely has
> more threads.

We have more threads in this workload (use-case 3), but rarely more than
two of them running (or runnable) simultaneously. I agree, that the
scheduler needs to avoid using more cpus than necessary.

The scheduler is particularly bad at this for thread patterns like the
ones observed for audio and video playback. Both have chains of threads
that wake up the next thread and then go to sleep. Since the current
thread continues to run for a moment after the next one has been woken
up, the wakee tends to be placed on a new cpu rather than waiting a few
tens of microseconds for the current cpu to be vacant.

            t0                     t0
cpu 0	===========            ===========
	         |             ^
                 v   t1        |
cpu 1	         ===========   |
                          |    |
                          v t2 |
cpu 2                     =======

> 
> What data do you have on the actual video workload in terms of threads?  My
> guess is we are looking an audio decode and video decode processing.  Is this
> video playback measurement including any rendering power or is all CPU?

As said above, these are cpus-only power numbers, so any gpu rendering is
excluded. The cpu workload includes both partial video decoding and
audio decoding. I'm not sure how much is offloaded to the gpu. use-case
3 has a short description of the main threads.

What sort of data are you looking for?

> 
> I guess what I'm calling out is it is not clear what the right thing for the
> scheduer to do as there is no physical model coupling power to SoC topology and
> workload charactoristics.
> 
> I'll see if the folks at work who are hands on with similar KPI measureing can
> share similar data.  (they read this list too) It may be easier for them to
> share if we can agree on a normalization of the power data.  Say 100 "lumps"
> (of coal) measured from the battery or psu output rails as the power burned on
> a workload if run by booting with MAXCPUS=1 kernel command line?  (or should it
> be measured from the SoC power rails?) That way we don't need to worry as much
> about exposing competitive sensitive data in physical units.  FWIW I would go
> with display off measurements (in "airplane mode"?) from the battery or
> equivelent.

Good question. System power (battery) seems right if you have your SoC
on a board which is fairly similar to the end product. That is not always
the case, so I tend to look at SoC power instead. How large is the
difference if the display is off and in airplane mode? Would it work if
we just state the measurement method when posting numbers?

Normalizing to single cpu (MAXCPUS=1 or equivalent) would work I think.

> 
> BTW remember my comment about power measuring being a path to hell?  Agreeing
> on what workloads to measure, how to normalize, what to measure and from where
> on a device, and how to report it (statistical data across multiple runs) is a
> pain.  Details on screen on vrs off and reproducibilty of data by a third party
> quickly come into play. 

Yes :) I don't think we necessarily have to have a fully specified test
suite. As long as each interested party makes sure to test whatever
patches that might eventually come out of this, we have the proof
whether it works or not. Third party reproducibility is more difficult.

Morten

next prev parent reply	other threads:[~2013-12-30 14:00 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-20 16:45 [0/11] Energy-aware scheduling use-cases and scheduler issues Morten Rasmussen
2013-12-20 16:45 ` [1/11] issue 1: Missing power topology information in scheduler Morten Rasmussen
2013-12-22 15:19   ` mark gross
2013-12-30 14:00     ` Morten Rasmussen [this message]
2014-01-13 20:23       ` Rafael J. Wysocki
2014-01-14 16:21         ` Morten Rasmussen
2014-01-14 17:09           ` Peter Zijlstra
2013-12-20 16:45 ` [2/11] issue 2: Energy-awareness for heterogeneous systems Morten Rasmussen
2013-12-20 16:45 ` [3/11] issue 3: No understanding of potential cpu capacity Morten Rasmussen
2013-12-20 16:45 ` [4/11] issue 4: Tracking idle states Morten Rasmussen
2013-12-20 16:45 ` [5/11] issue 5: Frequency and uarch invariant task load Morten Rasmussen
2013-12-20 16:45 ` [6/11] issue 6: Poor and non-deterministic performance on heterogeneous systems Morten Rasmussen
2013-12-20 16:45 ` [7/11] use-case 1: Webbrowsing on Android Morten Rasmussen
2013-12-20 16:45 ` [8/11] use-case 2: Audio playback " Morten Rasmussen
2014-01-07 12:15   ` Peter Zijlstra
2014-01-07 12:16     ` Peter Zijlstra
2014-01-07 16:02       ` Morten Rasmussen
2014-01-07 15:55     ` Morten Rasmussen
2013-12-20 16:45 ` [9/11] use-case 3: Video " Morten Rasmussen
2013-12-20 16:45 ` [10/11] use-case 4: Game " Morten Rasmussen
2013-12-20 16:45 ` [11/11] system 1: Saving energy using DVFS Morten Rasmussen
2013-12-22 16:28 ` [0/11] Energy-aware scheduling use-cases and scheduler issues mark gross
2013-12-30 12:10   ` Morten Rasmussen
2014-01-12 16:47     ` mark gross
2014-01-13 12:04       ` Catalin Marinas
  -- strict thread matches above, loose matches on Subject: below --
2014-01-07 16:19 [0/11][REPOST] " Morten Rasmussen
2014-01-07 16:19 ` [1/11] issue 1: Missing power topology information in scheduler Morten Rasmussen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131230140024.GB2936@e103034-lin \
    --to=morten.rasmussen@arm.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=linux-pm@vger.kernel.org \
    --cc=markgross@thegnar.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).