Re: [RFC PATCH 00/12 v2] A new CPU load metric for power-efficient scheduler: CPU ConCurrency

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Krzysztof Kozlowski <k.kozlowski@samsung.com>
To: Yuyang Du <yuyang.du@intel.com>
Cc: mingo@redhat.com, peterz@infradead.org,
	rafael.j.wysocki@intel.com, linux-kernel@vger.kernel.org,
	linux-pm@vger.kernel.org, arjan.van.de.ven@intel.com,
	len.brown@intel.com, alan.cox@intel.com, mark.gross@intel.com,
	morten.rasmussen@arm.com, vincent.guittot@linaro.org,
	rajeev.d.muralidhar@intel.com, vishwesh.m.rudramuni@intel.com,
	nicole.chalhoub@intel.com, ajaya.durg@intel.com,
	harinarayanan.seshadri@intel.com, jacob.jun.pan@linux.intel.com,
	fengguang.wu@intel.com
Subject: Re: [RFC PATCH 00/12 v2] A new CPU load metric for power-efficient scheduler: CPU ConCurrency
Date: Tue, 13 May 2014 15:23:33 +0200	[thread overview]
Message-ID: <1399987413.16665.4.camel@AMDC1943> (raw)
In-Reply-To: <1399832221-8314-1-git-send-email-yuyang.du@intel.com>

On pon, 2014-05-12 at 02:16 +0800, Yuyang Du wrote:
> Hi Ingo, PeterZ, Rafael, and others,
> 
> The current schedulerâ€™s load balancing is completely work-conserving. In some
> workload, generally low CPU utilization but immersed with CPU bursts of
> transient tasks, migrating task to engage all available CPUs for
> work-conserving can lead to significant overhead: cache locality loss,
> idle/active HW state transitional latency and power, shallower idle state,
> etc, which are both power and performance inefficient especially for todayâ€™s
> low power processors in mobile. 
> 
> This RFC introduces a sense of idleness-conserving into work-conserving (by
> all means, we really donâ€™t want to be overwhelming in only one way). But to
> what extent the idleness-conserving should be, bearing in mind that we donâ€™t
> want to sacrifice performance? We first need a load/idleness indicator to that
> end.
> 
> Thanks to CFSâ€™s â€œmodel an ideal, precise multi-tasking CPUâ€, tasks can be seen
> as concurrently running (the tasks in the runqueue). So it is natural to use
> task concurrency as load indicator. Having said that, we do two things:
> 
> 1) Divide continuous time into periods of time, and average task concurrency
> in period, for tolerating the transient bursts:
> a = sum(concurrency * time) / period
> 2) Exponentially decay past periods, and synthesize them all, for hysteresis
> to load drops or resilience to load rises (let f be decaying factor, and a_x
> the xth period average since period 0):
> s = a_n + f^1 * a_n-1 + f^2 * a_n-2 +, ..., + f^(n-1) * a_1 + f^n * a_0
> 
> We name this load indicator as CPU ConCurrency (CC): task concurrency
> determines how many CPUs are needed to be running concurrently.
> 
> Another two ways of how to interpret CC:
> 
> 1) the current work-conserving load balance also uses CC, but instantaneous
> CC.
> 
> 2) CC vs. CPU utilization. CC is runqueue-length-weighted CPU utilization. If
> we change: "a = sum(concurrency * time) / period" to "a' = sum(1 * time) /
> period". Then a' is just about the CPU utilization. And the way we weight
> runqueue-length is the simplest one (excluding the exponential decays, and you
> may have other ways).
> 
> To track CC, we intercept the scheduler in 1) enqueue, 2) dequeue, 3)
> scheduler tick, and 4) enter/exit idle.
> 
> After CC, in the consolidation part, we do 1) attach the CPU topology to be
> adaptive beyond our experimental platforms, and 2) intercept the current load
> balance for load and load balancing containment.
> 
> Currently, CC is per CPU. To consolidate, the formula is based on a heuristic.
> Suppose we have 2 CPUs, their task concurrency over time is ('-' means no
> task, 'x' having tasks):
> 
> 1)
> CPU0: ---xxxx---------- (CC[0])
> CPU1: ---------xxxx---- (CC[1])
> 
> 2)
> CPU0: ---xxxx---------- (CC[0])
> CPU1: ---xxxx---------- (CC[1])
> 
> If we consolidate CPU0 and CPU1, the consolidated CC will be: CC' = CC[0] +
> CC[1] for case 1 and CC'' = (CC[0] + CC[1]) * 2 for case 2. For the cases in
> between case 1 and 2 in terms of how xxx overlaps, the CC should be between
> CC' and CC''. So, we uniformly use this condition for consolidation (suppose
> we consolidate m CPUs to n CPUs, m > n):
> 
> (CC[0] + CC[1] + ... + CC[m-2] + CC[m-1]) * (n + log(m-n)) >=<? (1 * n) * n *
> consolidating_coefficient
> 
> The consolidating_coefficient could be like 100% or more or less.
> 
> By CC, we implemented a Workload Consolidation patch on two Intel mobile
> platforms (a quad-core composed of two dual-core modules): contain load and
> load balancing in the first dual-core when aggregated CC low, and if not in
> the full quad-core. Results show that we got power savings and no substantial
> performance regression (even gains for some). The workloads we used to
> evaluate the Workload Consolidation include 1) 50+ perf/ux benchmarks (almost
> all of the magazine ones), and 2) ~10 power workloads, of course, they are the
> easiest ones, such as browsing, audio, video, recording, imaging, etc. The
> current half-life is 1 period, and the period was 32ms, and now 64ms for more
> aggressive consolidation.

Hi,

Could you share some more numbers for energy savings and impact on
performance? I am also interested in these 10 power workloads - what
they are exactly?

Best regards,
Krzysztof




> v2:
> - Data type defined in formation
> 
> Yuyang Du (12):
>   CONFIG for CPU ConCurrency
>   Init CPU ConCurrency
>   CPU ConCurrency calculation
>   CPU ConCurrency tracking
>   CONFIG for Workload Consolidation
>   Attach CPU topology to specify each sched_domain's workload
>     consolidation
>   CPU ConCurrency API for Workload Consolidation
>   Intercept wakeup/fork/exec load balancing
>   Intercept idle balancing
>   Intercept periodic nohz idle balancing
>   Intercept periodic load balancing
>   Intercept RT scheduler
> 
>  arch/x86/Kconfig             |   21 +
>  include/linux/sched.h        |   13 +
>  include/linux/sched/sysctl.h |    8 +
>  include/linux/topology.h     |   16 +
>  kernel/sched/Makefile        |    1 +
>  kernel/sched/concurrency.c   |  928 ++++++++++++++++++++++++++++++++++++++++++
>  kernel/sched/core.c          |   46 +++
>  kernel/sched/fair.c          |  131 +++++-
>  kernel/sched/rt.c            |   25 ++
>  kernel/sched/sched.h         |   36 ++
>  kernel/sysctl.c              |   16 +
>  11 files changed, 1232 insertions(+), 9 deletions(-)
>  create mode 100644 kernel/sched/concurrency.c
>

     prev parent reply	other threads:[~2014-05-13 13:23 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-11 18:16 [RFC PATCH 00/12 v2] A new CPU load metric for power-efficient scheduler: CPU ConCurrency Yuyang Du
2014-05-11 18:16 ` [RFC PATCH 01/12 v2] CONFIG for " Yuyang Du
2014-05-11 18:16 ` [RFC PATCH 02/12 v2] Init " Yuyang Du
2014-05-11 18:16 ` [RFC PATCH 03/12 v2] CPU ConCurrency calculation Yuyang Du
2014-05-11 18:16 ` [RFC PATCH 04/12 v2] CPU ConCurrency tracking Yuyang Du
2014-05-11 18:16 ` [RFC PATCH 05/12 v2] CONFIG for Workload Consolidation Yuyang Du
2014-05-11 18:16 ` [RFC PATCH 06/12 v2] Attach CPU topology to specify each sched_domain's workload consolidation Yuyang Du
2014-05-11 18:16 ` [RFC PATCH 07/12 v2] CPU ConCurrency API for Workload Consolidation Yuyang Du
2014-05-11 18:16 ` [RFC PATCH 08/12 v2] Intercept wakeup/fork/exec load balancing Yuyang Du
2014-05-11 18:16 ` [RFC PATCH 09/12 v2] Intercept idle balancing Yuyang Du
2014-05-11 18:16 ` [RFC PATCH 10/12 v2] Intercept periodic nohz " Yuyang Du
2014-05-11 18:17 ` [RFC PATCH 11/12 v2] Intercept periodic load balancing Yuyang Du
2014-05-11 18:17 ` [RFC PATCH 12/12 v2] Intercept RT scheduler Yuyang Du
2014-05-12  6:45 ` [RFC PATCH 00/12 v2] A new CPU load metric for power-efficient scheduler: CPU ConCurrency Peter Zijlstra
2014-05-12  1:28   ` Yuyang Du
2014-05-12 15:15     ` Peter Zijlstra
2014-05-13 13:23 ` Krzysztof Kozlowski [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1399987413.16665.4.camel@AMDC1943 \
    --to=k.kozlowski@samsung.com \
    --cc=ajaya.durg@intel.com \
    --cc=alan.cox@intel.com \
    --cc=arjan.van.de.ven@intel.com \
    --cc=fengguang.wu@intel.com \
    --cc=harinarayanan.seshadri@intel.com \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mark.gross@intel.com \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=nicole.chalhoub@intel.com \
    --cc=peterz@infradead.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=rajeev.d.muralidhar@intel.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vishwesh.m.rudramuni@intel.com \
    --cc=yuyang.du@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.