From: a.p.zijlstra@chello.nl (Peter Zijlstra)
To: linux-arm-kernel@lists.infradead.org
Subject: sched: ARM: arch_scale_freq_power
Date: Tue, 11 Oct 2011 09:57:32 +0200 [thread overview]
Message-ID: <1318319852.14400.65.camel@laptop> (raw)
In-Reply-To: <CAP245DX8b45Nj5SAwtLivp_vMKpwfeUrqnR2sqjnVyqiRd61gg@mail.gmail.com>
On Tue, 2011-10-11 at 12:46 +0530, Amit Kucheria wrote:
> Adding Peter to the discussion..
Right, CCing the folks who actually wrote the code you're asking
questions about always helps ;-)
> On Thu, Oct 6, 2011 at 5:06 PM, Vincent Guittot
> <vincent.guittot@linaro.org> wrote:
> > I work to link the cpu_power of ARM cores to their frequency by using
> > arch_scale_freq_power.
Why and how? In particular note that if you're using something like the
on-demand cpufreq governor this isn't going to work.
> It's explained in the kernel that cpu_power is
> > used to distribute load on cpus and a cpu with more cpu_power will
> > pick up more load. The default value is SCHED_POWER_SCALE and I
> > increase the value if I want a cpu to have more load than another one.
> > Is there an advised range for cpu_power value as well as some time
> > scale constraints for updating the cpu_power value ?
Basically 1024 is the unit and denotes the capacity of a full core at
'normal' speed.
Typically cpufreq would down-clock a core and thus you'd end up with a
smaller number (linearly proportional to the freq ratio etc. although if
you want to go really fancy you could determine the actual
throughput/freq curves).
Things like x86 turbo mode would result in a >1024 value.
Things like SMT would typically result in <1024 and the SMT sum over the
core >1024 (if you're lucky).
> > I'm also wondering why this scheduler feature is currently disable by default ?
Because the only implementation in existence (x86) is broken and I
haven't gotten around to fixing it. Arguable we should disable that for
the time being, see below.
> In discussions with Vincent regarding this, I've wondered whether
> cpu_power wouldn't be better renamed to cpu_capacity since that is
> what it really seems to describe.
Possibly, but its been cpu_power for ages and we use capacity to
describe something else.
---
arch/x86/kernel/cpu/sched.c | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kernel/cpu/sched.c b/arch/x86/kernel/cpu/sched.c
index a640ae5..90ae68c 100644
--- a/arch/x86/kernel/cpu/sched.c
+++ b/arch/x86/kernel/cpu/sched.c
@@ -6,7 +6,14 @@
#include <asm/cpufeature.h>
#include <asm/processor.h>
-#ifdef CONFIG_SMP
+#if 0 /* def CONFIG_SMP */
+
+/*
+ * Currently broken, we need to filter out idle time because the aperf/mperf
+ * ratio measures actual throughput, not capacity. This means that if a logical
+ * cpu idles it will report less capacity and receive less work, which isn't
+ * what we want.
+ */
static DEFINE_PER_CPU(struct aperfmperf, old_perf_sched);
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Amit Kucheria <amit.kucheria@linaro.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
linux-kernel@vger.kernel.org,
LAK <linux-arm-kernel@lists.infradead.org>,
linaro-dev@lists.linaro.org
Subject: Re: sched: ARM: arch_scale_freq_power
Date: Tue, 11 Oct 2011 09:57:32 +0200 [thread overview]
Message-ID: <1318319852.14400.65.camel@laptop> (raw)
In-Reply-To: <CAP245DX8b45Nj5SAwtLivp_vMKpwfeUrqnR2sqjnVyqiRd61gg@mail.gmail.com>
On Tue, 2011-10-11 at 12:46 +0530, Amit Kucheria wrote:
> Adding Peter to the discussion..
Right, CCing the folks who actually wrote the code you're asking
questions about always helps ;-)
> On Thu, Oct 6, 2011 at 5:06 PM, Vincent Guittot
> <vincent.guittot@linaro.org> wrote:
> > I work to link the cpu_power of ARM cores to their frequency by using
> > arch_scale_freq_power.
Why and how? In particular note that if you're using something like the
on-demand cpufreq governor this isn't going to work.
> It's explained in the kernel that cpu_power is
> > used to distribute load on cpus and a cpu with more cpu_power will
> > pick up more load. The default value is SCHED_POWER_SCALE and I
> > increase the value if I want a cpu to have more load than another one.
> > Is there an advised range for cpu_power value as well as some time
> > scale constraints for updating the cpu_power value ?
Basically 1024 is the unit and denotes the capacity of a full core at
'normal' speed.
Typically cpufreq would down-clock a core and thus you'd end up with a
smaller number (linearly proportional to the freq ratio etc. although if
you want to go really fancy you could determine the actual
throughput/freq curves).
Things like x86 turbo mode would result in a >1024 value.
Things like SMT would typically result in <1024 and the SMT sum over the
core >1024 (if you're lucky).
> > I'm also wondering why this scheduler feature is currently disable by default ?
Because the only implementation in existence (x86) is broken and I
haven't gotten around to fixing it. Arguable we should disable that for
the time being, see below.
> In discussions with Vincent regarding this, I've wondered whether
> cpu_power wouldn't be better renamed to cpu_capacity since that is
> what it really seems to describe.
Possibly, but its been cpu_power for ages and we use capacity to
describe something else.
---
arch/x86/kernel/cpu/sched.c | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kernel/cpu/sched.c b/arch/x86/kernel/cpu/sched.c
index a640ae5..90ae68c 100644
--- a/arch/x86/kernel/cpu/sched.c
+++ b/arch/x86/kernel/cpu/sched.c
@@ -6,7 +6,14 @@
#include <asm/cpufeature.h>
#include <asm/processor.h>
-#ifdef CONFIG_SMP
+#if 0 /* def CONFIG_SMP */
+
+/*
+ * Currently broken, we need to filter out idle time because the aperf/mperf
+ * ratio measures actual throughput, not capacity. This means that if a logical
+ * cpu idles it will report less capacity and receive less work, which isn't
+ * what we want.
+ */
static DEFINE_PER_CPU(struct aperfmperf, old_perf_sched);
next prev parent reply other threads:[~2011-10-11 7:57 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-06 11:36 sched: ARM: arch_scale_freq_power Vincent Guittot
2011-10-06 11:36 ` Vincent Guittot
2011-10-11 7:16 ` Amit Kucheria
2011-10-11 7:16 ` Amit Kucheria
2011-10-11 7:57 ` Peter Zijlstra [this message]
2011-10-11 7:57 ` Peter Zijlstra
2011-10-11 8:51 ` Vincent Guittot
2011-10-11 8:51 ` Vincent Guittot
2011-10-11 9:13 ` Peter Zijlstra
2011-10-11 9:13 ` Peter Zijlstra
2011-10-11 9:38 ` Amit Kucheria
2011-10-11 9:38 ` Amit Kucheria
2011-10-11 10:03 ` Peter Zijlstra
2011-10-11 10:03 ` Peter Zijlstra
2011-10-11 9:40 ` Vincent Guittot
2011-10-11 9:40 ` Vincent Guittot
2011-10-11 10:27 ` Peter Zijlstra
2011-10-11 10:27 ` Peter Zijlstra
2011-10-11 16:03 ` Vincent Guittot
2011-10-11 16:03 ` Vincent Guittot
2011-10-11 16:21 ` Peter Zijlstra
2011-10-11 16:21 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1318319852.14400.65.camel@laptop \
--to=a.p.zijlstra@chello.nl \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.