From: dino@in.ibm.com
To: Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org,
John Stultz <johnstul@us.ibm.com>,
Darren Hart <dvhltc@us.ibm.com>, John Kacur <jkacur@redhat.com>
Subject: [patch -rt 12/17] x86: sched: provide arch implementations using aperf/mperf
Date: Thu, 22 Oct 2009 18:07:55 +0530 [thread overview]
Message-ID: <20091022124112.184099152@spinlock.in.ibm.com> (raw)
In-Reply-To: 20091022123743.506956796@spinlock.in.ibm.com
[-- Attachment #1: sched-lb-11.patch --]
[-- Type: text/plain, Size: 3637 bytes --]
APERF/MPERF support for cpu_power.
APERF/MPERF is arch defined to be a relative scale of work capacity
per logical cpu, this is assumed to include SMT and Turbo mode.
APERF/MPERF are specified to both reset to 0 when either counter
wraps, which is highly inconvenient, since that'll give a blimp when
that happens. The manual specifies writing 0 to the counters after
each read, but that's 1) too expensive, and 2) destroys the
possibility of sharing these counters with other users, so we live
with the blimp - the other existing user does too.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Dinakar Guniguntala <dino@in.ibm.com>
---
arch/x86/kernel/cpu/Makefile | 2 -
arch/x86/kernel/cpu/sched.c | 58 +++++++++++++++++++++++++++++++++++++++++++
include/linux/sched.h | 4 ++
3 files changed, 63 insertions(+), 1 deletion(-)
Index: linux-2.6.31.4-rt14-lb1/arch/x86/kernel/cpu/Makefile
===================================================================
--- linux-2.6.31.4-rt14-lb1.orig/arch/x86/kernel/cpu/Makefile 2009-10-21 10:47:15.000000000 -0400
+++ linux-2.6.31.4-rt14-lb1/arch/x86/kernel/cpu/Makefile 2009-10-21 10:49:00.000000000 -0400
@@ -13,7 +13,7 @@
obj-y := intel_cacheinfo.o addon_cpuid_features.o
obj-y += proc.o capflags.o powerflags.o common.o
-obj-y += vmware.o hypervisor.o
+obj-y += vmware.o hypervisor.o sched.o
obj-$(CONFIG_X86_32) += bugs.o cmpxchg.o
obj-$(CONFIG_X86_64) += bugs_64.o
Index: linux-2.6.31.4-rt14-lb1/arch/x86/kernel/cpu/sched.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.31.4-rt14-lb1/arch/x86/kernel/cpu/sched.c 2009-10-21 10:49:00.000000000 -0400
@@ -0,0 +1,58 @@
+#include <linux/sched.h>
+#include <linux/math64.h>
+#include <linux/percpu.h>
+#include <linux/irqflags.h>
+
+#include <asm/cpufeature.h>
+#include <asm/processor.h>
+
+static DEFINE_PER_CPU(struct aperfmperf, old_aperfmperf);
+
+static unsigned long scale_aperfmperf(void)
+{
+ struct aperfmperf cur, val, *old = &__get_cpu_var(old_aperfmperf);
+ unsigned long ratio = SCHED_LOAD_SCALE;
+ unsigned long flags;
+
+ local_irq_save(flags);
+ get_aperfmperf(&val);
+ local_irq_restore(flags);
+
+ cur = val;
+ cur.aperf -= old->aperf;
+ cur.mperf -= old->mperf;
+ *old = val;
+
+ cur.mperf >>= SCHED_LOAD_SHIFT;
+ if (cur.mperf)
+ ratio = div_u64(cur.aperf, cur.mperf);
+
+ return ratio;
+}
+
+unsigned long arch_scale_freq_power(struct sched_domain *sd, int cpu)
+{
+ /*
+ * do aperf/mperf on the cpu level because it includes things
+ * like turbo mode, which are relevant to full cores.
+ */
+ if (boot_cpu_has(X86_FEATURE_APERFMPERF))
+ return scale_aperfmperf();
+
+ /*
+ * maybe have something cpufreq here
+ */
+
+ return default_scale_freq_power(sd, cpu);
+}
+
+unsigned long arch_scale_smt_power(struct sched_domain *sd, int cpu)
+{
+ /*
+ * aperf/mperf already includes the smt gain
+ */
+ if (boot_cpu_has(X86_FEATURE_APERFMPERF))
+ return SCHED_LOAD_SCALE;
+
+ return default_scale_smt_power(sd, cpu);
+}
Index: linux-2.6.31.4-rt14-lb1/include/linux/sched.h
===================================================================
--- linux-2.6.31.4-rt14-lb1.orig/include/linux/sched.h 2009-10-21 10:47:15.000000000 -0400
+++ linux-2.6.31.4-rt14-lb1/include/linux/sched.h 2009-10-21 10:49:00.000000000 -0400
@@ -1047,6 +1047,10 @@
}
#endif /* !CONFIG_SMP */
+
+unsigned long default_scale_freq_power(struct sched_domain *sd, int cpu);
+unsigned long default_scale_smt_power(struct sched_domain *sd, int cpu);
+
struct io_context; /* See blkdev.h */
--
next prev parent reply other threads:[~2009-10-22 12:41 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-22 12:37 [patch -rt 00/17] [patch -rt] Sched load balance backport dino
2009-10-22 12:37 ` [patch -rt 01/17] sched: restore __cpu_power to a straight sum of power dino
2009-10-22 12:37 ` [patch -rt 02/17] sched: SD_PREFER_SIBLING dino
2009-10-22 12:37 ` [patch -rt 03/17] sched: update the cpu_power sum during load-balance dino
2009-10-22 12:37 ` [patch -rt 04/17] sched: add smt_gain dino
2009-10-22 12:37 ` [patch -rt 05/17] sched: dynamic cpu_power dino
2009-10-22 12:37 ` [patch -rt 06/17] sched: scale down cpu_power due to RT tasks dino
2009-10-22 12:37 ` [patch -rt 07/17] sched: try to deal with low capacity dino
2009-10-22 12:37 ` [patch -rt 08/17] sched: remove reciprocal for cpu_power dino
2009-10-22 12:37 ` [patch -rt 09/17] x86: move APERF/MPERF into a X86_FEATURE dino
2009-10-22 12:37 ` [patch -rt 10/17] x86: Add generic aperf/mperf code dino
2009-10-22 12:37 ` [patch -rt 11/17] Provide an arch specific hook for cpufreq based scaling of cpu_power dino
2009-10-22 12:37 ` dino [this message]
2009-10-22 12:37 ` [patch -rt 13/17] sched: cleanup wake_idle power saving dino
2009-10-22 12:37 ` [patch -rt 14/17] sched: cleanup wake_idle dino
2009-10-22 12:37 ` [patch -rt 15/17] sched: Add a missing = dino
2009-10-22 12:37 ` [patch -rt 16/17] sched: Deal with low-load in wake_affine() dino
2009-10-22 12:38 ` [patch -rt 17/17] sched: Fix dynamic power-balancing crash dino
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091022124112.184099152@spinlock.in.ibm.com \
--to=dino@in.ibm.com \
--cc=a.p.zijlstra@chello.nl \
--cc=dvhltc@us.ibm.com \
--cc=jkacur@redhat.com \
--cc=johnstul@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox