linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2]: cpuidle: Introducing cpuidle infrastructure to powerpc.
@ 2009-08-19 12:57 Arun R Bharadwaj
  2009-08-19 12:58 ` [PATCH 1/2]: pSeries: Enable cpuidle for pSeries Arun R Bharadwaj
  2009-08-19 12:59 ` [PATCH 2/2]: pSeries: Implement Thermal & Power Management Devices(TPMD) idle module Arun R Bharadwaj
  0 siblings, 2 replies; 3+ messages in thread
From: Arun R Bharadwaj @ 2009-08-19 12:57 UTC (permalink / raw)
  To: Joel Schopp, Benjamin Herrenschmidt, Shaohua Li,
	Venkatesh Pallipadi, Adam Belay, Peter Zijlstra, Ingo Molnar,
	Vaidyanathan Srinivasan, Dipankar Sarma, Balbir Singh,
	Gautham R Shenoy, Arun R Bharadwaj
  Cc: linuxppc-dev, linux-kernel

Hi,


**** RFC not for inclusion ****

"Cpuidle" is a CPU Power Management infrastrusture which helps manage
idle CPUs in a clean and efficient manner. The architecture can register
its driver (in this case, tpmd_idle driver) so that it subscribes for
cpuidle feature. Cpuidle has a set of governors (ladder and menu),
which will decide the best idle state to be chosen for the current situation,
based on heuristics, and calculates the expected residency time
for the current idle state. So based on this, the cpu is put into
the right idle state.

Currently, cpuidle infrasture is exploited by ACPI to choose between
the available ACPI C-states. This patch-set is aimed at enabling
cpuidle for powerpc and provides a sample implementation for pseries.

Currently, in the pseries_dedicated_idle_sleep(), the processor would
poll for a time period, which is called the snooze, and only then it
is ceded, which would put the processor in nap state. Cpuidle aims at
separating this into 2 different idle states. Based on the expected
residency time predicted by the cpuidle governor, the idle state is
chosen directly. So, choosing to enter the nap state directly based on
the decision made by cpuidle would avoid unnecessary snoozing before
entering nap.

This patch-set tries to achieve the above objective by introducing a
Thermal and Power Management Device module called tpmd_idle in
arch/powerpc/platform/pseries/tpmd_idle.c, which implements cpuidle
idle loop which would replace the pseries_dedicated_idle_sleep()
when cpuidle is enabled.

Patches included in this set:
PATCH 1/2 - Enable cpuidle for pSeries.
PATCH 2/2 - Implement Thermal & Power Management Devices(TPMD) idle module


Any feedback on the overall design and idea is immensely valuable.

--arun

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1/2]: pSeries: Enable cpuidle for pSeries.
  2009-08-19 12:57 [PATCH 0/2]: cpuidle: Introducing cpuidle infrastructure to powerpc Arun R Bharadwaj
@ 2009-08-19 12:58 ` Arun R Bharadwaj
  2009-08-19 12:59 ` [PATCH 2/2]: pSeries: Implement Thermal & Power Management Devices(TPMD) idle module Arun R Bharadwaj
  1 sibling, 0 replies; 3+ messages in thread
From: Arun R Bharadwaj @ 2009-08-19 12:58 UTC (permalink / raw)
  To: Joel Schopp, Benjamin Herrenschmidt, Shaohua Li,
	Venkatesh Pallipadi, Adam Belay, Peter Zijlstra, Ingo Molnar,
	Vaidyanathan Srinivasan, Dipankar Sarma, Balbir Singh,
	Gautham R Shenoy, Arun Bharadwaj
  Cc: linuxppc-dev, linux-kernel

* Arun R Bharadwaj <arun@linux.vnet.ibm.com> [2009-08-19 18:27:16]:

This patch enables the cpuidle option in Kconfig for pSeries.
It also adds the routine cpu_idle_wait.

Signed-off-by: Arun R Bharadwaj <arun@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig                   |   18 ++++++++++++++++++
 arch/powerpc/include/asm/system.h      |    2 ++
 arch/powerpc/platforms/pseries/setup.c |   20 ++++++++++++++++++++
 drivers/cpuidle/cpuidle.c              |    1 +
 4 files changed, 41 insertions(+)

Index: linux.trees.git/arch/powerpc/Kconfig
===================================================================
--- linux.trees.git.orig/arch/powerpc/Kconfig
+++ linux.trees.git/arch/powerpc/Kconfig
@@ -88,6 +88,9 @@ config ARCH_HAS_ILOG2_U64
 	bool
 	default y if 64BIT
 
+config ARCH_HAS_CPU_IDLE_WAIT
+	def_bool y
+
 config GENERIC_HWEIGHT
 	bool
 	default y
@@ -243,6 +246,21 @@ source "kernel/Kconfig.freezer"
 source "arch/powerpc/sysdev/Kconfig"
 source "arch/powerpc/platforms/Kconfig"
 
+menu "Power management options"
+
+source "drivers/cpuidle/Kconfig"
+
+config TPMD
+	tristate "TPMD power management support"
+	depends on PPC_PSERIES && CPU_IDLE
+	default y
+	help
+	  Thermal and Power Management Devices (TPMD). This hooks onto cpuidle
+	  infrastructure to help in idle cpu power management. Currently this
+	  is enabled only for pSeries.
+
+endmenu
+
 menu "Kernel options"
 
 config HIGHMEM
Index: linux.trees.git/drivers/cpuidle/cpuidle.c
===================================================================
--- linux.trees.git.orig/drivers/cpuidle/cpuidle.c
+++ linux.trees.git/drivers/cpuidle/cpuidle.c
@@ -17,6 +17,7 @@
 #include <linux/cpuidle.h>
 #include <linux/ktime.h>
 #include <linux/hrtimer.h>
+#include <linux/pm.h>
 
 #include "cpuidle.h"
 
Index: linux.trees.git/arch/powerpc/platforms/pseries/setup.c
===================================================================
--- linux.trees.git.orig/arch/powerpc/platforms/pseries/setup.c
+++ linux.trees.git/arch/powerpc/platforms/pseries/setup.c
@@ -278,6 +278,26 @@ static struct notifier_block pci_dn_reco
 	.notifier_call = pci_dn_reconfig_notifier,
 };
 
+static void do_nothing(void *unused)
+{
+}
+
+/*
+ * cpu_idle_wait - Used to ensure that all the CPUs discard old value of
+ * pm_idle and update to new pm_idle value. Required while changing pm_idle
+ * handler on SMP systems.
+ *
+ * Caller must have changed pm_idle to the new value before the call. Old
+ * pm_idle value will not be used by any CPU after the return of this function.
+ */
+void cpu_idle_wait(void)
+{
+	smp_mb();
+	/* kick all the CPUs so that they exit out of pm_idle */
+	smp_call_function(do_nothing, NULL, 1);
+}
+EXPORT_SYMBOL_GPL(cpu_idle_wait);
+
 static void __init pSeries_setup_arch(void)
 {
 	/* Discover PIC type and setup ppc_md accordingly */
Index: linux.trees.git/arch/powerpc/include/asm/system.h
===================================================================
--- linux.trees.git.orig/arch/powerpc/include/asm/system.h
+++ linux.trees.git/arch/powerpc/include/asm/system.h
@@ -546,5 +546,7 @@ extern void account_system_vtime(struct 
 
 extern struct dentry *powerpc_debugfs_root;
 
+void cpu_idle_wait(void);
+
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_SYSTEM_H */

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 2/2]: pSeries: Implement Thermal & Power Management Devices(TPMD) idle module.
  2009-08-19 12:57 [PATCH 0/2]: cpuidle: Introducing cpuidle infrastructure to powerpc Arun R Bharadwaj
  2009-08-19 12:58 ` [PATCH 1/2]: pSeries: Enable cpuidle for pSeries Arun R Bharadwaj
@ 2009-08-19 12:59 ` Arun R Bharadwaj
  1 sibling, 0 replies; 3+ messages in thread
From: Arun R Bharadwaj @ 2009-08-19 12:59 UTC (permalink / raw)
  To: Joel Schopp, Benjamin Herrenschmidt, Shaohua Li,
	Venkatesh Pallipadi, Adam Belay, Peter Zijlstra, Ingo Molnar,
	Vaidyanathan Srinivasan, Dipankar Sarma, Balbir Singh,
	Gautham R Shenoy
  Cc: linuxppc-dev, linux-kernel

* Arun R Bharadwaj <arun@linux.vnet.ibm.com> [2009-08-19 18:27:16]:

This patch creates the Thermal & Power Management Devices module, tpmd_idle
which implements the cpuidle infrasture for pseries.
It implements a tpmd_idle_loop() which would be the main idle loop called
from cpu_idle(). It makes decision of entering either snooze or nap state
based on the decision taken by the cpuidle governor.

Signed-off-by: Arun R Bharadwaj <arun@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/pseries/Makefile    |    1 
 arch/powerpc/platforms/pseries/tpmd.h      |   10 +
 arch/powerpc/platforms/pseries/tpmd_idle.c |  192 +++++++++++++++++++++++++++++
 3 files changed, 203 insertions(+)

Index: linux.trees.git/arch/powerpc/platforms/pseries/tpmd_idle.c
===================================================================
--- /dev/null
+++ linux.trees.git/arch/powerpc/platforms/pseries/tpmd_idle.c
@@ -0,0 +1,192 @@
+
+/*
+ * tpmd_idle - idle state submodule to the tpmd driver
+ *
+ *  Copyright (C) 2009 Arun R Bharadwaj <arun@linux.vnet.ibm.com>
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or (at
+ *  your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful, but
+ *  WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, write to the Free Software Foundation, Inc.,
+ *  59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/moduleparam.h>
+#include <linux/cpuidle.h>
+#include <linux/cpu.h>
+
+#include <asm/paca.h>
+#include <asm/machdep.h>
+
+#include "plpar_wrappers.h"
+#include "tpmd.h"
+
+MODULE_AUTHOR("Arun R Bharadwaj");
+MODULE_DESCRIPTION("TPMD Idle State Driver");
+MODULE_LICENSE("GPL");
+
+struct cpuidle_driver tpmd_idle_driver = {
+	.name =		"tpmd_idle",
+	.owner =	THIS_MODULE,
+};
+
+void (*pm_idle)(void);
+EXPORT_SYMBOL(pm_idle);
+
+static void (*old_idle_power_save)(void);
+
+DEFINE_PER_CPU(struct tpmd_processor_power, power);
+
+#define	IDLE_STATE_COUNT	2
+
+static int tpmd_idle_init(struct tpmd_processor_power *power)
+{
+	return cpuidle_register_device(&power->dev);
+}
+
+void tpmd_idle_exit(struct tpmd_processor_power *power)
+{
+	cpuidle_unregister_device(&power->dev);
+}
+
+static void snooze(void)
+{
+	local_irq_enable();
+	set_thread_flag(TIF_POLLING_NRFLAG);
+	while (!need_resched()) {
+		HMT_low();
+		HMT_very_low();
+	}
+	clear_thread_flag(TIF_POLLING_NRFLAG);
+	local_irq_disable();
+	smp_mb();
+}
+
+static void nap(void)
+{
+	HMT_medium();
+	smp_mb();
+	cede_processor();
+}
+
+static int tpmd_idle_loop(struct cpuidle_device *dev, struct cpuidle_state *st)
+{
+	ktime_t t1, t2;
+	s64 diff;
+	int ret;
+
+	get_lppaca()->idle = 1;
+	get_lppaca()->donate_dedicated_cpu = 1;
+
+	t1 = ktime_get();
+
+	if (strcmp(st->desc, "idle") == 0)
+		snooze();
+	else
+		nap();
+
+	t2 = ktime_get();
+	diff = ktime_to_us(ktime_sub(t2, t1));
+	if (diff > INT_MAX)
+		diff = INT_MAX;
+
+	ret = (int) diff;
+
+	get_lppaca()->idle = 0;
+	get_lppaca()->donate_dedicated_cpu = 0;
+
+	return ret;
+}
+
+static int tpmd_setup_cpuidle(struct tpmd_processor_power *power)
+{
+	int i;
+	struct cpuidle_state *state;
+	struct cpuidle_device *dev = &power->dev;
+
+	dev->cpu = power->id;
+
+	dev->enabled = 0;
+	for (i = 0; i < IDLE_STATE_COUNT; i++) {
+		state = &dev->states[i];
+
+		snprintf(state->name, CPUIDLE_NAME_LEN, "TPM%d", i);
+
+		switch (i) {
+		case 0:
+			strncpy(state->desc, "idle", CPUIDLE_DESC_LEN);
+			state->exit_latency = 0;
+			state->target_residency = 0;
+			state->enter = tpmd_idle_loop;
+			break;
+
+		case 1:
+			strncpy(state->desc, "nap", CPUIDLE_DESC_LEN);
+			state->exit_latency = 1;
+			state->target_residency = 100;
+			state->enter = tpmd_idle_loop;
+			break;
+		}
+	}
+
+	power->dev.state_count = i;
+	return 0;
+}
+
+static int tpmd_processor_get_power_info(struct tpmd_processor_power *power,
+					int cpu)
+{
+	power->id = cpu;
+	power->count = 2;
+	return 0;
+}
+
+static int __init tpmd_processor_init(void)
+{
+	int cpu;
+	int result = cpuidle_register_driver(&tpmd_idle_driver);
+
+	if (result < 0)
+		return result;
+
+	printk(KERN_DEBUG "TPMD idle driver registered\n");
+
+	for_each_online_cpu(cpu) {
+		tpmd_processor_get_power_info(&per_cpu(power, cpu), cpu);
+		tpmd_setup_cpuidle(&per_cpu(power, cpu));
+		tpmd_idle_init(&per_cpu(power, cpu));
+	}
+
+	printk(KERN_DEBUG "Using cpuidle idle loop\n");
+	old_idle_power_save = ppc_md.power_save;
+	ppc_md.power_save = pm_idle;
+	return 0;
+}
+
+static void __exit tpmd_processor_exit(void)
+{
+	int cpu;
+
+	ppc_md.power_save = old_idle_power_save;
+	for_each_online_cpu(cpu)
+		tpmd_idle_exit(&per_cpu(power, cpu));
+	cpuidle_unregister_driver(&tpmd_idle_driver);
+	printk(KERN_DEBUG "TPMD idle driver removed\n");
+}
+
+module_init(tpmd_processor_init);
+module_exit(tpmd_processor_exit);
Index: linux.trees.git/arch/powerpc/platforms/pseries/tpmd.h
===================================================================
--- /dev/null
+++ linux.trees.git/arch/powerpc/platforms/pseries/tpmd.h
@@ -0,0 +1,10 @@
+#include <linux/kernel.h>
+#include <linux/cpuidle.h>
+
+struct tpmd_processor_power {
+	struct cpuidle_device dev;
+	int count;
+	int id;
+};
+
+extern struct cpuidle_driver tpmd_idle_driver;
Index: linux.trees.git/arch/powerpc/platforms/pseries/Makefile
===================================================================
--- linux.trees.git.orig/arch/powerpc/platforms/pseries/Makefile
+++ linux.trees.git/arch/powerpc/platforms/pseries/Makefile
@@ -26,3 +26,4 @@ obj-$(CONFIG_HCALL_STATS)	+= hvCall_inst
 obj-$(CONFIG_PHYP_DUMP)	+= phyp_dump.o
 obj-$(CONFIG_CMM)		+= cmm.o
 obj-$(CONFIG_DTL)		+= dtl.o
+obj-$(CONFIG_TPMD)		+= tpmd_idle.o

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-08-19 13:00 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-19 12:57 [PATCH 0/2]: cpuidle: Introducing cpuidle infrastructure to powerpc Arun R Bharadwaj
2009-08-19 12:58 ` [PATCH 1/2]: pSeries: Enable cpuidle for pSeries Arun R Bharadwaj
2009-08-19 12:59 ` [PATCH 2/2]: pSeries: Implement Thermal & Power Management Devices(TPMD) idle module Arun R Bharadwaj

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).