LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v3] watchdog: mpc8xxx_wdt convert to watchdog core
From: Wim Van Sebroeck @ 2014-01-14  8:32 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: scottwood, linuxppc-dev, linux-kernel, Guenter Roeck,
	linux-watchdog
In-Reply-To: <20131204063214.D1DB01A2BEA@localhost.localdomain>

Hi Christophe,

> Convert mpc8xxx_wdt.c to the new watchdog API.
> 
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>

This patch has been added to linux-watchdog-next.

Kind regards,
Wim.

^ permalink raw reply

* [PATCH] powerpc: dma-mapping: Return dma_direct_ops variable when dev == NULL
From: Chunhe Lan @ 2014-01-14  9:44 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Chunhe Lan

Without this patch, kind of below error will be dumped if
'insmod ixgbevf.ko' is executed:

    ixgbevf: Intel(R) 10 Gigabit PCI Express Virtual Function
             Network Driver - version 2.7.12-k
    ixgbevf: Copyright (c) 2009 - 2012 Intel Corporation.
    ixgbevf 0000:01:10.0: enabling device (0000 -> 0002)
    ixgbevf 0000:01:10.0: No usable DMA configuration, aborting
    ixgbevf: probe of 0000:01:10.0 failed with error -5
    ......
    ......

Signed-off-by: Chunhe Lan <Chunhe.Lan@freescale.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Chunhe Lan <Chunhe.Lan@freescale.com>
---
 arch/powerpc/include/asm/dma-mapping.h |   13 +++++++++----
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h
index e27e9ad..b8c10de 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -84,10 +84,15 @@ static inline struct dma_map_ops *get_dma_ops(struct device *dev)
 	 * only ISA DMA device we support is the floppy and we have a hack
 	 * in the floppy driver directly to get a device for us.
 	 */
-	if (unlikely(dev == NULL))
-		return NULL;
-
-	return dev->archdata.dma_ops;
+	if (dev && dev->archdata.dma_ops)
+		return dev->archdata.dma_ops;
+	/*
+	 * In some cases (for example, use the Intel(R) 10 Gigabit PCI
+	 * expression Virtual Function Network Driver -- ixgbevf.ko),
+	 * their value of dev is the NULL. If return NULL, the driver is
+	 * aborting. So return dma_direct_ops variable when dev == NULL.
+	 */
+	return &dma_direct_ops;
 }
 
 static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
-- 
1.7.6.5

^ permalink raw reply related

* Re: [PATCH] powerpc: dma-mapping: Return dma_direct_ops variable when dev == NULL
From: Benjamin Herrenschmidt @ 2014-01-14 10:14 UTC (permalink / raw)
  To: Chunhe Lan; +Cc: linux-pci, linuxppc-dev
In-Reply-To: <1389692656-27758-1-git-send-email-Chunhe.Lan@freescale.com>

On Tue, 2014-01-14 at 17:44 +0800, Chunhe Lan wrote:
> Without this patch, kind of below error will be dumped if
> 'insmod ixgbevf.ko' is executed:
> 
>     ixgbevf: Intel(R) 10 Gigabit PCI Express Virtual Function
>              Network Driver - version 2.7.12-k
>     ixgbevf: Copyright (c) 2009 - 2012 Intel Corporation.
>     ixgbevf 0000:01:10.0: enabling device (0000 -> 0002)
>     ixgbevf 0000:01:10.0: No usable DMA configuration, aborting
>     ixgbevf: probe of 0000:01:10.0 failed with error -5
>     ......
>     ......

That's not right. The DMA ops must be set properly for the VF somewhere
in the arch code instead. When creating VFs, is there a hook allowing
the arch to fix things up ?

(Also adding linux-pci on CC)

Ben.

> Signed-off-by: Chunhe Lan <Chunhe.Lan@freescale.com>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Tested-by: Chunhe Lan <Chunhe.Lan@freescale.com>
> ---
>  arch/powerpc/include/asm/dma-mapping.h |   13 +++++++++----
>  1 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h
> index e27e9ad..b8c10de 100644
> --- a/arch/powerpc/include/asm/dma-mapping.h
> +++ b/arch/powerpc/include/asm/dma-mapping.h
> @@ -84,10 +84,15 @@ static inline struct dma_map_ops *get_dma_ops(struct device *dev)
>  	 * only ISA DMA device we support is the floppy and we have a hack
>  	 * in the floppy driver directly to get a device for us.
>  	 */
> -	if (unlikely(dev == NULL))
> -		return NULL;
> -
> -	return dev->archdata.dma_ops;
> +	if (dev && dev->archdata.dma_ops)
> +		return dev->archdata.dma_ops;
> +	/*
> +	 * In some cases (for example, use the Intel(R) 10 Gigabit PCI
> +	 * expression Virtual Function Network Driver -- ixgbevf.ko),
> +	 * their value of dev is the NULL. If return NULL, the driver is
> +	 * aborting. So return dma_direct_ops variable when dev == NULL.
> +	 */
> +	return &dma_direct_ops;
>  }
>  
>  static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)

^ permalink raw reply

* [PATCH v2] Move precessing of MCE queued event out from syscall exit path.
From: Mahesh J Salgaonkar @ 2014-01-14 10:15 UTC (permalink / raw)
  To: linuxppc-dev, Benjamin Herrenschmidt; +Cc: Hugh Dickins

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Huge Dickins reported an issue that b5ff4211a829
"powerpc/book3s: Queue up and process delayed MCE events" breaks the
PowerMac G5 boot. This patch fixes it by moving the mce even processing
away from syscall exit, which was wrong to do that in first place, and
using irq work framework to delay processing of mce event.

Reported-by: Hugh Dickins <hughd@google.com
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/mce.h |    1 -
 arch/powerpc/kernel/entry_64.S |    5 -----
 arch/powerpc/kernel/mce.c      |   13 ++++++++++---
 3 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index 2257d1e..f97d8cb 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -192,7 +192,6 @@ extern void save_mce_event(struct pt_regs *regs, long handled,
 extern int get_mce_event(struct machine_check_event *mce, bool release);
 extern void release_mce_event(void);
 extern void machine_check_queue_event(void);
-extern void machine_check_process_queued_event(void);
 extern void machine_check_print_event_info(struct machine_check_event *evt);
 extern uint64_t get_mce_fault_addr(struct machine_check_event *evt);
 
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 770d6d6..bbfb029 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -184,11 +184,6 @@ syscall_exit:
 	bl	.do_show_syscall_exit
 	ld	r3,RESULT(r1)
 #endif
-#ifdef CONFIG_PPC_BOOK3S_64
-BEGIN_FTR_SECTION
-	bl	.machine_check_process_queued_event
-END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
-#endif
 	CURRENT_THREAD_INFO(r12, r1)
 
 	ld	r8,_MSR(r1)
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index d6edf2b..a7fd4cb 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -26,6 +26,7 @@
 #include <linux/ptrace.h>
 #include <linux/percpu.h>
 #include <linux/export.h>
+#include <linux/irq_work.h>
 #include <asm/mce.h>
 
 static DEFINE_PER_CPU(int, mce_nest_count);
@@ -35,6 +36,11 @@ static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT], mce_event);
 static DEFINE_PER_CPU(int, mce_queue_count);
 static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT], mce_event_queue);
 
+static void machine_check_process_queued_event(struct irq_work *work);
+struct irq_work mce_event_process_work = {
+        .func = machine_check_process_queued_event,
+};
+
 static void mce_set_error_info(struct machine_check_event *mce,
 			       struct mce_error_info *mce_err)
 {
@@ -185,17 +191,19 @@ void machine_check_queue_event(void)
 		return;
 	}
 	__get_cpu_var(mce_event_queue[index]) = evt;
+
+	/* Queue irq work to process this event later. */
+	irq_work_queue(&mce_event_process_work);
 }
 
 /*
  * process pending MCE event from the mce event queue. This function will be
  * called during syscall exit.
  */
-void machine_check_process_queued_event(void)
+static void machine_check_process_queued_event(struct irq_work *work)
 {
 	int index;
 
-	preempt_disable();
 	/*
 	 * For now just print it to console.
 	 * TODO: log this error event to FSP or nvram.
@@ -206,7 +214,6 @@ void machine_check_process_queued_event(void)
 				&__get_cpu_var(mce_event_queue[index]));
 		__get_cpu_var(mce_queue_count)--;
 	}
-	preempt_enable();
 }
 
 void machine_check_print_event_info(struct machine_check_event *evt)

^ permalink raw reply related

* [PATCH v1 0/6] pseries/cpuidle: pseries cpuidle backend driver clean-ups.
From: Deepthi Dharwar @ 2014-01-14 10:55 UTC (permalink / raw)
  To: linux-pm, benh, daniel.lezcano, linux-kernel, srivatsa.bhat,
	preeti, svaidy, linuxppc-dev

The following patch series includes a bunch of clean-ups for the
pseries cpuidle backend driver. This includes,
moving the driver from arch/powerpc/platforms/pseries to
driver/cpuidle, refactoring of the code, making it a non-module,
removing smt-snooze-delay update and dead code around it.

After a number of attempts to consolidate the backend cpuidle driver 
for pSeries and powernv platforms, it seems best to have separate 
idle drivers for both the platforms as any kind of  code duplication 
seizes to exist beyond snooze loop and a hot plug notifier.  
Also going further, with addition of device tree parsing to setup
idle states and related changes just for powernv platform, it is
best to keep these drivers separate without adding complexity 
and thus improving readabilty for both the platform drivers.  

The clean-up undertaken here was posted earlier as part of
generic powerpc cpuidle driver clean-up.

V1 -> http://lkml.org/lkml/2013/7/23/143
V2 -> https://lkml.org/lkml/2013/7/30/872
V3 -> http://comments.gmane.org/gmane.linux.ports.ppc.embedded/63093
V4 -> https://lkml.org/lkml/2013/8/22/25
V5 -> http://lkml.org/lkml/2013/8/22/184
V6 -> https://lkml.org/lkml/2013/8/27/432
V7 -> https://lkml.org/lkml/2013/10/29/216
V8 -> https://lkml.org/lkml/2013/11/11/29

 Deepthi Dharwar (5):
      pseries/cpuidle: Move processor_idle.c to drivers/cpuidle.
      pseries/cpuidle: Use cpuidle_register() for initialisation.
      pseries/cpuidle: Make cpuidle-pseries backend driver a non-module.
      pseries/cpuidle: Remove MAX_IDLE_STATE macro.
      pseries/cpuidle: smt-snooze-delay cleanup.

Preeti U Murthy (1):
      pseries/cpuidle: Remove redundant call to ppc64_runlatch_off() in cpu idle routines


 arch/powerpc/include/asm/processor.h            |    7 
 arch/powerpc/kernel/sysfs.c                     |    2 
 arch/powerpc/platforms/pseries/Kconfig          |    9 -
 arch/powerpc/platforms/pseries/Makefile         |    1 
 arch/powerpc/platforms/pseries/processor_idle.c |  364 -----------------------
 drivers/cpuidle/Kconfig                         |    5 
 drivers/cpuidle/Kconfig.powerpc                 |   11 +
 drivers/cpuidle/Makefile                        |    4 
 drivers/cpuidle/cpuidle-pseries.c               |  267 +++++++++++++++++
 9 files changed, 287 insertions(+), 383 deletions(-)
 delete mode 100644 arch/powerpc/platforms/pseries/processor_idle.c
 create mode 100644 drivers/cpuidle/Kconfig.powerpc
 create mode 100644 drivers/cpuidle/cpuidle-pseries.c


-- Deepthi

^ permalink raw reply

* [PATCH v1 1/6] pseries/cpuidle: Remove redundant call to ppc64_runlatch_off() in cpu idle routines
From: Deepthi Dharwar @ 2014-01-14 10:55 UTC (permalink / raw)
  To: linux-pm, benh, daniel.lezcano, linux-kernel, srivatsa.bhat,
	preeti, svaidy, linuxppc-dev
In-Reply-To: <20140114105525.3064.52013.stgit@deepthi.in.ibm.com>

From: Preeti U Murthy <preeti@linux.vnet.ibm.com>

Commit fbd7740fdfdf9475f(powerpc: Simplify pSeries idle loop) switched pseries
cpu idle handling from complete idle loops to ppc_md.powersave functions.
Earlier to this switch, ppc64_runlatch_off() had to be called in each of the
idle routines. But after the switch, this call is handled in arch_cpu_idle(),
just before the call to ppc_md.powersave, where platform specific idle
routines are called.

As a consequence, the call to ppc64_runlatch_off() got duplicated in the
arch_cpu_idle() routine as well as in the some of the idle routines in
pseries and commit fbd7740fdfdf9475f missed to get rid of these redundant
calls. These calls were carried over subsequent enhancements to the pseries
cpuidle routines.

Although multiple calls to ppc64_runlatch_off() is harmless, there is still some
overhead due to it. Besides that, these calls could also make way for a
misunderstanding that it is *necessary* to call ppc64_runlatch_off() multiple
times, when that is not the case. Hence this patch takes care of eliminating
this redundancy.

Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/pseries/processor_idle.c |    3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/processor_idle.c b/arch/powerpc/platforms/pseries/processor_idle.c
index a166e38..09e4f56 100644
--- a/arch/powerpc/platforms/pseries/processor_idle.c
+++ b/arch/powerpc/platforms/pseries/processor_idle.c
@@ -17,7 +17,6 @@
 #include <asm/reg.h>
 #include <asm/machdep.h>
 #include <asm/firmware.h>
-#include <asm/runlatch.h>
 #include <asm/plpar_wrappers.h>
 
 struct cpuidle_driver pseries_idle_driver = {
@@ -63,7 +62,6 @@ static int snooze_loop(struct cpuidle_device *dev,
 	set_thread_flag(TIF_POLLING_NRFLAG);
 
 	while ((!need_resched()) && cpu_online(cpu)) {
-		ppc64_runlatch_off();
 		HMT_low();
 		HMT_very_low();
 	}
@@ -103,7 +101,6 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
 	idle_loop_prolog(&in_purr);
 	get_lppaca()->donate_dedicated_cpu = 1;
 
-	ppc64_runlatch_off();
 	HMT_medium();
 	check_and_cede_processor();
 

^ permalink raw reply related

* [PATCH v1 2/6] pseries/cpuidle: Move processor_idle.c to drivers/cpuidle.
From: Deepthi Dharwar @ 2014-01-14 10:56 UTC (permalink / raw)
  To: linux-pm, benh, daniel.lezcano, linux-kernel, srivatsa.bhat,
	preeti, svaidy, linuxppc-dev
In-Reply-To: <20140114105525.3064.52013.stgit@deepthi.in.ibm.com>

Move the file from arch specific pseries/processor_idle.c
to drivers/cpuidle/cpuidle-pseries.c
Make the relevant Makefile and Kconfig changes.
Also, introduce Kconfig.powerpc in drivers/cpuidle
for all powerpc cpuidle drivers.

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/processor.h            |    2 
 arch/powerpc/platforms/pseries/Kconfig          |    9 -
 arch/powerpc/platforms/pseries/Makefile         |    1 
 arch/powerpc/platforms/pseries/processor_idle.c |  361 -----------------------
 drivers/cpuidle/Kconfig                         |    5 
 drivers/cpuidle/Kconfig.powerpc                 |   11 +
 drivers/cpuidle/Makefile                        |    4 
 drivers/cpuidle/cpuidle-pseries.c               |  361 +++++++++++++++++++++++
 8 files changed, 382 insertions(+), 372 deletions(-)
 delete mode 100644 arch/powerpc/platforms/pseries/processor_idle.c
 create mode 100644 drivers/cpuidle/Kconfig.powerpc
 create mode 100644 drivers/cpuidle/cpuidle-pseries.c

diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index fc14a38..fa98fdf 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -445,7 +445,7 @@ enum idle_boot_override {IDLE_NO_OVERRIDE = 0, IDLE_POWERSAVE_OFF};
 extern int powersave_nap;	/* set if nap mode can be used in idle loop */
 extern void power7_nap(void);
 
-#ifdef CONFIG_PSERIES_IDLE
+#ifdef CONFIG_PSERIES_CPUIDLE
 extern void update_smt_snooze_delay(int cpu, int residency);
 #else
 static inline void update_smt_snooze_delay(int cpu, int residency) {}
diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig
index 62b4f80..bb59bb0 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -119,12 +119,3 @@ config DTL
 	  which are accessible through a debugfs file.
 
 	  Say N if you are unsure.
-
-config PSERIES_IDLE
-	bool "Cpuidle driver for pSeries platforms"
-	depends on CPU_IDLE
-	depends on PPC_PSERIES
-	default y
-	help
-	  Select this option to enable processor idle state management
-	  through cpuidle subsystem.
diff --git a/arch/powerpc/platforms/pseries/Makefile b/arch/powerpc/platforms/pseries/Makefile
index fbccac9..0348079 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -21,7 +21,6 @@ obj-$(CONFIG_HCALL_STATS)	+= hvCall_inst.o
 obj-$(CONFIG_CMM)		+= cmm.o
 obj-$(CONFIG_DTL)		+= dtl.o
 obj-$(CONFIG_IO_EVENT_IRQ)	+= io_event_irq.o
-obj-$(CONFIG_PSERIES_IDLE)	+= processor_idle.o
 obj-$(CONFIG_LPARCFG)		+= lparcfg.o
 
 ifeq ($(CONFIG_PPC_PSERIES),y)
diff --git a/arch/powerpc/platforms/pseries/processor_idle.c b/arch/powerpc/platforms/pseries/processor_idle.c
deleted file mode 100644
index 09e4f56..0000000
--- a/arch/powerpc/platforms/pseries/processor_idle.c
+++ /dev/null
@@ -1,361 +0,0 @@
-/*
- *  processor_idle - idle state cpuidle driver.
- *  Adapted from drivers/idle/intel_idle.c and
- *  drivers/acpi/processor_idle.c
- *
- */
-
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/init.h>
-#include <linux/moduleparam.h>
-#include <linux/cpuidle.h>
-#include <linux/cpu.h>
-#include <linux/notifier.h>
-
-#include <asm/paca.h>
-#include <asm/reg.h>
-#include <asm/machdep.h>
-#include <asm/firmware.h>
-#include <asm/plpar_wrappers.h>
-
-struct cpuidle_driver pseries_idle_driver = {
-	.name             = "pseries_idle",
-	.owner            = THIS_MODULE,
-};
-
-#define MAX_IDLE_STATE_COUNT	2
-
-static int max_idle_state = MAX_IDLE_STATE_COUNT - 1;
-static struct cpuidle_device __percpu *pseries_cpuidle_devices;
-static struct cpuidle_state *cpuidle_state_table;
-
-static inline void idle_loop_prolog(unsigned long *in_purr)
-{
-	*in_purr = mfspr(SPRN_PURR);
-	/*
-	 * Indicate to the HV that we are idle. Now would be
-	 * a good time to find other work to dispatch.
-	 */
-	get_lppaca()->idle = 1;
-}
-
-static inline void idle_loop_epilog(unsigned long in_purr)
-{
-	u64 wait_cycles;
-
-	wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
-	wait_cycles += mfspr(SPRN_PURR) - in_purr;
-	get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
-	get_lppaca()->idle = 0;
-}
-
-static int snooze_loop(struct cpuidle_device *dev,
-			struct cpuidle_driver *drv,
-			int index)
-{
-	unsigned long in_purr;
-	int cpu = dev->cpu;
-
-	idle_loop_prolog(&in_purr);
-	local_irq_enable();
-	set_thread_flag(TIF_POLLING_NRFLAG);
-
-	while ((!need_resched()) && cpu_online(cpu)) {
-		HMT_low();
-		HMT_very_low();
-	}
-
-	HMT_medium();
-	clear_thread_flag(TIF_POLLING_NRFLAG);
-	smp_mb();
-
-	idle_loop_epilog(in_purr);
-
-	return index;
-}
-
-static void check_and_cede_processor(void)
-{
-	/*
-	 * Ensure our interrupt state is properly tracked,
-	 * also checks if no interrupt has occurred while we
-	 * were soft-disabled
-	 */
-	if (prep_irq_for_idle()) {
-		cede_processor();
-#ifdef CONFIG_TRACE_IRQFLAGS
-		/* Ensure that H_CEDE returns with IRQs on */
-		if (WARN_ON(!(mfmsr() & MSR_EE)))
-			__hard_irq_enable();
-#endif
-	}
-}
-
-static int dedicated_cede_loop(struct cpuidle_device *dev,
-				struct cpuidle_driver *drv,
-				int index)
-{
-	unsigned long in_purr;
-
-	idle_loop_prolog(&in_purr);
-	get_lppaca()->donate_dedicated_cpu = 1;
-
-	HMT_medium();
-	check_and_cede_processor();
-
-	get_lppaca()->donate_dedicated_cpu = 0;
-
-	idle_loop_epilog(in_purr);
-
-	return index;
-}
-
-static int shared_cede_loop(struct cpuidle_device *dev,
-			struct cpuidle_driver *drv,
-			int index)
-{
-	unsigned long in_purr;
-
-	idle_loop_prolog(&in_purr);
-
-	/*
-	 * Yield the processor to the hypervisor.  We return if
-	 * an external interrupt occurs (which are driven prior
-	 * to returning here) or if a prod occurs from another
-	 * processor. When returning here, external interrupts
-	 * are enabled.
-	 */
-	check_and_cede_processor();
-
-	idle_loop_epilog(in_purr);
-
-	return index;
-}
-
-/*
- * States for dedicated partition case.
- */
-static struct cpuidle_state dedicated_states[MAX_IDLE_STATE_COUNT] = {
-	{ /* Snooze */
-		.name = "snooze",
-		.desc = "snooze",
-		.flags = CPUIDLE_FLAG_TIME_VALID,
-		.exit_latency = 0,
-		.target_residency = 0,
-		.enter = &snooze_loop },
-	{ /* CEDE */
-		.name = "CEDE",
-		.desc = "CEDE",
-		.flags = CPUIDLE_FLAG_TIME_VALID,
-		.exit_latency = 10,
-		.target_residency = 100,
-		.enter = &dedicated_cede_loop },
-};
-
-/*
- * States for shared partition case.
- */
-static struct cpuidle_state shared_states[MAX_IDLE_STATE_COUNT] = {
-	{ /* Shared Cede */
-		.name = "Shared Cede",
-		.desc = "Shared Cede",
-		.flags = CPUIDLE_FLAG_TIME_VALID,
-		.exit_latency = 0,
-		.target_residency = 0,
-		.enter = &shared_cede_loop },
-};
-
-void update_smt_snooze_delay(int cpu, int residency)
-{
-	struct cpuidle_driver *drv = cpuidle_get_driver();
-	struct cpuidle_device *dev = per_cpu(cpuidle_devices, cpu);
-
-	if (cpuidle_state_table != dedicated_states)
-		return;
-
-	if (residency < 0) {
-		/* Disable the Nap state on that cpu */
-		if (dev)
-			dev->states_usage[1].disable = 1;
-	} else
-		if (drv)
-			drv->states[1].target_residency = residency;
-}
-
-static int pseries_cpuidle_add_cpu_notifier(struct notifier_block *n,
-			unsigned long action, void *hcpu)
-{
-	int hotcpu = (unsigned long)hcpu;
-	struct cpuidle_device *dev =
-			per_cpu_ptr(pseries_cpuidle_devices, hotcpu);
-
-	if (dev && cpuidle_get_driver()) {
-		switch (action) {
-		case CPU_ONLINE:
-		case CPU_ONLINE_FROZEN:
-			cpuidle_pause_and_lock();
-			cpuidle_enable_device(dev);
-			cpuidle_resume_and_unlock();
-			break;
-
-		case CPU_DEAD:
-		case CPU_DEAD_FROZEN:
-			cpuidle_pause_and_lock();
-			cpuidle_disable_device(dev);
-			cpuidle_resume_and_unlock();
-			break;
-
-		default:
-			return NOTIFY_DONE;
-		}
-	}
-	return NOTIFY_OK;
-}
-
-static struct notifier_block setup_hotplug_notifier = {
-	.notifier_call = pseries_cpuidle_add_cpu_notifier,
-};
-
-/*
- * pseries_cpuidle_driver_init()
- */
-static int pseries_cpuidle_driver_init(void)
-{
-	int idle_state;
-	struct cpuidle_driver *drv = &pseries_idle_driver;
-
-	drv->state_count = 0;
-
-	for (idle_state = 0; idle_state < MAX_IDLE_STATE_COUNT; ++idle_state) {
-
-		if (idle_state > max_idle_state)
-			break;
-
-		/* is the state not enabled? */
-		if (cpuidle_state_table[idle_state].enter == NULL)
-			continue;
-
-		drv->states[drv->state_count] =	/* structure copy */
-			cpuidle_state_table[idle_state];
-
-		drv->state_count += 1;
-	}
-
-	return 0;
-}
-
-/* pseries_idle_devices_uninit(void)
- * unregister cpuidle devices and de-allocate memory
- */
-static void pseries_idle_devices_uninit(void)
-{
-	int i;
-	struct cpuidle_device *dev;
-
-	for_each_possible_cpu(i) {
-		dev = per_cpu_ptr(pseries_cpuidle_devices, i);
-		cpuidle_unregister_device(dev);
-	}
-
-	free_percpu(pseries_cpuidle_devices);
-	return;
-}
-
-/* pseries_idle_devices_init()
- * allocate, initialize and register cpuidle device
- */
-static int pseries_idle_devices_init(void)
-{
-	int i;
-	struct cpuidle_driver *drv = &pseries_idle_driver;
-	struct cpuidle_device *dev;
-
-	pseries_cpuidle_devices = alloc_percpu(struct cpuidle_device);
-	if (pseries_cpuidle_devices == NULL)
-		return -ENOMEM;
-
-	for_each_possible_cpu(i) {
-		dev = per_cpu_ptr(pseries_cpuidle_devices, i);
-		dev->state_count = drv->state_count;
-		dev->cpu = i;
-		if (cpuidle_register_device(dev)) {
-			printk(KERN_DEBUG \
-				"cpuidle_register_device %d failed!\n", i);
-			return -EIO;
-		}
-	}
-
-	return 0;
-}
-
-/*
- * pseries_idle_probe()
- * Choose state table for shared versus dedicated partition
- */
-static int pseries_idle_probe(void)
-{
-
-	if (!firmware_has_feature(FW_FEATURE_SPLPAR))
-		return -ENODEV;
-
-	if (cpuidle_disable != IDLE_NO_OVERRIDE)
-		return -ENODEV;
-
-	if (max_idle_state == 0) {
-		printk(KERN_DEBUG "pseries processor idle disabled.\n");
-		return -EPERM;
-	}
-
-	if (lppaca_shared_proc(get_lppaca()))
-		cpuidle_state_table = shared_states;
-	else
-		cpuidle_state_table = dedicated_states;
-
-	return 0;
-}
-
-static int __init pseries_processor_idle_init(void)
-{
-	int retval;
-
-	retval = pseries_idle_probe();
-	if (retval)
-		return retval;
-
-	pseries_cpuidle_driver_init();
-	retval = cpuidle_register_driver(&pseries_idle_driver);
-	if (retval) {
-		printk(KERN_DEBUG "Registration of pseries driver failed.\n");
-		return retval;
-	}
-
-	retval = pseries_idle_devices_init();
-	if (retval) {
-		pseries_idle_devices_uninit();
-		cpuidle_unregister_driver(&pseries_idle_driver);
-		return retval;
-	}
-
-	register_cpu_notifier(&setup_hotplug_notifier);
-	printk(KERN_DEBUG "pseries_idle_driver registered\n");
-
-	return 0;
-}
-
-static void __exit pseries_processor_idle_exit(void)
-{
-
-	unregister_cpu_notifier(&setup_hotplug_notifier);
-	pseries_idle_devices_uninit();
-	cpuidle_unregister_driver(&pseries_idle_driver);
-
-	return;
-}
-
-module_init(pseries_processor_idle_init);
-module_exit(pseries_processor_idle_exit);
-
-MODULE_AUTHOR("Deepthi Dharwar <deepthi@linux.vnet.ibm.com>");
-MODULE_DESCRIPTION("Cpuidle driver for POWER");
-MODULE_LICENSE("GPL");
diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
index b3fb81d..f04e25f 100644
--- a/drivers/cpuidle/Kconfig
+++ b/drivers/cpuidle/Kconfig
@@ -35,6 +35,11 @@ depends on ARM
 source "drivers/cpuidle/Kconfig.arm"
 endmenu
 
+menu "POWERPC CPU Idle Drivers"
+depends on PPC
+source "drivers/cpuidle/Kconfig.powerpc"
+endmenu
+
 endif
 
 config ARCH_NEEDS_CPU_IDLE_COUPLED
diff --git a/drivers/cpuidle/Kconfig.powerpc b/drivers/cpuidle/Kconfig.powerpc
new file mode 100644
index 0000000..8147de5
--- /dev/null
+++ b/drivers/cpuidle/Kconfig.powerpc
@@ -0,0 +1,11 @@
+#
+# POWERPC CPU Idle Drivers
+#
+config PSERIES_CPUIDLE
+	bool "Cpuidle driver for pSeries platforms"
+	depends on CPU_IDLE
+	depends on PPC_PSERIES
+	default y
+	help
+	  Select this option to enable processor idle state management
+	  through cpuidle subsystem.
diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
index 527be28..a6331ad 100644
--- a/drivers/cpuidle/Makefile
+++ b/drivers/cpuidle/Makefile
@@ -13,3 +13,7 @@ obj-$(CONFIG_ARM_KIRKWOOD_CPUIDLE)	+= cpuidle-kirkwood.o
 obj-$(CONFIG_ARM_ZYNQ_CPUIDLE)		+= cpuidle-zynq.o
 obj-$(CONFIG_ARM_U8500_CPUIDLE)         += cpuidle-ux500.o
 obj-$(CONFIG_ARM_AT91_CPUIDLE)          += cpuidle-at91.o
+
+###############################################################################
+# POWERPC drivers
+obj-$(CONFIG_PSERIES_CPUIDLE)		+= cpuidle-pseries.o
diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c
new file mode 100644
index 0000000..2115478
--- /dev/null
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -0,0 +1,361 @@
+/*
+ *  cpuidle-pseries - idle state cpuidle driver.
+ *  Adapted from drivers/idle/intel_idle.c and
+ *  drivers/acpi/processor_idle.c
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/moduleparam.h>
+#include <linux/cpuidle.h>
+#include <linux/cpu.h>
+#include <linux/notifier.h>
+
+#include <asm/paca.h>
+#include <asm/reg.h>
+#include <asm/machdep.h>
+#include <asm/firmware.h>
+#include <asm/plpar_wrappers.h>
+
+struct cpuidle_driver pseries_idle_driver = {
+	.name             = "pseries_idle",
+	.owner            = THIS_MODULE,
+};
+
+#define MAX_IDLE_STATE_COUNT	2
+
+static int max_idle_state = MAX_IDLE_STATE_COUNT - 1;
+static struct cpuidle_device __percpu *pseries_cpuidle_devices;
+static struct cpuidle_state *cpuidle_state_table;
+
+static inline void idle_loop_prolog(unsigned long *in_purr)
+{
+	*in_purr = mfspr(SPRN_PURR);
+	/*
+	 * Indicate to the HV that we are idle. Now would be
+	 * a good time to find other work to dispatch.
+	 */
+	get_lppaca()->idle = 1;
+}
+
+static inline void idle_loop_epilog(unsigned long in_purr)
+{
+	u64 wait_cycles;
+
+	wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
+	wait_cycles += mfspr(SPRN_PURR) - in_purr;
+	get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
+	get_lppaca()->idle = 0;
+}
+
+static int snooze_loop(struct cpuidle_device *dev,
+			struct cpuidle_driver *drv,
+			int index)
+{
+	unsigned long in_purr;
+	int cpu = dev->cpu;
+
+	idle_loop_prolog(&in_purr);
+	local_irq_enable();
+	set_thread_flag(TIF_POLLING_NRFLAG);
+
+	while ((!need_resched()) && cpu_online(cpu)) {
+		HMT_low();
+		HMT_very_low();
+	}
+
+	HMT_medium();
+	clear_thread_flag(TIF_POLLING_NRFLAG);
+	smp_mb();
+
+	idle_loop_epilog(in_purr);
+
+	return index;
+}
+
+static void check_and_cede_processor(void)
+{
+	/*
+	 * Ensure our interrupt state is properly tracked,
+	 * also checks if no interrupt has occurred while we
+	 * were soft-disabled
+	 */
+	if (prep_irq_for_idle()) {
+		cede_processor();
+#ifdef CONFIG_TRACE_IRQFLAGS
+		/* Ensure that H_CEDE returns with IRQs on */
+		if (WARN_ON(!(mfmsr() & MSR_EE)))
+			__hard_irq_enable();
+#endif
+	}
+}
+
+static int dedicated_cede_loop(struct cpuidle_device *dev,
+				struct cpuidle_driver *drv,
+				int index)
+{
+	unsigned long in_purr;
+
+	idle_loop_prolog(&in_purr);
+	get_lppaca()->donate_dedicated_cpu = 1;
+
+	HMT_medium();
+	check_and_cede_processor();
+
+	get_lppaca()->donate_dedicated_cpu = 0;
+
+	idle_loop_epilog(in_purr);
+
+	return index;
+}
+
+static int shared_cede_loop(struct cpuidle_device *dev,
+			struct cpuidle_driver *drv,
+			int index)
+{
+	unsigned long in_purr;
+
+	idle_loop_prolog(&in_purr);
+
+	/*
+	 * Yield the processor to the hypervisor.  We return if
+	 * an external interrupt occurs (which are driven prior
+	 * to returning here) or if a prod occurs from another
+	 * processor. When returning here, external interrupts
+	 * are enabled.
+	 */
+	check_and_cede_processor();
+
+	idle_loop_epilog(in_purr);
+
+	return index;
+}
+
+/*
+ * States for dedicated partition case.
+ */
+static struct cpuidle_state dedicated_states[MAX_IDLE_STATE_COUNT] = {
+	{ /* Snooze */
+		.name = "snooze",
+		.desc = "snooze",
+		.flags = CPUIDLE_FLAG_TIME_VALID,
+		.exit_latency = 0,
+		.target_residency = 0,
+		.enter = &snooze_loop },
+	{ /* CEDE */
+		.name = "CEDE",
+		.desc = "CEDE",
+		.flags = CPUIDLE_FLAG_TIME_VALID,
+		.exit_latency = 10,
+		.target_residency = 100,
+		.enter = &dedicated_cede_loop },
+};
+
+/*
+ * States for shared partition case.
+ */
+static struct cpuidle_state shared_states[MAX_IDLE_STATE_COUNT] = {
+	{ /* Shared Cede */
+		.name = "Shared Cede",
+		.desc = "Shared Cede",
+		.flags = CPUIDLE_FLAG_TIME_VALID,
+		.exit_latency = 0,
+		.target_residency = 0,
+		.enter = &shared_cede_loop },
+};
+
+void update_smt_snooze_delay(int cpu, int residency)
+{
+	struct cpuidle_driver *drv = cpuidle_get_driver();
+	struct cpuidle_device *dev = per_cpu(cpuidle_devices, cpu);
+
+	if (cpuidle_state_table != dedicated_states)
+		return;
+
+	if (residency < 0) {
+		/* Disable the Nap state on that cpu */
+		if (dev)
+			dev->states_usage[1].disable = 1;
+	} else
+		if (drv)
+			drv->states[1].target_residency = residency;
+}
+
+static int pseries_cpuidle_add_cpu_notifier(struct notifier_block *n,
+			unsigned long action, void *hcpu)
+{
+	int hotcpu = (unsigned long)hcpu;
+	struct cpuidle_device *dev =
+			per_cpu_ptr(pseries_cpuidle_devices, hotcpu);
+
+	if (dev && cpuidle_get_driver()) {
+		switch (action) {
+		case CPU_ONLINE:
+		case CPU_ONLINE_FROZEN:
+			cpuidle_pause_and_lock();
+			cpuidle_enable_device(dev);
+			cpuidle_resume_and_unlock();
+			break;
+
+		case CPU_DEAD:
+		case CPU_DEAD_FROZEN:
+			cpuidle_pause_and_lock();
+			cpuidle_disable_device(dev);
+			cpuidle_resume_and_unlock();
+			break;
+
+		default:
+			return NOTIFY_DONE;
+		}
+	}
+	return NOTIFY_OK;
+}
+
+static struct notifier_block setup_hotplug_notifier = {
+	.notifier_call = pseries_cpuidle_add_cpu_notifier,
+};
+
+/*
+ * pseries_cpuidle_driver_init()
+ */
+static int pseries_cpuidle_driver_init(void)
+{
+	int idle_state;
+	struct cpuidle_driver *drv = &pseries_idle_driver;
+
+	drv->state_count = 0;
+
+	for (idle_state = 0; idle_state < MAX_IDLE_STATE_COUNT; ++idle_state) {
+
+		if (idle_state > max_idle_state)
+			break;
+
+		/* is the state not enabled? */
+		if (cpuidle_state_table[idle_state].enter == NULL)
+			continue;
+
+		drv->states[drv->state_count] =	/* structure copy */
+			cpuidle_state_table[idle_state];
+
+		drv->state_count += 1;
+	}
+
+	return 0;
+}
+
+/* pseries_idle_devices_uninit(void)
+ * unregister cpuidle devices and de-allocate memory
+ */
+static void pseries_idle_devices_uninit(void)
+{
+	int i;
+	struct cpuidle_device *dev;
+
+	for_each_possible_cpu(i) {
+		dev = per_cpu_ptr(pseries_cpuidle_devices, i);
+		cpuidle_unregister_device(dev);
+	}
+
+	free_percpu(pseries_cpuidle_devices);
+	return;
+}
+
+/* pseries_idle_devices_init()
+ * allocate, initialize and register cpuidle device
+ */
+static int pseries_idle_devices_init(void)
+{
+	int i;
+	struct cpuidle_driver *drv = &pseries_idle_driver;
+	struct cpuidle_device *dev;
+
+	pseries_cpuidle_devices = alloc_percpu(struct cpuidle_device);
+	if (pseries_cpuidle_devices == NULL)
+		return -ENOMEM;
+
+	for_each_possible_cpu(i) {
+		dev = per_cpu_ptr(pseries_cpuidle_devices, i);
+		dev->state_count = drv->state_count;
+		dev->cpu = i;
+		if (cpuidle_register_device(dev)) {
+			printk(KERN_DEBUG \
+				"cpuidle_register_device %d failed!\n", i);
+			return -EIO;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * pseries_idle_probe()
+ * Choose state table for shared versus dedicated partition
+ */
+static int pseries_idle_probe(void)
+{
+
+	if (!firmware_has_feature(FW_FEATURE_SPLPAR))
+		return -ENODEV;
+
+	if (cpuidle_disable != IDLE_NO_OVERRIDE)
+		return -ENODEV;
+
+	if (max_idle_state == 0) {
+		printk(KERN_DEBUG "pseries processor idle disabled.\n");
+		return -EPERM;
+	}
+
+	if (lppaca_shared_proc(get_lppaca()))
+		cpuidle_state_table = shared_states;
+	else
+		cpuidle_state_table = dedicated_states;
+
+	return 0;
+}
+
+static int __init pseries_processor_idle_init(void)
+{
+	int retval;
+
+	retval = pseries_idle_probe();
+	if (retval)
+		return retval;
+
+	pseries_cpuidle_driver_init();
+	retval = cpuidle_register_driver(&pseries_idle_driver);
+	if (retval) {
+		printk(KERN_DEBUG "Registration of pseries driver failed.\n");
+		return retval;
+	}
+
+	retval = pseries_idle_devices_init();
+	if (retval) {
+		pseries_idle_devices_uninit();
+		cpuidle_unregister_driver(&pseries_idle_driver);
+		return retval;
+	}
+
+	register_cpu_notifier(&setup_hotplug_notifier);
+	printk(KERN_DEBUG "pseries_idle_driver registered\n");
+
+	return 0;
+}
+
+static void __exit pseries_processor_idle_exit(void)
+{
+
+	unregister_cpu_notifier(&setup_hotplug_notifier);
+	pseries_idle_devices_uninit();
+	cpuidle_unregister_driver(&pseries_idle_driver);
+
+	return;
+}
+
+module_init(pseries_processor_idle_init);
+module_exit(pseries_processor_idle_exit);
+
+MODULE_AUTHOR("Deepthi Dharwar <deepthi@linux.vnet.ibm.com>");
+MODULE_DESCRIPTION("Cpuidle driver for POWER");
+MODULE_LICENSE("GPL");

^ permalink raw reply related

* [PATCH v1 3/6] pseries/cpuidle: Use cpuidle_register() for initialisation.
From: Deepthi Dharwar @ 2014-01-14 10:56 UTC (permalink / raw)
  To: linux-pm, benh, daniel.lezcano, linux-kernel, srivatsa.bhat,
	preeti, svaidy, linuxppc-dev
In-Reply-To: <20140114105525.3064.52013.stgit@deepthi.in.ibm.com>

This patch replaces the cpuidle driver and devices initialisation
calls with a single generic cpuidle_register() call
and also includes minor refactoring of the code around it.

Remove the cpu online check in snooze loop, as this code can
only locally run on a cpu only if it is online. Therefore,
this check is not required.

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
---
 drivers/cpuidle/cpuidle-pseries.c |   78 +++++--------------------------------
 1 file changed, 11 insertions(+), 67 deletions(-)

diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c
index 2115478..32d86bc 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -27,7 +27,6 @@ struct cpuidle_driver pseries_idle_driver = {
 #define MAX_IDLE_STATE_COUNT	2
 
 static int max_idle_state = MAX_IDLE_STATE_COUNT - 1;
-static struct cpuidle_device __percpu *pseries_cpuidle_devices;
 static struct cpuidle_state *cpuidle_state_table;
 
 static inline void idle_loop_prolog(unsigned long *in_purr)
@@ -55,13 +54,12 @@ static int snooze_loop(struct cpuidle_device *dev,
 			int index)
 {
 	unsigned long in_purr;
-	int cpu = dev->cpu;
 
 	idle_loop_prolog(&in_purr);
 	local_irq_enable();
 	set_thread_flag(TIF_POLLING_NRFLAG);
 
-	while ((!need_resched()) && cpu_online(cpu)) {
+	while (!need_resched()) {
 		HMT_low();
 		HMT_very_low();
 	}
@@ -188,7 +186,7 @@ static int pseries_cpuidle_add_cpu_notifier(struct notifier_block *n,
 {
 	int hotcpu = (unsigned long)hcpu;
 	struct cpuidle_device *dev =
-			per_cpu_ptr(pseries_cpuidle_devices, hotcpu);
+				per_cpu(cpuidle_devices, hotcpu);
 
 	if (dev && cpuidle_get_driver()) {
 		switch (action) {
@@ -245,50 +243,6 @@ static int pseries_cpuidle_driver_init(void)
 	return 0;
 }
 
-/* pseries_idle_devices_uninit(void)
- * unregister cpuidle devices and de-allocate memory
- */
-static void pseries_idle_devices_uninit(void)
-{
-	int i;
-	struct cpuidle_device *dev;
-
-	for_each_possible_cpu(i) {
-		dev = per_cpu_ptr(pseries_cpuidle_devices, i);
-		cpuidle_unregister_device(dev);
-	}
-
-	free_percpu(pseries_cpuidle_devices);
-	return;
-}
-
-/* pseries_idle_devices_init()
- * allocate, initialize and register cpuidle device
- */
-static int pseries_idle_devices_init(void)
-{
-	int i;
-	struct cpuidle_driver *drv = &pseries_idle_driver;
-	struct cpuidle_device *dev;
-
-	pseries_cpuidle_devices = alloc_percpu(struct cpuidle_device);
-	if (pseries_cpuidle_devices == NULL)
-		return -ENOMEM;
-
-	for_each_possible_cpu(i) {
-		dev = per_cpu_ptr(pseries_cpuidle_devices, i);
-		dev->state_count = drv->state_count;
-		dev->cpu = i;
-		if (cpuidle_register_device(dev)) {
-			printk(KERN_DEBUG \
-				"cpuidle_register_device %d failed!\n", i);
-			return -EIO;
-		}
-	}
-
-	return 0;
-}
-
 /*
  * pseries_idle_probe()
  * Choose state table for shared versus dedicated partition
@@ -296,9 +250,6 @@ static int pseries_idle_devices_init(void)
 static int pseries_idle_probe(void)
 {
 
-	if (!firmware_has_feature(FW_FEATURE_SPLPAR))
-		return -ENODEV;
-
 	if (cpuidle_disable != IDLE_NO_OVERRIDE)
 		return -ENODEV;
 
@@ -307,10 +258,13 @@ static int pseries_idle_probe(void)
 		return -EPERM;
 	}
 
-	if (lppaca_shared_proc(get_lppaca()))
-		cpuidle_state_table = shared_states;
-	else
-		cpuidle_state_table = dedicated_states;
+	if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
+		if (lppaca_shared_proc(get_lppaca()))
+			cpuidle_state_table = shared_states;
+		else
+			cpuidle_state_table = dedicated_states;
+	} else
+		return -ENODEV;
 
 	return 0;
 }
@@ -324,22 +278,14 @@ static int __init pseries_processor_idle_init(void)
 		return retval;
 
 	pseries_cpuidle_driver_init();
-	retval = cpuidle_register_driver(&pseries_idle_driver);
+	retval = cpuidle_register(&pseries_idle_driver, NULL);
 	if (retval) {
 		printk(KERN_DEBUG "Registration of pseries driver failed.\n");
 		return retval;
 	}
 
-	retval = pseries_idle_devices_init();
-	if (retval) {
-		pseries_idle_devices_uninit();
-		cpuidle_unregister_driver(&pseries_idle_driver);
-		return retval;
-	}
-
 	register_cpu_notifier(&setup_hotplug_notifier);
 	printk(KERN_DEBUG "pseries_idle_driver registered\n");
-
 	return 0;
 }
 
@@ -347,9 +293,7 @@ static void __exit pseries_processor_idle_exit(void)
 {
 
 	unregister_cpu_notifier(&setup_hotplug_notifier);
-	pseries_idle_devices_uninit();
-	cpuidle_unregister_driver(&pseries_idle_driver);
-
+	cpuidle_unregister(&pseries_idle_driver);
 	return;
 }
 

^ permalink raw reply related

* [PATCH v1 4/6] pseries/cpuidle: Make cpuidle-pseries backend driver a non-module.
From: Deepthi Dharwar @ 2014-01-14 10:56 UTC (permalink / raw)
  To: linux-pm, benh, daniel.lezcano, linux-kernel, srivatsa.bhat,
	preeti, svaidy, linuxppc-dev
In-Reply-To: <20140114105525.3064.52013.stgit@deepthi.in.ibm.com>

Currently cpuidle-pseries backend driver cannot be
built as a module due to dependencies wrt cpuidle framework.
This patch removes all the module related code in the driver.

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
---
 drivers/cpuidle/cpuidle-pseries.c |   15 +--------------
 1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c
index 32d86bc..5e13f6c 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -289,17 +289,4 @@ static int __init pseries_processor_idle_init(void)
 	return 0;
 }
 
-static void __exit pseries_processor_idle_exit(void)
-{
-
-	unregister_cpu_notifier(&setup_hotplug_notifier);
-	cpuidle_unregister(&pseries_idle_driver);
-	return;
-}
-
-module_init(pseries_processor_idle_init);
-module_exit(pseries_processor_idle_exit);
-
-MODULE_AUTHOR("Deepthi Dharwar <deepthi@linux.vnet.ibm.com>");
-MODULE_DESCRIPTION("Cpuidle driver for POWER");
-MODULE_LICENSE("GPL");
+device_initcall(pseries_processor_idle_init);

^ permalink raw reply related

* [PATCH v1 5/6] pseries/cpuidle: Remove MAX_IDLE_STATE macro.
From: Deepthi Dharwar @ 2014-01-14 10:56 UTC (permalink / raw)
  To: linux-pm, benh, daniel.lezcano, linux-kernel, srivatsa.bhat,
	preeti, svaidy, linuxppc-dev
In-Reply-To: <20140114105525.3064.52013.stgit@deepthi.in.ibm.com>

This patch removes the usage of MAX_IDLE_STATE macro
and dead code around it. The number of states
are determined at run time based on the cpuidle
state table selected on a given platform

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
---
 drivers/cpuidle/cpuidle-pseries.c |   28 ++++++++++------------------
 1 file changed, 10 insertions(+), 18 deletions(-)

diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c
index 5e13f6c..bb56091 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -24,9 +24,7 @@ struct cpuidle_driver pseries_idle_driver = {
 	.owner            = THIS_MODULE,
 };
 
-#define MAX_IDLE_STATE_COUNT	2
-
-static int max_idle_state = MAX_IDLE_STATE_COUNT - 1;
+static int max_idle_state;
 static struct cpuidle_state *cpuidle_state_table;
 
 static inline void idle_loop_prolog(unsigned long *in_purr)
@@ -134,7 +132,7 @@ static int shared_cede_loop(struct cpuidle_device *dev,
 /*
  * States for dedicated partition case.
  */
-static struct cpuidle_state dedicated_states[MAX_IDLE_STATE_COUNT] = {
+static struct cpuidle_state dedicated_states[] = {
 	{ /* Snooze */
 		.name = "snooze",
 		.desc = "snooze",
@@ -154,7 +152,7 @@ static struct cpuidle_state dedicated_states[MAX_IDLE_STATE_COUNT] = {
 /*
  * States for shared partition case.
  */
-static struct cpuidle_state shared_states[MAX_IDLE_STATE_COUNT] = {
+static struct cpuidle_state shared_states[] = {
 	{ /* Shared Cede */
 		.name = "Shared Cede",
 		.desc = "Shared Cede",
@@ -225,12 +223,8 @@ static int pseries_cpuidle_driver_init(void)
 
 	drv->state_count = 0;
 
-	for (idle_state = 0; idle_state < MAX_IDLE_STATE_COUNT; ++idle_state) {
-
-		if (idle_state > max_idle_state)
-			break;
-
-		/* is the state not enabled? */
+	for (idle_state = 0; idle_state < max_idle_state; ++idle_state) {
+		/* Is the state not enabled? */
 		if (cpuidle_state_table[idle_state].enter == NULL)
 			continue;
 
@@ -253,16 +247,14 @@ static int pseries_idle_probe(void)
 	if (cpuidle_disable != IDLE_NO_OVERRIDE)
 		return -ENODEV;
 
-	if (max_idle_state == 0) {
-		printk(KERN_DEBUG "pseries processor idle disabled.\n");
-		return -EPERM;
-	}
-
 	if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
-		if (lppaca_shared_proc(get_lppaca()))
+		if (lppaca_shared_proc(get_lppaca())) {
 			cpuidle_state_table = shared_states;
-		else
+			max_idle_state = ARRAY_SIZE(shared_states);
+		} else {
 			cpuidle_state_table = dedicated_states;
+			max_idle_state = ARRAY_SIZE(dedicated_states);
+		}
 	} else
 		return -ENODEV;
 

^ permalink raw reply related

* [PATCH v1 6/6] pseries/cpuidle: smt-snooze-delay cleanup.
From: Deepthi Dharwar @ 2014-01-14 10:56 UTC (permalink / raw)
  To: linux-pm, benh, daniel.lezcano, linux-kernel, srivatsa.bhat,
	preeti, svaidy, linuxppc-dev
In-Reply-To: <20140114105525.3064.52013.stgit@deepthi.in.ibm.com>

smt-snooze-delay was designed to disable NAP state or delay the entry
to the NAP state prior to adoption of cpuidle framework. This
is per-cpu variable. With the coming of CPUIDLE framework,
states can be disabled on per-cpu basis using the cpuidle/enable
sysfs entry.

Also, with the coming of cpuidle driver each state's target residency
is per-driver unlike earlier which was per-device. Therefore,
the per-cpu sysfs smt-snooze-delay which decides the target residency
of the idle state on a particular cpu causes more confusion to the user
as we cannot have different smt-snooze-delay (target residency)
values for each cpu.

In the current code, smt-snooze-delay functionality is completely broken.
It makes sense to remove smt-snooze-delay from idle driver with the
coming of cpuidle framework.
However, sysfs files are retained as ppc64_util currently
utilises it. Once we fix ppc64_util, propose to clean
up the kernel code.

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/processor.h |    7 -------
 arch/powerpc/kernel/sysfs.c          |    2 --
 drivers/cpuidle/cpuidle-pseries.c    |   17 -----------------
 3 files changed, 26 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index fa98fdf..027fefd 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -444,13 +444,6 @@ enum idle_boot_override {IDLE_NO_OVERRIDE = 0, IDLE_POWERSAVE_OFF};
 
 extern int powersave_nap;	/* set if nap mode can be used in idle loop */
 extern void power7_nap(void);
-
-#ifdef CONFIG_PSERIES_CPUIDLE
-extern void update_smt_snooze_delay(int cpu, int residency);
-#else
-static inline void update_smt_snooze_delay(int cpu, int residency) {}
-#endif
-
 extern void flush_instruction_cache(void);
 extern void hard_reset_now(void);
 extern void poweroff_now(void);
diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index b4e6676..7f9e130 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -51,8 +51,6 @@ static ssize_t store_smt_snooze_delay(struct device *dev,
 		return -EINVAL;
 
 	per_cpu(smt_snooze_delay, cpu->dev.id) = snooze;
-	update_smt_snooze_delay(cpu->dev.id, snooze);
-
 	return count;
 }
 
diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c
index bb56091..7ab564a 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -162,23 +162,6 @@ static struct cpuidle_state shared_states[] = {
 		.enter = &shared_cede_loop },
 };
 
-void update_smt_snooze_delay(int cpu, int residency)
-{
-	struct cpuidle_driver *drv = cpuidle_get_driver();
-	struct cpuidle_device *dev = per_cpu(cpuidle_devices, cpu);
-
-	if (cpuidle_state_table != dedicated_states)
-		return;
-
-	if (residency < 0) {
-		/* Disable the Nap state on that cpu */
-		if (dev)
-			dev->states_usage[1].disable = 1;
-	} else
-		if (drv)
-			drv->states[1].target_residency = residency;
-}
-
 static int pseries_cpuidle_add_cpu_notifier(struct notifier_block *n,
 			unsigned long action, void *hcpu)
 {

^ permalink raw reply related

* [PATCH v1] powernv/cpuidle: Back-end cpuidle driver for powernv platform for idle state management.
From: Deepthi Dharwar @ 2014-01-14 11:02 UTC (permalink / raw)
  To: linux-pm, benh, daniel.lezcano, linux-kernel, srivatsa.bhat,
	preeti, svaidy, linuxppc-dev

Following patch ports the cpuidle framework for powernv
platform and also implements a cpuidle back-end powernv 
idle driver calling on to power7_nap and snooze idle states.

Moving the idle states over to cpuidle framework can take advantage 
of advanced heuristics, tunables and features provided by cpuidle 
framework. Additional idle states can be exploited using the cpuidle 
framework. The statistics and tracing infrastructure provided by 
the cpuidle framework also helps in enabling power management 
related tools and help tune the system and applications.

This series aims to maintain compatibility and functionality to
existing powernv idle cpu management code.  There are no new functions
or idle states added as part of this series. This can be extended by 
adding more states to this existing framework.

For POWERNV platform to hook into CPUIDLE framework, one
needs to enable CONFIG_POWERNV_IDLE. 

This patch series applies on pseries cpuidle backend driver
fixes patchset posted earlier.
pseries/cpuidle: pseries cpuidle backend driver clean-ups.

 Deepthi Dharwar (1):
      powernv/cpuidle: Back-end cpuidle driver for powernv platform.


 arch/powerpc/platforms/powernv/setup.c |   13 ++
 drivers/cpuidle/Kconfig.powerpc        |    9 ++
 drivers/cpuidle/Makefile               |    1 
 drivers/cpuidle/cpuidle-powernv.c      |  169 ++++++++++++++++++++++++++++++++
 4 files changed, 191 insertions(+), 1 deletion(-)
 create mode 100644 drivers/cpuidle/cpuidle-powernv.c


-- Deepthi

^ permalink raw reply

* [PATCH v1] powernv/cpuidle: Back-end cpuidle driver for powernv platform.
From: Deepthi Dharwar @ 2014-01-14 11:02 UTC (permalink / raw)
  To: linux-pm, benh, daniel.lezcano, linux-kernel, srivatsa.bhat,
	preeti, svaidy, linuxppc-dev
In-Reply-To: <20140114110157.4091.80684.stgit@deepthi.in.ibm.com>

Following patch ports the cpuidle framework for powernv
platform and also implements a cpuidle back-end powernv
idle driver calling on to power7_nap and snooze idle states.

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/setup.c |   13 ++
 drivers/cpuidle/Kconfig.powerpc        |    9 ++
 drivers/cpuidle/Makefile               |    1 
 drivers/cpuidle/cpuidle-powernv.c      |  169 ++++++++++++++++++++++++++++++++
 4 files changed, 191 insertions(+), 1 deletion(-)
 create mode 100644 drivers/cpuidle/cpuidle-powernv.c

diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
index 19884b2..764a14e 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -26,6 +26,7 @@
 #include <linux/of_fdt.h>
 #include <linux/interrupt.h>
 #include <linux/bug.h>
+#include <linux/cpuidle.h>
 
 #include <asm/machdep.h>
 #include <asm/firmware.h>
@@ -214,6 +215,16 @@ static int __init pnv_probe(void)
 	return 1;
 }
 
+void powernv_idle(void)
+{
+	/* Hook to cpuidle framework if available, else
+	 * call on default platform idle code
+	 */
+	if (cpuidle_idle_call()) {
+		power7_idle();
+	}
+}
+
 define_machine(powernv) {
 	.name			= "PowerNV",
 	.probe			= pnv_probe,
@@ -223,7 +234,7 @@ define_machine(powernv) {
 	.show_cpuinfo		= pnv_show_cpuinfo,
 	.progress		= pnv_progress,
 	.machine_shutdown	= pnv_shutdown,
-	.power_save             = power7_idle,
+	.power_save             = powernv_idle,
 	.calibrate_decr		= generic_calibrate_decr,
 #ifdef CONFIG_KEXEC
 	.kexec_cpu_down		= pnv_kexec_cpu_down,
diff --git a/drivers/cpuidle/Kconfig.powerpc b/drivers/cpuidle/Kconfig.powerpc
index 8147de5..66c3a09 100644
--- a/drivers/cpuidle/Kconfig.powerpc
+++ b/drivers/cpuidle/Kconfig.powerpc
@@ -9,3 +9,12 @@ config PSERIES_CPUIDLE
 	help
 	  Select this option to enable processor idle state management
 	  through cpuidle subsystem.
+
+config POWERNV_CPUIDLE
+	bool "Cpuidle driver for powernv platforms"
+	depends on CPU_IDLE
+	depends on PPC_POWERNV
+	default y
+	help
+	  Select this option to enable processor idle state management
+	  through cpuidle subsystem.
diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
index a6331ad..f71ae1b 100644
--- a/drivers/cpuidle/Makefile
+++ b/drivers/cpuidle/Makefile
@@ -17,3 +17,4 @@ obj-$(CONFIG_ARM_AT91_CPUIDLE)          += cpuidle-at91.o
 ###############################################################################
 # POWERPC drivers
 obj-$(CONFIG_PSERIES_CPUIDLE)		+= cpuidle-pseries.o
+obj-$(CONFIG_POWERNV_CPUIDLE)		+= cpuidle-powernv.o
diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
new file mode 100644
index 0000000..78fd174
--- /dev/null
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -0,0 +1,169 @@
+/*
+ *  cpuidle-powernv - idle state cpuidle driver.
+ *  Adapted from drivers/cpuidle/cpuidle-pseries
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/moduleparam.h>
+#include <linux/cpuidle.h>
+#include <linux/cpu.h>
+#include <linux/notifier.h>
+
+#include <asm/machdep.h>
+#include <asm/firmware.h>
+
+struct cpuidle_driver powernv_idle_driver = {
+	.name             = "powernv_idle",
+	.owner            = THIS_MODULE,
+};
+
+static int max_idle_state;
+static struct cpuidle_state *cpuidle_state_table;
+
+static int snooze_loop(struct cpuidle_device *dev,
+			struct cpuidle_driver *drv,
+			int index)
+{
+	local_irq_enable();
+	set_thread_flag(TIF_POLLING_NRFLAG);
+
+	while (!need_resched()) {
+		HMT_low();
+		HMT_very_low();
+	}
+
+	HMT_medium();
+	clear_thread_flag(TIF_POLLING_NRFLAG);
+	smp_mb();
+	return index;
+}
+
+static int nap_loop(struct cpuidle_device *dev,
+			struct cpuidle_driver *drv,
+			int index)
+{
+	power7_idle();
+	return index;
+}
+
+/*
+ * States for dedicated partition case.
+ */
+static struct cpuidle_state powernv_states[] = {
+	{ /* Snooze */
+		.name = "snooze",
+		.desc = "snooze",
+		.flags = CPUIDLE_FLAG_TIME_VALID,
+		.exit_latency = 0,
+		.target_residency = 0,
+		.enter = &snooze_loop },
+	{ /* NAP */
+		.name = "NAP",
+		.desc = "NAP",
+		.flags = CPUIDLE_FLAG_TIME_VALID,
+		.exit_latency = 10,
+		.target_residency = 100,
+		.enter = &nap_loop },
+};
+
+static int powernv_cpuidle_add_cpu_notifier(struct notifier_block *n,
+			unsigned long action, void *hcpu)
+{
+	int hotcpu = (unsigned long)hcpu;
+	struct cpuidle_device *dev =
+				per_cpu(cpuidle_devices, hotcpu);
+
+	if (dev && cpuidle_get_driver()) {
+		switch (action) {
+		case CPU_ONLINE:
+		case CPU_ONLINE_FROZEN:
+			cpuidle_pause_and_lock();
+			cpuidle_enable_device(dev);
+			cpuidle_resume_and_unlock();
+			break;
+
+		case CPU_DEAD:
+		case CPU_DEAD_FROZEN:
+			cpuidle_pause_and_lock();
+			cpuidle_disable_device(dev);
+			cpuidle_resume_and_unlock();
+			break;
+
+		default:
+			return NOTIFY_DONE;
+		}
+	}
+	return NOTIFY_OK;
+}
+
+static struct notifier_block setup_hotplug_notifier = {
+	.notifier_call = powernv_cpuidle_add_cpu_notifier,
+};
+
+/*
+ * powernv_cpuidle_driver_init()
+ */
+static int powernv_cpuidle_driver_init(void)
+{
+	int idle_state;
+	struct cpuidle_driver *drv = &powernv_idle_driver;
+
+	drv->state_count = 0;
+
+	for (idle_state = 0; idle_state < max_idle_state; ++idle_state) {
+		/* Is the state not enabled? */
+		if (cpuidle_state_table[idle_state].enter == NULL)
+			continue;
+
+		drv->states[drv->state_count] =	/* structure copy */
+			cpuidle_state_table[idle_state];
+
+		drv->state_count += 1;
+	}
+
+	return 0;
+}
+
+/*
+ * powernv_idle_probe()
+ * Choose state table for shared versus dedicated partition
+ */
+static int powernv_idle_probe(void)
+{
+
+	if (cpuidle_disable != IDLE_NO_OVERRIDE)
+		return -ENODEV;
+
+	if (firmware_has_feature(FW_FEATURE_OPALv3)) {
+		cpuidle_state_table = powernv_states;
+		max_idle_state = ARRAY_SIZE(powernv_states);
+ 	} else
+ 		return -ENODEV;
+
+	return 0;
+}
+
+static int __init powernv_processor_idle_init(void)
+{
+	int retval;
+
+	retval = powernv_idle_probe();
+	if (retval)
+		return retval;
+
+	powernv_cpuidle_driver_init();
+	retval = cpuidle_register(&powernv_idle_driver, NULL);
+	if (retval) {
+		printk(KERN_DEBUG "Registration of powernv driver failed.\n");
+		return retval;
+	}
+
+	register_cpu_notifier(&setup_hotplug_notifier);
+	printk(KERN_DEBUG "powernv_idle_driver registered\n");
+	return 0;
+}
+
+device_initcall(powernv_processor_idle_init);

^ permalink raw reply related

* Re: [PATCH] cpuidle/menu: Fail cpuidle_idle_call() if no idle state is acceptable
From: Preeti U Murthy @ 2014-01-14 11:02 UTC (permalink / raw)
  To: Srivatsa S. Bhat, svaidy
  Cc: deepthi, linux-pm, daniel.lezcano, rjw, linux-kernel, paulmck,
	linuxppc-dev, tuukka.tikkanen
In-Reply-To: <52D4E93C.3050503@linux.vnet.ibm.com>

On 01/14/2014 01:07 PM, Srivatsa S. Bhat wrote:
> On 01/14/2014 12:30 PM, Srivatsa S. Bhat wrote:
>> On 01/14/2014 11:35 AM, Preeti U Murthy wrote:
>>> On PowerPC, in a particular test scenario, all the cpu idle states were disabled.
>>> Inspite of this it was observed that the idle state count of the shallowest
>>> idle state, snooze, was increasing.
>>>
>>> This is because the governor returns the idle state index as 0 even in
>>> scenarios when no idle state can be chosen. These scenarios could be when the
>>> latency requirement is 0 or as mentioned above when the user wants to disable
>>> certain cpu idle states at runtime. In the latter case, its possible that no
>>> cpu idle state is valid because the suitable states were disabled
>>> and the rest did not match the menu governor criteria to be chosen as the
>>> next idle state.
>>>
>>> This patch adds the code to indicate that a valid cpu idle state could not be
>>> chosen by the menu governor and reports back to arch so that it can take some
>>> default action.
>>>
>>
>> That sounds fair enough. However, the "default" action of pseries idle loop
>> (pseries_lpar_idle()) surprises me. It enters Cede, which is _deeper_ than doing
>> a snooze! IOW, a user might "disable" cpuidle or set the PM_QOS_CPU_DMA_LATENCY
>> to 0 hoping to prevent the CPUs from going to deep idle states, but then the
>> machine would still end up going to Cede, even though that wont get reflected
>> in the idle state counts. IMHO that scenario needs some thought as well...
>>
> 
> I checked the git history and found that the default idle was changed (on purpose)
> to cede the processor, in order to speed up booting.. Hmm..
> 
> commit 363edbe2614aa90df706c0f19ccfa2a6c06af0be
> Author: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> Date:   Fri Sep 6 00:25:06 2013 +0530
> 
>     powerpc: Default arch idle could cede processor on pseries

This issue is not powerpc specific as I observed on digging a bit into
the default idle routines of the common archs. The way that archs
perceive the call to cpuidle framework today is that if it fails, it
means that cpuidle backend driver fails to *function* due to some reason
(as is mentioned in the above commit: either since cpuidle driver is not
registered or it does not work on some specific platforms) and that
therefore the archs should decide on an idle state themselves. They
therefore end up choosing a convenient idle state which could very well
be one of the idle states in the cpuidle state table.

The archs do not see failed call to cpuidle driver as "cpuidle driver
says no idle state can be entered now because there are strict latency
requirements or the idle states are disabled". IOW, the call to cpuidle
driver is currently based on if cpuidle driver exists rather than if it
agrees on entry into any of the idle states.

This patch brings in the need for the archs to incorporate this
additional check of "did cpuidle_idle_call() fail because it did not
find it wise to enter any of the idle states". In which case they should
simply exit without taking any *default action*.

Need to give this some thought and reconsider the patch.

Regards
Preeti U Murthy
> 
> 
> Regards,
> Srivatsa S. Bhat
> 

^ permalink raw reply

* Disable sleep states on P7+
From: Steven Pratt @ 2014-01-14 14:36 UTC (permalink / raw)
  To: linuxppc-dev

I am looking for info on when and how we are able to disable power saving features of current (P7, P7+) chips in order to reduce latency. This is often done in latency sensitive applications when power consumption is not an issue. On Intel boxes we can disable P-state frequency changes as well as disabling C-State or sleep state changes. In fact we can control how deep a sleep the processor can go into.  I know we have control Dynamic Processor Scaling and Idle Power Savings, but what states do these really affect?  Can I really disable Nap mode of a processor? If so how?  Can I disable even the lightest winkle mode?  Looking for current information (read RHEL 6 and SLES11), future changes are interesting.

Steve

^ permalink raw reply

* Re: Disable sleep states on P7+
From: Preeti U Murthy @ 2014-01-14 16:10 UTC (permalink / raw)
  To: Steven Pratt; +Cc: linuxppc-dev
In-Reply-To: <52D54B60.4090807@austin.ibm.com>

Hi Steven,

On 01/14/2014 08:06 PM, Steven Pratt wrote:
> I am looking for info on when and how we are able to disable power saving features of current (P7, P7+) chips in order to reduce latency. This is often done in latency sensitive applications when power consumption is not an issue. On Intel boxes we can disable P-state frequency changes as well as disabling C-State or sleep state changes. In fact we can control how deep a sleep the processor can go into.  I know we have control Dynamic Processor Scaling and Idle Power Savings, but what states do these really affect?  Can I really disable Nap mode of a processor? If so how?  Can I disable even the lightest winkle mode?  Looking for current information (read RHEL 6 and SLES11), future changes are interesting.
> 
> Steve

I can answer this question with respect to cpuidle on PowerNV platforms.

1. In order to disable cpuidle states management altogether, one can
pass the powersave=off kernel cmd line parameter during boot up of the
kernel. This will ensure that each time a CPU has nothing to do, it can
enter low thread priority which could lower power consumption to some
extent but is not expected to hit latency of applications noticeably.

2. In order to exactly control the cpuidle states into which idle CPUs
can enter into during runtime, one can make use of the sysfs files under:
/sys/devices/system/cpu/cpux/cpuidle/statex/disable option to
selectively disable any state.

However if one is using the menu cpuidle governor, disabling an idle
state does not disable the idle states which are deeper than it. They
continue to remain active unless they are specifically disabled. What
this means is that one cannot control the depth of the idle states
available for a CPU, although we can control the exact idle states
available for a processor.

But if the ladder governor is used, one can control the depth of the
idle states that a CPU can enter into. The governor can be chosen by
echoing either menu/ladder to
/sys/devices/system/cpu/cpuidle/current_governor_ro. The cpuidle
governor takes decisions about the idle state for a cpu to enter into
depending on its idle history. The popular governor used by most archs
is the menu governor.

Hence nap/sleep/winkle any of these states can be disabled. The code
which enables the above mentioned functionalities on powernv is yet to
go upstream although the same is already upstream and can be used for
the pseries platform to disable/enable the idle states on it.

Today on powernv the default idle state nap is entered into all the
time. One can disable it by echoing 0 to powersave_nap under
/proc/sys/kernel/powersave_nap, in which case the cpu enters low thread
priority.

Thanks

Regards
Preeti U Murthy

> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 

^ permalink raw reply

* Re: Disable sleep states on P7+
From: Steven Pratt @ 2014-01-14 17:04 UTC (permalink / raw)
  To: Preeti U Murthy; +Cc: linuxppc-dev
In-Reply-To: <52D56186.70807@linux.vnet.ibm.com>

On 01/14/2014 10:10 AM, Preeti U Murthy wrote:
> Hi Steven,
>
> On 01/14/2014 08:06 PM, Steven Pratt wrote:
>> I am looking for info on when and how we are able to disable power saving features of current (P7, P7+) chips in order to reduce latency. This is often done in latency sensitive applications when power consumption is not an issue. On Intel boxes we can disable P-state frequency changes as well as disabling C-State or sleep state changes. In fact we can control how deep a sleep the processor can go into.  I know we have control Dynamic Processor Scaling and Idle Power Savings, but what states do these really affect?  Can I really disable Nap mode of a processor? If so how?  Can I disable even the lightest winkle mode?  Looking for current information (read RHEL 6 and SLES11), future changes are interesting.
>>
>> Steve
> I can answer this question with respect to cpuidle on PowerNV platforms.
>
> 1. In order to disable cpuidle states management altogether, one can
> pass the powersave=off kernel cmd line parameter during boot up of the
> kernel. This will ensure that each time a CPU has nothing to do, it can
> enter low thread priority which could lower power consumption to some
> extent but is not expected to hit latency of applications noticeably.
>
> 2. In order to exactly control the cpuidle states into which idle CPUs
> can enter into during runtime, one can make use of the sysfs files under:
> /sys/devices/system/cpu/cpux/cpuidle/statex/disable option to
> selectively disable any state.
>
> However if one is using the menu cpuidle governor, disabling an idle
> state does not disable the idle states which are deeper than it. They
> continue to remain active unless they are specifically disabled. What
> this means is that one cannot control the depth of the idle states
> available for a CPU, although we can control the exact idle states
> available for a processor.
>
> But if the ladder governor is used, one can control the depth of the
> idle states that a CPU can enter into. The governor can be chosen by
> echoing either menu/ladder to
> /sys/devices/system/cpu/cpuidle/current_governor_ro. The cpuidle
> governor takes decisions about the idle state for a cpu to enter into
> depending on its idle history. The popular governor used by most archs
> is the menu governor.
>
> Hence nap/sleep/winkle any of these states can be disabled. The code
> which enables the above mentioned functionalities on powernv is yet to
> go upstream although the same is already upstream and can be used for
> the pseries platform to disable/enable the idle states on it.
>
> Today on powernv the default idle state nap is entered into all the
> time. One can disable it by echoing 0 to powersave_nap under
> /proc/sys/kernel/powersave_nap, in which case the cpu enters low thread

Thanks, that is great information going forward, now I just need info on what works today in PowerVM.

Steve

> priority.
>
> Thanks
>
> Regards
> Preeti U Murthy
>
>> _______________________________________________
>> Linuxppc-dev mailing list
>> Linuxppc-dev@lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/linuxppc-dev
>>

^ permalink raw reply

* Re: [PATCH v2] Move precessing of MCE queued event out from syscall exit path.
From: Hugh Dickins @ 2014-01-14 19:48 UTC (permalink / raw)
  To: Mahesh J Salgaonkar; +Cc: linuxppc-dev
In-Reply-To: <20140114101450.32385.65506.stgit@mars.in.ibm.com>

On Tue, 14 Jan 2014, Mahesh J Salgaonkar wrote:
> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> 
> Huge Dickins reported an issue that b5ff4211a829
> "powerpc/book3s: Queue up and process delayed MCE events" breaks the
> PowerMac G5 boot. This patch fixes it by moving the mce even processing
> away from syscall exit, which was wrong to do that in first place, and
> using irq work framework to delay processing of mce event.
> 
> Reported-by: Hugh Dickins <hughd@google.com
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

This version also boots and runs fine for me on the G5
(but of course, I'm probably not testing delayed MCE events at all).

Hugh

> ---
>  arch/powerpc/include/asm/mce.h |    1 -
>  arch/powerpc/kernel/entry_64.S |    5 -----
>  arch/powerpc/kernel/mce.c      |   13 ++++++++++---
>  3 files changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
> index 2257d1e..f97d8cb 100644
> --- a/arch/powerpc/include/asm/mce.h
> +++ b/arch/powerpc/include/asm/mce.h
> @@ -192,7 +192,6 @@ extern void save_mce_event(struct pt_regs *regs, long handled,
>  extern int get_mce_event(struct machine_check_event *mce, bool release);
>  extern void release_mce_event(void);
>  extern void machine_check_queue_event(void);
> -extern void machine_check_process_queued_event(void);
>  extern void machine_check_print_event_info(struct machine_check_event *evt);
>  extern uint64_t get_mce_fault_addr(struct machine_check_event *evt);
>  
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 770d6d6..bbfb029 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -184,11 +184,6 @@ syscall_exit:
>  	bl	.do_show_syscall_exit
>  	ld	r3,RESULT(r1)
>  #endif
> -#ifdef CONFIG_PPC_BOOK3S_64
> -BEGIN_FTR_SECTION
> -	bl	.machine_check_process_queued_event
> -END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
> -#endif
>  	CURRENT_THREAD_INFO(r12, r1)
>  
>  	ld	r8,_MSR(r1)
> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
> index d6edf2b..a7fd4cb 100644
> --- a/arch/powerpc/kernel/mce.c
> +++ b/arch/powerpc/kernel/mce.c
> @@ -26,6 +26,7 @@
>  #include <linux/ptrace.h>
>  #include <linux/percpu.h>
>  #include <linux/export.h>
> +#include <linux/irq_work.h>
>  #include <asm/mce.h>
>  
>  static DEFINE_PER_CPU(int, mce_nest_count);
> @@ -35,6 +36,11 @@ static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT], mce_event);
>  static DEFINE_PER_CPU(int, mce_queue_count);
>  static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT], mce_event_queue);
>  
> +static void machine_check_process_queued_event(struct irq_work *work);
> +struct irq_work mce_event_process_work = {
> +        .func = machine_check_process_queued_event,
> +};
> +
>  static void mce_set_error_info(struct machine_check_event *mce,
>  			       struct mce_error_info *mce_err)
>  {
> @@ -185,17 +191,19 @@ void machine_check_queue_event(void)
>  		return;
>  	}
>  	__get_cpu_var(mce_event_queue[index]) = evt;
> +
> +	/* Queue irq work to process this event later. */
> +	irq_work_queue(&mce_event_process_work);
>  }
>  
>  /*
>   * process pending MCE event from the mce event queue. This function will be
>   * called during syscall exit.
>   */
> -void machine_check_process_queued_event(void)
> +static void machine_check_process_queued_event(struct irq_work *work)
>  {
>  	int index;
>  
> -	preempt_disable();
>  	/*
>  	 * For now just print it to console.
>  	 * TODO: log this error event to FSP or nvram.
> @@ -206,7 +214,6 @@ void machine_check_process_queued_event(void)
>  				&__get_cpu_var(mce_event_queue[index]));
>  		__get_cpu_var(mce_queue_count)--;
>  	}
> -	preempt_enable();
>  }
>  
>  void machine_check_print_event_info(struct machine_check_event *evt)
> 
> 

^ permalink raw reply

* Re: [PATCH v2] Move precessing of MCE queued event out from syscall exit path.
From: Benjamin Herrenschmidt @ 2014-01-14 20:17 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Mahesh J Salgaonkar, linuxppc-dev
In-Reply-To: <alpine.LSU.2.11.1401141135240.3762@eggly.anvils>

On Tue, 2014-01-14 at 11:48 -0800, Hugh Dickins wrote:
> On Tue, 14 Jan 2014, Mahesh J Salgaonkar wrote:
> > From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> > 
> > Huge Dickins reported an issue that b5ff4211a829
> > "powerpc/book3s: Queue up and process delayed MCE events" breaks the
> > PowerMac G5 boot. This patch fixes it by moving the mce even processing
> > away from syscall exit, which was wrong to do that in first place, and
> > using irq work framework to delay processing of mce event.
> > 
> > Reported-by: Hugh Dickins <hughd@google.com
> > Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> 
> This version also boots and runs fine for me on the G5
> (but of course, I'm probably not testing delayed MCE events at all).

Thanks Hugh !

Cheers,
Ben.

> Hugh
> 
> > ---
> >  arch/powerpc/include/asm/mce.h |    1 -
> >  arch/powerpc/kernel/entry_64.S |    5 -----
> >  arch/powerpc/kernel/mce.c      |   13 ++++++++++---
> >  3 files changed, 10 insertions(+), 9 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
> > index 2257d1e..f97d8cb 100644
> > --- a/arch/powerpc/include/asm/mce.h
> > +++ b/arch/powerpc/include/asm/mce.h
> > @@ -192,7 +192,6 @@ extern void save_mce_event(struct pt_regs *regs, long handled,
> >  extern int get_mce_event(struct machine_check_event *mce, bool release);
> >  extern void release_mce_event(void);
> >  extern void machine_check_queue_event(void);
> > -extern void machine_check_process_queued_event(void);
> >  extern void machine_check_print_event_info(struct machine_check_event *evt);
> >  extern uint64_t get_mce_fault_addr(struct machine_check_event *evt);
> >  
> > diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> > index 770d6d6..bbfb029 100644
> > --- a/arch/powerpc/kernel/entry_64.S
> > +++ b/arch/powerpc/kernel/entry_64.S
> > @@ -184,11 +184,6 @@ syscall_exit:
> >  	bl	.do_show_syscall_exit
> >  	ld	r3,RESULT(r1)
> >  #endif
> > -#ifdef CONFIG_PPC_BOOK3S_64
> > -BEGIN_FTR_SECTION
> > -	bl	.machine_check_process_queued_event
> > -END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
> > -#endif
> >  	CURRENT_THREAD_INFO(r12, r1)
> >  
> >  	ld	r8,_MSR(r1)
> > diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
> > index d6edf2b..a7fd4cb 100644
> > --- a/arch/powerpc/kernel/mce.c
> > +++ b/arch/powerpc/kernel/mce.c
> > @@ -26,6 +26,7 @@
> >  #include <linux/ptrace.h>
> >  #include <linux/percpu.h>
> >  #include <linux/export.h>
> > +#include <linux/irq_work.h>
> >  #include <asm/mce.h>
> >  
> >  static DEFINE_PER_CPU(int, mce_nest_count);
> > @@ -35,6 +36,11 @@ static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT], mce_event);
> >  static DEFINE_PER_CPU(int, mce_queue_count);
> >  static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT], mce_event_queue);
> >  
> > +static void machine_check_process_queued_event(struct irq_work *work);
> > +struct irq_work mce_event_process_work = {
> > +        .func = machine_check_process_queued_event,
> > +};
> > +
> >  static void mce_set_error_info(struct machine_check_event *mce,
> >  			       struct mce_error_info *mce_err)
> >  {
> > @@ -185,17 +191,19 @@ void machine_check_queue_event(void)
> >  		return;
> >  	}
> >  	__get_cpu_var(mce_event_queue[index]) = evt;
> > +
> > +	/* Queue irq work to process this event later. */
> > +	irq_work_queue(&mce_event_process_work);
> >  }
> >  
> >  /*
> >   * process pending MCE event from the mce event queue. This function will be
> >   * called during syscall exit.
> >   */
> > -void machine_check_process_queued_event(void)
> > +static void machine_check_process_queued_event(struct irq_work *work)
> >  {
> >  	int index;
> >  
> > -	preempt_disable();
> >  	/*
> >  	 * For now just print it to console.
> >  	 * TODO: log this error event to FSP or nvram.
> > @@ -206,7 +214,6 @@ void machine_check_process_queued_event(void)
> >  				&__get_cpu_var(mce_event_queue[index]));
> >  		__get_cpu_var(mce_queue_count)--;
> >  	}
> > -	preempt_enable();
> >  }
> >  
> >  void machine_check_print_event_info(struct machine_check_event *evt)
> > 
> > 

^ permalink raw reply

* Re: [PATCH 3/4] powerpc: use subsys_initcall for Freescale Local Bus
From: Scott Wood @ 2014-01-14 22:56 UTC (permalink / raw)
  To: Paul Gortmaker; +Cc: Paul Mackerras, linuxppc-dev
In-Reply-To: <1389630113-7919-4-git-send-email-paul.gortmaker@windriver.com>

On Mon, 2014-01-13 at 11:21 -0500, Paul Gortmaker wrote:
> The FSL_SOC option is bool, and hence this code is either
> present or absent.  It will never be modular, so using
> module_init as an alias for __initcall is rather misleading.
> 
> Fix this up now, so that we can relocate module_init from
> init.h into module.h in the future.  If we don't do this, we'd
> have to add module.h to obviously non-modular code, and that
> would be a worse thing.
> 
> Note that direct use of __initcall is discouraged, vs. one
> of the priority categorized subgroups.  As __initcall gets
> mapped onto device_initcall, our use of subsys_initcall (which
> makes sense for bus code) will thus change this registration
> from level 6-device to level 4-subsys (i.e. slightly earlier).
> However no observable impact of that small difference has
> been observed during testing, or is expected.
> 
> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
> ---
>  arch/powerpc/sysdev/fsl_lbc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/sysdev/fsl_lbc.c b/arch/powerpc/sysdev/fsl_lbc.c
> index 6bc5a546d49f..9f00e5f84abe 100644
> --- a/arch/powerpc/sysdev/fsl_lbc.c
> +++ b/arch/powerpc/sysdev/fsl_lbc.c
> @@ -388,4 +388,4 @@ static int __init fsl_lbc_init(void)
>  {
>  	return platform_driver_register(&fsl_lbc_ctrl_driver);
>  }
> -module_init(fsl_lbc_init);
> +subsys_initcall(fsl_lbc_init);

Acked-by: Scott Wood <scottwood@freescale.com>

-Scott

^ permalink raw reply

* Re: [PATCH 3/3] powerpc/fsl: Use the new interface to save or restore registers
From: Scott Wood @ 2014-01-14 23:30 UTC (permalink / raw)
  To: Dongsheng Wang; +Cc: anton, linuxppc-dev, chenhui.zhao
In-Reply-To: <1389686397-46555-3-git-send-email-dongsheng.wang@freescale.com>

On Tue, 2014-01-14 at 15:59 +0800, Dongsheng Wang wrote:
> From: Wang Dongsheng <dongsheng.wang@freescale.com>
> 
> Use fsl_cpu_state_save/fsl_cpu_state_restore to save/restore registers.
> Use the functions to save/restore registers, so we don't need to
> maintain the code.
> 
> Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>

Is there any functional change with this patchset (e.g. suspend
supported on chips where it wasn't before), or is it just cleanup?  A
cover letter would be useful to describe the purpose of the overall
patchset when it isn't obvious.

> 
> diff --git a/arch/powerpc/kernel/swsusp_booke.S b/arch/powerpc/kernel/swsusp_booke.S
> index 553c140..b5992db 100644
> --- a/arch/powerpc/kernel/swsusp_booke.S
> +++ b/arch/powerpc/kernel/swsusp_booke.S
> @@ -4,92 +4,28 @@
>   * Copyright (c) 2009-2010 MontaVista Software, LLC.
>   */
>  
> -#include <linux/threads.h>
> -#include <asm/processor.h>
>  #include <asm/page.h>
> -#include <asm/cputable.h>
> -#include <asm/thread_info.h>
>  #include <asm/ppc_asm.h>
>  #include <asm/asm-offsets.h>
>  #include <asm/mmu.h>
> -
> -/*
> - * Structure for storing CPU registers on the save area.
> - */
> -#define SL_SP		0
> -#define SL_PC		4
> -#define SL_MSR		8
> -#define SL_TCR		0xc
> -#define SL_SPRG0	0x10
> -#define SL_SPRG1	0x14
> -#define SL_SPRG2	0x18
> -#define SL_SPRG3	0x1c
> -#define SL_SPRG4	0x20
> -#define SL_SPRG5	0x24
> -#define SL_SPRG6	0x28
> -#define SL_SPRG7	0x2c
> -#define SL_TBU		0x30
> -#define SL_TBL		0x34
> -#define SL_R2		0x38
> -#define SL_CR		0x3c
> -#define SL_LR		0x40
> -#define SL_R12		0x44	/* r12 to r31 */
> -#define SL_SIZE		(SL_R12 + 80)
> -
> -	.section .data
> -	.align	5
> -
> -_GLOBAL(swsusp_save_area)
> -	.space	SL_SIZE
> -
> +#include <asm/fsl_sleep.h>
>  
>  	.section .text
>  	.align	5
>  
>  _GLOBAL(swsusp_arch_suspend)
> -	lis	r11,swsusp_save_area@h
> -	ori	r11,r11,swsusp_save_area@l
> -
> -	mflr	r0
> -	stw	r0,SL_LR(r11)
> -	mfcr	r0
> -	stw	r0,SL_CR(r11)
> -	stw	r1,SL_SP(r11)
> -	stw	r2,SL_R2(r11)
> -	stmw	r12,SL_R12(r11)
> -
> -	/* Save MSR & TCR */
> -	mfmsr	r4
> -	stw	r4,SL_MSR(r11)
> -	mfspr	r4,SPRN_TCR
> -	stw	r4,SL_TCR(r11)
> -
> -	/* Get a stable timebase and save it */
> -1:	mfspr	r4,SPRN_TBRU
> -	stw	r4,SL_TBU(r11)
> -	mfspr	r5,SPRN_TBRL
> -	stw	r5,SL_TBL(r11)
> -	mfspr	r3,SPRN_TBRU
> -	cmpw	r3,r4
> -	bne	1b
> +	mflr	r15
> +	lis	r3, core_registers_save_area@h
> +	ori	r3, r3, core_registers_save_area@l
> +
> +	/* Save base register */
> +	li	r4, 0
> +	bl	fsl_cpu_state_save
>  
> -	/* Save SPRGs */
> -	mfspr	r4,SPRN_SPRG0
> -	stw	r4,SL_SPRG0(r11)
> -	mfspr	r4,SPRN_SPRG1
> -	stw	r4,SL_SPRG1(r11)
> -	mfspr	r4,SPRN_SPRG2
> -	stw	r4,SL_SPRG2(r11)
> -	mfspr	r4,SPRN_SPRG3
> -	stw	r4,SL_SPRG3(r11)
> -	mfspr	r4,SPRN_SPRG4
> -	stw	r4,SL_SPRG4(r11)
> -	mfspr	r4,SPRN_SPRG5
> -	stw	r4,SL_SPRG5(r11)
> -	mfspr	r4,SPRN_SPRG6
> -	stw	r4,SL_SPRG6(r11)
> -	mfspr	r4,SPRN_SPRG7
> -	stw	r4,SL_SPRG7(r11)
> +	/* Save LR */
> +	lis	r3, core_registers_save_area@h
> +	ori	r3, r3, core_registers_save_area@l
> +	stw	r15, SR_LR(r3)
>  
>  	/* Call the low level suspend stuff (we should probably have made
>  	 * a stackframe...
> @@ -97,11 +33,12 @@ _GLOBAL(swsusp_arch_suspend)
>  	bl	swsusp_save
>  
>  	/* Restore LR from the save area */
> -	lis	r11,swsusp_save_area@h
> -	ori	r11,r11,swsusp_save_area@l
> -	lwz	r0,SL_LR(r11)
> -	mtlr	r0
> +	lis	r3, core_registers_save_area@h
> +	ori	r3, r3, core_registers_save_area@l
> +	lwz	r15, SR_LR(r3)
> +	mtlr	r15
>  
> +	li	r3, 0
>  	blr
>  
>  _GLOBAL(swsusp_arch_resume)
> @@ -138,9 +75,6 @@ _GLOBAL(swsusp_arch_resume)
>  	bl flush_dcache_L1
>  	bl flush_instruction_cache
>  
> -	lis	r11,swsusp_save_area@h
> -	ori	r11,r11,swsusp_save_area@l
> -
>  	/*
>  	 * Mappings from virtual addresses to physical addresses may be
>  	 * different than they were prior to restoring hibernation state. 
> @@ -149,53 +83,12 @@ _GLOBAL(swsusp_arch_resume)
>  	 */
>  	bl	_tlbil_all
>  
> -	lwz	r4,SL_SPRG0(r11)
> -	mtspr	SPRN_SPRG0,r4
> -	lwz	r4,SL_SPRG1(r11)
> -	mtspr	SPRN_SPRG1,r4
> -	lwz	r4,SL_SPRG2(r11)
> -	mtspr	SPRN_SPRG2,r4
> -	lwz	r4,SL_SPRG3(r11)
> -	mtspr	SPRN_SPRG3,r4
> -	lwz	r4,SL_SPRG4(r11)
> -	mtspr	SPRN_SPRG4,r4
> -	lwz	r4,SL_SPRG5(r11)
> -	mtspr	SPRN_SPRG5,r4
> -	lwz	r4,SL_SPRG6(r11)
> -	mtspr	SPRN_SPRG6,r4
> -	lwz	r4,SL_SPRG7(r11)
> -	mtspr	SPRN_SPRG7,r4
> -
> -	/* restore the MSR */
> -	lwz	r3,SL_MSR(r11)
> -	mtmsr	r3
> -
> -	/* Restore TB */
> -	li	r3,0
> -	mtspr	SPRN_TBWL,r3
> -	lwz	r3,SL_TBU(r11)
> -	lwz	r4,SL_TBL(r11)
> -	mtspr	SPRN_TBWU,r3
> -	mtspr	SPRN_TBWL,r4
> -
> -	/* Restore TCR and clear any pending bits in TSR. */
> -	lwz	r4,SL_TCR(r11)
> -	mtspr	SPRN_TCR,r4
> -	lis	r4, (TSR_ENW | TSR_WIS | TSR_DIS | TSR_FIS)@h
> -	mtspr	SPRN_TSR,r4
> -
> -	/* Kick decrementer */
> -	li	r0,1
> -	mtdec	r0
> -
> -	/* Restore the callee-saved registers and return */
> -	lwz	r0,SL_CR(r11)
> -	mtcr	r0
> -	lwz	r2,SL_R2(r11)
> -	lmw	r12,SL_R12(r11)
> -	lwz	r1,SL_SP(r11)
> -	lwz	r0,SL_LR(r11)
> -	mtlr	r0
> +	lis	r3, core_registers_save_area@h
> +	ori	r3, r3, core_registers_save_area@l
> +
> +	/* Restore base register */
> +	li	r4, 0
> +	bl	fsl_cpu_state_restore

Why are you calling anything with "fsl" in the name from code that is
supposed to be for all booke?

-Scott

^ permalink raw reply

* Re: [PATCH 2/3] powerpc/85xx: Provide two functions to save/restore the core registers
From: Scott Wood @ 2014-01-14 23:50 UTC (permalink / raw)
  To: Dongsheng Wang; +Cc: anton, linuxppc-dev, chenhui.zhao
In-Reply-To: <1389686397-46555-2-git-send-email-dongsheng.wang@freescale.com>

On Tue, 2014-01-14 at 15:59 +0800, Dongsheng Wang wrote:
> From: Wang Dongsheng <dongsheng.wang@freescale.com>
> 
> Add fsl_cpu_state_save/fsl_cpu_state_restore functions, used for deep
> sleep and hibernation to save/restore core registers. We abstract out
> save/restore code for use in various modules, to make them don't need
> to maintain.
> 
> Currently supported processors type are E6500, E5500, E500MC, E500v2 and
> E500v1.
> 
> Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>

What is there that is specfic to a particular core type that can't be
handled from C code?

> +	/*
> +	 * Need to save float-point registers if MSR[FP] = 1.
> +	 */
> +	mfmsr	r12
> +	andi.	r12, r12, MSR_FP
> +	beq	1f
> +	do_sr_fpr_regs(save)

C code should have already ensured that MSR[FP] is not 1 (and thus the
FP context has been saved).

> +/*
> + * r3 = the virtual address of buffer
> + * r4 = suspend type, 0-BASE_SAVE, 1-ALL_SAVE

#define these magic numbers, and define what is meant by "base save"
versus "all save".

-Scott

^ permalink raw reply

* Re: [v6,2/5] powerpc/book3e: store crit/mc/dbg exception thread info
From: Scott Wood @ 2014-01-15  1:27 UTC (permalink / raw)
  To: Tiejun Chen; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <1382520685-11609-3-git-send-email-tiejun.chen@windriver.com>

On Wed, Oct 23, 2013 at 05:31:22PM +0800, Tiejun Chen wrote:
> We need to store thread info to these exception thread info like something
> we already did for PPC32.
> 
> Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
> 
> ---
> arch/powerpc/kernel/exceptions-64e.S |   22 +++++++++++++++++++---
>  1 file changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
> index 68d74b4..a55cf62 100644
> --- a/arch/powerpc/kernel/exceptions-64e.S
> +++ b/arch/powerpc/kernel/exceptions-64e.S
> @@ -36,6 +36,19 @@
>   */
>  #define	SPECIAL_EXC_FRAME_SIZE	INT_FRAME_SIZE
>  
> +/* Now we only store something to exception thread info */

Now as opposed to when?  Only as opposed to what else?

> +#define	EXC_LEVEL_EXCEPTION_PROLOG(type)				\

I'd prefer .macro over #define.

> +	ld	r14,PACAKSAVE(r13);					\
> +	CURRENT_THREAD_INFO(r14, r14);					\
> +	CURRENT_THREAD_INFO(r15, r1);					\
> +	ld	r10,TI_FLAGS(r14);		     			\
> +	std	r10,TI_FLAGS(r15);			     		\
> +	ld	r10,TI_PREEMPT(r14);		     			\
> +	std	r10,TI_PREEMPT(r15);		     			\
> +	ld	r10,TI_TASK(r14);			     		\
> +	std	r10,TI_TASK(r15);

This is a start, but we'll also need to save some more context to allow
TLB misses from within the exception (e.g. if a machine check handler or
GDB stub writes to a serial port, and the I/O registers aren't in the
TLB).  At a minimum I think we need to save SRR0, SRR1,
SPRN_SPRG_GEN_SCRATCH, SPRN_SPRG_TLB_SCRATCH, and the MAS registers. 
We'll also need to make the bolted TLB miss handlers capable of pointing
to different extables (though they won't need to auto-advance as the
original TLB miss handlers do -- we would advance SPRN_SPRG_TLB_EXFRAME
from this code), and the original TLB miss handlers will now need to
support more than 3 levels of nesting.

For the e6500 tablewalk TLB miss handler, we'll need to do something
special if we interrupt it when the lock is held, to revoke the lock and
return to code that retries.

Is there anything else I'm missing?

-Scott

^ permalink raw reply

* RE: [PATCH 3/3] powerpc/fsl: Use the new interface to save or restore registers
From: Dongsheng.Wang @ 2014-01-15  2:57 UTC (permalink / raw)
  To: Scott Wood
  Cc: anton@enomsg.org, linuxppc-dev@lists.ozlabs.org,
	chenhui.zhao@freescale.com
In-Reply-To: <1389742224.24905.140.camel@snotra.buserror.net>

DQoNCj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gRnJvbTogV29vZCBTY290dC1CMDc0
MjENCj4gU2VudDogV2VkbmVzZGF5LCBKYW51YXJ5IDE1LCAyMDE0IDc6MzAgQU0NCj4gVG86IFdh
bmcgRG9uZ3NoZW5nLUI0MDUzNA0KPiBDYzogYmVuaEBrZXJuZWwuY3Jhc2hpbmcub3JnOyBaaGFv
IENoZW5odWktQjM1MzM2OyBhbnRvbkBlbm9tc2cub3JnOyBsaW51eHBwYy0NCj4gZGV2QGxpc3Rz
Lm96bGFicy5vcmcNCj4gU3ViamVjdDogUmU6IFtQQVRDSCAzLzNdIHBvd2VycGMvZnNsOiBVc2Ug
dGhlIG5ldyBpbnRlcmZhY2UgdG8gc2F2ZSBvciByZXN0b3JlDQo+IHJlZ2lzdGVycw0KPiANCj4g
T24gVHVlLCAyMDE0LTAxLTE0IGF0IDE1OjU5ICswODAwLCBEb25nc2hlbmcgV2FuZyB3cm90ZToN
Cj4gPiBGcm9tOiBXYW5nIERvbmdzaGVuZyA8ZG9uZ3NoZW5nLndhbmdAZnJlZXNjYWxlLmNvbT4N
Cj4gPg0KPiA+IFVzZSBmc2xfY3B1X3N0YXRlX3NhdmUvZnNsX2NwdV9zdGF0ZV9yZXN0b3JlIHRv
IHNhdmUvcmVzdG9yZSByZWdpc3RlcnMuDQo+ID4gVXNlIHRoZSBmdW5jdGlvbnMgdG8gc2F2ZS9y
ZXN0b3JlIHJlZ2lzdGVycywgc28gd2UgZG9uJ3QgbmVlZCB0bw0KPiA+IG1haW50YWluIHRoZSBj
b2RlLg0KPiA+DQo+ID4gU2lnbmVkLW9mZi1ieTogV2FuZyBEb25nc2hlbmcgPGRvbmdzaGVuZy53
YW5nQGZyZWVzY2FsZS5jb20+DQo+IA0KPiBJcyB0aGVyZSBhbnkgZnVuY3Rpb25hbCBjaGFuZ2Ug
d2l0aCB0aGlzIHBhdGNoc2V0IChlLmcuIHN1c3BlbmQNCj4gc3VwcG9ydGVkIG9uIGNoaXBzIHdo
ZXJlIGl0IHdhc24ndCBiZWZvcmUpLCBvciBpcyBpdCBqdXN0IGNsZWFudXA/ICBBDQo+IGNvdmVy
IGxldHRlciB3b3VsZCBiZSB1c2VmdWwgdG8gZGVzY3JpYmUgdGhlIHB1cnBvc2Ugb2YgdGhlIG92
ZXJhbGwNCj4gcGF0Y2hzZXQgd2hlbiBpdCBpc24ndCBvYnZpb3VzLg0KPiANCg0KWWVzLCBqdXN0
IGNsZWFudXAuLg0KDQo+ID4gKw0KPiA+ICsJLyogUmVzdG9yZSBiYXNlIHJlZ2lzdGVyICovDQo+
ID4gKwlsaQlyNCwgMA0KPiA+ICsJYmwJZnNsX2NwdV9zdGF0ZV9yZXN0b3JlDQo+IA0KPiBXaHkg
YXJlIHlvdSBjYWxsaW5nIGFueXRoaW5nIHdpdGggImZzbCIgaW4gdGhlIG5hbWUgZnJvbSBjb2Rl
IHRoYXQgaXMNCj4gc3VwcG9zZWQgdG8gYmUgZm9yIGFsbCBib29rZT8NCj4gDQpFMjAwLCBFMzAw
IG5vdCBzdXBwb3J0Lg0KU3VwcG9ydCBFNTAwLCBFNTAwdjIsIEU1MDBNQywgRTU1MDAsIEU2NTAw
Lg0KDQpEbyB5b3UgaGF2ZSBhbnkgc3VnZ2VzdGlvbnMgYWJvdXQgdGhpcz8NCg0KVGhhbmtzLA0K
LURvbmdzaGVuZw0KDQo=

^ permalink raw reply

* RE: [PATCH 2/3] powerpc/85xx: Provide two functions to save/restore the core registers
From: Dongsheng.Wang @ 2014-01-15  3:30 UTC (permalink / raw)
  To: Scott Wood
  Cc: anton@enomsg.org, linuxppc-dev@lists.ozlabs.org,
	chenhui.zhao@freescale.com
In-Reply-To: <1389743456.24905.143.camel@snotra.buserror.net>

DQoNCj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gRnJvbTogV29vZCBTY290dC1CMDc0
MjENCj4gU2VudDogV2VkbmVzZGF5LCBKYW51YXJ5IDE1LCAyMDE0IDc6NTEgQU0NCj4gVG86IFdh
bmcgRG9uZ3NoZW5nLUI0MDUzNA0KPiBDYzogYmVuaEBrZXJuZWwuY3Jhc2hpbmcub3JnOyBaaGFv
IENoZW5odWktQjM1MzM2OyBhbnRvbkBlbm9tc2cub3JnOyBsaW51eHBwYy0NCj4gZGV2QGxpc3Rz
Lm96bGFicy5vcmcNCj4gU3ViamVjdDogUmU6IFtQQVRDSCAyLzNdIHBvd2VycGMvODV4eDogUHJv
dmlkZSB0d28gZnVuY3Rpb25zIHRvIHNhdmUvcmVzdG9yZSB0aGUNCj4gY29yZSByZWdpc3RlcnMN
Cj4gDQo+IE9uIFR1ZSwgMjAxNC0wMS0xNCBhdCAxNTo1OSArMDgwMCwgRG9uZ3NoZW5nIFdhbmcg
d3JvdGU6DQo+ID4gRnJvbTogV2FuZyBEb25nc2hlbmcgPGRvbmdzaGVuZy53YW5nQGZyZWVzY2Fs
ZS5jb20+DQo+ID4NCj4gPiBBZGQgZnNsX2NwdV9zdGF0ZV9zYXZlL2ZzbF9jcHVfc3RhdGVfcmVz
dG9yZSBmdW5jdGlvbnMsIHVzZWQgZm9yIGRlZXANCj4gPiBzbGVlcCBhbmQgaGliZXJuYXRpb24g
dG8gc2F2ZS9yZXN0b3JlIGNvcmUgcmVnaXN0ZXJzLiBXZSBhYnN0cmFjdCBvdXQNCj4gPiBzYXZl
L3Jlc3RvcmUgY29kZSBmb3IgdXNlIGluIHZhcmlvdXMgbW9kdWxlcywgdG8gbWFrZSB0aGVtIGRv
bid0IG5lZWQNCj4gPiB0byBtYWludGFpbi4NCj4gPg0KPiA+IEN1cnJlbnRseSBzdXBwb3J0ZWQg
cHJvY2Vzc29ycyB0eXBlIGFyZSBFNjUwMCwgRTU1MDAsIEU1MDBNQywgRTUwMHYyDQo+ID4gYW5k
IEU1MDB2MS4NCj4gPg0KPiA+IFNpZ25lZC1vZmYtYnk6IFdhbmcgRG9uZ3NoZW5nIDxkb25nc2hl
bmcud2FuZ0BmcmVlc2NhbGUuY29tPg0KPiANCj4gV2hhdCBpcyB0aGVyZSB0aGF0IGlzIHNwZWNm
aWMgdG8gYSBwYXJ0aWN1bGFyIGNvcmUgdHlwZSB0aGF0IGNhbid0IGJlIGhhbmRsZWQNCj4gZnJv
bSBDIGNvZGU/DQo+IA0KDQpJbiB0aGUgY29udGV4dCBvZiB0aGUgY2FsbGluZywgbWF5YmUgbm90
IGluIEMgZW52aXJvbm1lbnQuKERlZXAgc2xlZXAgd2l0aG91dA0KQyBlbnZpcm9ubWVudCB3aGVu
IGNhbGxpbmcgdGhvc2UgaW50ZXJmYWNlcykNCg0KPiA+ICsJLyoNCj4gPiArCSAqIE5lZWQgdG8g
c2F2ZSBmbG9hdC1wb2ludCByZWdpc3RlcnMgaWYgTVNSW0ZQXSA9IDEuDQo+ID4gKwkgKi8NCj4g
PiArCW1mbXNyCXIxMg0KPiA+ICsJYW5kaS4JcjEyLCByMTIsIE1TUl9GUA0KPiA+ICsJYmVxCTFm
DQo+ID4gKwlkb19zcl9mcHJfcmVncyhzYXZlKQ0KPiANCj4gQyBjb2RlIHNob3VsZCBoYXZlIGFs
cmVhZHkgZW5zdXJlZCB0aGF0IE1TUltGUF0gaXMgbm90IDEgKGFuZCB0aHVzIHRoZSBGUA0KPiBj
b250ZXh0IGhhcyBiZWVuIHNhdmVkKS4NCj4gDQoNClllcywgcmlnaHQuIEJ1dCBJIG1lYW4gaWYg
dGhlIEZQIHN0aWxsIHVzZSBpbiBjb3JlIHNhdmUgZmxvdywgd2UgbmVlZCB0byBzYXZlIGl0Lg0K
SW4gdGhpcyBwcm9jZXNzLCBpIGRvbid0IGNhcmUgd2hhdCBvdGhlciBjb2RlIGRvLCB3ZSBuZWVk
IHRvIGZvY3VzIG9uIG5vdCBsb3NpbmcNCnZhbHVhYmxlIGRhdGEuDQoNCj4gPiArLyoNCj4gPiAr
ICogcjMgPSB0aGUgdmlydHVhbCBhZGRyZXNzIG9mIGJ1ZmZlcg0KPiA+ICsgKiByNCA9IHN1c3Bl
bmQgdHlwZSwgMC1CQVNFX1NBVkUsIDEtQUxMX1NBVkUNCj4gDQo+ICNkZWZpbmUgdGhlc2UgbWFn
aWMgbnVtYmVycywgYW5kIGRlZmluZSB3aGF0IGlzIG1lYW50IGJ5ICJiYXNlIHNhdmUiDQo+IHZl
cnN1cyAiYWxsIHNhdmUiLg0KDQpPaywgdGhhbmtzLg0KDQotRG9uZ3NoZW5nDQoNCg==

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox