Linux Power Management development
 help / color / mirror / Atom feed
* idle patch queue for Linux 3.1
From: Len Brown @ 2011-08-03 19:44 UTC (permalink / raw)
  To: linux-pm, linux-kernel

Here is the idle patch queue for Linux 3.1
Please let me know if you see troubles with
any of these patches.

thanks,
-Len Brown, Intel Open Source Technology Center

^ permalink raw reply

* Re: [PATCH v4 1/2] Input: enable i8042-level wakeup control
From: Dmitry Torokhov @ 2011-08-03 19:38 UTC (permalink / raw)
  To: Daniel Drake; +Cc: linux-pm, dilinger, linux-input
In-Reply-To: <CAMLZHHQAjarobktUFiZF7aEVvLuaaAFSBuzrMj4pT_67BFssjA@mail.gmail.com>

On Wed, Aug 03, 2011 at 08:24:46PM +0100, Daniel Drake wrote:
> On Wed, Aug 3, 2011 at 7:43 PM, Dmitry Torokhov
> <dmitry.torokhov@gmail.com> wrote:
> > No, I do not believe we can do what you are doing in input core. You are
> > changing behavior for one platform and one (or 2 drivers) that will not
> > be matched (at least I don't think so) by anything else out there. What
> > will happen if you plug a random USB keyboard into OLPC box? Will it keep
> > leds powered?
> 
> In all other cases, behaviour is unchanged. The input device is not
> marked as wakeup capable, therefore the user cannot mark it as
> wakeup-enabled, therefore this code does not execute.
> 
> So a USB keyboard on an OLPC laptop would behave exactly as it does
> before this patch, as would devices on other systems.
> 
> The only way this functionality gets enabled is at the code level (as
> it does in patch 2/2), and then once the user has requested wakeups on
> all devices in the chain.

I believe we can and do mark devices such as USB as wakeup capable on
other arches, you do not have control here. That is why I am uneasy with
doing this in input core.

> 
> >> > Does the controller wakes up the system on key release or only press? My
> >> > concern is with cases when we suspend with a key pressed and wake up
> >> > with it already released.
> >>
> >> It wakes up on key press, but our EC buffers communication, so both
> >> the key press and key release event would be delivered in the above
> >> scenario.
> >
> > Just to confirm, we 3 events will be delivered in this case:
> 
> Sorry, I misunderstood the question and answered "what happens if the
> system is sleeping and you press and quickly release a key before the
> system is fully running" - in which case the system wakes up and
> delivers both events.
> 
> For the question you asked, regarding a key being pressed as the
> system goes into suspend, the system wakes up from suspend
> immediately.
> I just tested this by holding the 'i' key down while going into suspend.
> While the key is being held down, the system receives repeated
> interrupts with data 0x17 ('i' key pressed). The system suspends, but
> then immediately wakes up, and continues the flood of 0x17 interrupts.
> When I release the key a few seconds later, a 0x97 ('i' released)
> interrupt arrives.

Is there any keys that are not autorepeating. For example regular
(non-OLPC) laptops usually do not repeat suspend and other special keys.
In fact, they quite often forget to send release events for them ;)

> 
> I guess the more accurate way to describe the wakeup condition is that
> it wakes up on any i8042 interrupt.
> 
> > I also wonder what will happen with non-PS2 devices...
> 
> Again - other devices are not affected.
> 
> > Instead of wiring it all through input core could we contain this in
> > atkbd and hgpk by registering pm_notifiers and ignoring certain requests
> > from input/serio cores during system state transition on OLPC only?
> 
> I'll look into this, thanks for the suggestion.

Much appreciated, sorry for being a pain.

-- 
Dmitry

^ permalink raw reply

* Re: [PATCH v4 1/2] Input: enable i8042-level wakeup control
From: Daniel Drake @ 2011-08-03 19:24 UTC (permalink / raw)
  To: Dmitry Torokhov; +Cc: linux-pm, dilinger, linux-input
In-Reply-To: <20110803184332.GA17880@core.coreip.homeip.net>

On Wed, Aug 3, 2011 at 7:43 PM, Dmitry Torokhov
<dmitry.torokhov@gmail.com> wrote:
> No, I do not believe we can do what you are doing in input core. You are
> changing behavior for one platform and one (or 2 drivers) that will not
> be matched (at least I don't think so) by anything else out there. What
> will happen if you plug a random USB keyboard into OLPC box? Will it keep
> leds powered?

In all other cases, behaviour is unchanged. The input device is not
marked as wakeup capable, therefore the user cannot mark it as
wakeup-enabled, therefore this code does not execute.

So a USB keyboard on an OLPC laptop would behave exactly as it does
before this patch, as would devices on other systems.

The only way this functionality gets enabled is at the code level (as
it does in patch 2/2), and then once the user has requested wakeups on
all devices in the chain.

>> > Does the controller wakes up the system on key release or only press? My
>> > concern is with cases when we suspend with a key pressed and wake up
>> > with it already released.
>>
>> It wakes up on key press, but our EC buffers communication, so both
>> the key press and key release event would be delivered in the above
>> scenario.
>
> Just to confirm, we 3 events will be delivered in this case:

Sorry, I misunderstood the question and answered "what happens if the
system is sleeping and you press and quickly release a key before the
system is fully running" - in which case the system wakes up and
delivers both events.

For the question you asked, regarding a key being pressed as the
system goes into suspend, the system wakes up from suspend
immediately.
I just tested this by holding the 'i' key down while going into suspend.
While the key is being held down, the system receives repeated
interrupts with data 0x17 ('i' key pressed). The system suspends, but
then immediately wakes up, and continues the flood of 0x17 interrupts.
When I release the key a few seconds later, a 0x97 ('i' released)
interrupt arrives.

I guess the more accurate way to describe the wakeup condition is that
it wakes up on any i8042 interrupt.

> I also wonder what will happen with non-PS2 devices...

Again - other devices are not affected.

> Instead of wiring it all through input core could we contain this in
> atkbd and hgpk by registering pm_notifiers and ignoring certain requests
> from input/serio cores during system state transition on OLPC only?

I'll look into this, thanks for the suggestion.

Daniel

^ permalink raw reply

* Re: [PATCH v4 1/2] Input: enable i8042-level wakeup control
From: Dmitry Torokhov @ 2011-08-03 18:43 UTC (permalink / raw)
  To: Daniel Drake; +Cc: linux-pm, dilinger, linux-input
In-Reply-To: <CAMLZHHTTXvThE5bWfyjJu-UFJbaKAhzORKWV8ay4RZW-ZVifLg@mail.gmail.com>

On Wed, Aug 03, 2011 at 09:04:56AM +0100, Daniel Drake wrote:
> On Wed, Aug 3, 2011 at 7:59 AM, Dmitry Torokhov
> <dmitry.torokhov@gmail.com> wrote:
> > I am not sure that we should be marking input devices themselves as
> > wakeup capable - they are in no way physical devices. I'd stop at serio
> > level...
> 
> We need a way to make sure the device state is untouched during
> suspend/resume. Would you be happier if an input device looked at the
> device_may_wakeup() of its parent?

I think we should leave input core out of it completely, see below...

> 
> >>
> >> diff --git a/drivers/input/input.c b/drivers/input/input.c
> >> index da38d97..639aba7 100644
> >> --- a/drivers/input/input.c
> >> +++ b/drivers/input/input.c
> >> @@ -1588,6 +1588,9 @@ static int input_dev_suspend(struct device *dev)
> >>  {
> >>       struct input_dev *input_dev = to_input_dev(dev);
> >>
> >> +     if (device_may_wakeup(dev))
> >> +             return 0;
> >> +
> >
> > On suspend should we not try to shut off leds and sound?
> 
> Thats right. The setup is that the system appears to be running as
> normal, our display also remains on during suspend (and wifi, and so
> on). Bar the laptop's power LED which goes from always-on to flashing,
> the user is unaware that the laptop has suspended.

No, I do not believe we can do what you are doing in input core. You are
changing behavior for one platform and one (or 2 drivers) that will not
be matched (at least I don't think so) by anything else out there. What
will happen if you plug a random USB keyboard into OLPC box? Will it keep
leds powered?

> 
> >>       mutex_lock(&input_dev->mutex);
> >>
> >>       if (input_dev->users)
> >> @@ -1602,7 +1605,8 @@ static int input_dev_resume(struct device *dev)
> >>  {
> >>       struct input_dev *input_dev = to_input_dev(dev);
> >>
> >> -     input_reset_device(input_dev);
> >> +     if (!device_may_wakeup(dev))
> >> +             input_reset_device(input_dev);
> >>
> >
> > Does the controller wakes up the system on key release or only press? My
> > concern is with cases when we suspend with a key pressed and wake up
> > with it already released.
> 
> It wakes up on key press, but our EC buffers communication, so both
> the key press and key release event would be delivered in the above
> scenario.

Just to confirm, we 3 events will be delivered in this case:

1 - old key release
2 - wakeup key press
3 - wakeup key release

?

I also wonder what will happen with non-PS2 devices...

Instead of wiring it all through input core could we contain this in
atkbd and hgpk by registering pm_notifiers and ignoring certain requests
from input/serio cores during system state transition on OLPC only?

-- 
Dmitry

^ permalink raw reply

* Re: [PATCH v4 0/3] DEVFREQ, DVFS framework for non-CPU devices
From: Kevin Hilman @ 2011-08-03 18:33 UTC (permalink / raw)
  To: myungjoo.ham
  Cc: Len Brown, Greg Kroah-Hartman, Kyungmin Park, Thomas Gleixner,
	linux-pm
In-Reply-To: <CAJ0PZbTMn_pzFBWbhtMrMo9ZGsnwMNKENgMSDa=oOZCrGh03Gg@mail.gmail.com>

MyungJoo Ham <myungjoo.ham@samsung.com> writes:

> On Wed, Aug 3, 2011 at 7:02 AM, Kevin Hilman <khilman@ti.com> wrote:

[...]

>> Maybe I'm not understanding the usage of it fully, but that seems like
>> hard-coding policy into the framework that might not be appropriate.
>> For example, what if there are other devices with constraints such that
>> they cannot currently scale frequency/voltage?
>>
>> Mabye MyungJoo can explain in more detail the usecases for tickle?
>
> Tickle is not for QoS between devices. It is for faster reaction to
> (human) user inputs at DVFS side where waiting for DVFS's reaction
> takes too much time and reducing polling interval costs too much. 

This is exactly what quality of service (QoS) is about.

The user (whether it's a human user input or another device) has low
quality and expects higher quality.  It wants to request better quality,
so it needs a way to request it.   

The proposed "tickle" approach proposed here is simply a "request max
frequency for duration X" QoS request.

Kevin

^ permalink raw reply

* Re: [RFC PATCH V1 3/7] cpuidle: stop using pm_idle
From: Len Brown @ 2011-08-03 17:45 UTC (permalink / raw)
  To: Trinabh Gupta; +Cc: linuxppc-dev, linux-pm, linux-kernel
In-Reply-To: <20110607162947.6848.79430.stgit@tringupt.in.ibm.com>

On Tue, 7 Jun 2011, Trinabh Gupta wrote:

> From: Len Brown <len.brown@intel.com>
> 
> pm_idle does not scale as an idle handler registration mechanism.
> Don't use it for cpuidle.  Instead, call cpuidle directly, and
> allow architectures to use pm_idle as an arch-specific default
> if they need it.  ie.
> 
> cpu_idle()
> 	...
> 	if(cpuidle_call_idle())

Looks like you forgot to correct my typo that you pointed out earlier,
s/cpuidle_call_idle/cpuidle_idle_call/

both in the comment here and for arm and sh below.

Thanks for including the From: above, that is correct form.
But note in the future that when you modify somebody else's patch,
you should append a note about what you changed,
and also add your signed-off-by, so we can
track the changes.

thanks,
-Len

> 		pm_idle();
> 
> cc: x86@kernel.org
> cc: Kevin Hilman <khilman@deeprootsystems.com>
> cc: Paul Mundt <lethal@linux-sh.org>
> Signed-off-by: Len Brown <len.brown@intel.com>
> 
> ---
> 
>  arch/arm/kernel/process.c    |    4 +++-
>  arch/sh/kernel/idle.c        |    6 ++++--
>  arch/x86/kernel/process_32.c |    4 +++-
>  arch/x86/kernel/process_64.c |    4 +++-
>  drivers/cpuidle/cpuidle.c    |   39 ++++++++++++++++++---------------------
>  include/linux/cpuidle.h      |    2 ++
>  6 files changed, 33 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
> index 5e1e541..d7ee0d4 100644
> --- a/arch/arm/kernel/process.c
> +++ b/arch/arm/kernel/process.c
> @@ -30,6 +30,7 @@
>  #include <linux/uaccess.h>
>  #include <linux/random.h>
>  #include <linux/hw_breakpoint.h>
> +#include <linux/cpuidle.h>
>  
>  #include <asm/cacheflush.h>
>  #include <asm/leds.h>
> @@ -196,7 +197,8 @@ void cpu_idle(void)
>  				cpu_relax();
>  			} else {
>  				stop_critical_timings();
> -				pm_idle();
> +				if (cpuidle_call_idle())
> +					pm_idle();
>  				start_critical_timings();
>  				/*
>  				 * This will eventually be removed - pm_idle
> diff --git a/arch/sh/kernel/idle.c b/arch/sh/kernel/idle.c
> index 425d604..9c7099e 100644
> --- a/arch/sh/kernel/idle.c
> +++ b/arch/sh/kernel/idle.c
> @@ -16,12 +16,13 @@
>  #include <linux/thread_info.h>
>  #include <linux/irqflags.h>
>  #include <linux/smp.h>
> +#include <linux/cpuidle.h>
>  #include <asm/pgalloc.h>
>  #include <asm/system.h>
>  #include <asm/atomic.h>
>  #include <asm/smp.h>
>  
> -void (*pm_idle)(void) = NULL;
> +static void (*pm_idle)(void);
>  
>  static int hlt_counter;
>  
> @@ -100,7 +101,8 @@ void cpu_idle(void)
>  			local_irq_disable();
>  			/* Don't trace irqs off for idle */
>  			stop_critical_timings();
> -			pm_idle();
> +			if (cpuidle_call_idle())
> +				pm_idle();
>  			/*
>  			 * Sanity check to ensure that pm_idle() returns
>  			 * with IRQs enabled
> diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
> index 8d12878..61fadbe 100644
> --- a/arch/x86/kernel/process_32.c
> +++ b/arch/x86/kernel/process_32.c
> @@ -38,6 +38,7 @@
>  #include <linux/uaccess.h>
>  #include <linux/io.h>
>  #include <linux/kdebug.h>
> +#include <linux/cpuidle.h>
>  
>  #include <asm/pgtable.h>
>  #include <asm/system.h>
> @@ -109,7 +110,8 @@ void cpu_idle(void)
>  			local_irq_disable();
>  			/* Don't trace irqs off for idle */
>  			stop_critical_timings();
> -			pm_idle();
> +			if (cpuidle_idle_call())
> +				pm_idle();
>  			start_critical_timings();
>  		}
>  		tick_nohz_restart_sched_tick();
> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
> index 6c9dd92..62c219a 100644
> --- a/arch/x86/kernel/process_64.c
> +++ b/arch/x86/kernel/process_64.c
> @@ -37,6 +37,7 @@
>  #include <linux/uaccess.h>
>  #include <linux/io.h>
>  #include <linux/ftrace.h>
> +#include <linux/cpuidle.h>
>  
>  #include <asm/pgtable.h>
>  #include <asm/system.h>
> @@ -136,7 +137,8 @@ void cpu_idle(void)
>  			enter_idle();
>  			/* Don't trace irqs off for idle */
>  			stop_critical_timings();
> -			pm_idle();
> +			if (cpuidle_idle_call())
> +				pm_idle();
>  			start_critical_timings();
>  
>  			/* In many cases the interrupt that ended idle
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index 8d7303b..304e378 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -25,10 +25,10 @@ DEFINE_PER_CPU(struct cpuidle_device *, cpuidle_devices);
>  
>  DEFINE_MUTEX(cpuidle_lock);
>  LIST_HEAD(cpuidle_detected_devices);
> -static void (*pm_idle_old)(void);
>  
>  static int enabled_devices;
>  static int off __read_mostly;
> +static int initialized __read_mostly;
>  
>  int cpuidle_disabled(void)
>  {
> @@ -56,27 +56,24 @@ static int __cpuidle_register_device(struct cpuidle_device *dev);
>   * cpuidle_idle_call - the main idle loop
>   *
>   * NOTE: no locks or semaphores should be used here
> + * return non-zero on failure
>   */
> -static void cpuidle_idle_call(void)
> +int cpuidle_idle_call(void)
>  {
>  	struct cpuidle_device *dev = __this_cpu_read(cpuidle_devices);
>  	struct cpuidle_driver *drv = cpuidle_get_driver();
>  	struct cpuidle_state *target_state;
>  	int next_state, entered_state;
>  
> -	/* check if the device is ready */
> -	if (!dev || !dev->enabled) {
> -		if (pm_idle_old)
> -			pm_idle_old();
> -		else
> -#if defined(CONFIG_ARCH_HAS_DEFAULT_IDLE)
> -			default_idle();
> -#else
> -			local_irq_enable();
> -#endif
> -		return;
> -	}
> +	if (off)
> +		return -ENODEV;
> +
> +	if (!initialized)
> +		return -ENODEV;
>  
> +	/* check if the device is ready */
> +	if (!dev || !dev->enabled)
> +		return -EBUSY;
>  #if 0
>  	/* shows regressions, re-enable for 2.6.29 */
>  	/*
> @@ -90,7 +87,7 @@ static void cpuidle_idle_call(void)
>  	next_state = cpuidle_curr_governor->select(drv, dev);
>  	if (need_resched()) {
>  		local_irq_enable();
> -		return;
> +		return 0;
>  	}
>  
>  	target_state = &drv->states[next_state];
> @@ -116,6 +113,8 @@ static void cpuidle_idle_call(void)
>  	/* give the governor an opportunity to reflect on the outcome */
>  	if (cpuidle_curr_governor->reflect)
>  		cpuidle_curr_governor->reflect(dev, entered_state);
> +
> +	return 0;
>  }
>  
>  /**
> @@ -123,10 +122,10 @@ static void cpuidle_idle_call(void)
>   */
>  void cpuidle_install_idle_handler(void)
>  {
> -	if (enabled_devices && (pm_idle != cpuidle_idle_call)) {
> +	if (enabled_devices) {
>  		/* Make sure all changes finished before we switch to new idle */
>  		smp_wmb();
> -		pm_idle = cpuidle_idle_call;
> +		initialized = 1;
>  	}
>  }
>  
> @@ -135,8 +134,8 @@ void cpuidle_install_idle_handler(void)
>   */
>  void cpuidle_uninstall_idle_handler(void)
>  {
> -	if (enabled_devices && pm_idle_old && (pm_idle != pm_idle_old)) {
> -		pm_idle = pm_idle_old;
> +	if (enabled_devices) {
> +		initialized = 0;
>  		cpuidle_kick_cpus();
>  	}
>  }
> @@ -410,8 +409,6 @@ static int __init cpuidle_init(void)
>  	if (cpuidle_disabled())
>  		return -ENODEV;
>  
> -	pm_idle_old = pm_idle;
> -
>  	ret = cpuidle_add_class_sysfs(&cpu_sysdev_class);
>  	if (ret)
>  		return ret;
> diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
> index 2786787..c904188 100644
> --- a/include/linux/cpuidle.h
> +++ b/include/linux/cpuidle.h
> @@ -128,6 +128,7 @@ struct cpuidle_driver {
>  
>  #ifdef CONFIG_CPU_IDLE
>  extern void disable_cpuidle(void);
> +extern int cpuidle_idle_call(void);
>  
>  extern int cpuidle_register_driver(struct cpuidle_driver *drv);
>  struct cpuidle_driver *cpuidle_get_driver(void);
> @@ -142,6 +143,7 @@ extern void cpuidle_disable_device(struct cpuidle_device *dev);
>  
>  #else
>  static inline void disable_cpuidle(void) { }
> +static inline int cpuidle_idle_call(void) { return -ENODEV; }
>  
>  static inline int cpuidle_register_driver(struct cpuidle_driver *drv)
>  {return -ENODEV; }
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply

* Re: [PATCH] PM: add statistics sysfs file for suspend to ram
From: Pavel Machek @ 2011-08-03 17:37 UTC (permalink / raw)
  To: Greg KH; +Cc: Liu, ShuoX, linux-pm@lists.linux-foundation.org, Brown, Len
In-Reply-To: <20110804051745.GB12336@suse.de>



> > +   last_index = suspend_stats.last_failed + REC_FAILED_DEV_NUM - 1;
> > +   last_index %= REC_FAILED_DEV_NUM;
> > +   s += sprintf(s, "%s: %d\n%s: %d\n%s: %d\n%s: %d\n"
> > +           "%s: %d\n%s: %d\n%s: %d\n%s: %d\n",
> > +           "success", suspend_stats.success,
> > +           "fail", suspend_stats.fail,
> > +           "failed_freeze", suspend_stats.failed_freeze,
> > +           "failed_prepare", suspend_stats.failed_prepare,
> > +           "failed_suspend", suspend_stats.failed_suspend,
> > +           "failed_suspend_noirq",
> > +               suspend_stats.failed_suspend_noirq,
> > +           "failed_resume", suspend_stats.failed_resume,
> > +           "failed_resume_noirq",
> > +               suspend_stats.failed_resume_noirq);
> > +   s += sprintf(s, "failed_devs:\n  last_failed:\t%s\n",
> > +           suspend_stats.failed_devs[last_index]);
> > +   for (i = 1; i < REC_FAILED_DEV_NUM; i++) {
> > +       index = last_index + REC_FAILED_DEV_NUM - i;
> > +       index %= REC_FAILED_DEV_NUM;
> > +       s += sprintf(s, "\t\t%s\n",
> > +           suspend_stats.failed_devs[index]);
> > +   }
> 
> And, to top it all of, this is NOT allowed in sysfs at all.
> 
> Remember, sysfs is one text field per file.  Not something huge like
> this.
> 
> Perhaps you should use debugfs instead?

Or perhaps dmesg. NAK.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply

* Re: [PATCH v4 0/3] DEVFREQ, DVFS framework for non-CPU devices
From: Turquette, Mike @ 2011-08-03 17:31 UTC (permalink / raw)
  To: myungjoo.ham
  Cc: Len Brown, Greg Kroah-Hartman, Kyungmin Park, Thomas Gleixner,
	linux-pm
In-Reply-To: <CAJ0PZbTMn_pzFBWbhtMrMo9ZGsnwMNKENgMSDa=oOZCrGh03Gg@mail.gmail.com>

On Wed, Aug 3, 2011 at 12:03 AM, MyungJoo Ham <myungjoo.ham@samsung.com> wrote:
> On Wed, Aug 3, 2011 at 7:02 AM, Kevin Hilman <khilman@ti.com> wrote:
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>>
>>> On Friday, July 15, 2011, MyungJoo Ham wrote:
>>>> For a usage example, please look at
>>>> http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/devfreq
>>>>
>>>> In the above git tree, DVFS (dynamic voltage and frequency scaling) mechanism
>>>> is applied to the memory bus of Exynos4210 for Exynos4210-NURI boards.
>>>> In the example, the LPDDR2 DRAM frequency changes between 133, 266, and 400MHz
>>>> and other related clocks simply follow the determined DDR RAM clock.
>>>>
>>>> The DEVFREQ driver for Exynos4210 memory bus is at
>>>> /arch/arm/mach-exynos4/devfreq_bus.c in the git tree.
>>>>
>>>> MyungJoo Ham (3):
>>>>   PM: Introduce DEVFREQ: generic DVFS framework with device-specific
>>>>     OPPs
>>>>   PM / DEVFREQ: add example governors
>>>>   PM / DEVFREQ: add sysfs interface (including user tickling)
>>>
>>> OK, I'm going to take the patches for 3.2.
>>
>> Sorry for being late to the discussion, but personally I don't think
>> this should be merged for v3.2.
>>
>> First, I think the governor part of this series is basically fine, and
>> can see some cases where it would be useful, but as Mike has pointed
>> out, there is still a majority of devices for which a governor like this
>> would be overkill.
>>
>> My main problem is with the QoS aspects.  There is significant overlap
>> between this approach and the per-device PM QoS approach currently being
>> proposed by Jean Pihet, and I think any sort of per-device DVFS should
>> be built on top of a more generic per-device QoS layer (such as Jean's.)
>
> Then, what about adding a governor? Besides, DEVFREQ is designed to

No one is saying that you can't have a governor to control DVFS
transitions for your device.  The point is that the governor should
make a call to the per-device PM QoS DVFS API in it's ->target()
function.

The fundamental difference is this: you want devfreq to be the final
place where DVFS transitions actually occur (meaning clocks and
voltage rails get scaled).  Kevin and I are both arguing for devfreq
to be policy built on top of a dedicated DVFS API for managing those
transitions.

(Kevin, let me know if I'm putting words in your mouth)

> allow each device to create its own governors. The governors in the
> devfreq.c are meant to be used commonly by "many" devices.
>
> For PM QoS, we may have several approaches:
> 1. Adding a governor aware of PM QoS taking PM QoS parameters.
> 2. Modifying OPP to be aware of PM QoS so that only OPPs meeting the
> requirement are enabled if the PM QoS requirement specifies frequency
> or voltage conditions explicitly.
> 3. Modifying DEVFREQ to be aware of PM QoS and let DEVFREQ ignore
> frequencies that do not satisfy the PM QoS requirement. As in approach
> 2, the requirement also needs to be explicit about frequency / voltage
> conditions or we need to interpret devfreq_dev_status result in terms
> of PM QoS.
>
>>
>> This series currently provides a *very* basic QoS mechanism (e.g. fixed
>> duration frequency constraint) in the form of "tickle", which BTW I seem
>> to having a hard time understanding (more on that below...)
>>
>> More importantly though, this series also introduces a sysfs layer for
>> doing its QoS-like stuff, so adding this and then adding a more generic
>> per-device QoS is asking for confusion about how userspace is to do QoS.
>> And adding a sysfs interface may be turn out to be difficult to remove.
>>
>> Basically, without a more general constraints mechanism in place, I
>> don't see how this can be generally useful since there are too many
>> assumptions made with the current "tickle" approach, and as Mike has
>> pointed out, it cannot cleanly handle cases where there might be
>> multiple DVFS-related constraints on a given device.
>>
>> OK, back to "tickle"...  I haven't yet fully understood how that
>> interface is intended to be used, or who the potential users might be
>> and it is not documented in the code or changelog.  I also didn't see
>> any users of that API (except the sysfs code.)
>>
>> IIUC, tickle is just basically a way to set a frequency constraint on a
>> device for a fixed duration.  However, if tickle has been requested, any
>> OPP change will also force a change to the highest performance OPP
>> temporarily before changing to the target OPP.
>>
>> Maybe I'm not understanding the usage of it fully, but that seems like
>> hard-coding policy into the framework that might not be appropriate.
>> For example, what if there are other devices with constraints such that
>> they cannot currently scale frequency/voltage?
>>
>> Mabye MyungJoo can explain in more detail the usecases for tickle?
>
> Tickle is not for QoS between devices. It is for faster reaction to
> (human) user inputs at DVFS side where waiting for DVFS's reaction
> takes too much time and reducing polling interval costs too much. In
> fact, this tickling method was quite effective with CPUFREQ's ondemand
> governor (not upstreamed). We may tune DVFS constants to let it react
> faster with lower threshold; however, this results in higher power
> consumption with small load.

Do you think that your "tickle" method would go upstream for CPUfreq?
devfreq is analogous to CPUfreq is almost every way, so if that
solution would not be accepted for CPUfreq, then it should not be
accepted for devfreq either.

> Here goes more detailed description about the issue intended to be
> tackled by tickling, the response time of GUI. With DVFS (CPUFREQ), we
> have been suffering from slower user response time (e.g., dragging
> touch screen of mobile devices at "application drawer" or "web
> browsers"). Let's assume a system with a CPU that runs at the range of
> 100MHz ~ 2GHz and a GUI that requires at least1.5GHz for smooth
> transitions. If we are going to use 60Hz based display and 20ms
> CPUFREQ polling interval, a sudden user input requires 30ms of delay
> in average to get CPU to work at high speed, which loses 2 frames
> (often, it takes more time/frames for CPUFREQ to react; I don't know
> why). If a user repeatedly drags and stops the touch screen, such
> delay and lost frames become very noticeable and the screen follows
> user's finger retarded. With CPUFREQ-ondemand, tickling has been made
> such test to have almost same result with that of "performance"
> governor.

This can be achieved entirely with a DVFS framework that devices can
use to scale the CPU's frequency.  It requires no changes to CPUfreq
core or to the ondemand governor.  The CPUfreq driver makes a call to
the DVFS scaling API in it's ->target function, as do any other
devices which needs to hold a constraint against the CPU's clock speed
(in your case it sounds like the touchscreen driver would hold this
constraint).  Again, the "tickle" concept here is made redundant.

Regards,
Mike

> For the sysfs interface... we actually do not need sysfs interface for
> tickling if we are going to allow tickle function to the interrupt
> handler of HCI devices (touch screen, keyboards, mice, ...). However,
> it seems that many people don't like that idea; thus, I thought I need
> to allow userspace user-input handlers to tickle.
>
>>
>> Thanks,
>>
>> Kevin
>> _______________________________________________
>> linux-pm mailing list
>> linux-pm@lists.linux-foundation.org
>> https://lists.linux-foundation.org/mailman/listinfo/linux-pm
>>
>
>
>
> --
> MyungJoo Ham (함명주), Ph.D.
> Mobile Software Platform Lab,
> Digital Media and Communications (DMC) Business
> Samsung Electronics
> cell: 82-10-6714-2858
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/linux-pm
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/linux-pm

^ permalink raw reply

* Re: [RFC][PATCH] PM / Freezer: Freeze filesystems along with freezing processes (was: Re: PM / hibernate xfs lock up / xfs_reclaim_inodes_ag)
From: Pavel Machek @ 2011-08-03 17:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Christoph, Dave Chinner, LKML, xfs, Christoph Hellwig,
	Linux PM mailing list
In-Reply-To: <201108032315.06012.rjw@sisk.pl>

Hi!

> Freeze all filesystems during the freezing of tasks by calling
> freeze_bdev() for each of them and thaw them during the thawing
> of tasks with the help of thaw_bdev().
> 
> This is needed by hibernation, because some filesystems (e.g. XFS)
> deadlock with the preallocation of memory used by it if the memory
> pressure caused by it is too heavy.
> 
> The additional benefit of this change is that, if something goes
> wrong after filesystems have been frozen, they will stay in a
> consistent state and journal replays won't be necessary (e.g. after
> a failing suspend or resume).  In particular, this should help to
> solve a long-standing issue that in some cases during resume from
> hibernation the boot loader causes the journal to be replied for the
> filesystem containing the kernel image and initrd causing it to
> become inconsistent with the information stored in the hibernation
> image.

> +/**
> + * freeze_filesystems - Force all filesystems into a consistent state.
> + */
> +void freeze_filesystems(void)
> +{
> +	struct super_block *sb;
> +
> +	lockdep_off();

Ouch. So... why do we need to silence this?

> +	/*
> +	 * Freeze in reverse order so filesystems dependant upon others are
> +	 * frozen in the right order (eg. loopback on ext3).
> +	 */
> +	list_for_each_entry_reverse(sb, &super_blocks, s_list) {
> +		if (!sb->s_root || !sb->s_bdev ||
> +		    (sb->s_frozen == SB_FREEZE_TRANS) ||
> +		    (sb->s_flags & MS_RDONLY) ||
> +		    (sb->s_flags & MS_FROZEN))
> +			continue;

Should we stop NFS from modifying remote server, too?

Plus... ext3 writes to read-only filesystems on mount; not sure if it
does it later. But RDONLY means 'user cant write to it' not 'bdev will
not be modified'. Should we freeze all?

How can 'already frozen' happen?

> +	list_for_each_entry(sb, &super_blocks, s_list)
> +		if (sb->s_flags & MS_FROZEN) {
> +			sb->s_flags &= ~MS_FROZEN;
> +			thaw_bdev(sb->s_bdev, sb);
> +		}

...because we'll unfreeze it even if we did not freeze it...

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply

* Re: [PATCH v2] mrst_pmu: driver for Intel Moorestown Power Management Unit
From: Alan Cox @ 2011-08-03 12:48 UTC (permalink / raw)
  To: Len Brown; +Cc: linux-pm, x86, linux-kernel, Jesse Barnes
In-Reply-To: <alpine.LFD.2.02.1108021916430.25237@x980>

> This driver is essentially platform-speciif idle code.
> It has been in linux-next via my idle-test tree,
> and I can push it via that tree route w/ appropriate acks.

Looks good to me

Acked-by: Alan Cox <alan@linux.intel.com>

^ permalink raw reply

* Re: [PATCH v4 1/2] Input: enable i8042-level wakeup control
From: Wanlong Gao @ 2011-08-03  8:12 UTC (permalink / raw)
  To: Dmitry Torokhov; +Cc: linux-pm, dilinger, Daniel Drake, linux-input
In-Reply-To: <20110803065920.GC23399@core.coreip.homeip.net>

On 08/03/2011 02:59 PM, Dmitry Torokhov wrote:
> Hi Daniel,
>
> On Tue, Aug 02, 2011 at 04:49:15PM +0100, Daniel Drake wrote:
>> The OLPC XO laptop is able to use the PS/2 controller as a wakeup source.
>> When used as a wakeup source, the key press/mouse motion must not be lost.
>>
>> Add infrastructure to allow controllers to be marked as wakeup-capable,
>> and avoid doing power control/reset on the i8042/serio/input devices when
>> running in this mode. For systems where this functionality is available,
>> you are expected to enable wakeups on the i8042 device, the serio
>> devices, and the relevant input devices, to ensure that the hardware is
>> left powered and untouched throughout the suspend/resume.
>>
>> Signed-off-by: Daniel Drake<dsd@laptop.org>
>> ---
>>   drivers/input/input.c                 |    6 +++-
>>   drivers/input/keyboard/atkbd.c        |    4 ++-
>>   drivers/input/mouse/hgpk.c            |    2 +
>>   drivers/input/mouse/psmouse-base.c    |    4 ++-
>>   drivers/input/serio/i8042-io.h        |    4 ++
>>   drivers/input/serio/i8042-ip22io.h    |    4 ++
>>   drivers/input/serio/i8042-jazzio.h    |    4 ++
>>   drivers/input/serio/i8042-ppcio.h     |    4 ++
>>   drivers/input/serio/i8042-snirm.h     |    4 ++
>>   drivers/input/serio/i8042-sparcio.h   |    4 ++
>>   drivers/input/serio/i8042-x86ia64io.h |    4 ++
>>   drivers/input/serio/i8042.c           |   62 +++++++++++++++++++++++++++++---
>>   drivers/input/serio/serio.c           |   29 +++++++++++++--
>>   13 files changed, 122 insertions(+), 13 deletions(-)
>>

>>   	serio->port_data	= port;
>> -	serio->dev.parent	=&i8042_platform_device->dev;
>> +	serio->dev.parent	= parent;
>> +	device_set_wakeup_capable(&serio->dev, device_can_wakeup(parent));
>>   	if (idx<  0) {
>>   		strlcpy(serio->name, "i8042 AUX port", sizeof(serio->name));
>>   		strlcpy(serio->phys, I8042_AUX_PHYS_DESC, sizeof(serio->phys));
>> @@ -1403,6 +1450,9 @@ static int __init i8042_probe(struct platform_device *dev)
>>   		i8042_dritek_enable();
>>   #endif
>>
>> +	if (i8042_enable_wakeup)
>> +		device_set_wakeup_capable(&dev->dev, true);
>> +
>
> 	device_set_wakeup_capable(&dev->dev, i8042_enable_wakeup); ?
>
Hi Daniel:
	May be you missed this comment by Dmitry?

Thanks
Best Regards
Wanlong Gao

^ permalink raw reply

* Re: [PATCH v4 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
From: MyungJoo Ham @ 2011-08-03  8:06 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Kyungmin Park, Len Brown, linux-pm, Greg Kroah-Hartman,
	Thomas Gleixner
In-Reply-To: <87mxfrire4.fsf@ti.com>

On Wed, Aug 3, 2011 at 3:45 AM, Kevin Hilman <khilman@ti.com> wrote:
> MyungJoo Ham <myungjoo.ham@samsung.com> writes:
>
>> With OPPs, a device may have multiple operable frequency and voltage
>> sets. However, there can be multiple possible operable sets and a system
>> will need to choose one from them. In order to reduce the power
>> consumption (by reducing frequency and voltage) without affecting the
>> performance too much, a Dynamic Voltage and Frequency Scaling (DVFS)
>> scheme may be used.
>>
>> This patch introduces the DVFS capability to non-CPU devices with OPPs.
>> DVFS is a techique whereby the frequency and supplied voltage of a
>> device is adjusted on-the-fly. DVFS usually sets the frequency as low
>> as possible with given conditions (such as QoS assurance) and adjusts
>> voltage according to the chosen frequency in order to reduce power
>> consumption and heat dissipation.
>>
>> The generic DVFS for devices, DEVFREQ, may appear quite similar with
>> /drivers/cpufreq.  However, CPUFREQ does not allow to have multiple
>> devices registered and is not suitable to have multiple heterogenous
>> devices with different (but simple) governors.
>>
>> Normally, DVFS mechanism controls frequency based on the demand for
>> the device, and then, chooses voltage based on the chosen frequency.
>> DEVFREQ also controls the frequency based on the governor's frequency
>> recommendation and let OPP pick up the pair of frequency and voltage
>> based on the recommended frequency. Then, the chosen OPP is passed to
>> device driver's "target" callback.
>>
>> Tested with memory bus of Exynos4-NURI board.
>>
>> The test code with board support for Exynos4-NURI is at
>> http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/devfreq
>>
>> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>>
>> --
>> Thank you for your valuable comments, Rafael, Greg, Pavel, and Colin.
>>
>> Changed from v3
>> - In kerneldoc comments, DEVFREQ has ben replaced by devfreq
>
> FYI... there are still lots of kerneldoc comments in this version with
> DEVFREQ instead of devfreq, particularily in devfreq.h.
>
> Kevin
> _______________________________________________

Ah.. I'll correct them.

Besides, I've found that the private data for governor is better
located at struct devfreq than at struct devfreq_governor as such
private data is "per-device" and the same governor (with same memory
location) may be shared between different devices.


Cheers!
MyungJoo

-- 
MyungJoo Ham (함명주), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/linux-pm

^ permalink raw reply

* Re: [PATCH v4 1/2] Input: enable i8042-level wakeup control
From: Daniel Drake @ 2011-08-03  8:04 UTC (permalink / raw)
  To: Dmitry Torokhov; +Cc: linux-pm, dilinger, linux-input
In-Reply-To: <20110803065920.GC23399@core.coreip.homeip.net>

On Wed, Aug 3, 2011 at 7:59 AM, Dmitry Torokhov
<dmitry.torokhov@gmail.com> wrote:
> I am not sure that we should be marking input devices themselves as
> wakeup capable - they are in no way physical devices. I'd stop at serio
> level...

We need a way to make sure the device state is untouched during
suspend/resume. Would you be happier if an input device looked at the
device_may_wakeup() of its parent?

>>
>> diff --git a/drivers/input/input.c b/drivers/input/input.c
>> index da38d97..639aba7 100644
>> --- a/drivers/input/input.c
>> +++ b/drivers/input/input.c
>> @@ -1588,6 +1588,9 @@ static int input_dev_suspend(struct device *dev)
>>  {
>>       struct input_dev *input_dev = to_input_dev(dev);
>>
>> +     if (device_may_wakeup(dev))
>> +             return 0;
>> +
>
> On suspend should we not try to shut off leds and sound?

Thats right. The setup is that the system appears to be running as
normal, our display also remains on during suspend (and wifi, and so
on). Bar the laptop's power LED which goes from always-on to flashing,
the user is unaware that the laptop has suspended.

>>       mutex_lock(&input_dev->mutex);
>>
>>       if (input_dev->users)
>> @@ -1602,7 +1605,8 @@ static int input_dev_resume(struct device *dev)
>>  {
>>       struct input_dev *input_dev = to_input_dev(dev);
>>
>> -     input_reset_device(input_dev);
>> +     if (!device_may_wakeup(dev))
>> +             input_reset_device(input_dev);
>>
>
> Does the controller wakes up the system on key release or only press? My
> concern is with cases when we suspend with a key pressed and wake up
> with it already released.

It wakes up on key press, but our EC buffers communication, so both
the key press and key release event would be delivered in the above
scenario.

>> @@ -87,6 +87,7 @@ MODULE_PARM_DESC(debug, "Turn i8042 debugging mode on and off");
>>  #endif
>>
>>  static bool i8042_bypass_aux_irq_test;
>> +static bool i8042_enable_wakeup;
>>
>>  #include "i8042.h"
>>
>> @@ -1082,10 +1083,17 @@ static void i8042_dritek_enable(void)
>>   * before suspending.
>>   */
>>
>> -static int i8042_controller_resume(bool force_reset)
>> +static int i8042_controller_resume(bool force_reset, bool soft_resume)
>>  {
>>       int error;
>>
>> +     /*
>> +      * If device is selected as a wakeup source, it was not powered down
>> +      * or reset during suspend, so we have very little to do.
>> +      */
>> +     if (soft_resume)
>> +             goto soft;
>
> Maybe instead of adding a parameter simply not call the function and
> move i8042_interrupt as well?

OK

>>  static int i8042_pm_resume(struct device *dev)
>>  {
>>       /*
>>        * On resume from S2R we always try to reset the controller
>>        * to bring it in a sane state. (In case of S2D we expect
>>        * BIOS to reset the controller for us.)
>> +      * This function call will also handle any pending keypress event
>> +      * (perhaps the system wakeup reason)
>> +      */
>> +     int r = i8042_controller_resume(true, device_may_wakeup(dev));
>> +
>> +     /* If the device was left running during suspend, enable IRQs again
>> +      * now. Must be done last to avoid races with interrupt processing
>> +      * inside i8042_controller_resume.
>>        */
>
> Hmm, we have proper locking so I am not sure how true this statement is.

What happens is that the event used to wake up the system is not
delivered as an interrupt (consistent with the fact that the resume
handler already calls i8042_interrupt() manually). However, there is a
race window. If the system was woken up with a key press, and then the
key was released during early resume, the key press event will not be
presented as an interrupt, but the key release event will. That
interrupt can arrive before i8042_controller_resume() reaches
i8042_interrupt(), which would result in those 2 events being
processed in the wrong order. I'll make this clearer in the comment.

Thanks for the fast review comments!
Daniel

^ permalink raw reply

* Re: [PATCH v4 0/3] DEVFREQ, DVFS framework for non-CPU devices
From: MyungJoo Ham @ 2011-08-03  7:03 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Len Brown, Greg Kroah-Hartman, Kyungmin Park, Thomas Gleixner,
	linux-pm
In-Reply-To: <87livba2w6.fsf@ti.com>

On Wed, Aug 3, 2011 at 7:02 AM, Kevin Hilman <khilman@ti.com> wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>
>> On Friday, July 15, 2011, MyungJoo Ham wrote:
>>> For a usage example, please look at
>>> http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/devfreq
>>>
>>> In the above git tree, DVFS (dynamic voltage and frequency scaling) mechanism
>>> is applied to the memory bus of Exynos4210 for Exynos4210-NURI boards.
>>> In the example, the LPDDR2 DRAM frequency changes between 133, 266, and 400MHz
>>> and other related clocks simply follow the determined DDR RAM clock.
>>>
>>> The DEVFREQ driver for Exynos4210 memory bus is at
>>> /arch/arm/mach-exynos4/devfreq_bus.c in the git tree.
>>>
>>> MyungJoo Ham (3):
>>>   PM: Introduce DEVFREQ: generic DVFS framework with device-specific
>>>     OPPs
>>>   PM / DEVFREQ: add example governors
>>>   PM / DEVFREQ: add sysfs interface (including user tickling)
>>
>> OK, I'm going to take the patches for 3.2.
>
> Sorry for being late to the discussion, but personally I don't think
> this should be merged for v3.2.
>
> First, I think the governor part of this series is basically fine, and
> can see some cases where it would be useful, but as Mike has pointed
> out, there is still a majority of devices for which a governor like this
> would be overkill.
>
> My main problem is with the QoS aspects.  There is significant overlap
> between this approach and the per-device PM QoS approach currently being
> proposed by Jean Pihet, and I think any sort of per-device DVFS should
> be built on top of a more generic per-device QoS layer (such as Jean's.)

Then, what about adding a governor? Besides, DEVFREQ is designed to
allow each device to create its own governors. The governors in the
devfreq.c are meant to be used commonly by "many" devices.

For PM QoS, we may have several approaches:
1. Adding a governor aware of PM QoS taking PM QoS parameters.
2. Modifying OPP to be aware of PM QoS so that only OPPs meeting the
requirement are enabled if the PM QoS requirement specifies frequency
or voltage conditions explicitly.
3. Modifying DEVFREQ to be aware of PM QoS and let DEVFREQ ignore
frequencies that do not satisfy the PM QoS requirement. As in approach
2, the requirement also needs to be explicit about frequency / voltage
conditions or we need to interpret devfreq_dev_status result in terms
of PM QoS.

>
> This series currently provides a *very* basic QoS mechanism (e.g. fixed
> duration frequency constraint) in the form of "tickle", which BTW I seem
> to having a hard time understanding (more on that below...)
>
> More importantly though, this series also introduces a sysfs layer for
> doing its QoS-like stuff, so adding this and then adding a more generic
> per-device QoS is asking for confusion about how userspace is to do QoS.
> And adding a sysfs interface may be turn out to be difficult to remove.
>
> Basically, without a more general constraints mechanism in place, I
> don't see how this can be generally useful since there are too many
> assumptions made with the current "tickle" approach, and as Mike has
> pointed out, it cannot cleanly handle cases where there might be
> multiple DVFS-related constraints on a given device.
>
> OK, back to "tickle"...  I haven't yet fully understood how that
> interface is intended to be used, or who the potential users might be
> and it is not documented in the code or changelog.  I also didn't see
> any users of that API (except the sysfs code.)
>
> IIUC, tickle is just basically a way to set a frequency constraint on a
> device for a fixed duration.  However, if tickle has been requested, any
> OPP change will also force a change to the highest performance OPP
> temporarily before changing to the target OPP.
>
> Maybe I'm not understanding the usage of it fully, but that seems like
> hard-coding policy into the framework that might not be appropriate.
> For example, what if there are other devices with constraints such that
> they cannot currently scale frequency/voltage?
>
> Mabye MyungJoo can explain in more detail the usecases for tickle?

Tickle is not for QoS between devices. It is for faster reaction to
(human) user inputs at DVFS side where waiting for DVFS's reaction
takes too much time and reducing polling interval costs too much. In
fact, this tickling method was quite effective with CPUFREQ's ondemand
governor (not upstreamed). We may tune DVFS constants to let it react
faster with lower threshold; however, this results in higher power
consumption with small load.


Here goes more detailed description about the issue intended to be
tackled by tickling, the response time of GUI. With DVFS (CPUFREQ), we
have been suffering from slower user response time (e.g., dragging
touch screen of mobile devices at "application drawer" or "web
browsers"). Let's assume a system with a CPU that runs at the range of
100MHz ~ 2GHz and a GUI that requires at least1.5GHz for smooth
transitions. If we are going to use 60Hz based display and 20ms
CPUFREQ polling interval, a sudden user input requires 30ms of delay
in average to get CPU to work at high speed, which loses 2 frames
(often, it takes more time/frames for CPUFREQ to react; I don't know
why). If a user repeatedly drags and stops the touch screen, such
delay and lost frames become very noticeable and the screen follows
user's finger retarded. With CPUFREQ-ondemand, tickling has been made
such test to have almost same result with that of "performance"
governor.


For the sysfs interface... we actually do not need sysfs interface for
tickling if we are going to allow tickle function to the interrupt
handler of HCI devices (touch screen, keyboards, mice, ...). However,
it seems that many people don't like that idea; thus, I thought I need
to allow userspace user-input handlers to tickle.

>
> Thanks,
>
> Kevin
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/linux-pm
>



-- 
MyungJoo Ham (함명주), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/linux-pm

^ permalink raw reply

* Re: [PATCH v4 1/2] Input: enable i8042-level wakeup control
From: Dmitry Torokhov @ 2011-08-03  6:59 UTC (permalink / raw)
  To: Daniel Drake; +Cc: linux-pm, dilinger, linux-input
In-Reply-To: <20110802154916.624619D401C@zog.reactivated.net>

Hi Daniel,

On Tue, Aug 02, 2011 at 04:49:15PM +0100, Daniel Drake wrote:
> The OLPC XO laptop is able to use the PS/2 controller as a wakeup source.
> When used as a wakeup source, the key press/mouse motion must not be lost.
> 
> Add infrastructure to allow controllers to be marked as wakeup-capable,
> and avoid doing power control/reset on the i8042/serio/input devices when
> running in this mode. For systems where this functionality is available,
> you are expected to enable wakeups on the i8042 device, the serio
> devices, and the relevant input devices, to ensure that the hardware is
> left powered and untouched throughout the suspend/resume.
> 
> Signed-off-by: Daniel Drake <dsd@laptop.org>
> ---
>  drivers/input/input.c                 |    6 +++-
>  drivers/input/keyboard/atkbd.c        |    4 ++-
>  drivers/input/mouse/hgpk.c            |    2 +
>  drivers/input/mouse/psmouse-base.c    |    4 ++-
>  drivers/input/serio/i8042-io.h        |    4 ++
>  drivers/input/serio/i8042-ip22io.h    |    4 ++
>  drivers/input/serio/i8042-jazzio.h    |    4 ++
>  drivers/input/serio/i8042-ppcio.h     |    4 ++
>  drivers/input/serio/i8042-snirm.h     |    4 ++
>  drivers/input/serio/i8042-sparcio.h   |    4 ++
>  drivers/input/serio/i8042-x86ia64io.h |    4 ++
>  drivers/input/serio/i8042.c           |   62 +++++++++++++++++++++++++++++---
>  drivers/input/serio/serio.c           |   29 +++++++++++++--
>  13 files changed, 122 insertions(+), 13 deletions(-)
> 
> On original submission, Dmitry was worried about this functionality not
> working at all on other platforms. I agree, it will only work where the
> hardware has been specifically designed with this consideration. v2 of
> the patch therefore removes the module param option, meaning that it
> will only be activated on platforms that explicitly enable it at the
> code level.
> 
> v2 also performs a more extensive job. We avoid resetting the device
> at the input_device level during suspend/resume. We also disable i8042
> interrupts when going into suspend, to avoid races handling interrupts
> in the wrong order during resume.
> 
> v3 includes a cleanup suggested by Rafael, and explains the corner case
> that we have to handle in the input layer.
> 
> v4 changes direction a little: each layer now just looks at the wakeup
> capability of its own device, avoiding some of the earlier tree
> traversal. Userspace must now enable wakeups on 5 devices:
> 	i8042
> 	i8042/serio0
> 	i8042/serio0/input*
> 	i8042/serio1
> 	i8042/serio1/input*
> This matches the behaviour of USB devices, where both the device and the
> parent controller must have wakeups enabled for wakeup functionality
> to work. Suggested by Rafael J. Wysocki.

I am not sure that we should be marking input devices themselves as
wakeup capable - they are in no way physical devices. I'd stop at serio
level...

> 
> diff --git a/drivers/input/input.c b/drivers/input/input.c
> index da38d97..639aba7 100644
> --- a/drivers/input/input.c
> +++ b/drivers/input/input.c
> @@ -1588,6 +1588,9 @@ static int input_dev_suspend(struct device *dev)
>  {
>  	struct input_dev *input_dev = to_input_dev(dev);
>  
> +	if (device_may_wakeup(dev))
> +		return 0;
> +

On suspend should we not try to shut off leds and sound?

>  	mutex_lock(&input_dev->mutex);
>  
>  	if (input_dev->users)
> @@ -1602,7 +1605,8 @@ static int input_dev_resume(struct device *dev)
>  {
>  	struct input_dev *input_dev = to_input_dev(dev);
>  
> -	input_reset_device(input_dev);
> +	if (!device_may_wakeup(dev))
> +		input_reset_device(input_dev);
>  

Does the controller wakes up the system on key release or only press? My
concern is with cases when we suspend with a key pressed and wake up
with it already released.

>  	return 0;
>  }
> diff --git a/drivers/input/keyboard/atkbd.c b/drivers/input/keyboard/atkbd.c
> index 19cfc0c..4bb81c2 100644
> --- a/drivers/input/keyboard/atkbd.c
> +++ b/drivers/input/keyboard/atkbd.c
> @@ -1027,6 +1027,7 @@ static void atkbd_set_keycode_table(struct atkbd *atkbd)
>  static void atkbd_set_device_attrs(struct atkbd *atkbd)
>  {
>  	struct input_dev *input_dev = atkbd->dev;
> +	struct device *parent = &atkbd->ps2dev.serio->dev;
>  	int i;
>  
>  	if (atkbd->extra)
> @@ -1047,7 +1048,8 @@ static void atkbd_set_device_attrs(struct atkbd *atkbd)
>  	input_dev->id.product = atkbd->translated ? 1 : atkbd->set;
>  	input_dev->id.version = atkbd->id;
>  	input_dev->event = atkbd_event;
> -	input_dev->dev.parent = &atkbd->ps2dev.serio->dev;
> +	input_dev->dev.parent = parent;
> +	device_set_wakeup_capable(&input_dev->dev, device_can_wakeup(parent));
>  
>  	input_set_drvdata(input_dev, atkbd);
>  
> diff --git a/drivers/input/mouse/hgpk.c b/drivers/input/mouse/hgpk.c
> index 95577c1..d5dd990 100644
> --- a/drivers/input/mouse/hgpk.c
> +++ b/drivers/input/mouse/hgpk.c
> @@ -548,6 +548,8 @@ static void hgpk_setup_input_device(struct input_dev *input,
>  		input->phys = old_input->phys;
>  		input->id = old_input->id;
>  		input->dev.parent = old_input->dev.parent;
> +		device_set_wakeup_capable(&input->dev,
> +			device_can_wakeup(&old_input->dev));
>  	}
>  
>  	memset(input->evbit, 0, sizeof(input->evbit));
> diff --git a/drivers/input/mouse/psmouse-base.c b/drivers/input/mouse/psmouse-base.c
> index 3f74bae..de8ecd5 100644
> --- a/drivers/input/mouse/psmouse-base.c
> +++ b/drivers/input/mouse/psmouse-base.c
> @@ -1232,8 +1232,10 @@ static int psmouse_switch_protocol(struct psmouse *psmouse,
>  {
>  	const struct psmouse_protocol *selected_proto;
>  	struct input_dev *input_dev = psmouse->dev;
> +	struct device *parent = &psmouse->ps2dev.serio->dev;
>  
> -	input_dev->dev.parent = &psmouse->ps2dev.serio->dev;
> +	input_dev->dev.parent = parent;
> +	device_set_wakeup_capable(&input_dev->dev, device_can_wakeup(parent));
>  
>  	input_dev->evbit[0] = BIT_MASK(EV_KEY) | BIT_MASK(EV_REL);
>  	input_dev->keybit[BIT_WORD(BTN_MOUSE)] =
> diff --git a/drivers/input/serio/i8042-io.h b/drivers/input/serio/i8042-io.h
> index 5d48bb6..296633c 100644
> --- a/drivers/input/serio/i8042-io.h
> +++ b/drivers/input/serio/i8042-io.h
> @@ -92,4 +92,8 @@ static inline void i8042_platform_exit(void)
>  #endif
>  }
>  
> +static inline void i8042_platform_suspend(struct device *dev, bool may_wakeup)
> +{
> +}

This is ugly but I guess it's the best option without switching to
i8042_ops or something.

> +
>  #endif /* _I8042_IO_H */
> diff --git a/drivers/input/serio/i8042-ip22io.h b/drivers/input/serio/i8042-ip22io.h
> index ee1ad27..c5b76a4 100644
> --- a/drivers/input/serio/i8042-ip22io.h
> +++ b/drivers/input/serio/i8042-ip22io.h
> @@ -73,4 +73,8 @@ static inline void i8042_platform_exit(void)
>  #endif
>  }
>  
> +static inline void i8042_platform_suspend(struct device *dev, bool may_wakeup)
> +{
> +}
> +
>  #endif /* _I8042_IP22_H */
> diff --git a/drivers/input/serio/i8042-jazzio.h b/drivers/input/serio/i8042-jazzio.h
> index 13fd710..a11913a 100644
> --- a/drivers/input/serio/i8042-jazzio.h
> +++ b/drivers/input/serio/i8042-jazzio.h
> @@ -66,4 +66,8 @@ static inline void i8042_platform_exit(void)
>  #endif
>  }
>  
> +static inline void i8042_platform_suspend(struct device *dev, bool may_wakeup)
> +{
> +}
> +
>  #endif /* _I8042_JAZZ_H */
> diff --git a/drivers/input/serio/i8042-ppcio.h b/drivers/input/serio/i8042-ppcio.h
> index f708c75..c9f4292 100644
> --- a/drivers/input/serio/i8042-ppcio.h
> +++ b/drivers/input/serio/i8042-ppcio.h
> @@ -52,6 +52,10 @@ static inline void i8042_platform_exit(void)
>  {
>  }
>  
> +static inline void i8042_platform_suspend(struct device *dev, bool may_wakeup)
> +{
> +}
> +
>  #else
>  
>  #include "i8042-io.h"
> diff --git a/drivers/input/serio/i8042-snirm.h b/drivers/input/serio/i8042-snirm.h
> index 409a934..96d034f 100644
> --- a/drivers/input/serio/i8042-snirm.h
> +++ b/drivers/input/serio/i8042-snirm.h
> @@ -72,4 +72,8 @@ static inline void i8042_platform_exit(void)
>  
>  }
>  
> +static inline void i8042_platform_suspend(struct device *dev, bool may_wakeup)
> +{
> +}
> +
>  #endif /* _I8042_SNIRM_H */
> diff --git a/drivers/input/serio/i8042-sparcio.h b/drivers/input/serio/i8042-sparcio.h
> index 395a9af..e5381d3 100644
> --- a/drivers/input/serio/i8042-sparcio.h
> +++ b/drivers/input/serio/i8042-sparcio.h
> @@ -154,4 +154,8 @@ static inline void i8042_platform_exit(void)
>  }
>  #endif /* !CONFIG_PCI */
>  
> +static inline void i8042_platform_suspend(struct device *dev, bool may_wakeup)
> +{
> +}
> +
>  #endif /* _I8042_SPARCIO_H */
> diff --git a/drivers/input/serio/i8042-x86ia64io.h b/drivers/input/serio/i8042-x86ia64io.h
> index bb9f5d3..76b2e58 100644
> --- a/drivers/input/serio/i8042-x86ia64io.h
> +++ b/drivers/input/serio/i8042-x86ia64io.h
> @@ -875,6 +875,10 @@ static inline int i8042_pnp_init(void) { return 0; }
>  static inline void i8042_pnp_exit(void) { }
>  #endif
>  
> +static inline void i8042_platform_suspend(struct device *dev, bool may_wakeup)
> +{
> +}
> +
>  static int __init i8042_platform_init(void)
>  {
>  	int retval;
> diff --git a/drivers/input/serio/i8042.c b/drivers/input/serio/i8042.c
> index d37a48e..d275130 100644
> --- a/drivers/input/serio/i8042.c
> +++ b/drivers/input/serio/i8042.c
> @@ -87,6 +87,7 @@ MODULE_PARM_DESC(debug, "Turn i8042 debugging mode on and off");
>  #endif
>  
>  static bool i8042_bypass_aux_irq_test;
> +static bool i8042_enable_wakeup;
>  
>  #include "i8042.h"
>  
> @@ -1082,10 +1083,17 @@ static void i8042_dritek_enable(void)
>   * before suspending.
>   */
>  
> -static int i8042_controller_resume(bool force_reset)
> +static int i8042_controller_resume(bool force_reset, bool soft_resume)
>  {
>  	int error;
>  
> +	/*
> +	 * If device is selected as a wakeup source, it was not powered down
> +	 * or reset during suspend, so we have very little to do.
> +	 */
> +	if (soft_resume)
> +		goto soft;

Maybe instead of adding a parameter simply not call the function and
move i8042_interrupt as well?

> +
>  	error = i8042_controller_check();
>  	if (error)
>  		return error;
> @@ -1129,6 +1137,7 @@ static int i8042_controller_resume(bool force_reset)
>  	if (i8042_ports[I8042_KBD_PORT_NO].serio)
>  		i8042_enable_kbd_port();
>  
> +soft:
>  	i8042_interrupt(0, NULL);
>  
>  	return 0;
> @@ -1146,14 +1155,48 @@ static int i8042_pm_reset(struct device *dev)
>  	return 0;
>  }
>  
> +static int i8042_pm_suspend(struct device *dev)
> +{
> +	i8042_platform_suspend(dev, device_may_wakeup(dev));
> +
> +	/*
> +	 * If device is selected as a wakeup source, don't powerdown or reset
> +	 * during suspend. Just disable IRQs to ensure race-free resume.
> +	 */
> +	if (device_may_wakeup(dev)) {
> +		if (i8042_kbd_irq_registered)
> +			disable_irq(I8042_KBD_IRQ);
> +		if (i8042_aux_irq_registered)
> +			disable_irq(I8042_AUX_IRQ);
> +		return 0;
> +	}
> +
> +	return i8042_pm_reset(dev);
> +}
> +
>  static int i8042_pm_resume(struct device *dev)
>  {
>  	/*
>  	 * On resume from S2R we always try to reset the controller
>  	 * to bring it in a sane state. (In case of S2D we expect
>  	 * BIOS to reset the controller for us.)
> +	 * This function call will also handle any pending keypress event
> +	 * (perhaps the system wakeup reason)
> +	 */
> +	int r = i8042_controller_resume(true, device_may_wakeup(dev));
> +
> +	/* If the device was left running during suspend, enable IRQs again
> +	 * now. Must be done last to avoid races with interrupt processing
> +	 * inside i8042_controller_resume.
>  	 */

Hmm, we have proper locking so I am not sure how true this statement is.

> -	return i8042_controller_resume(true);
> +	if (device_may_wakeup(dev)) {
> +		if (i8042_kbd_irq_registered)
> +			enable_irq(I8042_KBD_IRQ);
> +		if (i8042_aux_irq_registered)
> +			enable_irq(I8042_AUX_IRQ);
> +	}
> +
> +	return r;
>  }
>  
>  static int i8042_pm_thaw(struct device *dev)
> @@ -1165,11 +1208,11 @@ static int i8042_pm_thaw(struct device *dev)
>  
>  static int i8042_pm_restore(struct device *dev)
>  {
> -	return i8042_controller_resume(false);
> +	return i8042_controller_resume(false, false);
>  }
>  
>  static const struct dev_pm_ops i8042_pm_ops = {
> -	.suspend	= i8042_pm_reset,
> +	.suspend	= i8042_pm_suspend,
>  	.resume		= i8042_pm_resume,
>  	.thaw		= i8042_pm_thaw,
>  	.poweroff	= i8042_pm_reset,
> @@ -1192,6 +1235,7 @@ static int __init i8042_create_kbd_port(void)
>  {
>  	struct serio *serio;
>  	struct i8042_port *port = &i8042_ports[I8042_KBD_PORT_NO];
> +	struct device *parent = &i8042_platform_device->dev;
>  
>  	serio = kzalloc(sizeof(struct serio), GFP_KERNEL);
>  	if (!serio)
> @@ -1203,7 +1247,8 @@ static int __init i8042_create_kbd_port(void)
>  	serio->stop		= i8042_stop;
>  	serio->close		= i8042_port_close;
>  	serio->port_data	= port;
> -	serio->dev.parent	= &i8042_platform_device->dev;
> +	serio->dev.parent	= parent;
> +	device_set_wakeup_capable(&serio->dev, device_can_wakeup(parent));
>  	strlcpy(serio->name, "i8042 KBD port", sizeof(serio->name));
>  	strlcpy(serio->phys, I8042_KBD_PHYS_DESC, sizeof(serio->phys));
>  
> @@ -1218,6 +1263,7 @@ static int __init i8042_create_aux_port(int idx)
>  	struct serio *serio;
>  	int port_no = idx < 0 ? I8042_AUX_PORT_NO : I8042_MUX_PORT_NO + idx;
>  	struct i8042_port *port = &i8042_ports[port_no];
> +	struct device *parent = &i8042_platform_device->dev;
>  
>  	serio = kzalloc(sizeof(struct serio), GFP_KERNEL);
>  	if (!serio)
> @@ -1228,7 +1274,8 @@ static int __init i8042_create_aux_port(int idx)
>  	serio->start		= i8042_start;
>  	serio->stop		= i8042_stop;
>  	serio->port_data	= port;
> -	serio->dev.parent	= &i8042_platform_device->dev;
> +	serio->dev.parent	= parent;
> +	device_set_wakeup_capable(&serio->dev, device_can_wakeup(parent));
>  	if (idx < 0) {
>  		strlcpy(serio->name, "i8042 AUX port", sizeof(serio->name));
>  		strlcpy(serio->phys, I8042_AUX_PHYS_DESC, sizeof(serio->phys));
> @@ -1403,6 +1450,9 @@ static int __init i8042_probe(struct platform_device *dev)
>  		i8042_dritek_enable();
>  #endif
>  
> +	if (i8042_enable_wakeup)
> +		device_set_wakeup_capable(&dev->dev, true);
> +

	device_set_wakeup_capable(&dev->dev, i8042_enable_wakeup); ?

>  	if (!i8042_noaux) {
>  		error = i8042_setup_aux();
>  		if (error && error != -ENODEV && error != -EBUSY)
> diff --git a/drivers/input/serio/serio.c b/drivers/input/serio/serio.c
> index ba70058..9e2fdb4 100644
> --- a/drivers/input/serio/serio.c
> +++ b/drivers/input/serio/serio.c
> @@ -931,7 +931,7 @@ static int serio_uevent(struct device *dev, struct kobj_uevent_env *env)
>  #endif /* CONFIG_HOTPLUG */
>  
>  #ifdef CONFIG_PM
> -static int serio_suspend(struct device *dev)
> +static int serio_poweroff(struct device *dev)
>  {
>  	struct serio *serio = to_serio_port(dev);
>  
> @@ -940,7 +940,16 @@ static int serio_suspend(struct device *dev)
>  	return 0;
>  }
>  
> -static int serio_resume(struct device *dev)
> +static int serio_suspend(struct device *dev)
> +{
> +	/* If configured as a wakeup source, don't power off. */
> +	if (device_may_wakeup(dev))
> +		return 0;
> +
> +	return serio_poweroff(dev);
> +}
> +
> +static int serio_restore(struct device *dev)
>  {
>  	struct serio *serio = to_serio_port(dev);
>  
> @@ -953,11 +962,23 @@ static int serio_resume(struct device *dev)
>  	return 0;
>  }
>  
> +static int serio_resume(struct device *dev)
> +{
> +	/*
> +	 * If configured as a wakeup source, we didn't power off during
> +	 * suspend, and hence have nothing to do.
> +	 */
> +	if (device_may_wakeup(dev))
> +		return 0;
> +
> +	return serio_restore(dev);
> +}
> +
>  static const struct dev_pm_ops serio_pm_ops = {
>  	.suspend	= serio_suspend,
>  	.resume		= serio_resume,
> -	.poweroff	= serio_suspend,
> -	.restore	= serio_resume,
> +	.poweroff	= serio_poweroff,
> +	.restore	= serio_restore,
>  };
>  #endif /* CONFIG_PM */
>  
> -- 
> 1.7.6
> 

-- 
Dmitry

^ permalink raw reply

* Re: [PATCH v4 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
From: MyungJoo Ham @ 2011-08-03  6:02 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Kyungmin Park, Len Brown, linux-pm, Greg Kroah-Hartman,
	Thomas Gleixner
In-Reply-To: <877h6vbhq1.fsf@ti.com>

On Wed, Aug 3, 2011 at 6:56 AM, Kevin Hilman <khilman@ti.com> wrote:
> MyungJoo Ham <myungjoo.ham@samsung.com> writes:
>
>> With OPPs, a device may have multiple operable frequency and voltage
>> sets. However, there can be multiple possible operable sets and a system
>> will need to choose one from them. In order to reduce the power
>> consumption (by reducing frequency and voltage) without affecting the
>> performance too much, a Dynamic Voltage and Frequency Scaling (DVFS)
>> scheme may be used.
>>
>> This patch introduces the DVFS capability to non-CPU devices with OPPs.
>> DVFS is a techique whereby the frequency and supplied voltage of a
>> device is adjusted on-the-fly. DVFS usually sets the frequency as low
>> as possible with given conditions (such as QoS assurance) and adjusts
>> voltage according to the chosen frequency in order to reduce power
>> consumption and heat dissipation.
>>
>> The generic DVFS for devices, DEVFREQ, may appear quite similar with
>> /drivers/cpufreq.  However, CPUFREQ does not allow to have multiple
>> devices registered and is not suitable to have multiple heterogenous
>> devices with different (but simple) governors.
>>
>> Normally, DVFS mechanism controls frequency based on the demand for
>> the device, and then, chooses voltage based on the chosen frequency.
>> DEVFREQ also controls the frequency based on the governor's frequency
>> recommendation and let OPP pick up the pair of frequency and voltage
>> based on the recommended frequency. Then, the chosen OPP is passed to
>> device driver's "target" callback.
>>
>> Tested with memory bus of Exynos4-NURI board.
>>
>> The test code with board support for Exynos4-NURI is at
>> http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/devfreq
>>
>> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>
> [...]
>
>> +int devfreq_update(struct device *dev)
>> +{
>> +     struct devfreq *devfreq;
>> +     int err = 0;
>> +
>> +     mutex_lock(&devfreq_list_lock);
>> +
>> +     devfreq = find_device_devfreq(dev);
>> +     if (IS_ERR(devfreq)) {
>> +             err = PTR_ERR(devfreq);
>> +             goto out;
>> +     }
>> +
>> +     /*
>> +      * If the maximum frequency available is changed either by
>> +      * enabling higher frequency or disabling the current
>> +      * maximum frequency, we need to adjust the frequency
>> +      * (tickle) again if the device has been being tickled.
>> +      */
>> +     if (devfreq->tickle) {
>> +             unsigned long freq = devfreq->profile->max_freq;
>> +             struct opp *opp = opp_find_freq_floor(devfreq->dev, &freq);
>> +
>> +             if (IS_ERR(opp)) {
>> +                     err = PTR_ERR(opp);
>> +                     goto out;
>> +             }
>> +
>> +             /* Max freq available is not changed */
>> +             if (devfreq->previous_freq == freq)
>> +                     goto out;
>> +
>> +             /* Tickle again. Max freq available is changed */
>> +             err = devfreq->profile->target(devfreq->dev, opp);
>> +             if (!err)
>> +                     devfreq->previous_freq = freq;
>
> This looks an awful lot like _devfreq_tickle_device()... wondering why
> that is not called here.
>

The device is already tickled by a user and we do not need to update
the timing/delay information about the current tickle. We only need to
adjust frequency in case the maximum available frequency has been
changed.

Anyway, in order to reduce redundant code, I will modify
_devfreq_tickle_device to be able to be called by devfreq_update().
For now, if devfreq_update() uses _devfreq_tickle_device without
modification, it will calculate num_tickle incorrectly and setting
delay of _devfreq_tickle_device again is required. I will let
_devfreq_tickle_device take delay = 0 for devfreq_update().

Thanks.

- MyungJoo

>> +     } else {
>> +             /* Reevaluate the proper frequency */
>> +             err = devfreq_do(devfreq);
>> +     }
>> +
>> +out:
>> +     mutex_unlock(&devfreq_list_lock);
>> +     return err;
>> +}
>
> Kevin
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/linux-pm
>



-- 
MyungJoo Ham (함명주), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/linux-pm

^ permalink raw reply

* [PATCH v2] mrst_pmu: driver for Intel Moorestown Power Management Unit
From: Len Brown @ 2011-08-02 23:24 UTC (permalink / raw)
  To: Alan Cox, x86, Jesse Barnes; +Cc: linux-pm, linux-kernel
In-Reply-To: <alpine.LFD.2.02.1108011116160.23581@x980>

From: Len Brown <len.brown@intel.com>

The Moorestown (MRST) Power Management Unit (PMU) driver
directs the SOC power states in the "Langwell" south complex (SCU).

It hooks pci_platform_pm_ops[] and thus observes all PCI ".set_state"
requests.  For devices in the SC, the pmu driver translates those
PCI requests into the appropriate commands for the SCU.

The PMU driver helps implement S0i3, a deep system idle power idle state.
Entry into S0i3 is via cpuidle, just like regular processor c-states.
S0i3 depends on pre-conditions including uni-processor, graphics off,
and certain IO devices in the SC must be off.  If those pre-conditions
are met, then the PMU allows cpuidle to enter S0i3, otherwise such requests
are demoted, either to Atom C4 or Atom C6.

This driver is based on prototype work by Bruce Flemming,
Illyas Mansoor, Rajeev D. Muralidhar, Vishwesh M. Rudramuni,
Hari Seshadri and Sujith Thomas.  The current driver also
includes contributions from H. Peter Anvin, Arjan van de Ven,
Kristen Accardi, and Yong Wang.

Thanks for additional review feedback from Alan Cox and Randy Dunlap.

Signed-off-by: Len Brown <len.brown@intel.com>
---
This driver is essentially platform-speciif idle code.
It has been in linux-next via my idle-test tree,
and I can push it via that tree route w/ appropriate acks.

There are additional bits needed to use this driver,
including platform-specific PCI hooks, and platform-specific
S0i3 assembly language.

 MAINTAINERS                     |    6 +
 arch/x86/platform/mrst/Makefile |    1 +
 arch/x86/platform/mrst/pmu.c    |  817 +++++++++++++++++++++++++++++++++++++++
 arch/x86/platform/mrst/pmu.h    |  234 +++++++++++
 4 files changed, 1058 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/platform/mrst/pmu.c
 create mode 100644 arch/x86/platform/mrst/pmu.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 187282d..d37b387 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3344,6 +3344,12 @@ F:	drivers/net/ixgb/
 F:	drivers/net/ixgbe/
 F:	drivers/net/ixgbevf/
 
+INTEL MRST PMU DRIVER
+M:	Len Brown <len.brown@intel.com>
+L:	linux-pm@lists.linux-foundation.org
+S:	Supported
+F:	arch/x86/platform/mrst/pmu.*
+
 INTEL PRO/WIRELESS 2100 NETWORK CONNECTION SUPPORT
 L:	linux-wireless@vger.kernel.org
 S:	Orphan
diff --git a/arch/x86/platform/mrst/Makefile b/arch/x86/platform/mrst/Makefile
index f61ccdd..1ea3877 100644
--- a/arch/x86/platform/mrst/Makefile
+++ b/arch/x86/platform/mrst/Makefile
@@ -1,3 +1,4 @@
 obj-$(CONFIG_X86_MRST)		+= mrst.o
 obj-$(CONFIG_X86_MRST)		+= vrtc.o
 obj-$(CONFIG_EARLY_PRINTK_MRST)	+= early_printk_mrst.o
+obj-$(CONFIG_X86_MRST)		+= pmu.o
diff --git a/arch/x86/platform/mrst/pmu.c b/arch/x86/platform/mrst/pmu.c
new file mode 100644
index 0000000..c0cebfb
--- /dev/null
+++ b/arch/x86/platform/mrst/pmu.c
@@ -0,0 +1,817 @@
+/*
+ * mrst/pmu.c - driver for MRST Power Management Unit
+ *
+ * Copyright (c) 2011, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <linux/cpuidle.h>
+#include <linux/debugfs.h>
+#include <linux/delay.h>
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/seq_file.h>
+#include <linux/sfi.h>
+#include <asm/intel_scu_ipc.h>
+#include "pmu.h"
+
+#define IPCMSG_FW_REVISION	0xF4
+
+struct mrst_device {
+	u16 pci_dev_num;	/* DEBUG only */
+	u16 lss;
+	u16 latest_request;
+	unsigned int pci_state_counts[PCI_D3cold + 1]; /* DEBUG only */
+};
+
+/*
+ * comlete list of MRST PCI devices
+ */
+static struct mrst_device mrst_devs[] = {
+/*  0 */ { 0x0800, LSS_SPI0 },		/* Moorestown SPI Ctrl 0 */
+/*  1 */ { 0x0801, LSS_SPI1 },		/* Moorestown SPI Ctrl 1 */
+/*  2 */ { 0x0802, LSS_I2C0 },		/* Moorestown I2C 0 */
+/*  3 */ { 0x0803, LSS_I2C1 },		/* Moorestown I2C 1 */
+/*  4 */ { 0x0804, LSS_I2C2 },		/* Moorestown I2C 2 */
+/*  5 */ { 0x0805, LSS_KBD },		/* Moorestown Keyboard Ctrl */
+/*  6 */ { 0x0806, LSS_USB_HC },	/* Moorestown USB Ctrl */
+/*  7 */ { 0x0807, LSS_SD_HC0 },	/* Moorestown SD Host Ctrl 0 */
+/*  8 */ { 0x0808, LSS_SD_HC1 },	/* Moorestown SD Host Ctrl 1 */
+/*  9 */ { 0x0809, LSS_NAND },		/* Moorestown NAND Ctrl */
+/* 10 */ { 0x080a, LSS_AUDIO },		/* Moorestown Audio Ctrl */
+/* 11 */ { 0x080b, LSS_IMAGING },	/* Moorestown ISP */
+/* 12 */ { 0x080c, LSS_SECURITY },	/* Moorestown Security Controller */
+/* 13 */ { 0x080d, LSS_DISPLAY },	/* Moorestown External Displays */
+/* 14 */ { 0x080e, 0 },			/* Moorestown SCU IPC */
+/* 15 */ { 0x080f, LSS_GPIO },		/* Moorestown GPIO Controller */
+/* 16 */ { 0x0810, 0 },			/* Moorestown Power Management Unit */
+/* 17 */ { 0x0811, LSS_USB_OTG },	/* Moorestown OTG Ctrl */
+/* 18 */ { 0x0812, LSS_SPI2 },		/* Moorestown SPI Ctrl 2 */
+/* 19 */ { 0x0813, 0 },			/* Moorestown SC DMA */
+/* 20 */ { 0x0814, LSS_AUDIO_LPE },	/* Moorestown LPE DMA */
+/* 21 */ { 0x0815, LSS_AUDIO_SSP },	/* Moorestown SSP0 */
+
+/* 22 */ { 0x084F, LSS_SD_HC2 },	/* Moorestown SD Host Ctrl 2 */
+
+/* 23 */ { 0x4102, 0 },			/* Lincroft */
+/* 24 */ { 0x4110, 0 },			/* Lincroft */
+};
+
+/* n.b. We ignore PCI-id 0x815 in LSS9 b/c MeeGo has no driver for it */
+static u16 mrst_lss9_pci_ids[] = {0x080a, 0x0814, 0};
+static u16 mrst_lss10_pci_ids[] = {0x0800, 0x0801, 0x0802, 0x0803,
+					0x0804, 0x0805, 0x080f, 0};
+
+/* handle concurrent SMP invokations of pmu_pci_set_power_state() */
+static spinlock_t mrst_pmu_power_state_lock;
+
+static unsigned int wake_counters[MRST_NUM_LSS];	/* DEBUG only */
+static unsigned int pmu_irq_stats[INT_INVALID + 1];	/* DEBUG only */
+
+static int graphics_is_off;
+static int lss_s0i3_enabled;
+static bool mrst_pmu_s0i3_enable;
+
+/*  debug counters */
+static u32 pmu_wait_ready_calls;
+static u32 pmu_wait_ready_udelays;
+static u32 pmu_wait_ready_udelays_max;
+static u32 pmu_wait_done_calls;
+static u32 pmu_wait_done_udelays;
+static u32 pmu_wait_done_udelays_max;
+static u32 pmu_set_power_state_entry;
+static u32 pmu_set_power_state_send_cmd;
+
+static struct mrst_device *pci_id_2_mrst_dev(u16 pci_dev_num)
+{
+	int index = 0;
+
+	if ((pci_dev_num >= 0x0800) && (pci_dev_num <= 0x815))
+		index = pci_dev_num - 0x800;
+	else if (pci_dev_num == 0x084F)
+		index = 22;
+	else if (pci_dev_num == 0x4102)
+		index = 23;
+	else if (pci_dev_num == 0x4110)
+		index = 24;
+
+	if (pci_dev_num != mrst_devs[index].pci_dev_num) {
+		WARN_ONCE(1, FW_BUG "Unknown PCI device 0x%04X\n", pci_dev_num);
+		return 0;
+	}
+
+	return &mrst_devs[index];
+}
+
+/**
+ * mrst_pmu_validate_cstates
+ * @dev: cpuidle_device
+ *
+ * Certain states are not appropriate for governor to pick in some cases.
+ * This function will be called as cpuidle_device's prepare callback and
+ * thus tells governor to ignore such states when selecting the next state
+ * to enter.
+ */
+
+#define IDLE_STATE4_IS_C6	4
+#define IDLE_STATE5_IS_S0I3	5
+
+int mrst_pmu_invalid_cstates(void)
+{
+	int cpu = smp_processor_id();
+
+	/*
+	 * Demote to C4 if the PMU is busy.
+	 * Since LSS changes leave the busy bit clear...
+	 * busy means either the PMU is waiting for an ACK-C6 that
+	 * isn't coming due to an MWAIT that returned immediately;
+	 * or we returned from S0i3 successfully, and the PMU
+	 * is not done sending us interrupts.
+	 */
+	if (pmu_read_busy_status())
+		return 1 << IDLE_STATE4_IS_C6 | 1 << IDLE_STATE5_IS_S0I3;
+
+	/*
+	 * Disallow S0i3 if: PMU is not initialized, or CPU1 is active,
+	 * or if device LSS is insufficient, or the GPU is active,
+	 * or if it has been explicitly disabled.
+	 */
+	if (!pmu_reg || !cpumask_equal(cpu_online_mask, cpumask_of(cpu)) ||
+	    !lss_s0i3_enabled || !graphics_is_off || !mrst_pmu_s0i3_enable)
+		return 1 << IDLE_STATE5_IS_S0I3;
+	else
+		return 0;
+}
+
+/*
+ * pmu_update_wake_counters(): read PM_WKS, update wake_counters[]
+ * DEBUG only.
+ */
+static void pmu_update_wake_counters(void)
+{
+	int lss;
+	u32 wake_status;
+
+	wake_status = pmu_read_wks();
+
+	for (lss = 0; lss < MRST_NUM_LSS; ++lss) {
+		if (wake_status & (1 << lss))
+			wake_counters[lss]++;
+	}
+}
+
+int mrst_pmu_s0i3_entry(void)
+{
+	int status;
+
+	/* Clear any possible error conditions */
+	pmu_write_ics(0x300);
+
+	/* set wake control to current D-states */
+	pmu_write_wssc(S0I3_SSS_TARGET);
+
+	status = mrst_s0i3_entry(PM_S0I3_COMMAND, &pmu_reg->pm_cmd);
+	pmu_update_wake_counters();
+	return status;
+}
+
+/* poll for maximum of 5ms for busy bit to clear */
+static int pmu_wait_ready(void)
+{
+	int udelays;
+
+	pmu_wait_ready_calls++;
+
+	for (udelays = 0; udelays < 5000; ++udelays) {
+		if (udelays > pmu_wait_ready_udelays_max)
+			pmu_wait_ready_udelays_max = udelays;
+
+		if (pmu_read_busy_status() == 0)
+			return 0;
+
+		udelay(10);
+		pmu_wait_ready_udelays++;
+	}
+
+	/*
+	 * if this fires, observe
+	 * /sys/kernel/debug/mrst_pmu_wait_ready_calls
+	 * /sys/kernel/debug/mrst_pmu_wait_ready_udelays
+	 */
+	WARN_ONCE(1, "SCU not ready for 5ms");
+	return -EBUSY;
+}
+/* poll for maximum of 50ms us for busy bit to clear */
+static int pmu_wait_done(void)
+{
+	int udelays;
+
+	pmu_wait_done_calls++;
+
+	for (udelays = 0; udelays < 5000; ++udelays) {
+		if (udelays > pmu_wait_done_udelays_max)
+			pmu_wait_done_udelays_max = udelays;
+
+		if (pmu_read_busy_status() == 0)
+			return 0;
+
+		udelay(100);
+		pmu_wait_done_udelays++;
+	}
+
+	/*
+	 * if this fires, observe
+	 * /sys/kernel/debug/mrst_pmu_wait_done_calls
+	 * /sys/kernel/debug/mrst_pmu_wait_done_udelays
+	 */
+	WARN_ONCE(1, "SCU not done for 50ms");
+	return -EBUSY;
+}
+
+u32 mrst_pmu_msi_is_disabled(void)
+{
+	return pmu_msi_is_disabled();
+}
+
+void mrst_pmu_enable_msi(void)
+{
+	pmu_msi_enable();
+}
+
+/**
+ * pmu_irq - pmu driver interrupt handler
+ * Context: interrupt context
+ */
+static irqreturn_t pmu_irq(int irq, void *dummy)
+{
+	union pmu_pm_ics pmu_ics;
+
+	pmu_ics.value = pmu_read_ics();
+
+	if (!pmu_ics.bits.pending)
+		return IRQ_NONE;
+
+	switch (pmu_ics.bits.cause) {
+	case INT_SPURIOUS:
+	case INT_CMD_DONE:
+	case INT_CMD_ERR:
+	case INT_WAKE_RX:
+	case INT_SS_ERROR:
+	case INT_S0IX_MISS:
+	case INT_NO_ACKC6:
+		pmu_irq_stats[pmu_ics.bits.cause]++;
+		break;
+	default:
+		pmu_irq_stats[INT_INVALID]++;
+	}
+
+	pmu_write_ics(pmu_ics.value); /* Clear pending interrupt */
+
+	return IRQ_HANDLED;
+}
+
+/*
+ * Translate PCI power management to MRST LSS D-states
+ */
+static int pci_2_mrst_state(int lss, pci_power_t pci_state)
+{
+	switch (pci_state) {
+	case PCI_D0:
+		if (SSMSK(D0i1, lss) & D0I1_ACG_SSS_TARGET)
+			return D0i1;
+		else
+			return D0;
+	case PCI_D1:
+		return D0i1;
+	case PCI_D2:
+		return D0i2;
+	case PCI_D3hot:
+	case PCI_D3cold:
+		return D0i3;
+	default:
+		WARN(1, "pci_state %d\n", pci_state);
+		return 0;
+	}
+}
+
+static int pmu_issue_command(u32 pm_ssc)
+{
+	union pmu_pm_set_cfg_cmd_t command;
+
+	if (pmu_read_busy_status()) {
+		pr_debug("pmu is busy, Operation not permitted\n");
+		return -1;
+	}
+
+	/*
+	 * enable interrupts in PMU so that interrupts are
+	 * propagated when ioc bit for a particular set
+	 * command is set
+	 */
+
+	pmu_irq_enable();
+
+	/* Configure the sub systems for pmu2 */
+
+	pmu_write_ssc(pm_ssc);
+
+	/*
+	 * Send the set config command for pmu its configured
+	 * for mode CM_IMMEDIATE & hence with No Trigger
+	 */
+
+	command.pmu2_params.d_param.cfg_mode = CM_IMMEDIATE;
+	command.pmu2_params.d_param.cfg_delay = 0;
+	command.pmu2_params.d_param.rsvd = 0;
+
+	/* construct the command to send SET_CFG to particular PMU */
+	command.pmu2_params.d_param.cmd = SET_CFG_CMD;
+	command.pmu2_params.d_param.ioc = 0;
+	command.pmu2_params.d_param.mode_id = 0;
+	command.pmu2_params.d_param.sys_state = SYS_STATE_S0I0;
+
+	/* write the value of PM_CMD into particular PMU */
+	pr_debug("pmu command being written %x\n",
+			command.pmu_pm_set_cfg_cmd_value);
+
+	pmu_write_cmd(command.pmu_pm_set_cfg_cmd_value);
+
+	return 0;
+}
+
+static u16 pmu_min_lss_pci_req(u16 *ids, u16 pci_state)
+{
+	u16 existing_request;
+	int i;
+
+	for (i = 0; ids[i]; ++i) {
+		struct mrst_device *mrst_dev;
+
+		mrst_dev = pci_id_2_mrst_dev(ids[i]);
+		if (unlikely(!mrst_dev))
+			continue;
+
+		existing_request = mrst_dev->latest_request;
+		if (existing_request < pci_state)
+			pci_state = existing_request;
+	}
+	return pci_state;
+}
+
+/**
+ * pmu_pci_set_power_state - Callback function is used by all the PCI devices
+ *			for a platform  specific device power on/shutdown.
+ */
+
+int pmu_pci_set_power_state(struct pci_dev *pdev, pci_power_t pci_state)
+{
+	u32 old_sss, new_sss;
+	int status = 0;
+	struct mrst_device *mrst_dev;
+
+	pmu_set_power_state_entry++;
+
+	BUG_ON(pdev->vendor != PCI_VENDOR_ID_INTEL);
+	BUG_ON(pci_state < PCI_D0 || pci_state > PCI_D3cold);
+
+	mrst_dev = pci_id_2_mrst_dev(pdev->device);
+	if (unlikely(!mrst_dev))
+		return -ENODEV;
+
+	mrst_dev->pci_state_counts[pci_state]++;	/* count invocations */
+
+	/* PMU driver calls self as part of PCI initialization, ignore */
+	if (pdev->device == PCI_DEV_ID_MRST_PMU)
+		return 0;
+
+	BUG_ON(!pmu_reg); /* SW bug if called before initialized */
+
+	spin_lock(&mrst_pmu_power_state_lock);
+
+	if (pdev->d3_delay) {
+		dev_dbg(&pdev->dev, "d3_delay %d, should be 0\n",
+			pdev->d3_delay);
+		pdev->d3_delay = 0;
+	}
+	/*
+	 * If Lincroft graphics, simply remember state
+	 */
+	if ((pdev->class >> 16) == PCI_BASE_CLASS_DISPLAY
+		&& !((pdev->class & PCI_SUB_CLASS_MASK) >> 8)) {
+		if (pci_state == PCI_D0)
+			graphics_is_off = 0;
+		else
+			graphics_is_off = 1;
+		goto ret;
+	}
+
+	if (!mrst_dev->lss)
+		goto ret;	/* device with no LSS */
+
+	if (mrst_dev->latest_request == pci_state)
+		goto ret;	/* no change */
+
+	mrst_dev->latest_request = pci_state;	/* record latest request */
+
+	/*
+	 * LSS9 and LSS10 contain multiple PCI devices.
+	 * Use the lowest numbered (highest power) state in the LSS
+	 */
+	if (mrst_dev->lss == 9)
+		pci_state = pmu_min_lss_pci_req(mrst_lss9_pci_ids, pci_state);
+	else if (mrst_dev->lss == 10)
+		pci_state = pmu_min_lss_pci_req(mrst_lss10_pci_ids, pci_state);
+
+	status = pmu_wait_ready();
+	if (status)
+		goto ret;
+
+	old_sss = pmu_read_sss();
+	new_sss = old_sss & ~SSMSK(3, mrst_dev->lss);
+	new_sss |= SSMSK(pci_2_mrst_state(mrst_dev->lss, pci_state),
+			mrst_dev->lss);
+
+	if (new_sss == old_sss)
+		goto ret;	/* nothing to do */
+
+	pmu_set_power_state_send_cmd++;
+
+	status = pmu_issue_command(new_sss);
+
+	if (unlikely(status != 0)) {
+		dev_err(&pdev->dev, "Failed to Issue a PM command\n");
+		goto ret;
+	}
+
+	if (pmu_wait_done())
+		goto ret;
+
+	lss_s0i3_enabled =
+	((pmu_read_sss() & S0I3_SSS_TARGET) == S0I3_SSS_TARGET);
+ret:
+	spin_unlock(&mrst_pmu_power_state_lock);
+	return status;
+}
+
+#ifdef CONFIG_DEBUG_FS
+static char *d0ix_names[] = {"D0", "D0i1", "D0i2", "D0i3"};
+
+static inline const char *d0ix_name(int state)
+{
+	return d0ix_names[(int) state];
+}
+
+static int debug_mrst_pmu_show(struct seq_file *s, void *unused)
+{
+	struct pci_dev *pdev = NULL;
+	u32 cur_pmsss;
+	int lss;
+
+	seq_printf(s, "0x%08X D0I1_ACG_SSS_TARGET\n", D0I1_ACG_SSS_TARGET);
+
+	cur_pmsss = pmu_read_sss();
+
+	seq_printf(s, "0x%08X S0I3_SSS_TARGET\n", S0I3_SSS_TARGET);
+
+	seq_printf(s, "0x%08X Current SSS ", cur_pmsss);
+	seq_printf(s, lss_s0i3_enabled ? "\n" : "[BLOCKS s0i3]\n");
+
+	if (cpumask_equal(cpu_online_mask, cpumask_of(0)))
+		seq_printf(s, "cpu0 is only cpu online\n");
+	else
+		seq_printf(s, "cpu0 is NOT only cpu online [BLOCKS S0i3]\n");
+
+	seq_printf(s, "GFX: %s\n", graphics_is_off ? "" : "[BLOCKS s0i3]");
+
+
+	for_each_pci_dev(pdev) {
+		int pos;
+		u16 pmcsr;
+		struct mrst_device *mrst_dev;
+		int i;
+
+		mrst_dev = pci_id_2_mrst_dev(pdev->device);
+
+		seq_printf(s, "%s %04x/%04X %-16.16s ",
+			dev_name(&pdev->dev),
+			pdev->vendor, pdev->device,
+			dev_driver_string(&pdev->dev));
+
+		if (unlikely (!mrst_dev)) {
+			seq_printf(s, " UNKNOWN\n");
+			continue;
+		}
+
+		if (mrst_dev->lss)
+			seq_printf(s, "LSS %2d %-4s ", mrst_dev->lss,
+				d0ix_name(((cur_pmsss >>
+					(mrst_dev->lss * 2)) & 0x3)));
+		else
+			seq_printf(s, "            ");
+
+		/* PCI PM config space setting */
+		pos = pci_find_capability(pdev, PCI_CAP_ID_PM);
+		if (pos != 0) {
+			pci_read_config_word(pdev, pos + PCI_PM_CTRL, &pmcsr);
+		seq_printf(s, "PCI-%-4s",
+			pci_power_name(pmcsr & PCI_PM_CTRL_STATE_MASK));
+		} else {
+			seq_printf(s, "        ");
+		}
+
+		seq_printf(s, " %s ", pci_power_name(mrst_dev->latest_request));
+		for (i = 0; i <= PCI_D3cold; ++i)
+			seq_printf(s, "%d ", mrst_dev->pci_state_counts[i]);
+
+		if (mrst_dev->lss) {
+			unsigned int lssmask;
+
+			lssmask = SSMSK(D0i3, mrst_dev->lss);
+
+			if ((lssmask & S0I3_SSS_TARGET) &&
+				((lssmask & cur_pmsss) !=
+					(lssmask & S0I3_SSS_TARGET)))
+						seq_printf(s , "[BLOCKS s0i3]");
+		}
+
+		seq_printf(s, "\n");
+	}
+	seq_printf(s, "Wake Counters:\n");
+	for (lss = 0; lss < MRST_NUM_LSS; ++lss)
+		seq_printf(s, "LSS%d %d\n", lss, wake_counters[lss]);
+
+	seq_printf(s, "Interrupt Counters:\n");
+	seq_printf(s,
+		"INT_SPURIOUS \t%8u\n" "INT_CMD_DONE \t%8u\n"
+		"INT_CMD_ERR  \t%8u\n" "INT_WAKE_RX  \t%8u\n"
+		"INT_SS_ERROR \t%8u\n" "INT_S0IX_MISS\t%8u\n"
+		"INT_NO_ACKC6 \t%8u\n" "INT_INVALID  \t%8u\n",
+		pmu_irq_stats[INT_SPURIOUS], pmu_irq_stats[INT_CMD_DONE],
+		pmu_irq_stats[INT_CMD_ERR], pmu_irq_stats[INT_WAKE_RX],
+		pmu_irq_stats[INT_SS_ERROR], pmu_irq_stats[INT_S0IX_MISS],
+		pmu_irq_stats[INT_NO_ACKC6], pmu_irq_stats[INT_INVALID]);
+
+	seq_printf(s, "mrst_pmu_wait_ready_calls          %8d\n",
+			pmu_wait_ready_calls);
+	seq_printf(s, "mrst_pmu_wait_ready_udelays        %8d\n",
+			pmu_wait_ready_udelays);
+	seq_printf(s, "mrst_pmu_wait_ready_udelays_max    %8d\n",
+			pmu_wait_ready_udelays_max);
+	seq_printf(s, "mrst_pmu_wait_done_calls           %8d\n",
+			pmu_wait_done_calls);
+	seq_printf(s, "mrst_pmu_wait_done_udelays         %8d\n",
+			pmu_wait_done_udelays);
+	seq_printf(s, "mrst_pmu_wait_done_udelays_max     %8d\n",
+			pmu_wait_done_udelays_max);
+	seq_printf(s, "mrst_pmu_set_power_state_entry     %8d\n",
+			pmu_set_power_state_entry);
+	seq_printf(s, "mrst_pmu_set_power_state_send_cmd  %8d\n",
+			pmu_set_power_state_send_cmd);
+	seq_printf(s, "SCU busy: %d\n", pmu_read_busy_status());
+
+	return 0;
+}
+
+static int debug_mrst_pmu_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, debug_mrst_pmu_show, NULL);
+}
+
+static const struct file_operations devices_state_operations = {
+	.open		= debug_mrst_pmu_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+#endif	/* DEBUG_FS */
+
+/*
+ * Validate SCU PCI shim PCI vendor capability byte
+ * against LSS hard-coded in mrst_devs[] above.
+ * DEBUG only.
+ */
+static void pmu_scu_firmware_debug(void)
+{
+	struct pci_dev *pdev = NULL;
+
+	for_each_pci_dev(pdev) {
+		struct mrst_device *mrst_dev;
+		u8 pci_config_lss;
+		int pos;
+
+		mrst_dev = pci_id_2_mrst_dev(pdev->device);
+		if (unlikely(!mrst_dev)) {
+			printk(KERN_ERR FW_BUG "pmu: Unknown "
+				"PCI device 0x%04X\n", pdev->device);
+			continue;
+		}
+
+		if (mrst_dev->lss == 0)
+			continue;	 /* no LSS in our table */
+
+		pos = pci_find_capability(pdev, PCI_CAP_ID_VNDR);
+		if (!pos != 0) {
+			printk(KERN_ERR FW_BUG "pmu: 0x%04X "
+				"missing PCI Vendor Capability\n",
+				pdev->device);
+			continue;
+		}
+		pci_read_config_byte(pdev, pos + 4, &pci_config_lss);
+		if (!(pci_config_lss & PCI_VENDOR_CAP_LOG_SS_MASK)) {
+			printk(KERN_ERR FW_BUG "pmu: 0x%04X "
+				"invalid PCI Vendor Capability 0x%x "
+				" expected LSS 0x%X\n",
+				pdev->device, pci_config_lss, mrst_dev->lss);
+			continue;
+		}
+		pci_config_lss &= PCI_VENDOR_CAP_LOG_ID_MASK;
+
+		if (mrst_dev->lss == pci_config_lss)
+			continue;
+
+		printk(KERN_ERR FW_BUG "pmu: 0x%04X LSS = %d, expected %d\n",
+			pdev->device, pci_config_lss, mrst_dev->lss);
+	}
+}
+
+/**
+ * pmu_probe
+ */
+static int __devinit pmu_probe(struct pci_dev *pdev,
+				   const struct pci_device_id *pci_id)
+{
+	int ret;
+	struct mrst_pmu_reg *pmu;
+
+	/* Init the device */
+	ret = pci_enable_device(pdev);
+	if (ret) {
+		dev_err(&pdev->dev, "Unable to Enable PCI device\n");
+		return ret;
+	}
+
+	ret = pci_request_regions(pdev, MRST_PMU_DRV_NAME);
+	if (ret < 0) {
+		dev_err(&pdev->dev, "Cannot obtain PCI resources, aborting\n");
+		goto out_err1;
+	}
+
+	/* Map the memory of PMU reg base */
+	pmu = pci_iomap(pdev, 0, 0);
+	if (!pmu) {
+		dev_err(&pdev->dev, "Unable to map the PMU address space\n");
+		ret = -ENOMEM;
+		goto out_err2;
+	}
+
+#ifdef CONFIG_DEBUG_FS
+	/* /sys/kernel/debug/mrst_pmu */
+	(void) debugfs_create_file("mrst_pmu", S_IFREG | S_IRUGO,
+				NULL, NULL, &devices_state_operations);
+#endif
+	pmu_reg = pmu;	/* success */
+
+	if (request_irq(pdev->irq, pmu_irq, 0, MRST_PMU_DRV_NAME, NULL)) {
+		dev_err(&pdev->dev, "Registering isr has failed\n");
+		ret = -1;
+		goto out_err3;
+	}
+
+	pmu_scu_firmware_debug();
+
+	pmu_write_wkc(S0I3_WAKE_SOURCES);	/* Enable S0i3 wakeup sources */
+
+	pmu_wait_ready();
+
+	pmu_write_ssc(D0I1_ACG_SSS_TARGET);	/* Enable Auto-Clock_Gating */
+	pmu_write_cmd(0x201);
+
+	spin_lock_init(&mrst_pmu_power_state_lock);
+
+	/* Enable the hardware interrupt */
+	pmu_irq_enable();
+	return 0;
+
+out_err3:
+	free_irq(pdev->irq, NULL);
+	pci_iounmap(pdev, pmu_reg);
+	pmu_reg = NULL;
+out_err2:
+	pci_release_region(pdev, 0);
+out_err1:
+	pci_disable_device(pdev);
+	return ret;
+}
+
+static void __devexit pmu_remove(struct pci_dev *pdev)
+{
+	dev_err(&pdev->dev, "Mid PM pmu_remove called\n");
+
+	/* Freeing up the irq */
+	free_irq(pdev->irq, NULL);
+
+	pci_iounmap(pdev, pmu_reg);
+	pmu_reg = NULL;
+
+	/* disable the current PCI device */
+	pci_release_region(pdev, 0);
+	pci_disable_device(pdev);
+}
+
+static DEFINE_PCI_DEVICE_TABLE(pmu_pci_ids) = {
+	{ PCI_VDEVICE(INTEL, PCI_DEV_ID_MRST_PMU), 0 },
+	{ }
+};
+
+MODULE_DEVICE_TABLE(pci, pmu_pci_ids);
+
+static struct pci_driver driver = {
+	.name = MRST_PMU_DRV_NAME,
+	.id_table = pmu_pci_ids,
+	.probe = pmu_probe,
+	.remove = __devexit_p(pmu_remove),
+};
+
+/**
+ * pmu_pci_register - register the PMU driver as PCI device
+ */
+static int __init pmu_pci_register(void)
+{
+	return pci_register_driver(&driver);
+}
+
+/* Register and probe via fs_initcall() to preceed device_initcall() */
+fs_initcall(pmu_pci_register);
+
+static void __exit mid_pci_cleanup(void)
+{
+	pci_unregister_driver(&driver);
+}
+
+static int ia_major;
+static int ia_minor;
+
+static int pmu_sfi_parse_oem(struct sfi_table_header *table)
+{
+	struct sfi_table_simple *sb;
+
+	sb = (struct sfi_table_simple *)table;
+	ia_major = (sb->pentry[1] >> 0) & 0xFFFF;
+	ia_minor = (sb->pentry[1] >> 16) & 0xFFFF;
+	printk(KERN_INFO "mrst_pmu: IA FW version v%x.%x\n",
+		ia_major, ia_minor);
+
+	return 0;
+}
+
+static int __init scu_fw_check(void)
+{
+	int ret;
+	u32 fw_version;
+
+	if (!pmu_reg)
+		return 0;	/* this driver didn't probe-out */
+
+	sfi_table_parse("OEMB", NULL, NULL, pmu_sfi_parse_oem);
+
+	if (ia_major < 0x6005 || ia_minor < 0x1525) {
+		WARN(1, "mrst_pmu: IA FW version too old\n");
+		return -1;
+	}
+
+	ret = intel_scu_ipc_command(IPCMSG_FW_REVISION, 0, NULL, 0,
+					&fw_version, 1);
+
+	if (ret) {
+		WARN(1, "mrst_pmu: IPC FW version? %d\n", ret);
+	} else {
+		int scu_major = (fw_version >> 8) & 0xFF;
+		int scu_minor = (fw_version >> 0) & 0xFF;
+
+		printk(KERN_INFO "mrst_pmu: firmware v%x\n", fw_version);
+
+		if ((scu_major >= 0xC0) && (scu_minor >= 0x49)) {
+			printk(KERN_INFO "mrst_pmu: enabling S0i3\n");
+			mrst_pmu_s0i3_enable = true;
+		} else {
+			WARN(1, "mrst_pmu: S0i3 disabled, old firmware %X.%X",
+					scu_major, scu_minor);
+		}
+	}
+	return 0;
+}
+late_initcall(scu_fw_check);
+module_exit(mid_pci_cleanup);
diff --git a/arch/x86/platform/mrst/pmu.h b/arch/x86/platform/mrst/pmu.h
new file mode 100644
index 0000000..bfbfe64
--- /dev/null
+++ b/arch/x86/platform/mrst/pmu.h
@@ -0,0 +1,234 @@
+/*
+ * mrst/pmu.h - private definitions for MRST Power Management Unit mrst/pmu.c
+ *
+ * Copyright (c) 2011, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#ifndef _MRST_PMU_H_
+#define _MRST_PMU_H_
+
+#define PCI_DEV_ID_MRST_PMU		0x0810
+#define MRST_PMU_DRV_NAME		"mrst_pmu"
+#define	PCI_SUB_CLASS_MASK		0xFF00
+
+#define	PCI_VENDOR_CAP_LOG_ID_MASK	0x7F
+#define PCI_VENDOR_CAP_LOG_SS_MASK	0x80
+
+#define SUB_SYS_ALL_D0I1	0x01155555
+#define S0I3_WAKE_SOURCES	0x00001FFF
+
+#define PM_S0I3_COMMAND					\
+	((0 << 31) |	/* Reserved */			\
+	(0 << 30) |	/* Core must be idle */		\
+	(0xc2 << 22) |	/* ACK C6 trigger */		\
+	(3 << 19) |	/* Trigger on DMI message */	\
+	(3 << 16) |	/* Enter S0i3 */		\
+	(0 << 13) |	/* Numeric mode ID (sw) */	\
+	(3 << 9) |	/* Trigger mode */		\
+	(0 << 8) |	/* Do not interrupt */		\
+	(1 << 0))	/* Set configuration */
+
+#define	LSS_DMI		0
+#define	LSS_SD_HC0	1
+#define	LSS_SD_HC1	2
+#define	LSS_NAND	3
+#define	LSS_IMAGING	4
+#define	LSS_SECURITY	5
+#define	LSS_DISPLAY	6
+#define	LSS_USB_HC	7
+#define	LSS_USB_OTG	8
+#define	LSS_AUDIO	9
+#define	LSS_AUDIO_LPE	9
+#define	LSS_AUDIO_SSP	9
+#define	LSS_I2C0	10
+#define	LSS_I2C1	10
+#define	LSS_I2C2	10
+#define	LSS_KBD		10
+#define	LSS_SPI0	10
+#define	LSS_SPI1	10
+#define	LSS_SPI2	10
+#define	LSS_GPIO	10
+#define	LSS_SRAM	11	/* used by SCU, do not touch */
+#define	LSS_SD_HC2	12
+/* LSS hardware bits 15,14,13 are hardwired to 0, thus unusable */
+#define MRST_NUM_LSS	13
+
+#define MIN(a, b) (((a) < (b)) ? (a) : (b))
+
+#define	SSMSK(mask, lss) ((mask) << ((lss) * 2))
+#define	D0	0
+#define	D0i1	1
+#define	D0i2	2
+#define	D0i3	3
+
+#define S0I3_SSS_TARGET	(		\
+	SSMSK(D0i1, LSS_DMI) |		\
+	SSMSK(D0i3, LSS_SD_HC0) |	\
+	SSMSK(D0i3, LSS_SD_HC1) |	\
+	SSMSK(D0i3, LSS_NAND) |		\
+	SSMSK(D0i3, LSS_SD_HC2) |	\
+	SSMSK(D0i3, LSS_IMAGING) |	\
+	SSMSK(D0i3, LSS_SECURITY) |	\
+	SSMSK(D0i3, LSS_DISPLAY) |	\
+	SSMSK(D0i3, LSS_USB_HC) |	\
+	SSMSK(D0i3, LSS_USB_OTG) |	\
+	SSMSK(D0i3, LSS_AUDIO) |	\
+	SSMSK(D0i1, LSS_I2C0))
+
+/*
+ * D0i1 on Langwell is Autonomous Clock Gating (ACG).
+ * Enable ACG on every LSS except camera and audio
+ */
+#define D0I1_ACG_SSS_TARGET	 \
+	(SUB_SYS_ALL_D0I1 & ~SSMSK(D0i1, LSS_IMAGING) & ~SSMSK(D0i1, LSS_AUDIO))
+
+enum cm_mode {
+	CM_NOP,			/* ignore the config mode value */
+	CM_IMMEDIATE,
+	CM_DELAY,
+	CM_TRIGGER,
+	CM_INVALID
+};
+
+enum sys_state {
+	SYS_STATE_S0I0,
+	SYS_STATE_S0I1,
+	SYS_STATE_S0I2,
+	SYS_STATE_S0I3,
+	SYS_STATE_S3,
+	SYS_STATE_S5
+};
+
+#define SET_CFG_CMD	1
+
+enum int_status {
+	INT_SPURIOUS = 0,
+	INT_CMD_DONE = 1,
+	INT_CMD_ERR = 2,
+	INT_WAKE_RX = 3,
+	INT_SS_ERROR = 4,
+	INT_S0IX_MISS = 5,
+	INT_NO_ACKC6 = 6,
+	INT_INVALID = 7,
+};
+
+/* PMU register interface */
+static struct mrst_pmu_reg {
+	u32 pm_sts;		/* 0x00 */
+	u32 pm_cmd;		/* 0x04 */
+	u32 pm_ics;		/* 0x08 */
+	u32 _resv1;		/* 0x0C */
+	u32 pm_wkc[2];		/* 0x10 */
+	u32 pm_wks[2];		/* 0x18 */
+	u32 pm_ssc[4];		/* 0x20 */
+	u32 pm_sss[4];		/* 0x30 */
+	u32 pm_wssc[4];		/* 0x40 */
+	u32 pm_c3c4;		/* 0x50 */
+	u32 pm_c5c6;		/* 0x54 */
+	u32 pm_msi_disable;	/* 0x58 */
+} *pmu_reg;
+
+static inline u32 pmu_read_sts(void) { return readl(&pmu_reg->pm_sts); }
+static inline u32 pmu_read_ics(void) { return readl(&pmu_reg->pm_ics); }
+static inline u32 pmu_read_wks(void) { return readl(&pmu_reg->pm_wks[0]); }
+static inline u32 pmu_read_sss(void) { return readl(&pmu_reg->pm_sss[0]); }
+
+static inline void pmu_write_cmd(u32 arg) { writel(arg, &pmu_reg->pm_cmd); }
+static inline void pmu_write_ics(u32 arg) { writel(arg, &pmu_reg->pm_ics); }
+static inline void pmu_write_wkc(u32 arg) { writel(arg, &pmu_reg->pm_wkc[0]); }
+static inline void pmu_write_ssc(u32 arg) { writel(arg, &pmu_reg->pm_ssc[0]); }
+static inline void pmu_write_wssc(u32 arg)
+					{ writel(arg, &pmu_reg->pm_wssc[0]); }
+
+static inline void pmu_msi_enable(void) { writel(0, &pmu_reg->pm_msi_disable); }
+static inline u32 pmu_msi_is_disabled(void)
+				{ return readl(&pmu_reg->pm_msi_disable); }
+
+union pmu_pm_ics {
+	struct {
+		u32 cause:8;
+		u32 enable:1;
+		u32 pending:1;
+		u32 reserved:22;
+	} bits;
+	u32 value;
+};
+
+static inline void pmu_irq_enable(void)
+{
+	union pmu_pm_ics pmu_ics;
+
+	pmu_ics.value = pmu_read_ics();
+	pmu_ics.bits.enable = 1;
+	pmu_write_ics(pmu_ics.value);
+}
+
+union pmu_pm_status {
+	struct {
+		u32 pmu_rev:8;
+		u32 pmu_busy:1;
+		u32 mode_id:4;
+		u32 Reserved:19;
+	} pmu_status_parts;
+	u32 pmu_status_value;
+};
+
+static inline int pmu_read_busy_status(void)
+{
+	union pmu_pm_status result;
+
+	result.pmu_status_value = pmu_read_sts();
+
+	return result.pmu_status_parts.pmu_busy;
+}
+
+/* pmu set config parameters */
+struct cfg_delay_param_t {
+	u32 cmd:8;
+	u32 ioc:1;
+	u32 cfg_mode:4;
+	u32 mode_id:3;
+	u32 sys_state:3;
+	u32 cfg_delay:8;
+	u32 rsvd:5;
+};
+
+struct cfg_trig_param_t {
+	u32 cmd:8;
+	u32 ioc:1;
+	u32 cfg_mode:4;
+	u32 mode_id:3;
+	u32 sys_state:3;
+	u32 cfg_trig_type:3;
+	u32 cfg_trig_val:8;
+	u32 cmbi:1;
+	u32 rsvd1:1;
+};
+
+union pmu_pm_set_cfg_cmd_t {
+	union {
+		struct cfg_delay_param_t d_param;
+		struct cfg_trig_param_t t_param;
+	} pmu2_params;
+	u32 pmu_pm_set_cfg_cmd_value;
+};
+
+#ifdef FUTURE_PATCH
+extern int mrst_s0i3_entry(u32 regval, u32 *regaddr);
+#else
+static inline int mrst_s0i3_entry(u32 regval, u32 *regaddr) { return -1; }
+#endif
+#endif
-- 
1.7.6.396.ge0613

^ permalink raw reply related

* [GIT PATCH] tools/power patches for Linux 3.1
From: Len Brown @ 2011-08-02 22:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-pm, linux-kernel

Hi Linus,

please pull from: 

git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6.git tools-release

This will update the files shown below.

thanks!

Len Brown
Intel Open Source Technology Center

ps. individual patches are available on linux-pm@lists.linux-foundation.org

 tools/power/x86/turbostat/turbostat.c              |   46 ++++++++++----------
 .../x86_energy_perf_policy.c                       |    5 +-
 2 files changed, 25 insertions(+), 26 deletions(-)

through these commits:

Len Brown (2):
      tools/power x86_energy_perf_policy: fix print of uninitialized string
      tools/power turbostat: fit output into 80 columns on snb-ep

with this log:

commit d30c4b7a87e8b19583d5ef1402d9b38f51e30f44
Author: Len Brown <len.brown@intel.com>
Date:   Sun Jul 31 18:19:33 2011 -0400

    tools/power turbostat: fit output into 80 columns on snb-ep
    
    Reduce columns for package number to 1.
    If you can afford more than 9 packages,
    you can also afford a terminal with more than 80 columns:-)
    
    Also shave a column also off the package C-states
    
    Signed-off-by: Len Brown <len.brown@intel.com>

commit e4c0d0e22ce5cf84b3993b5af6c2dac08d52f06b
Author: Len Brown <len.brown@intel.com>
Date:   Fri Jul 15 23:39:00 2011 -0400

    tools/power x86_energy_perf_policy: fix print of uninitialized string
    
    Looks like I was going to stick the brand string
    in the verbose ouput, but didn't get around to it.
    
    Signed-off-by: Len Brown <len.brown@intel.com>

^ permalink raw reply

* [PATCH 2/2] tools/power turbostat: fit output into 80 columns on snb-ep
From: Len Brown @ 2011-08-02 22:36 UTC (permalink / raw)
  To: linux-pm; +Cc: Len Brown
In-Reply-To: <e4c0d0e22ce5cf84b3993b5af6c2dac08d52f06b.1312324497.git.len.brown@intel.com>

From: Len Brown <len.brown@intel.com>

Reduce columns for package number to 1.
If you can afford more than 9 packages,
you can also afford a terminal with more than 80 columns:-)

Also shave a column also off the package C-states

Signed-off-by: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c |   46 ++++++++++++++++----------------
 1 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 6d8ef4a..8b2d37b 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -128,34 +128,34 @@ unsigned long long get_msr(int cpu, off_t offset)
 void print_header(void)
 {
 	if (show_pkg)
-		fprintf(stderr, "pkg ");
+		fprintf(stderr, "pk");
 	if (show_core)
-		fprintf(stderr, "core");
+		fprintf(stderr, " cr");
 	if (show_cpu)
 		fprintf(stderr, " CPU");
 	if (do_nhm_cstates)
-		fprintf(stderr, "   %%c0 ");
+		fprintf(stderr, "    %%c0 ");
 	if (has_aperf)
-		fprintf(stderr, "  GHz");
+		fprintf(stderr, " GHz");
 	fprintf(stderr, "  TSC");
 	if (do_nhm_cstates)
-		fprintf(stderr, "   %%c1 ");
+		fprintf(stderr, "    %%c1");
 	if (do_nhm_cstates)
-		fprintf(stderr, "   %%c3 ");
+		fprintf(stderr, "    %%c3");
 	if (do_nhm_cstates)
-		fprintf(stderr, "   %%c6 ");
+		fprintf(stderr, "    %%c6");
 	if (do_snb_cstates)
-		fprintf(stderr, "   %%c7 ");
+		fprintf(stderr, "    %%c7");
 	if (do_snb_cstates)
-		fprintf(stderr, "  %%pc2 ");
+		fprintf(stderr, "  %%pc2");
 	if (do_nhm_cstates)
-		fprintf(stderr, "  %%pc3 ");
+		fprintf(stderr, "  %%pc3");
 	if (do_nhm_cstates)
-		fprintf(stderr, "  %%pc6 ");
+		fprintf(stderr, "  %%pc6");
 	if (do_snb_cstates)
-		fprintf(stderr, "  %%pc7 ");
+		fprintf(stderr, "  %%pc7");
 	if (extra_msr_offset)
-		fprintf(stderr, "       MSR 0x%x ", extra_msr_offset);
+		fprintf(stderr, "        MSR 0x%x ", extra_msr_offset);
 
 	putc('\n', stderr);
 }
@@ -194,14 +194,14 @@ void print_cnt(struct counters *p)
 	/* topology columns, print blanks on 1st (average) line */
 	if (p == cnt_average) {
 		if (show_pkg)
-			fprintf(stderr, "    ");
+			fprintf(stderr, " ");
 		if (show_core)
 			fprintf(stderr, "    ");
 		if (show_cpu)
 			fprintf(stderr, "    ");
 	} else {
 		if (show_pkg)
-			fprintf(stderr, "%4d", p->pkg);
+			fprintf(stderr, "%d", p->pkg);
 		if (show_core)
 			fprintf(stderr, "%4d", p->core);
 		if (show_cpu)
@@ -241,22 +241,22 @@ void print_cnt(struct counters *p)
 		if (!skip_c1)
 			fprintf(stderr, "%7.2f", 100.0 * p->c1/p->tsc);
 		else
-			fprintf(stderr, "   ****");
+			fprintf(stderr, "  ****");
 	}
 	if (do_nhm_cstates)
-		fprintf(stderr, "%7.2f", 100.0 * p->c3/p->tsc);
+		fprintf(stderr, " %6.2f", 100.0 * p->c3/p->tsc);
 	if (do_nhm_cstates)
-		fprintf(stderr, "%7.2f", 100.0 * p->c6/p->tsc);
+		fprintf(stderr, " %6.2f", 100.0 * p->c6/p->tsc);
 	if (do_snb_cstates)
-		fprintf(stderr, "%7.2f", 100.0 * p->c7/p->tsc);
+		fprintf(stderr, " %6.2f", 100.0 * p->c7/p->tsc);
 	if (do_snb_cstates)
-		fprintf(stderr, "%7.2f", 100.0 * p->pc2/p->tsc);
+		fprintf(stderr, " %5.2f", 100.0 * p->pc2/p->tsc);
 	if (do_nhm_cstates)
-		fprintf(stderr, "%7.2f", 100.0 * p->pc3/p->tsc);
+		fprintf(stderr, " %5.2f", 100.0 * p->pc3/p->tsc);
 	if (do_nhm_cstates)
-		fprintf(stderr, "%7.2f", 100.0 * p->pc6/p->tsc);
+		fprintf(stderr, " %5.2f", 100.0 * p->pc6/p->tsc);
 	if (do_snb_cstates)
-		fprintf(stderr, "%7.2f", 100.0 * p->pc7/p->tsc);
+		fprintf(stderr, " %5.2f", 100.0 * p->pc7/p->tsc);
 	if (extra_msr_offset)
 		fprintf(stderr, "  0x%016llx", p->extra_msr);
 	putc('\n', stderr);
-- 
1.7.6.396.ge0613

^ permalink raw reply related

* [PATCH 1/2] tools/power x86_energy_perf_policy: fix print of uninitialized string
From: Len Brown @ 2011-08-02 22:36 UTC (permalink / raw)
  To: linux-pm; +Cc: Len Brown
In-Reply-To: <1312324567-4374-1-git-send-email-lenb@kernel.org>

From: Len Brown <len.brown@intel.com>

Looks like I was going to stick the brand string
in the verbose ouput, but didn't get around to it.

Signed-off-by: Len Brown <len.brown@intel.com>
---
 .../x86_energy_perf_policy.c                       |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c b/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c
index 2618ef2..33c5c7e 100644
--- a/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c
+++ b/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c
@@ -137,7 +137,6 @@ void cmdline(int argc, char **argv)
 void validate_cpuid(void)
 {
 	unsigned int eax, ebx, ecx, edx, max_level;
-	char brand[16];
 	unsigned int fms, family, model, stepping;
 
 	eax = ebx = ecx = edx = 0;
@@ -160,8 +159,8 @@ void validate_cpuid(void)
 		model += ((fms >> 16) & 0xf) << 4;
 
 	if (verbose > 1)
-		printf("CPUID %s %d levels family:model:stepping "
-			"0x%x:%x:%x (%d:%d:%d)\n", brand, max_level,
+		printf("CPUID %d levels family:model:stepping "
+			"0x%x:%x:%x (%d:%d:%d)\n", max_level,
 			family, model, stepping, family, model, stepping);
 
 	if (!(edx & (1 << 5))) {
-- 
1.7.6.396.ge0613

^ permalink raw reply related

* tools/power patch queue for Linux 3.1
From: Len Brown @ 2011-08-02 22:36 UTC (permalink / raw)
  To: linux-pm

Let me know if you have troubles with
any of these patches.

thanks,
Len Brown, Intel Open Source Technology Center

^ permalink raw reply

* Re: [PATCH 02/11] PM: extend PM QoS with per-device wake-up constraints
From: Rafael J. Wysocki @ 2011-08-02 22:16 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: markgross, Mark Brown, Linux PM mailing list, linux-omap,
	Jean Pihet
In-Reply-To: <8762mfecdk.fsf@ti.com>

On Tuesday, August 02, 2011, Kevin Hilman wrote:
> [adding Mark Brown as we discussed similar topics a couple plumbers ago]
> 
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> [...]
> 
> >> >> The new class is only available from kernel drivers and so is not exported
> >> >> to user space.
> >> >
> >> > It should be available to user space, however, because in many cases drivers
> >> > simply have no idea what values to use (after all, the use decides if he
> >> > wants to trade worse video playback quality for better battery life, for
> >> > example).
> >> >
> >> 
> >> FWIW, I think it's wrong to expose the raw per-device constraints
> >> directly to userspace.
> >> 
> >> I think it's the responsibility of the subsystems (video, audio, input,
> >> etc.) to expose QoS knobs to userspace as they see fit and now allow
> >> userspace to tinker directly with QoS constraints.
> >
> > This assumes that those "subsystems" or rather "frameworks" (a bus type or
> > a device class is a subsystem in the terminology used throughout the PM
> > documentation) will (a) know about PM QoS and (b) will care to handle it.
> > Both (a) and (b) seem to be unrealistic IMHO.
> 
> I disagree and think that both are quite realistic (mainly because they
> exist today, albiet mostly out of tree because no generic QoS framework
> exist.  e.g. on OMAP, we have OMAP-specific *kernel* APIs for requesting
> per-device wakeup latencies, and drivers and frameworks are using them.)

I'm sure there are frameworks using such things.  I'm also sure there
are frameworks that don't.  BTW, the "we have it out of the tree" argument is
not very useful, so I'd appreciate it if you didn't use it.

> Most of these frameworks already have QoS constraints/requirements but
> have no generic way to express them.  That's why we're pushing for a
> generic constraints framework.
> 
> Consider video for example.  It's the kernel-side drivers, not user
> space apps, that know about the latency or throughput constraints based
> on e.g. frame rate, bytes/pixel, double/triple buffering, PIP, multiple
> displays, etc. etc.   
> 
> In this case, the video framework (V4L2) might not want any knobs
> exposed to userspace because userspace simply doesn't have the knowledge
> to set appropriate constraints.  I'm less familiar with audio, but I
> believe audio would be similar (sample rate, number of channels, mixing
> with other concurrent audio streams, etc. etc. are all known by the
> kernel-side code.)
> 
> On the other hand, consider touchscreen.  Touchscreens have a
> configurable sample rate which allows a trade-off between power savings
> and accuracy.  For example, low accuracy (and thus low power) would be
> fine for a UI which is only taking finger gestures, but if the
> application was doing handwriting recognition with a stylus, it would
> likely want higher accuracy (and consume more power.)
> 
> In this case, the kernel driver has no way of knowing what the
> application is doing, so some way for touchscreen apps to request this
> kind of constraint would be required.
> 
> My point is it should be up to each framework (audio, video,
> input/touchscreen) to expose a userspace interface to their users that
> makes sense for the *specific needs* of the framework.
> 
> Using the above examples, audio and video might not need (or want) to
> expose anything to userspace, where touchscreen would.  IMO, it would be
> much more obvious for a touchscreen app to use a new API in tslib (which
> it is already using) to set its constraints rather than having to use
> tslib for most things but a sysfs file for QoS.
> 
> > We already export wakeup and runtime PM knobs per device via sysfs and
> > I'm not so sure why PM QoS is different in that respect.
> 
> As stated above, because for many frameworks userspace simply does not
> have all (or any) of the knowledge to set the right constraints.

I still don't understand what's wrong with allowing user space to _add_
requirements.  The will only override the drivers' or frameworks' requirements
if they are stronger, so the functionality shouldn't be hurt.  They may cause
some more energy to be used, but if user space wants that, it's pretty much
fine by me.

Thanks,
Rafael

^ permalink raw reply

* Re: [PATCH v4 0/3] DEVFREQ, DVFS framework for non-CPU devices
From: Kevin Hilman @ 2011-08-02 22:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Len Brown, Greg Kroah-Hartman, Kyungmin Park, MyungJoo Ham,
	Thomas Gleixner, linux-pm
In-Reply-To: <201107290010.53788.rjw@sisk.pl>

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Friday, July 15, 2011, MyungJoo Ham wrote:
>> For a usage example, please look at
>> http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/devfreq
>> 
>> In the above git tree, DVFS (dynamic voltage and frequency scaling) mechanism
>> is applied to the memory bus of Exynos4210 for Exynos4210-NURI boards.
>> In the example, the LPDDR2 DRAM frequency changes between 133, 266, and 400MHz
>> and other related clocks simply follow the determined DDR RAM clock.
>> 
>> The DEVFREQ driver for Exynos4210 memory bus is at
>> /arch/arm/mach-exynos4/devfreq_bus.c in the git tree.
>> 
>> MyungJoo Ham (3):
>>   PM: Introduce DEVFREQ: generic DVFS framework with device-specific
>>     OPPs
>>   PM / DEVFREQ: add example governors
>>   PM / DEVFREQ: add sysfs interface (including user tickling)
>
> OK, I'm going to take the patches for 3.2.

Sorry for being late to the discussion, but personally I don't think
this should be merged for v3.2.

First, I think the governor part of this series is basically fine, and
can see some cases where it would be useful, but as Mike has pointed
out, there is still a majority of devices for which a governor like this
would be overkill.

My main problem is with the QoS aspects.  There is significant overlap
between this approach and the per-device PM QoS approach currently being
proposed by Jean Pihet, and I think any sort of per-device DVFS should
be built on top of a more generic per-device QoS layer (such as Jean's.)

This series currently provides a *very* basic QoS mechanism (e.g. fixed
duration frequency constraint) in the form of "tickle", which BTW I seem
to having a hard time understanding (more on that below...)

More importantly though, this series also introduces a sysfs layer for
doing its QoS-like stuff, so adding this and then adding a more generic
per-device QoS is asking for confusion about how userspace is to do QoS.
And adding a sysfs interface may be turn out to be difficult to remove.

Basically, without a more general constraints mechanism in place, I
don't see how this can be generally useful since there are too many
assumptions made with the current "tickle" approach, and as Mike has
pointed out, it cannot cleanly handle cases where there might be
multiple DVFS-related constraints on a given device.

OK, back to "tickle"...  I haven't yet fully understood how that
interface is intended to be used, or who the potential users might be
and it is not documented in the code or changelog.  I also didn't see
any users of that API (except the sysfs code.)

IIUC, tickle is just basically a way to set a frequency constraint on a
device for a fixed duration.  However, if tickle has been requested, any
OPP change will also force a change to the highest performance OPP
temporarily before changing to the target OPP.

Maybe I'm not understanding the usage of it fully, but that seems like
hard-coding policy into the framework that might not be appropriate.
For example, what if there are other devices with constraints such that
they cannot currently scale frequency/voltage?

Mabye MyungJoo can explain in more detail the usecases for tickle?

Thanks,

Kevin

^ permalink raw reply

* Re: [PATCH v4 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
From: Kevin Hilman @ 2011-08-02 21:56 UTC (permalink / raw)
  To: MyungJoo Ham
  Cc: Len Brown, linux-pm, Greg Kroah-Hartman, Thomas Gleixner,
	Kyungmin Park
In-Reply-To: <1310717510-19002-2-git-send-email-myungjoo.ham@samsung.com>

MyungJoo Ham <myungjoo.ham@samsung.com> writes:

> With OPPs, a device may have multiple operable frequency and voltage
> sets. However, there can be multiple possible operable sets and a system
> will need to choose one from them. In order to reduce the power
> consumption (by reducing frequency and voltage) without affecting the
> performance too much, a Dynamic Voltage and Frequency Scaling (DVFS)
> scheme may be used.
>
> This patch introduces the DVFS capability to non-CPU devices with OPPs.
> DVFS is a techique whereby the frequency and supplied voltage of a
> device is adjusted on-the-fly. DVFS usually sets the frequency as low
> as possible with given conditions (such as QoS assurance) and adjusts
> voltage according to the chosen frequency in order to reduce power
> consumption and heat dissipation.
>
> The generic DVFS for devices, DEVFREQ, may appear quite similar with
> /drivers/cpufreq.  However, CPUFREQ does not allow to have multiple
> devices registered and is not suitable to have multiple heterogenous
> devices with different (but simple) governors.
>
> Normally, DVFS mechanism controls frequency based on the demand for
> the device, and then, chooses voltage based on the chosen frequency.
> DEVFREQ also controls the frequency based on the governor's frequency
> recommendation and let OPP pick up the pair of frequency and voltage
> based on the recommended frequency. Then, the chosen OPP is passed to
> device driver's "target" callback.
>
> Tested with memory bus of Exynos4-NURI board.
>
> The test code with board support for Exynos4-NURI is at
> http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/devfreq
>
> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>

[...]

> +int devfreq_update(struct device *dev)
> +{
> +	struct devfreq *devfreq;
> +	int err = 0;
> +
> +	mutex_lock(&devfreq_list_lock);
> +
> +	devfreq = find_device_devfreq(dev);
> +	if (IS_ERR(devfreq)) {
> +		err = PTR_ERR(devfreq);
> +		goto out;
> +	}
> +
> +	/*
> +	 * If the maximum frequency available is changed either by
> +	 * enabling higher frequency or disabling the current
> +	 * maximum frequency, we need to adjust the frequency
> +	 * (tickle) again if the device has been being tickled.
> +	 */
> +	if (devfreq->tickle) {
> +		unsigned long freq = devfreq->profile->max_freq;
> +		struct opp *opp = opp_find_freq_floor(devfreq->dev, &freq);
> +
> +		if (IS_ERR(opp)) {
> +			err = PTR_ERR(opp);
> +			goto out;
> +		}
> +
> +		/* Max freq available is not changed */
> +		if (devfreq->previous_freq == freq)
> +			goto out;
> +
> +		/* Tickle again. Max freq available is changed */
> +		err = devfreq->profile->target(devfreq->dev, opp);
> +		if (!err)
> +			devfreq->previous_freq = freq;

This looks an awful lot like _devfreq_tickle_device()... wondering why
that is not called here.

> +	} else {
> +		/* Reevaluate the proper frequency */
> +		err = devfreq_do(devfreq);
> +	}
> +
> +out:
> +	mutex_unlock(&devfreq_list_lock);
> +	return err;
> +}

Kevin

^ permalink raw reply

* [GIT PATCH] ACPI patches for Linux 3.1
From: Len Brown @ 2011-08-02 21:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-acpi, linux-pm, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 22580 bytes --]

Hi Linus,

please pull from: 

git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git release

This will update the files shown below.

thanks!

Len Brown
Intel Open Source Technology Center

ps. individual patches are available on linux-acpi@vger.kernel.org

 Documentation/feature-removal-schedule.txt |    9 --
 Documentation/kernel-parameters.txt        |    5 +
 drivers/acpi/acpica/acglobal.h             |    6 +
 drivers/acpi/acpica/aclocal.h              |    1 +
 drivers/acpi/acpica/acpredef.h             |    1 +
 drivers/acpi/acpica/nspredef.c             |   19 +++-
 drivers/acpi/acpica/nsrepair2.c            |   15 +++
 drivers/acpi/acpica/tbinstal.c             |   27 +++++-
 drivers/acpi/battery.c                     |   82 ++++++++++------
 drivers/acpi/dock.c                        |    4 +-
 drivers/acpi/ec_sys.c                      |    2 +-
 drivers/acpi/fan.c                         |    2 +-
 drivers/acpi/osl.c                         |   25 +++++-
 drivers/acpi/pci_irq.c                     |   58 +++++++++++
 drivers/acpi/pci_root.c                    |    3 +-
 drivers/acpi/processor_thermal.c           |    2 +-
 drivers/acpi/sbs.c                         |   13 ++-
 drivers/acpi/sleep.c                       |   16 +++
 drivers/acpi/sysfs.c                       |    4 +-
 drivers/acpi/thermal.c                     |    2 +-
 drivers/acpi/video.c                       |    2 +-
 drivers/ata/libata-acpi.c                  |    4 +-
 drivers/pci/hotplug/acpiphp_glue.c         |    2 +-
 drivers/thermal/Kconfig                    |    8 +-
 drivers/thermal/thermal_sys.c              |  142 ++++++++++++++++++++++------
 include/acpi/acpi_drivers.h                |    2 +-
 include/acpi/acpixf.h                      |    3 +-
 include/acpi/processor.h                   |    2 +-
 include/linux/acpi.h                       |    1 -
 include/linux/thermal.h                    |   22 -----
 30 files changed, 352 insertions(+), 132 deletions(-)

through these commits:

Bob Moore (5):
      ACPICA: Load operator: re-instate most restrictions on incoming table signature
      ACPICA: Add missing _TDL to list of predefined names
      ACPICA: Update to version 20110527
      ACPICA: Add option to disable method return value validation and repair
      ACPICA: Update to version 20110623

David Rientjes (1):
      ACPI: remove NID_INVAL

Fenghua Yu (1):
      ACPICA: Do not repair _TSS return package if _PSS is present

Jean Delvare (3):
      thermal: hide CONFIG_THERMAL_HWMON
      thermal: split hwmon lookup to a separate function
      thermal: make THERMAL_HWMON implementation fully internal

Jon Mason (1):
      ACPI: fix 80 char overflow

Lan Tianyu (8):
      ACPI / SBS: Add getting state operation in the acpi_sbs_battery_get_property()
      ACPI / SBS: Correct the value of power_now and power_avg in the sysfs
      ACPI / Battery: Add the power unit macro
      ACPI / Battery: Change 16-bit signed negative battery current into correct value
      ACPI / Battery: Rename acpi_battery_quirks2 with acpi_battery_quirks
      ACPI / Battery: Add the hibernation process in the battery_notify()
      ACPI / Battery: Add the check before refresh sysfs in the battery_notify()
      ACPI / Battery: Resolve the race condition in the sysfs_remove_battery()

Len Brown (2):
      ACPI print OSI(Linux) warning only once
      ACPI:  delete stale reference in kernel-parameters.txt

Shaohua Li (1):
      ACPI: add missing _OSI strings

Stefan Assmann (1):
      ACPI: fix CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS

Stefan Hajnoczi (2):
      ACPI / Battery: avoid acpi_battery_add() use-after-free
      ACPI / Battery: propagate sysfs error in acpi_battery_add()

Takao Indoh (1):
      ACPI: introduce "acpi_rsdp=" parameter for kdump

Vasiliy Kulikov (1):
      ACPI: constify ops structs

Zhang Rui (1):
      ACPI: DMI workaround for Asus A8N-SLI Premium and Asus A8N-SLI DELUX

with this log:

commit 4a8f5058bde15d737abe39b5bed3f21dcb6599d2
Merge: 3eb208f eb03cb0 d7f6169 e410829 bb0c5ed aa16597 4996c02 6c8111e
Author: Len Brown <len.brown@intel.com>
Date:   Tue Aug 2 17:22:09 2011 -0400

    Merge branches 'acpica', 'battery', 'boot-irqs', 'bz-24492', 'bz-9528', 'from-akpm', 'kexec-param' and 'misc' into release
    
    Conflicts:
    	Documentation/kernel-parameters.txt
    
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 6c8111e9a0e73ef1e58a1bf0a10c23ee1512e7a2
Author: Len Brown <len.brown@intel.com>
Date:   Tue Aug 2 17:15:33 2011 -0400

    ACPI:  delete stale reference in kernel-parameters.txt
    
    Says for acpi=
                            See also Documentation/power/pm.txt, pci=noacpi
    
    but this file does not exist
    
    Reported-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit aa165971c2923d05988f920c978e438dbc7b0de6
Author: Shaohua Li <shaohua.li@intel.com>
Date:   Thu Jul 28 13:48:43 2011 -0700

    ACPI: add missing _OSI strings
    
    Linux supports some optional features, but it should notify the BIOS about
    them via the _OSI method.  Currently Linux doesn't notify any, which might
    make such features not work because the BIOS doesn't know about them.
    
    Jarosz has a system which needs this to make ACPI processor aggregator
    device work.
    
    Reported-by: "Jarosz, Sebastian" <sebastian.jarosz@intel.com>
    Signed-off-by: Shaohua Li <shaohua.li@intel.com>
    Acked-by: Matthew Garrett <mjg@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit f52e00c668669c9c290e84adf859c76db6d92a5a
Author: David Rientjes <rientjes@google.com>
Date:   Thu Jul 28 13:48:43 2011 -0700

    ACPI: remove NID_INVAL
    
    b552a8c56db8 ("ACPI: remove NID_INVAL") removed the left over uses of
    NID_INVAL, but didn't actually remove the definition.  Remove it.
    
    Signed-off-by: David Rientjes <rientjes@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 31f5396ad3bde23c8416e8d23ba425e27f413314
Author: Jean Delvare <khali@linux-fr.org>
Date:   Thu Jul 28 13:48:42 2011 -0700

    thermal: make THERMAL_HWMON implementation fully internal
    
    THERMAL_HWMON is implemented inside the thermal_sys driver and has no
    effect on drivers implementing thermal zones, so they shouldn't see
    anything related to it in <linux/thermal.h>.  Making the THERMAL_HWMON
    implementation fully internal has two advantages beyond the cleaner
    design:
    
    * This avoids rebuilding all thermal drivers if the THERMAL_HWMON
      implementation changes, or if CONFIG_THERMAL_HWMON gets enabled or
      disabled.
    
    * This avoids breaking the thermal kABI in these cases too, which should
      make distributions happy.
    
    The only drawback I can see is slightly higher memory fragmentation, as
    the number of kzalloc() calls will increase by one per thermal zone.  But
    I doubt it will be a problem in practice, as I've never seen a system with
    more than two thermal zones.
    
    Signed-off-by: Jean Delvare <khali@linux-fr.org>
    Cc: Rene Herman <rene.herman@gmail.com>
    Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 0d97d7a494d43be77f57e688369be0aae33d1ade
Author: Jean Delvare <khali@linux-fr.org>
Date:   Thu Jul 28 13:48:41 2011 -0700

    thermal: split hwmon lookup to a separate function
    
    We'll soon need to reuse it.
    
    Signed-off-by: Jean Delvare <khali@linux-fr.org>
    Cc: Rene Herman <rene.herman@gmail.com>
    Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit ab92402af04c151c26541eb28e1c26286a429db5
Author: Jean Delvare <khali@linux-fr.org>
Date:   Thu Jul 28 13:48:40 2011 -0700

    thermal: hide CONFIG_THERMAL_HWMON
    
    It's about time to revert 16d752397301b9 ("thermal: Create
    CONFIG_THERMAL_HWMON=n").  Anybody running a kernel >= 2.6.40 would also
    be running a recent enough version of lm-sensors.
    
    Actually having CONFIG_THERMAL_HWMON is pretty convenient so instead of
    dropping it, we keep it but hide it.
    
    Signed-off-by: Jean Delvare <khali@linux-fr.org>
    Cc: Rene Herman <rene.herman@gmail.com>
    Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 8997621bb2daaf19a4e9d82f118224159d8054e2
Author: Len Brown <len.brown@intel.com>
Date:   Tue Aug 2 00:45:48 2011 -0400

    ACPI print OSI(Linux) warning only once
    
    This message gets repeated on some machines:
    https://bugzilla.kernel.org/show_bug.cgi?id=29292
    
    Signed-off-by: Len Brown <len.brown@intel.com>

commit bb0c5ed6ec523199e34e81dcef8e987507553b63
Author: Zhang Rui <rui.zhang@intel.com>
Date:   Sat Jul 30 01:40:48 2011 -0400

    ACPI: DMI workaround for Asus A8N-SLI Premium and Asus A8N-SLI DELUX
    
    DMI workaround for A8N-SLI Premium and A8N-SLI DELUXE
    to enable the s3 suspend old ordering.
    http://bugzilla.kernel.org/show_bug.cgi?id=9528
    
    Tested-by: Heiko Ettelbrück <hbruckynews@gmx.de>
    Tested-by: Brian Beardall <brian@rapsure.net>
    Signed-off-by: Zhang Rui <rui.zhang@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit eb03cb02b74df6dd0b653d5f6d976f16a434dfaf
Author: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Date:   Tue Jul 12 09:03:29 2011 +0100

    ACPI / Battery: propagate sysfs error in acpi_battery_add()
    
    Make sure the error return from sysfs_add_battery() is checked and
    propagated out from acpi_battery_add().
    
    Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit e80bba4b5108c6479379740201b0a5d9da5ffbac
Author: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Date:   Tue Jul 12 09:03:28 2011 +0100

    ACPI / Battery: avoid acpi_battery_add() use-after-free
    
    When acpi_battery_add_fs() fails the error handling code does not clean
    up completely.  Moreover, it does not return resulting in a
    use-after-free.
    
    Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 4996c02306a25def1d352ec8e8f48895bbc7dea9
Author: Takao Indoh <indou.takao@jp.fujitsu.com>
Date:   Thu Jul 14 18:05:21 2011 -0400

    ACPI: introduce "acpi_rsdp=" parameter for kdump
    
    There is a problem with putting the first kernel in EFI virtual mode,
    it is that when the second kernel comes up it tries to initialize the
    EFI again and once we have put EFI in virtual mode we can not really
    do that.
    
    Actually, EFI is not necessary for kdump, we can boot the second kernel
    with "noefi" parameter, but the boot will mostly fail because 2nd kernel
    cannot find RSDP.
    
    In this situation, we introduced "acpi_rsdp=" kernel parameter, so that
    kexec-tools can pass the "noefi acpi_rsdp=X" to the second kernel to
    make kdump works. The physical address of the RSDP can be got from
    sysfs(/sys/firmware/efi/systab).
    
    Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com>
    Reviewed-by: WANG Cong <amwang@redhat.com>
    Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 9c8b04be443b33939f374a811c82abeebe0a61d1
Author: Vasiliy Kulikov <segoon@openwall.com>
Date:   Sat Jun 25 21:07:52 2011 +0400

    ACPI: constify ops structs
    
    Structs battery_file, acpi_dock_ops, file_operations,
    thermal_cooling_device_ops, thermal_zone_device_ops, kernel_param_ops
    are not changed in runtime.  It is safe to make them const.
    register_hotplug_dock_device() was altered to take const "ops" argument
    to respect acpi_dock_ops' const notion.
    
    Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
    Acked-by: Jeff Garzik <jgarzik@redhat.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit d7f6169a0d32002657886fee561c641acddb9a75
Author: Stefan Assmann <sassmann@kpanic.de>
Date:   Fri Jul 15 14:52:30 2011 +0200

    ACPI: fix CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS
    
    The following was observed by Steve Rostedt on 3.0.0-rc5
    Backtrace:
    irq 16: nobody cared (try booting with the "irqpoll" option)
    Pid: 65, comm: irq/16-uhci_hcd Not tainted 3.0.0-rc5-test+ #94
    Call Trace:
     [<ffffffff810aa643>] __report_bad_irq+0x37/0xc1
     [<ffffffff810aaa2d>] note_interrupt+0x14e/0x1c9
     [<ffffffff810a9a05>] ? irq_thread_fn+0x3c/0x3c
     [<ffffffff810a990e>] irq_thread+0xf6/0x1b1
     [<ffffffff810a9818>] ? irq_finalize_oneshot+0xb3/0xb3
     [<ffffffff8106b4d6>] kthread+0x9f/0xa7
     [<ffffffff814f1f04>] kernel_thread_helper+0x4/0x10
     [<ffffffff8103ca09>] ? finish_task_switch+0x7b/0xc0
     [<ffffffff814eac78>] ? retint_restore_args+0x13/0x13
     [<ffffffff8106b437>] ? __init_kthread_worker+0x5a/0x5a
     [<ffffffff814f1f00>] ? gs_change+0x13/0x13
    handlers:
    [<ffffffff810a912d>] irq_default_primary_handler threaded [<ffffffff8135eaa6>] usb_hcd_irq
    [<ffffffff810a912d>] irq_default_primary_handler threaded [<ffffffff8135eaa6>] usb_hcd_irq
    Disabling IRQ #16
    
    The problem being that a device triggers boot interrupts (due to threaded
    interrupt handling and masking of the IO-APIC), which are forwarded
    to the PIRQ line of the device. These interrupts are not handled on the PIRQ
    line because the interrupt handler is not present there.
    This should have already been fixed by CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS.
    However some parts of the quirk got lost in the ACPI merge. This is a resent of
    the patch proposed in 2009.
    See http://lkml.org/lkml/2009/9/7/192
    
    Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
    Tested-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit e545b55a1e980cbb6a158886286106bbf39722b1
Author: Jon Mason <jdmason@kudzu.us>
Date:   Sun Jun 19 18:51:37 2011 -0500

    ACPI: fix 80 char overflow
    
    Trivial fix for 80 char line overflow in drivers/acpi/pci_root.c
    
    Signed-off-by: Jon Mason <jdmason@kudzu.us>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 9c921c22a7f33397a6774d7fa076db9b6a0fd669
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Thu Jun 30 11:34:12 2011 +0800

    ACPI / Battery: Resolve the race condition in the sysfs_remove_battery()
    
    Use battery->lock in sysfs_remove_battery() to make
    checking, removing, and clearing bat.dev atomic.
    This is necessary because sysfs_remove_battery() may
    be invoked concurrently from different paths.
    
        https://bugzilla.kernel.org/show_bug.cgi?id=35642
    
    Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 6e17fb6aa1a67afa1827ae317c3594040f055730
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Thu Jun 30 11:33:58 2011 +0800

    ACPI / Battery: Add the check before refresh sysfs in the battery_notify()
    
    In the commit 25be5821, add the refresh sysfs when system resumes
    from suspending. But it didn't check that the battery exists. This
    will cause battery sysfs files added when the battery doesn't exist.
    This patch add the check before refreshing.
    
    	https://bugzilla.kernel.org/show_bug.cgi?id=35642
    
    Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit d5a5911b3278bad6515a9958f7318f74d534ef64
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Thu Jun 30 11:33:40 2011 +0800

    ACPI / Battery: Add the hibernation process in the battery_notify()
    
    The Commit 25be58215 has added a PM notifier to refresh the sys in order
    to deal with the unit change of the Battery Present Rate. But it just
    consided the suspend situation. The problem also will happen during the
    hibernation according the bug 28192.
    
        https://bugzilla.kernel.org/show_bug.cgi?id=28192
    
    This patch adds the hibernation process and fix the bug.
    
    Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 7b78622d0f9df9f274a319eea1494240119d3734
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Thu Jun 30 11:33:27 2011 +0800

    ACPI / Battery: Rename acpi_battery_quirks2 with acpi_battery_quirks
    
    This patch is cosmetic only, and makes no functional change.
    Since the acpi_battery_quirks has been deleted, rename
    acpi_battery_quirks2 with acpi_battery_quirks to clean the code.
    
    Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 55003b2105a4578736f3e868fbaa889bb1ff3ce0
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Thu Jun 30 11:33:12 2011 +0800

    ACPI / Battery: Change 16-bit signed negative battery current into correct value
    
    This patch is for some machines which report the battery current
    as a 16-bit signed negative when it is charging. This is caused
    by DSDT bug. The commit bc76f90b8a5cf4aceedf210d08d5e8292f820cec
    has resolved the problem for Acer laptops. But some other machines
    also have such problem.
    
        https://bugzilla.kernel.org/show_bug.cgi?id=33722
    
    Since it is improper that the current is above 32A on laptops
    whether on AC or on battery, this patch is to check the current and
     take its absolute value as current and producing a message when it
    is negative in s16.
    
    Remove Acer quirk, as this workaround handles Acer too.
    
    Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit ae6f61870490c10a0b0436e5afffa00c9dacffef
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Thu Jun 30 11:32:40 2011 +0800

    ACPI / Battery: Add the power unit macro
    
    This patch is cosmetic only, and makes no functional change.
    
    Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit e4108292cc5b5ca07abc83af31a78338362810ca
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Fri Jul 1 16:03:39 2011 +0800

    ACPI / SBS: Correct the value of power_now and power_avg in the sysfs
    
    https://bugzilla.kernel.org/show_bug.cgi?id=24492
    
    Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 1dd5c715e5b7524da8c1030f5cf1ea903e45c457
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Fri Jul 1 16:03:15 2011 +0800

    ACPI / SBS: Add getting state operation in the acpi_sbs_battery_get_property()
    
    https://bugzilla.kernel.org/show_bug.cgi?id=24492
    
    Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 3eb208f0a36cf7a86953afa7a4eb6776294e6768
Author: Bob Moore <robert.moore@intel.com>
Date:   Mon Jul 4 08:38:51 2011 +0000

    ACPICA: Update to version 20110623
    
    Version 20110623.
    
    Signed-off-by: Bob Moore <robert.moore@intel.com>
    Signed-off-by: Lin Ming <ming.m.lin@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 8f9c91273e36e5762c617c23e4fd48d5172e0dac
Author: Fenghua Yu <fenghua.yu@intel.com>
Date:   Mon Jul 4 08:36:16 2011 +0000

    ACPICA: Do not repair _TSS return package if _PSS is present
    
    We can only sort the _TSS return package if there is no _PSS
    in the same scope. This is because if _PSS is present, the ACPI
    specification dictates that the _TSS Power Dissipation field is
    to be ignored, and therefore some BIOSs leave garbage values in
    the _TSS Power field(s).  In this case, it is best to just return
    the _TSS package as-is.
    
    Reported-by: Fenghua Yu <fenghua.yu@intel.com>
    Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
    Signed-off-by: Bob Moore <robert.moore@intel.com>
    Signed-off-by: Lin Ming <ming.m.lin@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit d57b23ad0ca7a46931e4d98eb6b4b73b112f0969
Author: Bob Moore <robert.moore@intel.com>
Date:   Mon Jul 4 08:24:03 2011 +0000

    ACPICA: Add option to disable method return value validation and repair
    
    Runtime option can be used to disable return value repair if this
    is causing a problem on a particular machine.
    
    Signed-off-by: Bob Moore <robert.moore@intel.com>
    Signed-off-by: Lin Ming <ming.m.lin@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 80f40ce0f10e87d9a27f1e29325c6f39245fbcb1
Author: Bob Moore <robert.moore@intel.com>
Date:   Tue Jun 14 10:45:57 2011 +0800

    ACPICA: Update to version 20110527
    
    Version 20110527.
    
    Signed-off-by: Bob Moore <robert.moore@intel.com>
    Signed-off-by: Lin Ming <ming.m.lin@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit f631f9cafbc47bbfe5eb2aa831cd323175298448
Author: Bob Moore <robert.moore@intel.com>
Date:   Tue Jun 14 10:44:51 2011 +0800

    ACPICA: Add missing _TDL to list of predefined names
    
    Affects both iASL and core ACPICA.
    
    Signed-off-by: Bob Moore <robert.moore@intel.com>
    Signed-off-by: Lin Ming <ming.m.lin@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit c8cefe307d79c57b32ca1d4c9ebff528f8dd914c
Author: Bob Moore <robert.moore@intel.com>
Date:   Tue Jun 14 10:42:53 2011 +0800

    ACPICA: Load operator: re-instate most restrictions on incoming table signature
    
    Now, only allow "SSDT" "OEM", and a null signature.
    
    Signed-off-by: Bob Moore <robert.moore@intel.com>
    Signed-off-by: Lin Ming <ming.m.lin@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox