* [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <200906100057.04473.rjw@sisk.pl>
@ 2009-06-10 8:29 ` Rafael J. Wysocki
[not found] ` <200906101029.27529.rjw@sisk.pl>
1 sibling, 0 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2009-06-10 8:29 UTC (permalink / raw)
To: Alan Stern; +Cc: ACPI Devel Maling List, linux-pm, LKML
On Wednesday 10 June 2009, Rafael J. Wysocki wrote:
> On Tuesday 09 June 2009, Alan Stern wrote:
> > On Mon, 8 Jun 2009, Rafael J. Wysocki wrote:
> >
> > > > Use of the RPM_UNKNOWN state isn't good. A bus may have valid reasons
> > > > of its own for not carrying out an autosuspend. When this happens the
> > > > device's state isn't unknown.
> > >
> > > I'm not sure what you mean exactly.
> > >
> > > If ->autosuspend() fails, the device power state may be known, but the core
> > > can't be sure if the device is active. This information is available to the
> > > driver and/or the bus type, which should change the status to whatever is
> > > appropriate.
> >
> > But no matter what the driver or bus type sets the state to, your
> > pm_autosuspend() will change it to one of RPM_UNKNOWN or RPM_SUSPENDED.
> > Neither might be right.
>
> The idea is that if ->autosuspend() or ->autoresume() returns an error code,
> this is a situation the PM core cannot recover from by itself, so it shouldn't
> pretend it knows what's happened. Instead, it marks the device as "I don't
> know if it is safe to touch this" and won't handle it until the device driver
> or bus type clears the status.
>
> > > The name of this constant may be confusing, but I didn't have any better ideas.
> >
> > It's not clear what RPM_ACTIVE, RPM_IDLE, and RPM_SUSPENDED are
> > supposed to mean; this should be documented in the code. Also, why
> > isn't there RPM_RESUMING?
>
> Yes, there should be. In fact it's in the current version of the patch, which
> is appended. Also, there's a comment explaining the meaning of the RPM_*
> constants in pm.h .
>
> > By the way, a legitimate reason for aborting an autosuspend is if the
> > device's driver requires remote wakeup to be enabled during suspend but
> > the user has disabled it.
>
> Do you mean the user has disabled the remote wakeup?
>
> > > > The scheme doesn't include any mechanism for communicating runtime
> > > > power information up the device tree. When a device is autosuspended,
> > > > its parent's driver should be told so that the driver can consider
> > > > autosuspending the parent.
> > >
> > > I thought the bus type's ->autosuspend() callback could take care of this.
> >
> > Shouldn't this happen after the device's state has changed to
> > RPM_SUSPENDED? That's not until after the callback returns.
>
> OK, I tried to address the issue of parent suspend/resume in the new
> version of the patch below (I'm not sure if I did the nesting of spinlocks in
> pm_request_resume() correctly).
>
> > > > There should be a sysfs interface (like the one in USB) to allow
> > > > userspace to prevent a device from being autosuspended -- and perhaps
> > > > also to force it to be suspended.
> > >
> > > To prevent a device from being suspended - yes. To force it to stay suspended
> > > - I'm not sure.
> >
> > I'm not sure either. Oliver Neukum requested it originally and it has
> > been useful for debugging, but I haven't seen many places where it
> > would come in useful in practice.
>
> The problem with it is that the user space may not know if it is safe to keep
> a device suspended and if it is not, the kernel will have to ignore the setting
> anyway, so I'm not sure what's the point (except for debugging).
>
> > > > What about devices that have more than two runtime power states? For
> > > > example, you can't squeeze PCI's {D0,D1,D2,D3hot} range into {running,
> > > > suspended}.
> > >
> > > That has to be bus type-specific.
> > >
> > > In the case of PCI all of the low power states (D1-D3) are in fact substates of
> > > "suspended", because we generally need to quiesce the device before putting
> > > it into any of these states.
> > >
> > > I'm not sure if we can introduce more "levels of suspension", so to speak, at
> > > the core level, but in any case we can easily distinguish between "device
> > > quiesced and in a low power state" and "device fully active".
> > >
> > > So, in this picture the device is "suspended" from the core's point of view
> > > once it's bus type's ->autosuspend() callback has been successfully executed.
> >
> > This too should be documented in the code. Or in a Documentation file.
>
> OK
>
> I tried to address your comments and the Oliver's comments too in the new
> version of the patch below. Please have a look and tell me what you think.
Argh, I forgot about some important things.
First, there are devices with no parent (actually, it would be much easier if
they had a default dummy parent, but that's a separate issue).
Second, the parent has to be taken into account in the asynchronous resume
path too (which BTW is more complicated).
Finally, I decided to follow the Oliver's suggestion that some error codes returned
by ->autosuspend() and ->autoresume() may be regarded as "go back to the
previous state" information. I chose to use -EAGAIN and -EBUSY for this
purpose.
Updated patch follows, sorry for the confusion.
Best,
Rafael
---
drivers/base/power/Makefile | 1
drivers/base/power/main.c | 2
drivers/base/power/runtime.c | 393 +++++++++++++++++++++++++++++++++++++++++++
include/linux/pm.h | 78 ++++++++
include/linux/pm_runtime.h | 50 +++++
kernel/power/Kconfig | 14 +
kernel/power/main.c | 17 +
7 files changed, 553 insertions(+), 2 deletions(-)
Index: linux-2.6/kernel/power/Kconfig
===================================================================
--- linux-2.6.orig/kernel/power/Kconfig
+++ linux-2.6/kernel/power/Kconfig
@@ -208,3 +208,17 @@ config APM_EMULATION
random kernel OOPSes or reboots that don't seem to be related to
anything, try disabling/enabling this option (or disabling/enabling
APM in your BIOS).
+
+config PM_RUNTIME
+ bool "Run-time PM core functionality"
+ depends on PM
+ ---help---
+ Enable functionality allowing I/O devices to be put into energy-saving
+ (low power) states at run time (or autosuspended) after a specified
+ period of inactivity and woken up in response to a hardware-generated
+ wake-up event or a driver's request.
+
+ Hardware support is generally required for this functionality to work
+ and the bus type drivers of the buses the devices are on are
+ responsibile for the actual handling of the autosuspend requests and
+ wake-up events.
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -11,6 +11,7 @@
#include <linux/kobject.h>
#include <linux/string.h>
#include <linux/resume-trace.h>
+#include <linux/workqueue.h>
#include "power.h"
@@ -217,8 +218,24 @@ static struct attribute_group attr_group
.attrs = g,
};
+#ifdef CONFIG_PM_RUNTIME
+struct workqueue_struct *pm_wq;
+
+static int __init pm_start_workqueue(void)
+{
+ pm_wq = create_freezeable_workqueue("pm");
+
+ return pm_wq ? 0 : -ENOMEM;
+}
+#else
+static inline int pm_start_workqueue(void) { return 0; }
+#endif
+
static int __init pm_init(void)
{
+ int error = pm_start_workqueue();
+ if (error)
+ return error;
power_kobj = kobject_create_and_add("power", NULL);
if (!power_kobj)
return -ENOMEM;
Index: linux-2.6/include/linux/pm.h
===================================================================
--- linux-2.6.orig/include/linux/pm.h
+++ linux-2.6/include/linux/pm.h
@@ -22,6 +22,9 @@
#define _LINUX_PM_H
#include <linux/list.h>
+#include <linux/workqueue.h>
+#include <linux/spinlock.h>
+#include <linux/completion.h>
/*
* Callbacks for platform drivers to implement.
@@ -165,6 +168,15 @@ typedef struct pm_message {
* It is allowed to unregister devices while the above callbacks are being
* executed. However, it is not allowed to unregister a device from within any
* of its own callbacks.
+ *
+ * There also are two callbacks related to run-time power management of devices:
+ *
+ * @autosuspend: Save the device registers and put it into an energy-saving (low
+ * power) state at run-time, enable wake-up events as appropriate.
+ *
+ * @autoresume: Put the device into the full power state and restore its
+ * registers (if applicable) at run time, in response to a wake-up event
+ * generated by hardware or at a request of software.
*/
struct dev_pm_ops {
@@ -182,6 +194,10 @@ struct dev_pm_ops {
int (*thaw_noirq)(struct device *dev);
int (*poweroff_noirq)(struct device *dev);
int (*restore_noirq)(struct device *dev);
+#ifdef CONFIG_PM_RUNTIME
+ int (*autosuspend)(struct device *dev);
+ int (*autoresume)(struct device *dev);
+#endif
};
/**
@@ -315,14 +331,72 @@ enum dpm_state {
DPM_OFF_IRQ,
};
+/**
+ * Device run-time power management state.
+ *
+ * These state labels are used internally by the PM core to indicate the current
+ * status of a device with respect to the PM core operations. They do not
+ * reflect the actual power state of the device or its status as seen by the
+ * driver.
+ *
+ * RPM_ACTIVE Device is fully operational, no run-time PM requests are
+ * pending for it.
+ *
+ * RPM_IDLE It has been requested that the device be suspended.
+ * Suspend request has been put into the run-time PM
+ * workqueue and it's pending execution.
+ *
+ * RPM_SUSPENDING Device bus type's ->autosuspend() callback is being
+ * executed.
+ *
+ * RPM_SUSPENDED Device bus type's ->autosuspend() callback has completed
+ * successfully. The device is regarded as suspended.
+ *
+ * RPM_WAKE It has been requested that the device be woken up.
+ * Resume request has been put into the run-time PM
+ * workqueue and it's pending execution.
+ *
+ * RPM_RESUMING Device bus type's ->autoresume() callback is being
+ * executed.
+ *
+ * RPM_ERROR Represents a condition from which the PM core cannot
+ * recover by itself. If the device's run-time PM status
+ * field has this value, all of the run-time PM operations
+ * carried out for the device by the core will fail, until
+ * the status field is changed to either RPM_ACTIVE or
+ * RPM_SUSPENDED (it is not valid to use the other values
+ * in such a situation) by the device's driver or bus type.
+ * This happens when the device bus type's ->autosuspend()
+ * or ->autoresume() callback returns error code other than
+ * -EAGAIN or -EBUSY.
+ */
+
+enum rpm_state {
+ RPM_ERROR = -1,
+ RPM_ACTIVE,
+ RPM_IDLE,
+ RPM_SUSPENDING,
+ RPM_SUSPENDED,
+ RPM_WAKE,
+ RPM_RESUMING,
+};
+
struct dev_pm_info {
pm_message_t power_state;
- unsigned can_wakeup:1;
- unsigned should_wakeup:1;
+ unsigned int can_wakeup:1;
+ unsigned int should_wakeup:1;
enum dpm_state status; /* Owned by the PM core */
#ifdef CONFIG_PM_SLEEP
struct list_head entry;
#endif
+#ifdef CONFIG_PM_RUNTIME
+ struct delayed_work suspend_work;
+ unsigned int suspend_aborted:1;
+ struct work_struct resume_work;
+ struct completion work_done;
+ enum rpm_state runtime_status;
+ spinlock_t lock;
+#endif
};
/*
Index: linux-2.6/drivers/base/power/Makefile
===================================================================
--- linux-2.6.orig/drivers/base/power/Makefile
+++ linux-2.6/drivers/base/power/Makefile
@@ -1,5 +1,6 @@
obj-$(CONFIG_PM) += sysfs.o
obj-$(CONFIG_PM_SLEEP) += main.o
+obj-$(CONFIG_PM_RUNTIME) += runtime.o
obj-$(CONFIG_PM_TRACE_RTC) += trace.o
ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- /dev/null
+++ linux-2.6/drivers/base/power/runtime.c
@@ -0,0 +1,393 @@
+/*
+ * drivers/base/power/runtime.c - Helper functions for device run-time PM
+ *
+ * Copyright (c) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+ *
+ * This file is released under the GPLv2.
+ */
+
+#include <linux/pm_runtime.h>
+
+/**
+ * pm_runtime_reset - Clear all of the device run-time PM flags.
+ * @dev: Device object to clear the flags for.
+ */
+static void pm_runtime_reset(struct device *dev)
+{
+ dev->power.suspend_aborted = false;
+ dev->power.runtime_status = RPM_ACTIVE;
+}
+
+/**
+ * pm_device_suspended - Check if given device has been suspended at run time.
+ * @dev: Device to check.
+ * @data: Ignored.
+ *
+ * Returns 0 if the device has been suspended or -EBUSY otherwise.
+ */
+static int pm_device_suspended(struct device *dev, void *data)
+{
+ int ret;
+
+ spin_lock(&dev->power.lock);
+
+ ret = dev->power.runtime_status == RPM_SUSPENDED ? 0 : -EBUSY;
+
+ spin_unlock(&dev->power.lock);
+
+ return ret;
+}
+
+/**
+ * pm_check_children - Check if all children of a device have been suspended.
+ * @dev: Device to check.
+ *
+ * Returns 0 if all children of the device have been suspended or -EBUSY
+ * otherwise.
+ */
+static int pm_check_children(struct device *dev)
+{
+ return device_for_each_child(dev, NULL, pm_device_suspended);
+}
+
+/**
+ * pm_autosuspend - Run autosuspend callback of given device object's bus type.
+ * @work: Work structure used for scheduling the execution of this function.
+ *
+ * Use @work to get the device object the suspend has been scheduled for,
+ * check if the suspend request hasn't been cancelled and run the
+ * ->autosuspend() callback from the device's bus type driver. Update the
+ * run-time PM flags in the device object to reflect the current status of the
+ * device.
+ */
+static void pm_autosuspend(struct work_struct *work)
+{
+ struct delayed_work *dw = to_delayed_work(work);
+ struct device *dev = suspend_work_to_device(dw);
+ int error = 0;
+
+ spin_lock(&dev->power.lock);
+
+ if (dev->power.suspend_aborted) {
+ dev->power.runtime_status = RPM_ACTIVE;
+ goto out;
+ } else if (dev->power.runtime_status != RPM_IDLE) {
+ goto out;
+ } else if (pm_check_children(dev)) {
+ /*
+ * We can only suspend the device if all of its children have
+ * been suspended.
+ */
+ goto out;
+ }
+
+ dev->power.runtime_status = RPM_SUSPENDING;
+ init_completion(&dev->power.work_done);
+
+ spin_unlock(&dev->power.lock);
+
+ if (dev && dev->bus && dev->bus->pm && dev->bus->pm->autosuspend)
+ error = dev->bus->pm->autosuspend(dev);
+
+ spin_lock(&dev->power.lock);
+
+ switch (error) {
+ case 0:
+ dev->power.runtime_status = RPM_SUSPENDED;
+ break;
+ case -EAGAIN:
+ case -EBUSY:
+ dev->power.runtime_status = RPM_ACTIVE;
+ break;
+ default:
+ dev->power.runtime_status = RPM_ERROR;
+ }
+ complete(&dev->power.work_done);
+
+ out:
+ spin_unlock(&dev->power.lock);
+}
+
+/**
+ * pm_request_suspend - Schedule run-time suspend of given device.
+ * @dev: Device to suspend.
+ * @delay: Time to wait before attempting to suspend the device.
+ */
+void pm_request_suspend(struct device *dev, unsigned long delay)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&dev->power.lock, flags);
+
+ if (dev->power.runtime_status != RPM_ACTIVE)
+ goto out;
+
+ dev->power.runtime_status = RPM_IDLE;
+ dev->power.suspend_aborted = false;
+ queue_delayed_work(pm_wq, &dev->power.suspend_work, delay);
+
+ out:
+ spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+
+/**
+ * pm_cancel_suspend - Cancel a pending suspend request for given device.
+ * @dev: Device to cancel the suspend request for.
+ *
+ * Should be called under pm_lock_device() and only if we are sure that the
+ * ->autosuspend() callback hasn't started to yet.
+ */
+static void pm_cancel_suspend(struct device *dev)
+{
+ dev->power.suspend_aborted = true;
+ cancel_delayed_work(&dev->power.suspend_work);
+ dev->power.runtime_status = RPM_ACTIVE;
+}
+
+/**
+ * pm_autoresume - Run autoresume callback of given device object's bus type.
+ * @work: Work structure used for scheduling the execution of this function.
+ *
+ * Use @work to get the device object the resume has been scheduled for,
+ * check if the device is really suspended and run the ->autoresume() callback
+ * from the device's bus type driver. Update the run-time PM flags in the
+ * device object to reflect the current status of the device.
+ */
+static void pm_autoresume(struct work_struct *work)
+{
+ struct device *dev = resume_work_to_device(work);
+ int error = 0;
+
+ if (dev->parent)
+ spin_lock(&dev->parent->power.lock);
+ spin_lock(&dev->power.lock);
+
+ repeat:
+ if (dev->power.runtime_status != RPM_WAKE) {
+ if (dev->parent)
+ spin_unlock(&dev->parent->power.lock);
+ goto out;
+ } else if (dev->parent
+ && dev->parent->power.runtime_status != RPM_ACTIVE) {
+ if (dev->parent->power.runtime_status == RPM_RESUMING) {
+ spin_unlock(&dev->power.lock);
+ spin_unlock(&dev->parent->power.lock);
+
+ wait_for_completion(&dev->parent->power.work_done);
+
+ spin_lock(&dev->parent->power.lock);
+ spin_lock(&dev->power.lock);
+ }
+ if (dev->parent->power.runtime_status != RPM_ACTIVE) {
+ spin_unlock(&dev->parent->power.lock);
+ goto out;
+ }
+ goto repeat;
+ }
+
+ dev->power.runtime_status = RPM_RESUMING;
+ init_completion(&dev->power.work_done);
+
+ spin_unlock(&dev->power.lock);
+ if (dev->parent)
+ spin_unlock(&dev->parent->power.lock);
+
+ if (dev->bus && dev->bus->pm && dev->bus->pm->autoresume)
+ error = dev->bus->pm->autoresume(dev);
+
+ spin_lock(&dev->power.lock);
+
+ switch (error) {
+ case 0:
+ dev->power.runtime_status = RPM_ACTIVE;
+ break;
+ case -EAGAIN:
+ case -EBUSY:
+ dev->power.runtime_status = RPM_SUSPENDED;
+ break;
+ default:
+ dev->power.runtime_status = RPM_ERROR;
+ }
+ complete(&dev->power.work_done);
+
+ out:
+ spin_unlock(&dev->power.lock);
+}
+
+/**
+ * pm_request_resume - Schedule run-time resume of given device.
+ * @dev: Device to resume.
+ */
+void pm_request_resume(struct device *dev)
+{
+ unsigned long parent_flags = 0, flags;
+
+ repeat:
+ if (dev->parent)
+ spin_lock_irqsave(&dev->parent->power.lock, parent_flags);
+ spin_lock_irqsave(&dev->power.lock, flags);
+
+ if (dev->power.runtime_status == RPM_IDLE) {
+ /* Autosuspend request is pending, no need to resume. */
+ pm_cancel_suspend(dev);
+ goto out;
+ } else if (dev->power.runtime_status != RPM_SUSPENDING
+ && dev->power.runtime_status != RPM_SUSPENDED) {
+ goto out;
+ } else if (dev->parent
+ && (dev->parent->power.runtime_status == RPM_IDLE
+ || dev->parent->power.runtime_status == RPM_SUSPENDING
+ || dev->parent->power.runtime_status == RPM_SUSPENDED)) {
+ spin_unlock_irqrestore(&dev->power.lock, flags);
+ spin_unlock_irqrestore(&dev->parent->power.lock, parent_flags);
+
+ /* We have to resume the parent first. */
+ pm_request_resume(dev->parent);
+
+ goto repeat;
+ }
+
+ dev->power.runtime_status = RPM_WAKE;
+ queue_work(pm_wq, &dev->power.resume_work);
+
+ out:
+ spin_unlock_irqrestore(&dev->power.lock, flags);
+ if (dev->parent)
+ spin_unlock_irqrestore(&dev->parent->power.lock, parent_flags);
+}
+
+/**
+ * pm_resume_sync - Resume given device waiting for the operation to complete.
+ * @dev: Device to resume.
+ *
+ * Resume the device synchronously, waiting for the operation to complete. If
+ * autosuspend is in progress while this function is being run, wait for it to
+ * finish before resuming the device. If the autosuspend is scheduled, but it
+ * hasn't started yet, cancel it and we're done.
+ */
+int pm_resume_sync(struct device *dev)
+{
+ int error = 0;
+
+ spin_lock(&dev->power.lock);
+
+ if (dev->power.runtime_status == RPM_ACTIVE) {
+ goto out;
+ } if (dev->power.runtime_status == RPM_IDLE) {
+ /* ->autosuspend() hasn't started yet, no need to resume. */
+ pm_cancel_suspend(dev);
+ goto out;
+ }
+
+ if (dev->power.runtime_status == RPM_SUSPENDING) {
+ spin_unlock(&dev->power.lock);
+
+ /*
+ * The ->autosuspend() callback is being executed right now,
+ * wait for it to complete.
+ */
+ wait_for_completion(&dev->power.work_done);
+ } else if (dev->power.runtime_status == RPM_SUSPENDED && dev->parent) {
+ spin_unlock(&dev->power.lock);
+
+ /* The device's parent may also be suspended. Resume it. */
+ error = pm_resume_sync(dev->parent);
+ if (error)
+ return error;
+ } else {
+ spin_unlock(&dev->power.lock);
+ }
+
+ if (dev->parent)
+ spin_lock(&dev->parent->power.lock);
+ spin_lock(&dev->power.lock);
+
+ if (dev->power.runtime_status == RPM_RESUMING)
+ /* There's another resume running in parallel with us. */
+ error = -EAGAIN;
+ else if (dev->power.runtime_status != RPM_SUSPENDED)
+ error = -EINVAL;
+ if (error) {
+ if (dev->parent)
+ spin_unlock(&dev->parent->power.lock);
+ goto out;
+ }
+
+ dev->power.runtime_status = RPM_RESUMING;
+ init_completion(&dev->power.work_done);
+
+ spin_unlock(&dev->power.lock);
+ if (dev->parent)
+ spin_unlock(&dev->parent->power.lock);
+
+ if (dev->bus && dev->bus->pm && dev->bus->pm->autoresume)
+ error = dev->bus->pm->autoresume(dev);
+
+ spin_lock(&dev->power.lock);
+
+ switch (error) {
+ case 0:
+ dev->power.runtime_status = RPM_ACTIVE;
+ break;
+ case -EAGAIN:
+ case -EBUSY:
+ dev->power.runtime_status = RPM_SUSPENDED;
+ break;
+ default:
+ dev->power.runtime_status = RPM_ERROR;
+ }
+ complete(&dev->power.work_done);
+
+ out:
+ spin_unlock(&dev->power.lock);
+
+ return error;
+}
+
+/**
+ * pm_cancel_autosuspend - Cancel a pending autosuspend request for given device
+ * @dev: Device to handle.
+ *
+ * This routine is only supposed to be called when the run-time PM workqueue is
+ * frozen (i.e. during system-wide suspend or hibernation) when it is guaranteed
+ * that no work items are being executed.
+ */
+void pm_cancel_autosuspend(struct device *dev)
+{
+ spin_lock(&dev->power.lock);
+
+ cancel_delayed_work(&dev->power.suspend_work);
+ pm_runtime_reset(dev);
+
+ spin_unlock(&dev->power.lock);
+}
+
+/**
+ * pm_cancel_autoresume - Cancel a pending autoresume request for given device
+ * @dev: Device to handle.
+ *
+ * This routine is only supposed to be called when the run-time PM workqueue is
+ * frozen (i.e. during system-wide suspend or hibernation) when it is guaranteed
+ * that no work items are being executed.
+ */
+void pm_cancel_autoresume(struct device *dev)
+{
+ spin_lock(&dev->power.lock);
+
+ work_clear_pending(&dev->power.resume_work);
+ pm_runtime_reset(dev);
+
+ spin_unlock(&dev->power.lock);
+}
+
+/**
+ * pm_runtime_init - Initialize run-time PM fields in given device object.
+ * @dev: Device object to handle.
+ */
+void pm_runtime_init(struct device *dev)
+{
+ pm_runtime_reset(dev);
+ spin_lock_init(&dev->power.lock);
+ INIT_DELAYED_WORK(&dev->power.suspend_work, pm_autosuspend);
+ INIT_WORK(&dev->power.resume_work, pm_autoresume);
+}
Index: linux-2.6/include/linux/pm_runtime.h
===================================================================
--- /dev/null
+++ linux-2.6/include/linux/pm_runtime.h
@@ -0,0 +1,50 @@
+/*
+ * pm_runtime.h - Device run-time power management helper functions.
+ *
+ * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This file is released under the GPLv2.
+ */
+
+#ifndef _LINUX_PM_RUNTIME_H
+#define _LINUX_PM_RUNTIME_H
+
+#include <linux/device.h>
+#include <linux/pm.h>
+
+#ifdef CONFIG_PM_RUNTIME
+extern struct workqueue_struct *pm_wq;
+
+extern void pm_runtime_init(struct device *dev);
+extern void pm_request_suspend(struct device *dev, unsigned long delay);
+extern void pm_request_resume(struct device *dev);
+extern int pm_resume_sync(struct device *dev);
+extern void pm_cancel_autosuspend(struct device *dev);
+extern void pm_cancel_autoresume(struct device *dev);
+
+static inline struct device *suspend_work_to_device(struct delayed_work *work)
+{
+ struct dev_pm_info *dpi;
+
+ dpi = container_of(work, struct dev_pm_info, suspend_work);
+ return container_of(dpi, struct device, power);
+}
+
+static inline struct device *resume_work_to_device(struct work_struct *work)
+{
+ struct dev_pm_info *dpi;
+
+ dpi = container_of(work, struct dev_pm_info, resume_work);
+ return container_of(dpi, struct device, power);
+}
+
+#else /* !CONFIG_PM_RUNTIME */
+static inline void pm_runtime_init(struct device *dev) {}
+static inline void pm_request_suspend(struct device *dev, unsigned long delay);
+static inline void pm_request_resume(struct device *dev) {}
+static inline int pm_resume_sync(struct device *dev) { return -ENOSYS; }
+static inline void pm_cancel_autosuspend(struct device *dev) {}
+static inline void pm_cancel_autoresume(struct device *dev) {}
+#endif /* !CONFIG_PM_RUNTIME */
+
+#endif
Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -21,6 +21,7 @@
#include <linux/kallsyms.h>
#include <linux/mutex.h>
#include <linux/pm.h>
+#include <linux/pm_runtime.h>
#include <linux/resume-trace.h>
#include <linux/rwsem.h>
#include <linux/interrupt.h>
@@ -88,6 +89,7 @@ void device_pm_add(struct device *dev)
}
list_add_tail(&dev->power.entry, &dpm_list);
+ pm_runtime_init(dev);
mutex_unlock(&dpm_list_mtx);
}
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <200906101029.27529.rjw@sisk.pl>
@ 2009-06-10 14:20 ` Oliver Neukum
[not found] ` <200906101620.26559.oliver@neukum.org>
` (3 subsequent siblings)
4 siblings, 0 replies; 41+ messages in thread
From: Oliver Neukum @ 2009-06-10 14:20 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, linux-pm, LKML
Am Mittwoch, 10. Juni 2009 10:29:26 schrieb Rafael J. Wysocki:
> Argh, I forgot about some important things.
>
> First, there are devices with no parent (actually, it would be much easier
> if they had a default dummy parent, but that's a separate issue).
>
> Second, the parent has to be taken into account in the asynchronous resume
> path too (which BTW is more complicated).
What happens if the parent's parent is also suspended? It seems to me that
you must code this recursively.
Regards
Oliver
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <200906101620.26559.oliver@neukum.org>
@ 2009-06-10 19:27 ` Rafael J. Wysocki
[not found] ` <200906102127.57136.rjw@sisk.pl>
1 sibling, 0 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2009-06-10 19:27 UTC (permalink / raw)
To: Oliver Neukum; +Cc: ACPI Devel Maling List, linux-pm, LKML
On Wednesday 10 June 2009, Oliver Neukum wrote:
> Am Mittwoch, 10. Juni 2009 10:29:26 schrieb Rafael J. Wysocki:
> > Argh, I forgot about some important things.
> >
> > First, there are devices with no parent (actually, it would be much easier
> > if they had a default dummy parent, but that's a separate issue).
> >
> > Second, the parent has to be taken into account in the asynchronous resume
> > path too (which BTW is more complicated).
>
> What happens if the parent's parent is also suspended? It seems to me that
> you must code this recursively.
Hmm, I thought I did.
[Looks]
pm_request_resume(dev) will call pm_request_resume(dev->parent), if necessary,
and that will call pm_request_resume(dev->parent->parent) and so on. Each of
them will queue a work item and the one for the topmost parent will be queued
first. So, the resume requests for all parents will be executed before the
one for the device, due to the fact that the workqueue is singlethread.
Well, there is a bug related to it, namely pm_autosuspend() may change the
status to RPM_SUSPENDED after pm_request_resume() has changed it to
RPM_WAKE, that needs fixing.
Best,
Rafael
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <200906101029.27529.rjw@sisk.pl>
2009-06-10 14:20 ` Oliver Neukum
[not found] ` <200906101620.26559.oliver@neukum.org>
@ 2009-06-10 21:14 ` Alan Stern
2009-06-11 5:18 ` Magnus Damm
[not found] ` <aec7e5c30906102218j2f8f38dbt5d05a3928ca46c15@mail.gmail.com>
4 siblings, 0 replies; 41+ messages in thread
From: Alan Stern @ 2009-06-10 21:14 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Wed, 10 Jun 2009, Rafael J. Wysocki wrote:
> > The idea is that if ->autosuspend() or ->autoresume() returns an error code,
> > this is a situation the PM core cannot recover from by itself, so it shouldn't
> > pretend it knows what's happened. Instead, it marks the device as "I don't
> > know if it is safe to touch this" and won't handle it until the device driver
> > or bus type clears the status.
I'm still not sure this is a good idea. When would the device driver
clear the status? The autosuspend and autoresume methods run
asynchronously, so after they're done the driver doesn't get a chance
to do anything.
It might be best just to set the status to RPM_ACTIVE if a runtime
suspend fails and RPM_SUSPENDED if a runtime resume fails.
> Finally, I decided to follow the Oliver's suggestion that some error codes returned
> by ->autosuspend() and ->autoresume() may be regarded as "go back to the
> previous state" information. I chose to use -EAGAIN and -EBUSY for this
> purpose.
Maybe...
> struct dev_pm_info {
> pm_message_t power_state;
> - unsigned can_wakeup:1;
> - unsigned should_wakeup:1;
> + unsigned int can_wakeup:1;
> + unsigned int should_wakeup:1;
> enum dpm_state status; /* Owned by the PM core */
> #ifdef CONFIG_PM_SLEEP
> struct list_head entry;
> #endif
> +#ifdef CONFIG_PM_RUNTIME
> + struct delayed_work suspend_work;
> + unsigned int suspend_aborted:1;
> + struct work_struct resume_work;
> + struct completion work_done;
> + enum rpm_state runtime_status;
> + spinlock_t lock;
> +#endif
> };
You know, it doesn't make any sense to have a suspend and a resume
both pending at the same time. So you could add only a delayed_work
structure and use its embedded work_struct for resume requests.
Also, you might borrow a trick from Dave Brownell. Define the RPM_*
values so that the individual bits have meanings. Then instead of
testing for multiple possible values of runtime_status, you could do a
simple bit test.
> +/**
> + * pm_device_suspended - Check if given device has been suspended at run time.
> + * @dev: Device to check.
> + * @data: Ignored.
> + *
> + * Returns 0 if the device has been suspended or -EBUSY otherwise.
> + */
> +static int pm_device_suspended(struct device *dev, void *data)
> +{
> + int ret;
> +
> + spin_lock(&dev->power.lock);
> +
> + ret = dev->power.runtime_status == RPM_SUSPENDED ? 0 : -EBUSY;
> +
> + spin_unlock(&dev->power.lock);
How does acquiring the lock help here?
> +/**
> + * pm_check_children - Check if all children of a device have been suspended.
> + * @dev: Device to check.
> + *
> + * Returns 0 if all children of the device have been suspended or -EBUSY
> + * otherwise.
> + */
We might want to do a runtime suspend even if the device's children
aren't already suspended. For example, you could suspend a link while
leaving the device on the other end of the link at full power --
especially if powering down the device is slow but changing the link's
power level is fast.
> +/**
> + * pm_autosuspend - Run autosuspend callback of given device object's bus type.
> + * @work: Work structure used for scheduling the execution of this function.
> + *
> + * Use @work to get the device object the suspend has been scheduled for,
> + * check if the suspend request hasn't been cancelled and run the
> + * ->autosuspend() callback from the device's bus type driver. Update the
> + * run-time PM flags in the device object to reflect the current status of the
> + * device.
> + */
> +static void pm_autosuspend(struct work_struct *work)
Can we call this something else? "Autosuspend" implies that the
suspend originated from within the kernel. How about "pm_suspend_work"
or "pm_runtime_suspend"? Likewise for the resume routines.
I haven't checked the details of the code yet. More later...
Alan Stern
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <Pine.LNX.4.44L0.0906101656100.2589-100000@iolanthe.rowland.org>
@ 2009-06-10 21:31 ` Rafael J. Wysocki
[not found] ` <200906102331.14267.rjw@sisk.pl>
1 sibling, 0 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2009-06-10 21:31 UTC (permalink / raw)
To: Alan Stern; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Wednesday 10 June 2009, Alan Stern wrote:
> On Wed, 10 Jun 2009, Rafael J. Wysocki wrote:
>
> > > The idea is that if ->autosuspend() or ->autoresume() returns an error code,
> > > this is a situation the PM core cannot recover from by itself, so it shouldn't
> > > pretend it knows what's happened. Instead, it marks the device as "I don't
> > > know if it is safe to touch this" and won't handle it until the device driver
> > > or bus type clears the status.
>
> I'm still not sure this is a good idea. When would the device driver
> clear the status? The autosuspend and autoresume methods run
> asynchronously, so after they're done the driver doesn't get a chance
> to do anything.
>
> It might be best just to set the status to RPM_ACTIVE if a runtime
> suspend fails and RPM_SUSPENDED if a runtime resume fails.
>
> > Finally, I decided to follow the Oliver's suggestion that some error codes returned
> > by ->autosuspend() and ->autoresume() may be regarded as "go back to the
> > previous state" information. I chose to use -EAGAIN and -EBUSY for this
> > purpose.
>
> Maybe...
>
>
> > struct dev_pm_info {
> > pm_message_t power_state;
> > - unsigned can_wakeup:1;
> > - unsigned should_wakeup:1;
> > + unsigned int can_wakeup:1;
> > + unsigned int should_wakeup:1;
> > enum dpm_state status; /* Owned by the PM core */
> > #ifdef CONFIG_PM_SLEEP
> > struct list_head entry;
> > #endif
> > +#ifdef CONFIG_PM_RUNTIME
> > + struct delayed_work suspend_work;
> > + unsigned int suspend_aborted:1;
> > + struct work_struct resume_work;
> > + struct completion work_done;
> > + enum rpm_state runtime_status;
> > + spinlock_t lock;
> > +#endif
> > };
>
> You know, it doesn't make any sense to have a suspend and a resume
> both pending at the same time.
>
> So you could add only a delayed_work structure and use its embedded
> work_struct for resume requests.
I thought so too, but I was wrong. ;-)
If resume is requested while the suspend hasn't completed yet, we should
queue it (it's totally valid to request a suspending device to resume IMO), but
the delayed work is still being used by the workqueue code, so we can't modify
it.
> Also, you might borrow a trick from Dave Brownell. Define the RPM_*
> values so that the individual bits have meanings. Then instead of
> testing for multiple possible values of runtime_status, you could do a
> simple bit test.
Yes, I'm seriously considering using this approach.
> > +/**
> > + * pm_device_suspended - Check if given device has been suspended at run time.
> > + * @dev: Device to check.
> > + * @data: Ignored.
> > + *
> > + * Returns 0 if the device has been suspended or -EBUSY otherwise.
> > + */
> > +static int pm_device_suspended(struct device *dev, void *data)
> > +{
> > + int ret;
> > +
> > + spin_lock(&dev->power.lock);
> > +
> > + ret = dev->power.runtime_status == RPM_SUSPENDED ? 0 : -EBUSY;
> > +
> > + spin_unlock(&dev->power.lock);
>
> How does acquiring the lock help here?
OK, it doesn't.
> > +/**
> > + * pm_check_children - Check if all children of a device have been suspended.
> > + * @dev: Device to check.
> > + *
> > + * Returns 0 if all children of the device have been suspended or -EBUSY
> > + * otherwise.
> > + */
>
> We might want to do a runtime suspend even if the device's children
> aren't already suspended. For example, you could suspend a link while
> leaving the device on the other end of the link at full power --
> especially if powering down the device is slow but changing the link's
> power level is fast.
Well, this means that the dependencies between devices in the device tree are
pretty much useless for the run-time PM as far as the core is concerned. In
which case, why did you mention them at all?
> > +/**
> > + * pm_autosuspend - Run autosuspend callback of given device object's bus type.
> > + * @work: Work structure used for scheduling the execution of this function.
> > + *
> > + * Use @work to get the device object the suspend has been scheduled for,
> > + * check if the suspend request hasn't been cancelled and run the
> > + * ->autosuspend() callback from the device's bus type driver. Update the
> > + * run-time PM flags in the device object to reflect the current status of the
> > + * device.
> > + */
> > +static void pm_autosuspend(struct work_struct *work)
>
> Can we call this something else? "Autosuspend" implies that the
> suspend originated from within the kernel. How about "pm_suspend_work"
> or "pm_runtime_suspend"? Likewise for the resume routines.
OK
> I haven't checked the details of the code yet. More later...
OK, thanks.
Best,
Rafael
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <200906102127.57136.rjw@sisk.pl>
@ 2009-06-10 21:38 ` Oliver Neukum
[not found] ` <200906102338.35183.oliver@neukum.org>
1 sibling, 0 replies; 41+ messages in thread
From: Oliver Neukum @ 2009-06-10 21:38 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, linux-pm, LKML
Am Mittwoch, 10. Juni 2009 21:27:56 schrieb Rafael J. Wysocki:
> > What happens if the parent's parent is also suspended? It seems to me
> > that you must code this recursively.
>
> Hmm, I thought I did.
>
> [Looks]
>
> pm_request_resume(dev) will call pm_request_resume(dev->parent), if
> necessary, and that will call pm_request_resume(dev->parent->parent) and so
> on. Each of them will queue a work item and the one for the topmost parent
> will be queued first. So, the resume requests for all parents will be
> executed before the one for the device, due to the fact that the workqueue
> is singlethread.
Sneaky, I overlooked that.
> Well, there is a bug related to it, namely pm_autosuspend() may change the
> status to RPM_SUSPENDED after pm_request_resume() has changed it to
> RPM_WAKE, that needs fixing.
Ok, maybe this is related. You recurse if the parent isn't in RPM_ACTIVE.
But that is not enough. You must ensure that all the nodes higher up stay
in RPM_ACTIVE. It seems to me that you must go up until you find an
active node (or the root) and put it a blocked state.
Regards
Oliver
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <200906102338.35183.oliver@neukum.org>
@ 2009-06-10 22:01 ` Rafael J. Wysocki
[not found] ` <200906110001.20718.rjw@sisk.pl>
1 sibling, 0 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2009-06-10 22:01 UTC (permalink / raw)
To: Oliver Neukum; +Cc: ACPI Devel Maling List, linux-pm, LKML
On Wednesday 10 June 2009, Oliver Neukum wrote:
> Am Mittwoch, 10. Juni 2009 21:27:56 schrieb Rafael J. Wysocki:
> > > What happens if the parent's parent is also suspended? It seems to me
> > > that you must code this recursively.
> >
> > Hmm, I thought I did.
> >
> > [Looks]
> >
> > pm_request_resume(dev) will call pm_request_resume(dev->parent), if
> > necessary, and that will call pm_request_resume(dev->parent->parent) and so
> > on. Each of them will queue a work item and the one for the topmost parent
> > will be queued first. So, the resume requests for all parents will be
> > executed before the one for the device, due to the fact that the workqueue
> > is singlethread.
>
> Sneaky, I overlooked that.
>
> > Well, there is a bug related to it, namely pm_autosuspend() may change the
> > status to RPM_SUSPENDED after pm_request_resume() has changed it to
> > RPM_WAKE, that needs fixing.
>
> Ok, maybe this is related. You recurse if the parent isn't in RPM_ACTIVE.
> But that is not enough. You must ensure that all the nodes higher up stay
> in RPM_ACTIVE. It seems to me that you must go up until you find an
> active node (or the root) and put it a blocked state.
If you're referring to pm_autoresume(), then this again is tricky.
We have queued up resume requests for the device's parent, its parent etc.,
the topmost one goes first. The workqueue is singlethread, so pm_autoresume()
is going to be run for all parents before the device itself, so if that were the
only resume mechanism, it would be enough to check if the parent is RPM_ACTIVE.
*However*, there also is pm_resume_sync(), which can take the device directly
from RPM_SUSPENDED to RPM_RESUMING and that may be done in parallel with our
pm_autoresume(). That's why I put the wait_for_completion() in there.
Best,
Rafael
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <200906110001.20718.rjw@sisk.pl>
@ 2009-06-10 23:07 ` Oliver Neukum
[not found] ` <200906110107.23023.oliver@neukum.org>
1 sibling, 0 replies; 41+ messages in thread
From: Oliver Neukum @ 2009-06-10 23:07 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, linux-pm, LKML
Am Donnerstag, 11. Juni 2009 00:01:20 schrieb Rafael J. Wysocki:
> We have queued up resume requests for the device's parent, its parent etc.,
> the topmost one goes first. The workqueue is singlethread, so
> pm_autoresume() is going to be run for all parents before the device
> itself, so if that were the only resume mechanism, it would be enough to
> check if the parent is RPM_ACTIVE.
A (IDLE)
/ \
B (SUSPENDED) C (SUSPENDED)
Suppose C is to be resumed. This means first in case of A the request
to suspend would be cancelled. Here you drop the locks:
+ && (dev->parent->power.runtime_status == RPM_IDLE
+ || dev->parent->power.runtime_status == RPM_SUSPENDING
+ || dev->parent->power.runtime_status == RPM_SUSPENDED)) {
+ spin_unlock_irqrestore(&dev->power.lock, flags);
+ spin_unlock_irqrestore(&dev->parent->power.lock, parent_flags);
+
+ /* We have to resume the parent first. */
+ pm_request_resume(dev->parent);
But after pm_request_resume() returns there's no means to make sure
nothing alters it back to RPM_SUSPENDED. The workqueue doesn't help
you because you've scheduled nothing by that time. The suspension will
work because C is still in RPM_SUSPENDED.
Regards
Oliver
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <200906102331.14267.rjw@sisk.pl>
@ 2009-06-10 23:15 ` Oliver Neukum
2009-06-10 23:42 ` Alan Stern
[not found] ` <200906110115.30729.oliver@neukum.org>
2 siblings, 0 replies; 41+ messages in thread
From: Oliver Neukum @ 2009-06-10 23:15 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
Am Mittwoch, 10. Juni 2009 23:31:13 schrieb Rafael J. Wysocki:
> > > +/**
> > > + * pm_check_children - Check if all children of a device have been
> > > suspended. + * @dev: Device to check.
> > > + *
> > > + * Returns 0 if all children of the device have been suspended or
> > > -EBUSY + * otherwise.
> > > + */
> >
> > We might want to do a runtime suspend even if the device's children
> > aren't already suspended. For example, you could suspend a link while
> > leaving the device on the other end of the link at full power --
> > especially if powering down the device is slow but changing the link's
> > power level is fast.
>
> Well, this means that the dependencies between devices in the device tree
> are pretty much useless for the run-time PM as far as the core is
> concerned. In which case, why did you mention them at all?
Some bussystems need this constraint others don't or only for some nodes.
We need a way to communicate this to the core.
Regards
Oliver
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <200906102331.14267.rjw@sisk.pl>
2009-06-10 23:15 ` Oliver Neukum
@ 2009-06-10 23:42 ` Alan Stern
[not found] ` <200906110115.30729.oliver@neukum.org>
2 siblings, 0 replies; 41+ messages in thread
From: Alan Stern @ 2009-06-10 23:42 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Wed, 10 Jun 2009, Rafael J. Wysocki wrote:
> > You know, it doesn't make any sense to have a suspend and a resume
> > both pending at the same time.
> >
> > So you could add only a delayed_work structure and use its embedded
> > work_struct for resume requests.
>
> I thought so too, but I was wrong. ;-)
>
> If resume is requested while the suspend hasn't completed yet, we should
> queue it (it's totally valid to request a suspending device to resume IMO), but
> the delayed work is still being used by the workqueue code, so we can't modify
> it.
Where is the delayed work still being used? There's even a comment in
run_workqueue() that says a work_struct can be freed by the function it
calls.
> > We might want to do a runtime suspend even if the device's children
> > aren't already suspended. For example, you could suspend a link while
> > leaving the device on the other end of the link at full power --
> > especially if powering down the device is slow but changing the link's
> > power level is fast.
>
> Well, this means that the dependencies between devices in the device tree are
> pretty much useless for the run-time PM as far as the core is concerned. In
> which case, why did you mention them at all?
The dependencies aren't totally useless. It's still true that before
you resume a device, you have to autoresume its parent. And it's still
true that when you suspend a device, the parent should be given a
chance to autosuspend.
I guess the real point is that the decision about whether all children
must be suspended should be made by the driver, not the PM core.
> > I haven't checked the details of the code yet. More later...
One more thought... The autosuspend and autoresume callbacks need to
be mutually exclusive with probe and remove. So somehow the driver
core will need to block runtime PM calls.
It might also be nice to make sure that the driver core autoresumes a
device before probing it and autosuspends a device (after some
reasonable delay) after unbinding its driver.
Alan Stern
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <200906110107.23023.oliver@neukum.org>
@ 2009-06-10 23:42 ` Alan Stern
2009-06-11 13:46 ` Rafael J. Wysocki
1 sibling, 0 replies; 41+ messages in thread
From: Alan Stern @ 2009-06-10 23:42 UTC (permalink / raw)
To: Oliver Neukum; +Cc: ACPI Devel Maling List, linux-pm, LKML
On Thu, 11 Jun 2009, Oliver Neukum wrote:
> Am Donnerstag, 11. Juni 2009 00:01:20 schrieb Rafael J. Wysocki:
> > We have queued up resume requests for the device's parent, its parent etc.,
> > the topmost one goes first. The workqueue is singlethread, so
> > pm_autoresume() is going to be run for all parents before the device
> > itself, so if that were the only resume mechanism, it would be enough to
> > check if the parent is RPM_ACTIVE.
>
> A (IDLE)
> / \
> B (SUSPENDED) C (SUSPENDED)
>
> Suppose C is to be resumed. This means first in case of A the request
> to suspend would be cancelled. Here you drop the locks:
>
> + && (dev->parent->power.runtime_status == RPM_IDLE
> + || dev->parent->power.runtime_status == RPM_SUSPENDING
> + || dev->parent->power.runtime_status == RPM_SUSPENDED)) {
> + spin_unlock_irqrestore(&dev->power.lock, flags);
> + spin_unlock_irqrestore(&dev->parent->power.lock, parent_flags);
> +
> + /* We have to resume the parent first. */
> + pm_request_resume(dev->parent);
>
> But after pm_request_resume() returns there's no means to make sure
> nothing alters it back to RPM_SUSPENDED. The workqueue doesn't help
> you because you've scheduled nothing by that time. The suspension will
> work because C is still in RPM_SUSPENDED.
This is an example where usage counters come in handy.
Alan Stern
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <200906101029.27529.rjw@sisk.pl>
` (2 preceding siblings ...)
2009-06-10 21:14 ` Alan Stern
@ 2009-06-11 5:18 ` Magnus Damm
[not found] ` <aec7e5c30906102218j2f8f38dbt5d05a3928ca46c15@mail.gmail.com>
4 siblings, 0 replies; 41+ messages in thread
From: Magnus Damm @ 2009-06-11 5:18 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, linux-pm, LKML
Hi Rafael,
On Wed, Jun 10, 2009 at 5:29 PM, Rafael J. Wysocki<rjw@sisk.pl> wrote:
>> I tried to address your comments and the Oliver's comments too in the new
>> version of the patch below. Please have a look and tell me what you think.
>
> Argh, I forgot about some important things.
>
> First, there are devices with no parent (actually, it would be much easier if
> they had a default dummy parent, but that's a separate issue).
>
> Second, the parent has to be taken into account in the asynchronous resume
> path too (which BTW is more complicated).
>
> Finally, I decided to follow the Oliver's suggestion that some error codes returned
> by ->autosuspend() and ->autoresume() may be regarded as "go back to the
> previous state" information. I chose to use -EAGAIN and -EBUSY for this
> purpose.
>
> Updated patch follows, sorry for the confusion.
Thanks for your work on this. I think the patch looks very good in
general so I will not comment on the code itself. I do however have a
few high level questions:
Q1) Regarding pm_request_suspend(), would it be possible to get a
synchronous version or to make use of the completion somehow?
Q2) As pm_request_suspend() works today, the device is marked as
RPM_IDLE and the delayed work is queued up. There is no real decision
making going on except the time out. I'd like to let the bus code
decide when to autosuspend a buch of devices. Maybe the idle handling
should be broken out into a pm_request_idle() and pm_request_suspend()
can be modified to synchronously suspend devices marked with
pm_request_idle()?
Q3) Have you thought about how device drivers can inform the Runtime
PM that the device are idle and that they need the hardware to be
woken up? I'd like something similar to my "[PATCH 02/04] Driver Core:
Add idle and wakeup functions" patch but maybe not specific to
platform devices. We talked about adding some hooks to the bus_type
for this. Any ideas?
This is how I'm thinking of integrating your Runtime PM code with our
SuperH platform devices:
Device Idle Handling:
1) Device drivers call pm_device_idle() (See Q3) which invokes arch
specific platform bus code in the case of our SoC platform devices.
This bus code marks the device as idle using pm_request_idle(). (See
Q2). At this point light weight power management like clock stopping
may be performed as well.
2) The arch specific bus code knows how the platform devices are
grouped together (thanks to the data area in "[PATCH] Driver Core: Add
platform device arch data V3"), and when all devices in one power
domain are marked as idle the bus code calls the synchronous
pm_request_suspend() (no delay).
3) When all devices in the power domain are suspended the bus code can
turn off the power. The reason why I'd like to only autosuspend when
all devices are idle is simply that we don't get any power savings
from the per device autosuspend() callbacks, only from turning off
power to the entire per-domain. So bindly autosuspending and
autoresuming devices is just pure overhead unless we know we can do it
for all devices in the domain.
4) Over time when the code in 2) should be extended to handle latencies.
Device Wakeup Handling:
1) Device drivers call pm_device_wakeup() (See Q3). This invokes arch
specific bus code for our SoC platform devices. The bus code enables
clocks if needed and also calls pm_resume_sync().
2) After the call to pm_resume_sync() the pm_device_wakeup() call
returns and the device driver can access the hardware as usual.
That's it! Far from perfect but maybe a good start at least.
I have seen quite a few patches from you lately, nice work. To make
tracking of the Runtime PM patches easier, can you pleae consider
including the version of the patch in the subject when you post new
versions?
Thanks!
/ magnus
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <200906110115.30729.oliver@neukum.org>
@ 2009-06-11 5:27 ` Magnus Damm
0 siblings, 0 replies; 41+ messages in thread
From: Magnus Damm @ 2009-06-11 5:27 UTC (permalink / raw)
To: Oliver Neukum; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Thu, Jun 11, 2009 at 8:15 AM, Oliver Neukum<oliver@neukum.org> wrote:
> Am Mittwoch, 10. Juni 2009 23:31:13 schrieb Rafael J. Wysocki:
>> > > +/**
>> > > + * pm_check_children - Check if all children of a device have been
>> > > suspended. + * @dev: Device to check.
>> > > + *
>> > > + * Returns 0 if all children of the device have been suspended or
>> > > -EBUSY + * otherwise.
>> > > + */
>> >
>> > We might want to do a runtime suspend even if the device's children
>> > aren't already suspended. For example, you could suspend a link while
>> > leaving the device on the other end of the link at full power --
>> > especially if powering down the device is slow but changing the link's
>> > power level is fast.
>>
>> Well, this means that the dependencies between devices in the device tree
>> are pretty much useless for the run-time PM as far as the core is
>> concerned. In which case, why did you mention them at all?
>
> Some bussystems need this constraint others don't or only for some nodes.
> We need a way to communicate this to the core.
I agree that this depends on the bus.
Our SuperH on-chip SoC platform devices are arranged in a flat fashion
so no real problem there, but if there whould be dependencies then I
think we need to manage it recursively somehow.
Compare that to PM of our I2C driver and the I2C bus hanging off from
that. In that case I'd like to be able to autosuspend the I2C master
driver regardless of the I2C slave devices and their PM state.
/ magnus
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <aec7e5c30906102218j2f8f38dbt5d05a3928ca46c15@mail.gmail.com>
@ 2009-06-11 9:08 ` Oliver Neukum
[not found] ` <200906111108.40785.oliver@neukum.org>
1 sibling, 0 replies; 41+ messages in thread
From: Oliver Neukum @ 2009-06-11 9:08 UTC (permalink / raw)
To: Magnus Damm; +Cc: ACPI Devel Maling List, linux-pm, LKML
Am Donnerstag, 11. Juni 2009 07:18:46 schrieb Magnus Damm:
> 3) When all devices in the power domain are suspended the bus code can
> turn off the power. The reason why I'd like to only autosuspend when
So you are saying that you have power dependencies independent
of the device tree?
> all devices are idle is simply that we don't get any power savings
> from the per device autosuspend() callbacks, only from turning off
> power to the entire per-domain. So bindly autosuspending and
> autoresuming devices is just pure overhead unless we know we can do it
> for all devices in the domain.
Why can't you do this within the framework? You simply suspend when
all a domain's devices have been autosuspended.
I suppose we could have a helper.
int pm_autosuspend_in_domain(struct device *dev)
{
int err;
mutex_lock(dev->power_domain);
if (! --dev->power_domain.active_devices)
err = dev->power_domain->power_down(dev->power_domain);
else
err = 0;
mutex_unlock(dev->power_domain);
return err;
}
Regards
Oliver
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <200906110107.23023.oliver@neukum.org>
2009-06-10 23:42 ` Alan Stern
@ 2009-06-11 13:46 ` Rafael J. Wysocki
1 sibling, 0 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2009-06-11 13:46 UTC (permalink / raw)
To: Oliver Neukum; +Cc: ACPI Devel Maling List, linux-pm, LKML
On Thursday 11 June 2009, Oliver Neukum wrote:
> Am Donnerstag, 11. Juni 2009 00:01:20 schrieb Rafael J. Wysocki:
> > We have queued up resume requests for the device's parent, its parent etc.,
> > the topmost one goes first. The workqueue is singlethread, so
> > pm_autoresume() is going to be run for all parents before the device
> > itself, so if that were the only resume mechanism, it would be enough to
> > check if the parent is RPM_ACTIVE.
>
> A (IDLE)
> / \
> B (SUSPENDED) C (SUSPENDED)
>
> Suppose C is to be resumed. This means first in case of A the request
> to suspend would be cancelled. Here you drop the locks:
>
> + && (dev->parent->power.runtime_status == RPM_IDLE
> + || dev->parent->power.runtime_status == RPM_SUSPENDING
> + || dev->parent->power.runtime_status == RPM_SUSPENDED)) {
> + spin_unlock_irqrestore(&dev->power.lock, flags);
> + spin_unlock_irqrestore(&dev->parent->power.lock, parent_flags);
> +
> + /* We have to resume the parent first. */
> + pm_request_resume(dev->parent);
>
> But after pm_request_resume() returns there's no means to make sure
> nothing alters it back to RPM_SUSPENDED. The workqueue doesn't help
> you because you've scheduled nothing by that time. The suspension will
> work because C is still in RPM_SUSPENDED.
That exactly is the bug I told you about in one of the previous messages. :-)
The solution I used in the current version of the patch (appended) is to have
separate bits for RPM_WAKE and RPM_SUSPENDED (and for the other status
constants), so that they both can be set at a time.
Well, there probably still are some bugs lurking in it ...
Best,
Rafael
---
drivers/base/power/Makefile | 1
drivers/base/power/main.c | 2
drivers/base/power/runtime.c | 415 +++++++++++++++++++++++++++++++++++++++++++
include/linux/pm.h | 82 ++++++++
include/linux/pm_runtime.h | 50 +++++
kernel/power/Kconfig | 14 +
kernel/power/main.c | 17 +
7 files changed, 578 insertions(+), 3 deletions(-)
Index: linux-2.6/kernel/power/Kconfig
===================================================================
--- linux-2.6.orig/kernel/power/Kconfig
+++ linux-2.6/kernel/power/Kconfig
@@ -208,3 +208,17 @@ config APM_EMULATION
random kernel OOPSes or reboots that don't seem to be related to
anything, try disabling/enabling this option (or disabling/enabling
APM in your BIOS).
+
+config PM_RUNTIME
+ bool "Run-time PM core functionality"
+ depends on PM
+ ---help---
+ Enable functionality allowing I/O devices to be put into energy-saving
+ (low power) states at run time (or autosuspended) after a specified
+ period of inactivity and woken up in response to a hardware-generated
+ wake-up event or a driver's request.
+
+ Hardware support is generally required for this functionality to work
+ and the bus type drivers of the buses the devices are on are
+ responsibile for the actual handling of the autosuspend requests and
+ wake-up events.
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -11,6 +11,7 @@
#include <linux/kobject.h>
#include <linux/string.h>
#include <linux/resume-trace.h>
+#include <linux/workqueue.h>
#include "power.h"
@@ -217,8 +218,24 @@ static struct attribute_group attr_group
.attrs = g,
};
+#ifdef CONFIG_PM_RUNTIME
+struct workqueue_struct *pm_wq;
+
+static int __init pm_start_workqueue(void)
+{
+ pm_wq = create_freezeable_workqueue("pm");
+
+ return pm_wq ? 0 : -ENOMEM;
+}
+#else
+static inline int pm_start_workqueue(void) { return 0; }
+#endif
+
static int __init pm_init(void)
{
+ int error = pm_start_workqueue();
+ if (error)
+ return error;
power_kobj = kobject_create_and_add("power", NULL);
if (!power_kobj)
return -ENOMEM;
Index: linux-2.6/include/linux/pm.h
===================================================================
--- linux-2.6.orig/include/linux/pm.h
+++ linux-2.6/include/linux/pm.h
@@ -22,6 +22,9 @@
#define _LINUX_PM_H
#include <linux/list.h>
+#include <linux/workqueue.h>
+#include <linux/spinlock.h>
+#include <linux/completion.h>
/*
* Callbacks for platform drivers to implement.
@@ -165,6 +168,15 @@ typedef struct pm_message {
* It is allowed to unregister devices while the above callbacks are being
* executed. However, it is not allowed to unregister a device from within any
* of its own callbacks.
+ *
+ * There also are two callbacks related to run-time power management of devices:
+ *
+ * @autosuspend: Save the device registers and put it into an energy-saving (low
+ * power) state at run-time, enable wake-up events as appropriate.
+ *
+ * @autoresume: Put the device into the full power state and restore its
+ * registers (if applicable) at run time, in response to a wake-up event
+ * generated by hardware or at a request of software.
*/
struct dev_pm_ops {
@@ -182,6 +194,10 @@ struct dev_pm_ops {
int (*thaw_noirq)(struct device *dev);
int (*poweroff_noirq)(struct device *dev);
int (*restore_noirq)(struct device *dev);
+#ifdef CONFIG_PM_RUNTIME
+ int (*runtime_suspend)(struct device *dev);
+ int (*runtime_resume)(struct device *dev);
+#endif
};
/**
@@ -315,14 +331,74 @@ enum dpm_state {
DPM_OFF_IRQ,
};
+/**
+ * Device run-time power management state.
+ *
+ * These state labels are used internally by the PM core to indicate the current
+ * status of a device with respect to the PM core operations. They do not
+ * reflect the actual power state of the device or its status as seen by the
+ * driver.
+ *
+ * RPM_ACTIVE Device is fully operational, no run-time PM requests are
+ * pending for it.
+ *
+ * RPM_IDLE It has been requested that the device be suspended.
+ * Suspend request has been put into the run-time PM
+ * workqueue and it's pending execution.
+ *
+ * RPM_SUSPENDING Device bus type's ->runtime_suspend() callback is being
+ * executed.
+ *
+ * RPM_SUSPENDED Device bus type's ->runtime_suspend() callback has
+ * completed successfully. The device is regarded as
+ * suspended.
+ *
+ * RPM_WAKE It has been requested that the device be woken up.
+ * Resume request has been put into the run-time PM
+ * workqueue and it's pending execution.
+ *
+ * RPM_RESUMING Device bus type's ->runtime_resume() callback is being
+ * executed.
+ *
+ * RPM_ERROR Represents a condition from which the PM core cannot
+ * recover by itself. If the device's run-time PM status
+ * field has this value, all of the run-time PM operations
+ * carried out for the device by the core will fail, until
+ * the status field is changed to either RPM_ACTIVE or
+ * RPM_SUSPENDED (it is not valid to use the other values
+ * in such a situation) by the device's driver or bus type.
+ * This happens when the device bus type's
+ * ->runtime_suspend() or ->runtime_resume() callback
+ * returns error code different from -EAGAIN or -EBUSY.
+ */
+
+#define RPM_ACTIVE 0
+#define RPM_IDLE 0x01
+#define RPM_SUSPENDING 0x02
+#define RPM_SUSPENDED 0x04
+#define RPM_WAKE 0x08
+#define RPM_RESUMING 0x10
+
+#define RPM_IN_SUSPEND (RPM_SUSPENDING | RPM_SUSPENDED)
+#define RPM_INACTIVE (RPM_IDLE | RPM_IN_SUSPEND)
+#define RPM_ERROR (-1)
+
struct dev_pm_info {
pm_message_t power_state;
- unsigned can_wakeup:1;
- unsigned should_wakeup:1;
+ unsigned int can_wakeup:1;
+ unsigned int should_wakeup:1;
enum dpm_state status; /* Owned by the PM core */
-#ifdef CONFIG_PM_SLEEP
+#ifdef CONFIG_PM_SLEEP
struct list_head entry;
#endif
+#ifdef CONFIG_PM_RUNTIME
+ struct delayed_work suspend_work;
+ struct work_struct resume_work;
+ struct completion work_done;
+ unsigned int suspend_aborted:1;
+ unsigned int runtime_status:5;
+ spinlock_t lock;
+#endif
};
/*
Index: linux-2.6/drivers/base/power/Makefile
===================================================================
--- linux-2.6.orig/drivers/base/power/Makefile
+++ linux-2.6/drivers/base/power/Makefile
@@ -1,5 +1,6 @@
obj-$(CONFIG_PM) += sysfs.o
obj-$(CONFIG_PM_SLEEP) += main.o
+obj-$(CONFIG_PM_RUNTIME) += runtime.o
obj-$(CONFIG_PM_TRACE_RTC) += trace.o
ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- /dev/null
+++ linux-2.6/drivers/base/power/runtime.c
@@ -0,0 +1,415 @@
+/*
+ * drivers/base/power/runtime.c - Helper functions for device run-time PM
+ *
+ * Copyright (c) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+ *
+ * This file is released under the GPLv2.
+ */
+
+#include <linux/pm_runtime.h>
+
+/**
+ * pm_runtime_reset - Clear all of the device run-time PM flags.
+ * @dev: Device object to clear the flags for.
+ */
+static void pm_runtime_reset(struct device *dev)
+{
+ dev->power.suspend_aborted = false;
+ dev->power.runtime_status = RPM_ACTIVE;
+}
+
+/**
+ * pm_device_suspended - Check if given device has been suspended at run time.
+ * @dev: Device to check.
+ * @data: Ignored.
+ *
+ * Returns 0 if the device has been suspended and it hasn't been requested to
+ * resume or -EBUSY otherwise.
+ */
+static int pm_device_suspended(struct device *dev, void *data)
+{
+ return dev->power.runtime_status == RPM_SUSPENDED ? 0 : -EBUSY;
+}
+
+/**
+ * pm_check_children - Check if all children of a device have been suspended.
+ * @dev: Device to check.
+ *
+ * Returns 0 if all children of the device have been suspended or -EBUSY
+ * otherwise.
+ */
+static int pm_check_children(struct device *dev)
+{
+ return device_for_each_child(dev, NULL, pm_device_suspended);
+}
+
+/**
+ * pm_runtime_suspend - Run a device bus type's runtime_suspend() callback.
+ * @work: Work structure used for scheduling the execution of this function.
+ *
+ * Use @work to get the device object the suspend has been scheduled for,
+ * check if the suspend request hasn't been cancelled and run the
+ * ->runtime_suspend() callback provided by the device's bus type driver.
+ * Update the run-time PM flags in the device object to reflect the current
+ * status of the device.
+ */
+static void pm_runtime_suspend(struct work_struct *work)
+{
+ struct delayed_work *dw = to_delayed_work(work);
+ struct device *dev = suspend_work_to_device(dw);
+ int error = 0;
+
+ spin_lock(&dev->power.lock);
+
+ if (dev->power.suspend_aborted) {
+ dev->power.runtime_status = RPM_ACTIVE;
+ goto out;
+ } else if (dev->power.runtime_status != RPM_IDLE) {
+ goto out;
+ } else if (pm_check_children(dev)) {
+ /*
+ * We can only suspend the device if all of its children have
+ * been suspended.
+ */
+ goto out;
+ }
+
+ dev->power.runtime_status = RPM_SUSPENDING;
+ init_completion(&dev->power.work_done);
+
+ spin_unlock(&dev->power.lock);
+
+ if (dev && dev->bus && dev->bus->pm && dev->bus->pm->runtime_suspend)
+ error = dev->bus->pm->runtime_suspend(dev);
+
+ spin_lock(&dev->power.lock);
+
+ /*
+ * Resume request might have been queued in the meantime, in which case
+ * the RPM_WAKE bit is also set in runtime_status.
+ */
+ dev->power.runtime_status &= ~RPM_SUSPENDING;
+ switch (error) {
+ case 0:
+ dev->power.runtime_status |= RPM_SUSPENDED;
+ break;
+ case -EAGAIN:
+ case -EBUSY:
+ dev->power.runtime_status = RPM_ACTIVE;
+ break;
+ default:
+ dev->power.runtime_status = RPM_ERROR;
+ }
+ complete(&dev->power.work_done);
+
+ out:
+ spin_unlock(&dev->power.lock);
+}
+
+/**
+ * pm_request_suspend - Schedule run-time suspend of given device.
+ * @dev: Device to suspend.
+ * @delay: Time to wait before attempting to suspend the device.
+ */
+void pm_request_suspend(struct device *dev, unsigned long delay)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&dev->power.lock, flags);
+
+ if (dev->power.runtime_status != RPM_ACTIVE)
+ goto out;
+
+ dev->power.runtime_status = RPM_IDLE;
+ dev->power.suspend_aborted = false;
+ queue_delayed_work(pm_wq, &dev->power.suspend_work, delay);
+
+ out:
+ spin_unlock_irqrestore(&dev->power.lock, flags);
+}
+
+/**
+ * pm_cancel_suspend - Cancel a pending suspend request for given device.
+ * @dev: Device to cancel the suspend request for.
+ *
+ * Should be called under pm_lock_device() and only if we are sure that the
+ * ->autosuspend() callback hasn't started to yet.
+ */
+static void pm_cancel_suspend(struct device *dev)
+{
+ dev->power.suspend_aborted = true;
+ cancel_delayed_work(&dev->power.suspend_work);
+ dev->power.runtime_status = RPM_ACTIVE;
+}
+
+/**
+ * pm_runtime_resume - Run a device bus type's runtime_resume() callback.
+ * @work: Work structure used for scheduling the execution of this function.
+ *
+ * Use @work to get the device object the resume has been scheduled for,
+ * check if the device is really suspended and run the ->runtime_resume()
+ * callback provided by the device's bus type driver. Update the run-time PM
+ * flags in the device object to reflect the current status of the device.
+ */
+static void pm_runtime_resume(struct work_struct *work)
+{
+ struct device *dev = resume_work_to_device(work);
+ int error = 0;
+
+ if (dev->parent)
+ spin_lock(&dev->parent->power.lock);
+ spin_lock(&dev->power.lock);
+
+ /*
+ * Since the PM workqueue is singlethread, this function cannot run
+ * in parallel with pm_runtime_suspend(). For this reason it is not
+ * necessary to check if RPM_SUSPENDING is set in runtime_status of the
+ * device.
+ */
+ repeat:
+ if (!(dev->power.runtime_status & RPM_WAKE)) {
+ if (dev->parent)
+ spin_unlock(&dev->parent->power.lock);
+ goto out;
+ } else if (dev->parent
+ && dev->parent->power.runtime_status != RPM_ACTIVE) {
+ /*
+ * Although this function cannot run in parallel with another
+ * instance of itself, it may be running in parallel with the
+ * synchronous resume of another device. In particular, that
+ * may be the device's parent.
+ */
+ if (dev->parent->power.runtime_status & RPM_RESUMING) {
+ spin_unlock(&dev->power.lock);
+ spin_unlock(&dev->parent->power.lock);
+
+ wait_for_completion(&dev->parent->power.work_done);
+
+ spin_lock(&dev->parent->power.lock);
+ spin_lock(&dev->power.lock);
+ }
+ if (dev->parent->power.runtime_status != RPM_ACTIVE) {
+ spin_unlock(&dev->parent->power.lock);
+ goto out;
+ }
+ goto repeat;
+ }
+
+ dev->power.runtime_status = RPM_RESUMING;
+ init_completion(&dev->power.work_done);
+
+ spin_unlock(&dev->power.lock);
+ if (dev->parent)
+ spin_unlock(&dev->parent->power.lock);
+
+ if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume)
+ error = dev->bus->pm->runtime_resume(dev);
+
+ spin_lock(&dev->power.lock);
+
+ switch (error) {
+ case 0:
+ dev->power.runtime_status = RPM_ACTIVE;
+ break;
+ case -EAGAIN:
+ case -EBUSY:
+ dev->power.runtime_status = RPM_SUSPENDED;
+ break;
+ default:
+ dev->power.runtime_status = RPM_ERROR;
+ }
+ complete(&dev->power.work_done);
+
+ out:
+ spin_unlock(&dev->power.lock);
+}
+
+/**
+ * pm_request_resume - Schedule run-time resume of given device.
+ * @dev: Device to resume.
+ */
+void pm_request_resume(struct device *dev)
+{
+ unsigned long parent_flags = 0, flags;
+
+ repeat:
+ if (dev->parent)
+ spin_lock_irqsave(&dev->parent->power.lock, parent_flags);
+ spin_lock_irqsave(&dev->power.lock, flags);
+
+ if (dev->power.runtime_status == RPM_IDLE) {
+ /* Autosuspend request is pending, no need to resume. */
+ pm_cancel_suspend(dev);
+ goto out;
+ } else if (!(dev->power.runtime_status & RPM_IN_SUSPEND)) {
+ goto out;
+ } else if (dev->parent
+ && (dev->parent->power.runtime_status & RPM_INACTIVE)) {
+ spin_unlock_irqrestore(&dev->power.lock, flags);
+ spin_unlock_irqrestore(&dev->parent->power.lock, parent_flags);
+
+ /* We have to resume the parent first. */
+ pm_request_resume(dev->parent);
+
+ goto repeat;
+ }
+
+ /*
+ * The device may be suspending at the moment and we can't clear the
+ * RPM_SUSPENDING bit in its runtime_status just yet.
+ */
+ dev->power.runtime_status |= RPM_WAKE;
+ queue_work(pm_wq, &dev->power.resume_work);
+
+ out:
+ spin_unlock_irqrestore(&dev->power.lock, flags);
+ if (dev->parent)
+ spin_unlock_irqrestore(&dev->parent->power.lock, parent_flags);
+}
+
+/**
+ * pm_resume_sync - Resume given device waiting for the operation to complete.
+ * @dev: Device to resume.
+ *
+ * Resume the device synchronously, waiting for the operation to complete. If
+ * autosuspend is in progress while this function is being run, wait for it to
+ * finish before resuming the device. If the autosuspend is scheduled, but it
+ * hasn't started yet, cancel it and we're done.
+ */
+int pm_resume_sync(struct device *dev)
+{
+ int error = 0;
+
+ spin_lock(&dev->power.lock);
+
+ if (dev->power.runtime_status == RPM_ACTIVE) {
+ goto out;
+ } if (dev->power.runtime_status == RPM_IDLE) {
+ /* ->runtime_suspend() hasn't started yet, no need to resume. */
+ pm_cancel_suspend(dev);
+ goto out;
+ }
+
+ if (dev->power.runtime_status & RPM_SUSPENDING) {
+ spin_unlock(&dev->power.lock);
+
+ /*
+ * The ->runtime_suspend() callback is being executed right now,
+ * wait for it to complete.
+ */
+ wait_for_completion(&dev->power.work_done);
+ } else if (dev->power.runtime_status == RPM_SUSPENDED && dev->parent) {
+ spin_unlock(&dev->power.lock);
+
+ /* The device's parent may also be suspended. Resume it. */
+ error = pm_resume_sync(dev->parent);
+ if (error)
+ return error;
+ } else {
+ spin_unlock(&dev->power.lock);
+ }
+
+ if (dev->parent)
+ spin_lock(&dev->parent->power.lock);
+ spin_lock(&dev->power.lock);
+
+ if (dev->power.runtime_status & RPM_WAKE) {
+ /* There's a pending resume request that can be cancelled. */
+ work_clear_pending(&dev->power.resume_work);
+ } else if (dev->power.runtime_status == RPM_RESUMING) {
+ spin_unlock(&dev->power.lock);
+ if (dev->parent)
+ spin_unlock(&dev->parent->power.lock);
+
+ /*
+ * There's another resume running in parallel with us. Wait for
+ * it to complete.
+ */
+ wait_for_completion(&dev->power.work_done);
+
+ return dev->power.runtime_status == RPM_ACTIVE ? 0 : -EAGAIN;
+ } else if (!(dev->power.runtime_status & RPM_SUSPENDED)) {
+ error = -EINVAL;
+ if (dev->parent)
+ spin_unlock(&dev->parent->power.lock);
+ goto out;
+ }
+
+ dev->power.runtime_status = RPM_RESUMING;
+ init_completion(&dev->power.work_done);
+
+ spin_unlock(&dev->power.lock);
+ if (dev->parent)
+ spin_unlock(&dev->parent->power.lock);
+
+ if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume)
+ error = dev->bus->pm->runtime_resume(dev);
+
+ spin_lock(&dev->power.lock);
+
+ switch (error) {
+ case 0:
+ dev->power.runtime_status = RPM_ACTIVE;
+ break;
+ case -EAGAIN:
+ case -EBUSY:
+ dev->power.runtime_status = RPM_SUSPENDED;
+ break;
+ default:
+ dev->power.runtime_status = RPM_ERROR;
+ }
+ complete(&dev->power.work_done);
+
+ out:
+ spin_unlock(&dev->power.lock);
+
+ return error;
+}
+
+/**
+ * pm_cancel_autosuspend - Cancel a pending autosuspend request for given device
+ * @dev: Device to handle.
+ *
+ * This routine is only supposed to be called when the run-time PM workqueue is
+ * frozen (i.e. during system-wide suspend or hibernation) when it is guaranteed
+ * that no work items are being executed.
+ */
+void pm_cancel_autosuspend(struct device *dev)
+{
+ spin_lock(&dev->power.lock);
+
+ cancel_delayed_work(&dev->power.suspend_work);
+ pm_runtime_reset(dev);
+
+ spin_unlock(&dev->power.lock);
+}
+
+/**
+ * pm_cancel_autoresume - Cancel a pending autoresume request for given device
+ * @dev: Device to handle.
+ *
+ * This routine is only supposed to be called when the run-time PM workqueue is
+ * frozen (i.e. during system-wide suspend or hibernation) when it is guaranteed
+ * that no work items are being executed.
+ */
+void pm_cancel_autoresume(struct device *dev)
+{
+ spin_lock(&dev->power.lock);
+
+ work_clear_pending(&dev->power.resume_work);
+ pm_runtime_reset(dev);
+
+ spin_unlock(&dev->power.lock);
+}
+
+/**
+ * pm_runtime_init - Initialize run-time PM fields in given device object.
+ * @dev: Device object to handle.
+ */
+void pm_runtime_init(struct device *dev)
+{
+ pm_runtime_reset(dev);
+ spin_lock_init(&dev->power.lock);
+ INIT_DELAYED_WORK(&dev->power.suspend_work, pm_runtime_suspend);
+ INIT_WORK(&dev->power.resume_work, pm_runtime_resume);
+}
Index: linux-2.6/include/linux/pm_runtime.h
===================================================================
--- /dev/null
+++ linux-2.6/include/linux/pm_runtime.h
@@ -0,0 +1,50 @@
+/*
+ * pm_runtime.h - Device run-time power management helper functions.
+ *
+ * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This file is released under the GPLv2.
+ */
+
+#ifndef _LINUX_PM_RUNTIME_H
+#define _LINUX_PM_RUNTIME_H
+
+#include <linux/device.h>
+#include <linux/pm.h>
+
+#ifdef CONFIG_PM_RUNTIME
+extern struct workqueue_struct *pm_wq;
+
+extern void pm_runtime_init(struct device *dev);
+extern void pm_request_suspend(struct device *dev, unsigned long delay);
+extern void pm_request_resume(struct device *dev);
+extern int pm_resume_sync(struct device *dev);
+extern void pm_cancel_autosuspend(struct device *dev);
+extern void pm_cancel_autoresume(struct device *dev);
+
+static inline struct device *suspend_work_to_device(struct delayed_work *work)
+{
+ struct dev_pm_info *dpi;
+
+ dpi = container_of(work, struct dev_pm_info, suspend_work);
+ return container_of(dpi, struct device, power);
+}
+
+static inline struct device *resume_work_to_device(struct work_struct *work)
+{
+ struct dev_pm_info *dpi;
+
+ dpi = container_of(work, struct dev_pm_info, resume_work);
+ return container_of(dpi, struct device, power);
+}
+
+#else /* !CONFIG_PM_RUNTIME */
+static inline void pm_runtime_init(struct device *dev) {}
+static inline void pm_request_suspend(struct device *dev, unsigned long delay);
+static inline void pm_request_resume(struct device *dev) {}
+static inline int pm_resume_sync(struct device *dev) { return -ENOSYS; }
+static inline void pm_cancel_autosuspend(struct device *dev) {}
+static inline void pm_cancel_autoresume(struct device *dev) {}
+#endif /* !CONFIG_PM_RUNTIME */
+
+#endif
Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -21,6 +21,7 @@
#include <linux/kallsyms.h>
#include <linux/mutex.h>
#include <linux/pm.h>
+#include <linux/pm_runtime.h>
#include <linux/resume-trace.h>
#include <linux/rwsem.h>
#include <linux/interrupt.h>
@@ -88,6 +89,7 @@ void device_pm_add(struct device *dev)
}
list_add_tail(&dev->power.entry, &dpm_list);
+ pm_runtime_init(dev);
mutex_unlock(&dpm_list_mtx);
}
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <Pine.LNX.4.44L0.0906101942280.29717-100000@netrider.rowland.org>
@ 2009-06-11 13:48 ` Rafael J. Wysocki
[not found] ` <200906111548.33751.rjw@sisk.pl>
1 sibling, 0 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2009-06-11 13:48 UTC (permalink / raw)
To: Alan Stern; +Cc: ACPI Devel Maling List, linux-pm, LKML
On Thursday 11 June 2009, Alan Stern wrote:
> On Thu, 11 Jun 2009, Oliver Neukum wrote:
>
> > Am Donnerstag, 11. Juni 2009 00:01:20 schrieb Rafael J. Wysocki:
> > > We have queued up resume requests for the device's parent, its parent etc.,
> > > the topmost one goes first. The workqueue is singlethread, so
> > > pm_autoresume() is going to be run for all parents before the device
> > > itself, so if that were the only resume mechanism, it would be enough to
> > > check if the parent is RPM_ACTIVE.
> >
> > A (IDLE)
> > / \
> > B (SUSPENDED) C (SUSPENDED)
> >
> > Suppose C is to be resumed. This means first in case of A the request
> > to suspend would be cancelled. Here you drop the locks:
> >
> > + && (dev->parent->power.runtime_status == RPM_IDLE
> > + || dev->parent->power.runtime_status == RPM_SUSPENDING
> > + || dev->parent->power.runtime_status == RPM_SUSPENDED)) {
> > + spin_unlock_irqrestore(&dev->power.lock, flags);
> > + spin_unlock_irqrestore(&dev->parent->power.lock, parent_flags);
> > +
> > + /* We have to resume the parent first. */
> > + pm_request_resume(dev->parent);
> >
> > But after pm_request_resume() returns there's no means to make sure
> > nothing alters it back to RPM_SUSPENDED. The workqueue doesn't help
> > you because you've scheduled nothing by that time. The suspension will
> > work because C is still in RPM_SUSPENDED.
>
> This is an example where usage counters come in handy.
Do you mean we can count suspend/resume requests for a device?
Rafael
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <200906111548.33751.rjw@sisk.pl>
@ 2009-06-11 13:57 ` Oliver Neukum
0 siblings, 0 replies; 41+ messages in thread
From: Oliver Neukum @ 2009-06-11 13:57 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, linux-pm, LKML
Am Donnerstag, 11. Juni 2009 15:48:33 schrieb Rafael J. Wysocki:
> > > But after pm_request_resume() returns there's no means to make sure
> > > nothing alters it back to RPM_SUSPENDED. The workqueue doesn't help
> > > you because you've scheduled nothing by that time. The suspension will
> > > work because C is still in RPM_SUSPENDED.
> >
> > This is an example where usage counters come in handy.
>
> Do you mean we can count suspend/resume requests for a device?
No, we count reasons a device cannot be suspended. Drivers are allowed to
add their own reasons. The core uses that mechanism to indicate that an
ongoing resumption lower down is also a reason.
The count going to zero is equivalent to a request to suspend.
Regards
Oliver
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <200906111557.49490.oliver@neukum.org>
@ 2009-06-11 14:16 ` Alan Stern
0 siblings, 0 replies; 41+ messages in thread
From: Alan Stern @ 2009-06-11 14:16 UTC (permalink / raw)
To: Oliver Neukum; +Cc: ACPI Devel Maling List, linux-pm, LKML
On Thu, 11 Jun 2009, Oliver Neukum wrote:
> Am Donnerstag, 11. Juni 2009 15:48:33 schrieb Rafael J. Wysocki:
> > > > But after pm_request_resume() returns there's no means to make sure
> > > > nothing alters it back to RPM_SUSPENDED. The workqueue doesn't help
> > > > you because you've scheduled nothing by that time. The suspension will
> > > > work because C is still in RPM_SUSPENDED.
> > >
> > > This is an example where usage counters come in handy.
> >
> > Do you mean we can count suspend/resume requests for a device?
>
> No, we count reasons a device cannot be suspended. Drivers are allowed to
> add their own reasons. The core uses that mechanism to indicate that an
> ongoing resumption lower down is also a reason.
> The count going to zero is equivalent to a request to suspend.
Right.
Here's a related thought. Change the resume routines as follows:
void pm_runtime_resume(struct device *dev)
{
// Do the actual resume ...
}
EXPORT_SYMBOL_GPL(pm_runtime_resume);
static void pm_runtime_resume_work(struct work_struct *work)
{
pm_runtime_resume(resume_work_to_device(work));
}
Then there's no need for a separate pm_resume_sync(); drivers can
simply call pm_runtime_resume() directly. The same trick works for
suspending.
Of course, this means you have to give up the notion that all suspends
and resumes are funnelled through the workqueue. IMO that notion isn't
worth keeping in any case.
Alan Stern
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <Pine.LNX.4.44L0.0906101933220.29717-100000@netrider.rowland.org>
@ 2009-06-11 14:17 ` Rafael J. Wysocki
2009-06-11 14:52 ` Alan Stern
0 siblings, 1 reply; 41+ messages in thread
From: Rafael J. Wysocki @ 2009-06-11 14:17 UTC (permalink / raw)
To: Alan Stern; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Thursday 11 June 2009, Alan Stern wrote:
> On Wed, 10 Jun 2009, Rafael J. Wysocki wrote:
>
> > > You know, it doesn't make any sense to have a suspend and a resume
> > > both pending at the same time.
> > >
> > > So you could add only a delayed_work structure and use its embedded
> > > work_struct for resume requests.
> >
> > I thought so too, but I was wrong. ;-)
> >
> > If resume is requested while the suspend hasn't completed yet, we should
> > queue it (it's totally valid to request a suspending device to resume IMO), but
> > the delayed work is still being used by the workqueue code, so we can't modify
> > it.
>
> Where is the delayed work still being used? There's even a comment in
> run_workqueue() that says a work_struct can be freed by the function it
> calls.
You are right, I overlooked the comment and it wasn't clear to me from looking
at the code.
> > > We might want to do a runtime suspend even if the device's children
> > > aren't already suspended. For example, you could suspend a link while
> > > leaving the device on the other end of the link at full power --
> > > especially if powering down the device is slow but changing the link's
> > > power level is fast.
> >
> > Well, this means that the dependencies between devices in the device tree are
> > pretty much useless for the run-time PM as far as the core is concerned. In
> > which case, why did you mention them at all?
>
> The dependencies aren't totally useless. It's still true that before
> you resume a device, you have to autoresume its parent.
Well, in fact if we don't have the requirement that the children of a device
have to be suspended for it to be able to suspend too, we have to check
all parents up the device tree up to the one that doesn't have a parent
and autoresume the ones that aren't active.
> And it's still true that when you suspend a device, the parent should be
> given a chance to autosuspend.
>
> I guess the real point is that the decision about whether all children
> must be suspended should be made by the driver, not the PM core.
The point here is what the core is supposed to do. Does it need to handle
this at all or leave it to the bus type and driver?
After reconsidering it for a while I think that we should define what
"suspended" is supposed to mean from the core point of view. And my opinion
is that it should mean "device doesn't communicate with the CPUs and RAM due
to power management". That need not be power management of the device itself,
but such that leads to the device not doing I/O.
Under this definition all devices behind an inactive link are suspended,
because they can't do any I/O. Which appears to makes sense, because their
drivers have to be notified before the link is suspended and the link has to be
turned on for the devices to be able to communicate with the CPU and RAM.
If this definition is adopted, then it's quite clear that the device can only
be suspended if all of its children are suspended and it's always necessary
to resume the parent of a device in order to resume the device itself.
> > > I haven't checked the details of the code yet. More later...
>
> One more thought... The autosuspend and autoresume callbacks need to
> be mutually exclusive with probe and remove. So somehow the driver
> core will need to block runtime PM calls.
That's correct and I'm going to take care of this.
> It might also be nice to make sure that the driver core autoresumes a
> device before probing it and autosuspends a device (after some
> reasonable delay) after unbinding its driver.
Agreed.
Best,
Rafael
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
2009-06-11 14:17 ` [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code) Rafael J. Wysocki
@ 2009-06-11 14:52 ` Alan Stern
0 siblings, 0 replies; 41+ messages in thread
From: Alan Stern @ 2009-06-11 14:52 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Thu, 11 Jun 2009, Rafael J. Wysocki wrote:
> The point here is what the core is supposed to do. Does it need to handle
> this at all or leave it to the bus type and driver?
>
> After reconsidering it for a while I think that we should define what
> "suspended" is supposed to mean from the core point of view. And my opinion
> is that it should mean "device doesn't communicate with the CPUs and RAM due
> to power management". That need not be power management of the device itself,
> but such that leads to the device not doing I/O.
>
> Under this definition all devices behind an inactive link are suspended,
> because they can't do any I/O. Which appears to makes sense, because their
> drivers have to be notified before the link is suspended and the link has to be
> turned on for the devices to be able to communicate with the CPU and RAM.
>
> If this definition is adopted, then it's quite clear that the device can only
> be suspended if all of its children are suspended and it's always necessary
> to resume the parent of a device in order to resume the device itself.
Okay, I'll agree to that. It should be made clear that a device which
is "suspended" according to this definition is not necessarily in a
low-power state. For example, before powering down the link to a disk
drive you might want the drive's suspend method to flush the drive's
cache, but it wouldn't have to spin the drive down.
(But this example leaves open the question of how we would go about
spinning down the disk. Submitting a 15-minute (or whatever) delayed
autosuspend request wouldn't work; the request wouldn't be accepted
because the disk is already suspended as far as the PM core is
concerned. The disk's driver would have to implement its own
spin-down delayed_work.)
> > > > I haven't checked the details of the code yet. More later...
> >
> > One more thought... The autosuspend and autoresume callbacks need to
> > be mutually exclusive with probe and remove. So somehow the driver
> > core will need to block runtime PM calls.
>
> That's correct and I'm going to take care of this.
>
> > It might also be nice to make sure that the driver core autoresumes a
> > device before probing it and autosuspends a device (after some
> > reasonable delay) after unbinding its driver.
>
> Agreed.
This is another case where a usage counter comes in handy. The driver
core resumes the device and increments the counter -- thus preventing
any unwanted autosuspends -- before making the probe and remove calls.
Alan Stern
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <Pine.LNX.4.44L0.0906111041020.3040-100000@iolanthe.rowland.org>
@ 2009-06-11 15:06 ` Oliver Neukum
2009-06-11 19:43 ` Rafael J. Wysocki
1 sibling, 0 replies; 41+ messages in thread
From: Oliver Neukum @ 2009-06-11 15:06 UTC (permalink / raw)
To: Alan Stern; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
Am Donnerstag, 11. Juni 2009 16:52:03 schrieb Alan Stern:
> > Under this definition all devices behind an inactive link are suspended,
> > because they can't do any I/O. Which appears to makes sense, because
> > their drivers have to be notified before the link is suspended and the
> > link has to be turned on for the devices to be able to communicate with
> > the CPU and RAM.
> >
> > If this definition is adopted, then it's quite clear that the device can
> > only be suspended if all of its children are suspended and it's always
> > necessary to resume the parent of a device in order to resume the device
> > itself.
>
> Okay, I'll agree to that. It should be made clear that a device which
> is "suspended" according to this definition is not necessarily in a
> low-power state. For example, before powering down the link to a disk
> drive you might want the drive's suspend method to flush the drive's
> cache, but it wouldn't have to spin the drive down.
This precludes handling busses that have low power states that are
left automatically. If such links are stacked the management of acceptable
latencies cannot be left to the busses.
An actual example are the link states of USB 3.0
Regards
Oliver
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <200906111706.16848.oliver@neukum.org>
@ 2009-06-11 15:22 ` Alan Stern
0 siblings, 0 replies; 41+ messages in thread
From: Alan Stern @ 2009-06-11 15:22 UTC (permalink / raw)
To: Oliver Neukum; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Thu, 11 Jun 2009, Oliver Neukum wrote:
> Am Donnerstag, 11. Juni 2009 16:52:03 schrieb Alan Stern:
> > > Under this definition all devices behind an inactive link are suspended,
> > > because they can't do any I/O. Which appears to makes sense, because
> > > their drivers have to be notified before the link is suspended and the
> > > link has to be turned on for the devices to be able to communicate with
> > > the CPU and RAM.
> > >
> > > If this definition is adopted, then it's quite clear that the device can
> > > only be suspended if all of its children are suspended and it's always
> > > necessary to resume the parent of a device in order to resume the device
> > > itself.
> >
> > Okay, I'll agree to that. It should be made clear that a device which
> > is "suspended" according to this definition is not necessarily in a
> > low-power state. For example, before powering down the link to a disk
> > drive you might want the drive's suspend method to flush the drive's
> > cache, but it wouldn't have to spin the drive down.
>
> This precludes handling busses that have low power states that are
> left automatically. If such links are stacked the management of acceptable
> latencies cannot be left to the busses.
> An actual example are the link states of USB 3.0
I don't understand. Can you explain more fully?
Alan Stern
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <Pine.LNX.4.44L0.0906111120260.3040-100000@iolanthe.rowland.org>
@ 2009-06-11 16:05 ` Oliver Neukum
0 siblings, 0 replies; 41+ messages in thread
From: Oliver Neukum @ 2009-06-11 16:05 UTC (permalink / raw)
To: Alan Stern; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
Am Donnerstag, 11. Juni 2009 17:22:06 schrieb Alan Stern:
> > > Okay, I'll agree to that. It should be made clear that a device which
> > > is "suspended" according to this definition is not necessarily in a
> > > low-power state. For example, before powering down the link to a disk
> > > drive you might want the drive's suspend method to flush the drive's
> > > cache, but it wouldn't have to spin the drive down.
> >
> > This precludes handling busses that have low power states that are
> > left automatically. If such links are stacked the management of
> > acceptable latencies cannot be left to the busses.
> > An actual example are the link states of USB 3.0
>
> I don't understand. Can you explain more fully?
I am talking about the U1 and U2 feature of USB 3.0.
Or abstractly any power saving state that does autoresume in hardware.
In these cases you know that you can enter a powersaving state that
will add X latency.
In terms of user space API we'll probably add a way for user space
to specify how much latency may be added for power management's sake.
If busses are stacked the "latency budget" has to be handled at core level.
If furthermore states that allow IO but with additional latency are ignored,
the budget will be calculated wrongly.
Regards
Oliver
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <200906111805.53388.oliver@neukum.org>
@ 2009-06-11 18:36 ` Alan Stern
0 siblings, 0 replies; 41+ messages in thread
From: Alan Stern @ 2009-06-11 18:36 UTC (permalink / raw)
To: Oliver Neukum; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Thu, 11 Jun 2009, Oliver Neukum wrote:
> Am Donnerstag, 11. Juni 2009 17:22:06 schrieb Alan Stern:
> > > > Okay, I'll agree to that. It should be made clear that a device which
> > > > is "suspended" according to this definition is not necessarily in a
> > > > low-power state. For example, before powering down the link to a disk
> > > > drive you might want the drive's suspend method to flush the drive's
> > > > cache, but it wouldn't have to spin the drive down.
> > >
> > > This precludes handling busses that have low power states that are
> > > left automatically. If such links are stacked the management of
> > > acceptable latencies cannot be left to the busses.
> > > An actual example are the link states of USB 3.0
> >
> > I don't understand. Can you explain more fully?
>
> I am talking about the U1 and U2 feature of USB 3.0.
>
> Or abstractly any power saving state that does autoresume in hardware.
> In these cases you know that you can enter a powersaving state that
> will add X latency.
>
> In terms of user space API we'll probably add a way for user space
> to specify how much latency may be added for power management's sake.
> If busses are stacked the "latency budget" has to be handled at core level.
> If furthermore states that allow IO but with additional latency are ignored,
> the budget will be calculated wrongly.
Okay, fine. What does this have to do with Rafael's work? Why does
setting the status to RPM_SUSPENDED even when a device is not in a
low-power state preclude handling buses that automatically change their
power state?
I don't see any connection between Rafael's work and managing
latencies, beyond the obvious fact that a device will have a higher
latency when it is suspended than when it isn't.
Alan Stern
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <Pine.LNX.4.44L0.0906111010360.2939-100000@iolanthe.rowland.org>
@ 2009-06-11 19:38 ` Rafael J. Wysocki
0 siblings, 0 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2009-06-11 19:38 UTC (permalink / raw)
To: Alan Stern; +Cc: ACPI Devel Maling List, linux-pm, LKML
On Thursday 11 June 2009, Alan Stern wrote:
> On Thu, 11 Jun 2009, Oliver Neukum wrote:
>
> > Am Donnerstag, 11. Juni 2009 15:48:33 schrieb Rafael J. Wysocki:
> > > > > But after pm_request_resume() returns there's no means to make sure
> > > > > nothing alters it back to RPM_SUSPENDED. The workqueue doesn't help
> > > > > you because you've scheduled nothing by that time. The suspension will
> > > > > work because C is still in RPM_SUSPENDED.
> > > >
> > > > This is an example where usage counters come in handy.
> > >
> > > Do you mean we can count suspend/resume requests for a device?
> >
> > No, we count reasons a device cannot be suspended. Drivers are allowed to
> > add their own reasons. The core uses that mechanism to indicate that an
> > ongoing resumption lower down is also a reason.
> > The count going to zero is equivalent to a request to suspend.
>
> Right.
Ah. *That* is what you had in mind. Yes, we can do that.
> Here's a related thought. Change the resume routines as follows:
>
> void pm_runtime_resume(struct device *dev)
> {
> // Do the actual resume ...
> }
> EXPORT_SYMBOL_GPL(pm_runtime_resume);
>
> static void pm_runtime_resume_work(struct work_struct *work)
> {
> pm_runtime_resume(resume_work_to_device(work));
> }
>
> Then there's no need for a separate pm_resume_sync(); drivers can
> simply call pm_runtime_resume() directly. The same trick works for
> suspending.
>
> Of course, this means you have to give up the notion that all suspends
> and resumes are funnelled through the workqueue. IMO that notion isn't
> worth keeping in any case.
That's already not the case for resuming.
Well, ISTR a reason why I thought pm_resume_sync() was needed anyway, but the
idea is actually good.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <Pine.LNX.4.44L0.0906111041020.3040-100000@iolanthe.rowland.org>
2009-06-11 15:06 ` Oliver Neukum
@ 2009-06-11 19:43 ` Rafael J. Wysocki
1 sibling, 0 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2009-06-11 19:43 UTC (permalink / raw)
To: Alan Stern; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Thursday 11 June 2009, Alan Stern wrote:
> On Thu, 11 Jun 2009, Rafael J. Wysocki wrote:
>
> > The point here is what the core is supposed to do. Does it need to handle
> > this at all or leave it to the bus type and driver?
> >
> > After reconsidering it for a while I think that we should define what
> > "suspended" is supposed to mean from the core point of view. And my opinion
> > is that it should mean "device doesn't communicate with the CPUs and RAM due
> > to power management". That need not be power management of the device itself,
> > but such that leads to the device not doing I/O.
> >
> > Under this definition all devices behind an inactive link are suspended,
> > because they can't do any I/O. Which appears to makes sense, because their
> > drivers have to be notified before the link is suspended and the link has to be
> > turned on for the devices to be able to communicate with the CPU and RAM.
> >
> > If this definition is adopted, then it's quite clear that the device can only
> > be suspended if all of its children are suspended and it's always necessary
> > to resume the parent of a device in order to resume the device itself.
>
> Okay, I'll agree to that. It should be made clear that a device which
> is "suspended" according to this definition is not necessarily in a
> low-power state. For example, before powering down the link to a disk
> drive you might want the drive's suspend method to flush the drive's
> cache, but it wouldn't have to spin the drive down.
Exactly.
> (But this example leaves open the question of how we would go about
> spinning down the disk. Submitting a 15-minute (or whatever) delayed
> autosuspend request wouldn't work; the request wouldn't be accepted
> because the disk is already suspended as far as the PM core is
> concerned. The disk's driver would have to implement its own
> spin-down delayed_work.)
Yes, it would.
> > > > > I haven't checked the details of the code yet. More later...
> > >
> > > One more thought... The autosuspend and autoresume callbacks need to
> > > be mutually exclusive with probe and remove. So somehow the driver
> > > core will need to block runtime PM calls.
> >
> > That's correct and I'm going to take care of this.
> >
> > > It might also be nice to make sure that the driver core autoresumes a
> > > device before probing it and autosuspends a device (after some
> > > reasonable delay) after unbinding its driver.
> >
> > Agreed.
>
> This is another case where a usage counter comes in handy. The driver
> core resumes the device and increments the counter -- thus preventing
> any unwanted autosuspends -- before making the probe and remove calls.
I like this idea.
BTW, where exactly the counter should be increased in that case?
I thought of driver_probe_device(), but is it sufficient? Or is there a better
place?
Best,
Rafael
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <Pine.LNX.4.44L0.0906111433110.4875-100000@iolanthe.rowland.org>
@ 2009-06-11 21:05 ` Oliver Neukum
0 siblings, 0 replies; 41+ messages in thread
From: Oliver Neukum @ 2009-06-11 21:05 UTC (permalink / raw)
To: Alan Stern; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
Am Donnerstag, 11. Juni 2009 20:36:30 schrieb Alan Stern:
> > Or abstractly any power saving state that does autoresume in hardware.
> > In these cases you know that you can enter a powersaving state that
> > will add X latency.
> >
> > In terms of user space API we'll probably add a way for user space
> > to specify how much latency may be added for power management's sake.
> > If busses are stacked the "latency budget" has to be handled at core
> > level. If furthermore states that allow IO but with additional latency
> > are ignored, the budget will be calculated wrongly.
>
> Okay, fine. What does this have to do with Rafael's work? Why does
> setting the status to RPM_SUSPENDED even when a device is not in a
> low-power state preclude handling buses that automatically change their
> power state?
For these cases the tree constraint does not apply.
I think there are devices who can be suspended while children are active
and devices which can not be. This is an attribute of the device and should
be evaluated by the core.
Regards
Oliver
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <200906112305.49232.oliver@neukum.org>
@ 2009-06-12 2:16 ` Alan Stern
0 siblings, 0 replies; 41+ messages in thread
From: Alan Stern @ 2009-06-12 2:16 UTC (permalink / raw)
To: Oliver Neukum; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Thu, 11 Jun 2009, Oliver Neukum wrote:
> Am Donnerstag, 11. Juni 2009 20:36:30 schrieb Alan Stern:
> > > Or abstractly any power saving state that does autoresume in hardware.
> > > In these cases you know that you can enter a powersaving state that
> > > will add X latency.
> > >
> > > In terms of user space API we'll probably add a way for user space
> > > to specify how much latency may be added for power management's sake.
> > > If busses are stacked the "latency budget" has to be handled at core
> > > level. If furthermore states that allow IO but with additional latency
> > > are ignored, the budget will be calculated wrongly.
> >
> > Okay, fine. What does this have to do with Rafael's work? Why does
> > setting the status to RPM_SUSPENDED even when a device is not in a
> > low-power state preclude handling buses that automatically change their
> > power state?
>
> For these cases the tree constraint does not apply.
What tree constraint? You mean that the PM core shouldn't allow
devices to suspend unless all their children are suspended? Why
doesn't it still apply?
Remember, when Rafael and I say "suspend" here, we don't mean "go to a
low-power state". We mean "the PM core calls the runtime_suspend
method". No matter what actions the link hardware may decide to take
on its own, the PM core will still want to observe the
all-children-suspended restriction when calling runtime_suspend
methods.
> I think there are devices who can be suspended while children are active
> and devices which can not be. This is an attribute of the device and should
> be evaluated by the core.
Clearly it should be decided by the driver. Should there be a bit for
it in the dev_pm_info structure?
Alan Stern
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <200906111108.40785.oliver@neukum.org>
@ 2009-06-12 3:13 ` Magnus Damm
2009-06-12 8:11 ` Oliver Neukum
[not found] ` <200906121011.13592.oliver@neukum.org>
0 siblings, 2 replies; 41+ messages in thread
From: Magnus Damm @ 2009-06-12 3:13 UTC (permalink / raw)
To: Oliver Neukum; +Cc: ACPI Devel Maling List, linux-pm, LKML
Hi Oliver,
On Thu, Jun 11, 2009 at 6:08 PM, Oliver Neukum<oliver@neukum.org> wrote:
> Am Donnerstag, 11. Juni 2009 07:18:46 schrieb Magnus Damm:
>> 3) When all devices in the power domain are suspended the bus code can
>> turn off the power. The reason why I'd like to only autosuspend when
>
> So you are saying that you have power dependencies independent
> of the device tree?
I can think of the following power dependencies:
- hardware bus topology
- clocks
- power domains
>> all devices are idle is simply that we don't get any power savings
>> from the per device autosuspend() callbacks, only from turning off
>> power to the entire per-domain. So bindly autosuspending and
>> autoresuming devices is just pure overhead unless we know we can do it
>> for all devices in the domain.
>
> Why can't you do this within the framework? You simply suspend when
> all a domain's devices have been autosuspended.
So you mean I should handle that in my arch/bus specific
dev->bus->pm->autosuspend() code? So instead of calling
dev->driver->pm->autosuspend() straight away I keep track of the use
count of the power domain and when the domain is unused I call
dev->driver->pm->autosuspend() for all devices in the power domain
before powering off?
I guess hooking in things in dev->bus->pm->autosuspend() is doable,
but then dev->power.runtime_status will be set to RPM_SUSPENDED even
though the actual device isn't suspended at all. And RPM_IDLE state
will be more or less unused since the drivers should pass a delay of
zero to make sure the bus code gets notified about the idle state
straight away.
Basically, for my use case it would make more sense to let the
bus_type directly decide when a device should be suspended instead of
using a timeout before calling the bus_type code. I rather let the
bus_type decide if a timeout should be used or not instead of using it
for all bus_types.
So I guess the plan is that drivers directly should invoke
pm_request_suspend() to notify the bus that they are idle? (I guess
similar to my platform_device_idle()?)
For my use case there is no point in having the delay in
pm_request_suspend(), we want to notify the bus about the per-device
idleness straight away. Using a delay in pm_request_suspend() before
calling the bus type autosuspend will just keep the current per-device
state away from the bus level and make sure we _cannot_ enter deep
sleep states. Which I believe will result in worse battery life
because we spend more time than necessary in not-so-deep sleep states.
So yes, I'd like to do things in dev->bus->pm->autosuspend(), and the
code is quite close. I can't figure out why anyone would want the
suspend delay at the current level though, but I guess other busses
want to use that?
Thanks for your comments,
/ magnus
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
2009-06-12 3:13 ` Magnus Damm
@ 2009-06-12 8:11 ` Oliver Neukum
[not found] ` <200906121011.13592.oliver@neukum.org>
1 sibling, 0 replies; 41+ messages in thread
From: Oliver Neukum @ 2009-06-12 8:11 UTC (permalink / raw)
To: Magnus Damm; +Cc: ACPI Devel Maling List, linux-pm, LKML
Am Freitag, 12. Juni 2009 05:13:12 schrieb Magnus Damm:
> Hi Oliver,
>
> On Thu, Jun 11, 2009 at 6:08 PM, Oliver Neukum<oliver@neukum.org> wrote:
> > Am Donnerstag, 11. Juni 2009 07:18:46 schrieb Magnus Damm:
> >> 3) When all devices in the power domain are suspended the bus code can
> >> turn off the power. The reason why I'd like to only autosuspend when
> >
> > So you are saying that you have power dependencies independent
> > of the device tree?
>
> I can think of the following power dependencies:
> - hardware bus topology
> - clocks
> - power domains
That means that some devices otherwise unrelated have a common power
switch, doesn't it?
> >> all devices are idle is simply that we don't get any power savings
> >> from the per device autosuspend() callbacks, only from turning off
> >> power to the entire per-domain. So bindly autosuspending and
> >> autoresuming devices is just pure overhead unless we know we can do it
> >> for all devices in the domain.
> >
> > Why can't you do this within the framework? You simply suspend when
> > all a domain's devices have been autosuspended.
>
> So you mean I should handle that in my arch/bus specific
> dev->bus->pm->autosuspend() code? So instead of calling
> dev->driver->pm->autosuspend() straight away I keep track of the use
> count of the power domain and when the domain is unused I call
> dev->driver->pm->autosuspend() for all devices in the power domain
> before powering off?
How much overhead do you have in autosuspend() if it actually powers
down the devices? If it is small, I suggest you really run the autosuspend
methods in the drivers but use a counter for the actual power switching
on a bus level.
> I guess hooking in things in dev->bus->pm->autosuspend() is doable,
> but then dev->power.runtime_status will be set to RPM_SUSPENDED even
> though the actual device isn't suspended at all. And RPM_IDLE state
Why do you care about a device being in RPM_SUSPENDED while
active. The inverse is a bug, but this seems harmless.
> will be more or less unused since the drivers should pass a delay of
> zero to make sure the bus code gets notified about the idle state
> straight away.
So? You are not getting a common code without a little overhead
for some cases.
> Basically, for my use case it would make more sense to let the
> bus_type directly decide when a device should be suspended instead of
> using a timeout before calling the bus_type code. I rather let the
> bus_type decide if a timeout should be used or not instead of using it
> for all bus_types.
So call with a delay of 0.
> So I guess the plan is that drivers directly should invoke
> pm_request_suspend() to notify the bus that they are idle? (I guess
> similar to my platform_device_idle()?)
Yes.
> So yes, I'd like to do things in dev->bus->pm->autosuspend(), and the
> code is quite close. I can't figure out why anyone would want the
> suspend delay at the current level though, but I guess other busses
> want to use that?
In your case resumption seems to cost almost no energy. In other
cases you must avoid short sleeps, as you conserve less energy
sleeping than it takes to resume.
Regards
Oliver
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <Pine.LNX.4.44L0.0906112210340.21931-100000@netrider.rowland.org>
@ 2009-06-12 8:15 ` Oliver Neukum
0 siblings, 0 replies; 41+ messages in thread
From: Oliver Neukum @ 2009-06-12 8:15 UTC (permalink / raw)
To: Alan Stern; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
Am Freitag, 12. Juni 2009 04:16:10 schrieb Alan Stern:
> What tree constraint? You mean that the PM core shouldn't allow
> devices to suspend unless all their children are suspended? Why
> doesn't it still apply?
Because the hardware doesn't need it.
> Remember, when Rafael and I say "suspend" here, we don't mean "go to a
> low-power state". We mean "the PM core calls the runtime_suspend
> method". No matter what actions the link hardware may decide to take
> on its own, the PM core will still want to observe the
> all-children-suspended restriction when calling runtime_suspend
> methods.
No. The core if it insists all children be suspended will not use
the hardware's full capabilities.
If it leaves such power saving measures to the drivers, latency
accounting will be wrong.
> > I think there are devices who can be suspended while children are active
> > and devices which can not be. This is an attribute of the device and
> > should be evaluated by the core.
>
> Clearly it should be decided by the driver. Should there be a bit for
> it in the dev_pm_info structure?
Yes.
Regards
Oliver
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] ` <200906121011.13592.oliver@neukum.org>
@ 2009-06-12 10:54 ` Magnus Damm
0 siblings, 0 replies; 41+ messages in thread
From: Magnus Damm @ 2009-06-12 10:54 UTC (permalink / raw)
To: Oliver Neukum; +Cc: ACPI Devel Maling List, linux-pm, LKML
On Fri, Jun 12, 2009 at 5:11 PM, Oliver Neukum<oliver@neukum.org> wrote:
> Am Freitag, 12. Juni 2009 05:13:12 schrieb Magnus Damm:
>> Hi Oliver,
>>
>> On Thu, Jun 11, 2009 at 6:08 PM, Oliver Neukum<oliver@neukum.org> wrote:
>> > Am Donnerstag, 11. Juni 2009 07:18:46 schrieb Magnus Damm:
>> >> 3) When all devices in the power domain are suspended the bus code can
>> >> turn off the power. The reason why I'd like to only autosuspend when
>> >
>> > So you are saying that you have power dependencies independent
>> > of the device tree?
>>
>> I can think of the following power dependencies:
>> - hardware bus topology
>> - clocks
>> - power domains
>
> That means that some devices otherwise unrelated have a common power
> switch, doesn't it?
Yes, various on chip devices share the same power switch.
On the SuperH Mobile SoCs that I've seen we basically have two power
domains. One big which contains almost everything except what's in the
second domain: a timer, watchdog and a rtc. I guess the devices are
related to each other because they are in the same SoC, but exactly
how they relate to each other depends on actual processor type. Same
for clocks.
This presentation may give you an overview of the SuperH Mobile
hardware and where we are today:
http://www.celinuxforum.org/CelfPubWiki/ELC2009Presentations?action=AttachFile&do=view&target=Runtime-Power-Management-on-SuperH-Mobile-20090407.pdf
>> >> all devices are idle is simply that we don't get any power savings
>> >> from the per device autosuspend() callbacks, only from turning off
>> >> power to the entire per-domain. So bindly autosuspending and
>> >> autoresuming devices is just pure overhead unless we know we can do it
>> >> for all devices in the domain.
>> >
>> > Why can't you do this within the framework? You simply suspend when
>> > all a domain's devices have been autosuspended.
>>
>> So you mean I should handle that in my arch/bus specific
>> dev->bus->pm->autosuspend() code? So instead of calling
>> dev->driver->pm->autosuspend() straight away I keep track of the use
>> count of the power domain and when the domain is unused I call
>> dev->driver->pm->autosuspend() for all devices in the power domain
>> before powering off?
>
> How much overhead do you have in autosuspend() if it actually powers
> down the devices? If it is small, I suggest you really run the autosuspend
> methods in the drivers but use a counter for the actual power switching
> on a bus level.
On a SoC scale, returning from the deepest sleep state takes a bit of
time. It depends on clock configuration, but worst case i think it
takes ~3ms to come back from the deepest sleep. Just to let the cpu
core start executing instructions. And on the way back we also have to
setup almost the entire system from scratch which probaly takes quite
a bit of time.
Why I don't want to call dev->driver->pm_autosuspend() directly is
basically our per-processor model hard coded clock dependencies. Some
open devices may block deep sleep, for instance if a serial port is
open we may not deep sleep unless we can live with stopping the clock
and potentially loosing incoming data. So if these open devices block
the entire power domain then there is no point in executing the
dev->driver->pm autosuspend()/autoresume() all the time since we
basically will never power off the domain.
The autosuspend()/autoresume() driver callbacks will save and restore
registers which is quite expensive since each memory access is
uncached. So I don't want to do that more than absolutely necessary.
But I can handle that in the bus type code, no problem. I'm sure ARM
can as well.
>> I guess hooking in things in dev->bus->pm->autosuspend() is doable,
>> but then dev->power.runtime_status will be set to RPM_SUSPENDED even
>> though the actual device isn't suspended at all. And RPM_IDLE state
>
> Why do you care about a device being in RPM_SUSPENDED while
> active. The inverse is a bug, but this seems harmless.
Ok, just wanted to check if you guys agreed with that as a valid combination.
>> will be more or less unused since the drivers should pass a delay of
>> zero to make sure the bus code gets notified about the idle state
>> straight away.
>
> So? You are not getting a common code without a little overhead
> for some cases.
Some overhead is of course acceptable. If it's too much then we can
always optimize later.
>> Basically, for my use case it would make more sense to let the
>> bus_type directly decide when a device should be suspended instead of
>> using a timeout before calling the bus_type code. I rather let the
>> bus_type decide if a timeout should be used or not instead of using it
>> for all bus_types.
>
> So call with a delay of 0.
Sure.
>> So I guess the plan is that drivers directly should invoke
>> pm_request_suspend() to notify the bus that they are idle? (I guess
>> similar to my platform_device_idle()?)
>
> Yes.
Ok, sounds good.
>> So yes, I'd like to do things in dev->bus->pm->autosuspend(), and the
>> code is quite close. I can't figure out why anyone would want the
>> suspend delay at the current level though, but I guess other busses
>> want to use that?
>
> In your case resumption seems to cost almost no energy. In other
> cases you must avoid short sleeps, as you conserve less energy
> sleeping than it takes to resume.
I want to avoid that as well, but I don't think a per-device timeout
is enough. I need to tie in latencies into this somehow. But I prefer
to do that after getting the basic stuff working. Step by step.
Thanks,
/ magnus
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <200906112143.57361.rjw@sisk.pl>
@ 2009-06-12 14:25 ` Alan Stern
0 siblings, 0 replies; 41+ messages in thread
From: Alan Stern @ 2009-06-12 14:25 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Thu, 11 Jun 2009, Rafael J. Wysocki wrote:
> > > > It might also be nice to make sure that the driver core autoresumes a
> > > > device before probing it and autosuspends a device (after some
> > > > reasonable delay) after unbinding its driver.
> > >
> > > Agreed.
> >
> > This is another case where a usage counter comes in handy. The driver
> > core resumes the device and increments the counter -- thus preventing
> > any unwanted autosuspends -- before making the probe and remove calls.
>
> I like this idea.
>
> BTW, where exactly the counter should be increased in that case?
>
> I thought of driver_probe_device(), but is it sufficient? Or is there a better
> place?
That's okay. Or you could put it in really_probe(). Either one.
Alan Stern
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <200906121015.19504.oliver@neukum.org>
@ 2009-06-12 14:32 ` Alan Stern
0 siblings, 0 replies; 41+ messages in thread
From: Alan Stern @ 2009-06-12 14:32 UTC (permalink / raw)
To: Oliver Neukum; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Fri, 12 Jun 2009, Oliver Neukum wrote:
> Am Freitag, 12. Juni 2009 04:16:10 schrieb Alan Stern:
> > What tree constraint? You mean that the PM core shouldn't allow
> > devices to suspend unless all their children are suspended? Why
> > doesn't it still apply?
>
> Because the hardware doesn't need it.
But maybe drivers need it.
> > Remember, when Rafael and I say "suspend" here, we don't mean "go to a
> > low-power state". We mean "the PM core calls the runtime_suspend
> > method". No matter what actions the link hardware may decide to take
> > on its own, the PM core will still want to observe the
> > all-children-suspended restriction when calling runtime_suspend
> > methods.
>
> No. The core if it insists all children be suspended will not use
> the hardware's full capabilities.
That isn't what I said. The core does not insist that all children be
suspended, i.e., be in a low-power state. It insists only that the
children's drivers' runtime_suspend methods have been called. Those
methods are not obligated to put the children in a low-power state.
> If it leaves such power saving measures to the drivers, latency
> accounting will be wrong.
>
> > > I think there are devices who can be suspended while children are active
> > > and devices which can not be. This is an attribute of the device and
> > > should be evaluated by the core.
> >
> > Clearly it should be decided by the driver. Should there be a bit for
> > it in the dev_pm_info structure?
>
> Yes.
That would resolve the issue.
Alan Stern
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <Pine.LNX.4.44L0.0906121027300.2915-100000@iolanthe.rowland.org>
@ 2009-06-12 19:09 ` Rafael J. Wysocki
0 siblings, 0 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2009-06-12 19:09 UTC (permalink / raw)
To: Alan Stern; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Friday 12 June 2009, Alan Stern wrote:
> On Fri, 12 Jun 2009, Oliver Neukum wrote:
>
> > Am Freitag, 12. Juni 2009 04:16:10 schrieb Alan Stern:
> > > What tree constraint? You mean that the PM core shouldn't allow
> > > devices to suspend unless all their children are suspended? Why
> > > doesn't it still apply?
> >
> > Because the hardware doesn't need it.
>
> But maybe drivers need it.
>
> > > Remember, when Rafael and I say "suspend" here, we don't mean "go to a
> > > low-power state". We mean "the PM core calls the runtime_suspend
> > > method". No matter what actions the link hardware may decide to take
> > > on its own, the PM core will still want to observe the
> > > all-children-suspended restriction when calling runtime_suspend
> > > methods.
> >
> > No. The core if it insists all children be suspended will not use
> > the hardware's full capabilities.
>
> That isn't what I said. The core does not insist that all children be
> suspended, i.e., be in a low-power state. It insists only that the
> children's drivers' runtime_suspend methods have been called. Those
> methods are not obligated to put the children in a low-power state.
>
> > If it leaves such power saving measures to the drivers, latency
> > accounting will be wrong.
> >
> > > > I think there are devices who can be suspended while children are active
> > > > and devices which can not be. This is an attribute of the device and
> > > > should be evaluated by the core.
> > >
> > > Clearly it should be decided by the driver. Should there be a bit for
> > > it in the dev_pm_info structure?
> >
> > Yes.
>
> That would resolve the issue.
So, are you suggesting that the core should only check the "all children
suspended" condition if special flag is set in dev_pm_info?
Best,
Rafael
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <200906122109.35589.rjw@sisk.pl>
@ 2009-06-12 19:48 ` Alan Stern
0 siblings, 0 replies; 41+ messages in thread
From: Alan Stern @ 2009-06-12 19:48 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Fri, 12 Jun 2009, Rafael J. Wysocki wrote:
> So, are you suggesting that the core should only check the "all children
> suspended" condition if special flag is set in dev_pm_info?
Or rather, check it only if the special flag _isn't_ set.
Alan Stern
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <Pine.LNX.4.44L0.0906121547490.2862-100000@iolanthe.rowland.org>
@ 2009-06-12 19:56 ` Rafael J. Wysocki
0 siblings, 0 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2009-06-12 19:56 UTC (permalink / raw)
To: Alan Stern; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Friday 12 June 2009, Alan Stern wrote:
> On Fri, 12 Jun 2009, Rafael J. Wysocki wrote:
>
> > So, are you suggesting that the core should only check the "all children
> > suspended" condition if special flag is set in dev_pm_info?
>
> Or rather, check it only if the special flag _isn't_ set.
Where the default is unset, I guess?
But then, what about the resuming of the parents before the device is resumed?
Should the parents be resumed regardless of the flag state? And if so, what's
the condition for breaking the recurrence? Surely it's not sufficient to check
if the parent is active, because its parent need not be active if it has this
special flag set.
Best,
Rafael
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <200906122156.14873.rjw@sisk.pl>
@ 2009-06-12 21:23 ` Alan Stern
0 siblings, 0 replies; 41+ messages in thread
From: Alan Stern @ 2009-06-12 21:23 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Fri, 12 Jun 2009, Rafael J. Wysocki wrote:
> On Friday 12 June 2009, Alan Stern wrote:
> > On Fri, 12 Jun 2009, Rafael J. Wysocki wrote:
> >
> > > So, are you suggesting that the core should only check the "all children
> > > suspended" condition if special flag is set in dev_pm_info?
> >
> > Or rather, check it only if the special flag _isn't_ set.
>
> Where the default is unset, I guess?
Yep.
> But then, what about the resuming of the parents before the device is resumed?
> Should the parents be resumed regardless of the flag state?
Yes. In general you should assume a device's parent (and the device
itself!) needs to be resumed whenever the kernel wants to do something
with the device. The special flag arises because sometimes it's safe
to suspend the parent without suspending the device _if_ the kernel
isn't using the device.
Imagine an idle disk at the end of a link. We might want to
autosuspend the link without spinning down the disk. When we have to
communicate with the disk again, we autoresume the link. (Including
the case where the communication is a "spin-down" command.)
> And if so, what's
> the condition for breaking the recurrence? Surely it's not sufficient to check
> if the parent is active, because its parent need not be active if it has this
> special flag set.
That's a good question. Let's assume that situations like this will be
handled by the drivers.
For example, suppose A is the parent of B is the parent of C, and A is
suspended but B isn't and C is. What happens when somebody wants to
use C?
An autoresume request is generated for C. Since C's parent is already
resumed, the runtime_resume method in C's driver is called. The driver
has to do some I/O in order to resume C, so it passes an I/O request up
to B's driver. The request then gets passed up to A's driver. This
driver knows that A is suspended, so it starts an autoresume of A and
waits for the autoresume to complete before carrying out the request.
Then the I/O can go through, so C gets resumed and everything works
out.
I don't know how often this sort of pattern will arise. It certainly
could be used in usb-storage; there would be no difficulty starting an
autoresume when an I/O request arrives from the SCSI layer below. In
fact, that is exactly how some early runtime-PM patches for usb-storage
worked.
Alan Stern
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <Pine.LNX.4.44L0.0906121706180.5018-100000@iolanthe.rowland.org>
@ 2009-06-12 23:06 ` Rafael J. Wysocki
0 siblings, 0 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2009-06-12 23:06 UTC (permalink / raw)
To: Alan Stern; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Friday 12 June 2009, Alan Stern wrote:
> On Fri, 12 Jun 2009, Rafael J. Wysocki wrote:
>
> > On Friday 12 June 2009, Alan Stern wrote:
> > > On Fri, 12 Jun 2009, Rafael J. Wysocki wrote:
> > >
> > > > So, are you suggesting that the core should only check the "all children
> > > > suspended" condition if special flag is set in dev_pm_info?
> > >
> > > Or rather, check it only if the special flag _isn't_ set.
> >
> > Where the default is unset, I guess?
>
> Yep.
>
> > But then, what about the resuming of the parents before the device is resumed?
> > Should the parents be resumed regardless of the flag state?
>
> Yes. In general you should assume a device's parent (and the device
> itself!) needs to be resumed whenever the kernel wants to do something
> with the device. The special flag arises because sometimes it's safe
> to suspend the parent without suspending the device _if_ the kernel
> isn't using the device.
>
> Imagine an idle disk at the end of a link. We might want to
> autosuspend the link without spinning down the disk. When we have to
> communicate with the disk again, we autoresume the link. (Including
> the case where the communication is a "spin-down" command.)
>
> > And if so, what's
> > the condition for breaking the recurrence? Surely it's not sufficient to check
> > if the parent is active, because its parent need not be active if it has this
> > special flag set.
>
> That's a good question. Let's assume that situations like this will be
> handled by the drivers.
>
> For example, suppose A is the parent of B is the parent of C, and A is
> suspended but B isn't and C is. What happens when somebody wants to
> use C?
>
> An autoresume request is generated for C. Since C's parent is already
> resumed, the runtime_resume method in C's driver is called. The driver
> has to do some I/O in order to resume C, so it passes an I/O request up
> to B's driver. The request then gets passed up to A's driver. This
> driver knows that A is suspended, so it starts an autoresume of A and
> waits for the autoresume to complete before carrying out the request.
>
> Then the I/O can go through, so C gets resumed and everything works
> out.
>
> I don't know how often this sort of pattern will arise. It certainly
> could be used in usb-storage; there would be no difficulty starting an
> autoresume when an I/O request arrives from the SCSI layer below. In
> fact, that is exactly how some early runtime-PM patches for usb-storage
> worked.
So, the conclusion seems to be that we should break the recurrence
at the point we find an already active device or a device with no parent and
let the driver(s) handle the more complicated cases. Is this correct?
BTW, is __device_release_driver() the right place for blocking the run-time PM
temporarily during remove?
Best,
Rafael
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <200906130106.58464.rjw@sisk.pl>
@ 2009-06-13 18:08 ` Alan Stern
0 siblings, 0 replies; 41+ messages in thread
From: Alan Stern @ 2009-06-13 18:08 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Sat, 13 Jun 2009, Rafael J. Wysocki wrote:
> So, the conclusion seems to be that we should break the recurrence
> at the point we find an already active device or a device with no parent and
> let the driver(s) handle the more complicated cases. Is this correct?
That's right.
> BTW, is __device_release_driver() the right place for blocking the run-time PM
> temporarily during remove?
It is. And for submitting a delayed autosuspend request afterward; we
may as well try to suspend devices that don't have drivers.
Alan Stern
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code)
[not found] <Pine.LNX.4.44L0.0906131404160.25619-100000@netrider.rowland.org>
@ 2009-06-13 22:04 ` Rafael J. Wysocki
0 siblings, 0 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2009-06-13 22:04 UTC (permalink / raw)
To: Alan Stern; +Cc: ACPI Devel Maling List, Linux-pm mailing list, LKML
On Saturday 13 June 2009, Alan Stern wrote:
> On Sat, 13 Jun 2009, Rafael J. Wysocki wrote:
>
> > So, the conclusion seems to be that we should break the recurrence
> > at the point we find an already active device or a device with no parent and
> > let the driver(s) handle the more complicated cases. Is this correct?
>
> That's right.
OK
> > BTW, is __device_release_driver() the right place for blocking the run-time PM
> > temporarily during remove?
>
> It is.
OK
> And for submitting a delayed autosuspend request afterward; we
> may as well try to suspend devices that don't have drivers.
OK, but I'd like to add this functionality if future, when at least one bus
type starts using the framework.
I think I have all of the ducks in a row now, so I'm going to post a cleaned-up
patch in a new thread in a while.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2009-06-13 22:04 UTC | newest]
Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <Pine.LNX.4.44L0.0906101933220.29717-100000@netrider.rowland.org>
2009-06-11 14:17 ` [patch update] Re: Run-time PM idea (was: Re: [RFC][PATCH 0/2] PM: Rearrange core suspend code) Rafael J. Wysocki
2009-06-11 14:52 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0906131404160.25619-100000@netrider.rowland.org>
2009-06-13 22:04 ` Rafael J. Wysocki
[not found] <200906130106.58464.rjw@sisk.pl>
2009-06-13 18:08 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0906121706180.5018-100000@iolanthe.rowland.org>
2009-06-12 23:06 ` Rafael J. Wysocki
[not found] <200906122156.14873.rjw@sisk.pl>
2009-06-12 21:23 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0906121547490.2862-100000@iolanthe.rowland.org>
2009-06-12 19:56 ` Rafael J. Wysocki
[not found] <200906122109.35589.rjw@sisk.pl>
2009-06-12 19:48 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0906121027300.2915-100000@iolanthe.rowland.org>
2009-06-12 19:09 ` Rafael J. Wysocki
[not found] <200906121015.19504.oliver@neukum.org>
2009-06-12 14:32 ` Alan Stern
[not found] <200906112143.57361.rjw@sisk.pl>
2009-06-12 14:25 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0906112210340.21931-100000@netrider.rowland.org>
2009-06-12 8:15 ` Oliver Neukum
[not found] <200906112305.49232.oliver@neukum.org>
2009-06-12 2:16 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0906111433110.4875-100000@iolanthe.rowland.org>
2009-06-11 21:05 ` Oliver Neukum
[not found] <Pine.LNX.4.44L0.0906111010360.2939-100000@iolanthe.rowland.org>
2009-06-11 19:38 ` Rafael J. Wysocki
[not found] <200906111805.53388.oliver@neukum.org>
2009-06-11 18:36 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0906111120260.3040-100000@iolanthe.rowland.org>
2009-06-11 16:05 ` Oliver Neukum
[not found] <200906111706.16848.oliver@neukum.org>
2009-06-11 15:22 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0906111041020.3040-100000@iolanthe.rowland.org>
2009-06-11 15:06 ` Oliver Neukum
2009-06-11 19:43 ` Rafael J. Wysocki
[not found] <200906111557.49490.oliver@neukum.org>
2009-06-11 14:16 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0906101942280.29717-100000@netrider.rowland.org>
2009-06-11 13:48 ` Rafael J. Wysocki
[not found] ` <200906111548.33751.rjw@sisk.pl>
2009-06-11 13:57 ` Oliver Neukum
[not found] <Pine.LNX.4.44L0.0906101656100.2589-100000@iolanthe.rowland.org>
2009-06-10 21:31 ` Rafael J. Wysocki
[not found] ` <200906102331.14267.rjw@sisk.pl>
2009-06-10 23:15 ` Oliver Neukum
2009-06-10 23:42 ` Alan Stern
[not found] ` <200906110115.30729.oliver@neukum.org>
2009-06-11 5:27 ` Magnus Damm
[not found] <Pine.LNX.4.44L0.0906082233490.19072-100000@netrider.rowland.org>
[not found] ` <200906100057.04473.rjw@sisk.pl>
2009-06-10 8:29 ` Rafael J. Wysocki
[not found] ` <200906101029.27529.rjw@sisk.pl>
2009-06-10 14:20 ` Oliver Neukum
[not found] ` <200906101620.26559.oliver@neukum.org>
2009-06-10 19:27 ` Rafael J. Wysocki
[not found] ` <200906102127.57136.rjw@sisk.pl>
2009-06-10 21:38 ` Oliver Neukum
[not found] ` <200906102338.35183.oliver@neukum.org>
2009-06-10 22:01 ` Rafael J. Wysocki
[not found] ` <200906110001.20718.rjw@sisk.pl>
2009-06-10 23:07 ` Oliver Neukum
[not found] ` <200906110107.23023.oliver@neukum.org>
2009-06-10 23:42 ` Alan Stern
2009-06-11 13:46 ` Rafael J. Wysocki
2009-06-10 21:14 ` Alan Stern
2009-06-11 5:18 ` Magnus Damm
[not found] ` <aec7e5c30906102218j2f8f38dbt5d05a3928ca46c15@mail.gmail.com>
2009-06-11 9:08 ` Oliver Neukum
[not found] ` <200906111108.40785.oliver@neukum.org>
2009-06-12 3:13 ` Magnus Damm
2009-06-12 8:11 ` Oliver Neukum
[not found] ` <200906121011.13592.oliver@neukum.org>
2009-06-12 10:54 ` Magnus Damm
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox