Linux Power Management development
 help / color / mirror / Atom feed
* [PATCH 5/7] ACPI / PM: Provide device PM functions operating on struct acpi_device
From: Rafael J. Wysocki @ 2012-10-29  9:11 UTC (permalink / raw)
  To: Linux PM list
  Cc: ACPI Devel Maling List, Aaron Lu, Huang Ying, LKML, Len Brown,
	Lv Zheng, Adrian Hunter
In-Reply-To: <1766582.8gdQKXoi0K@vostro.rjw.lan>

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

If the caller of acpi_bus_set_power() already has a pointer to the
struct acpi_device object corresponding to the device in question, it
doesn't make sense for it to go through acpi_bus_get_device(), which
may be costly, because it involves acquiring the global ACPI
namespace mutex.

For this reason, export the function operating on struct acpi_device
objects used internally by acpi_bus_set_power(), so that it may be
called instead of acpi_bus_set_power() in the above case, and change
its name to acpi_device_set_power().

Additionally, introduce two inline wrappers for checking ACPI PM
capabilities of devices represented by struct acpi_device objects.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/bus.c      |   15 ++++++++++++---
 include/acpi/acpi_bus.h |   11 +++++++++++
 2 files changed, 23 insertions(+), 3 deletions(-)

Index: linux/drivers/acpi/bus.c
===================================================================
--- linux.orig/drivers/acpi/bus.c
+++ linux/drivers/acpi/bus.c
@@ -257,7 +257,15 @@ static int __acpi_bus_get_power(struct a
 }
 
 
-static int __acpi_bus_set_power(struct acpi_device *device, int state)
+/**
+ * acpi_device_set_power - Set power state of an ACPI device.
+ * @device: Device to set the power state of.
+ * @state: New power state to set.
+ *
+ * Callers must ensure that the device is power manageable before using this
+ * function.
+ */
+int acpi_device_set_power(struct acpi_device *device, int state)
 {
 	int result = 0;
 	acpi_status status = AE_OK;
@@ -341,6 +349,7 @@ static int __acpi_bus_set_power(struct a
 
 	return result;
 }
+EXPORT_SYMBOL(acpi_device_set_power);
 
 
 int acpi_bus_set_power(acpi_handle handle, int state)
@@ -359,7 +368,7 @@ int acpi_bus_set_power(acpi_handle handl
 		return -ENODEV;
 	}
 
-	return __acpi_bus_set_power(device, state);
+	return acpi_device_set_power(device, state);
 }
 EXPORT_SYMBOL(acpi_bus_set_power);
 
@@ -402,7 +411,7 @@ int acpi_bus_update_power(acpi_handle ha
 	if (result)
 		return result;
 
-	result = __acpi_bus_set_power(device, state);
+	result = acpi_device_set_power(device, state);
 	if (!result && state_p)
 		*state_p = state;
 
Index: linux/include/acpi/acpi_bus.h
===================================================================
--- linux.orig/include/acpi/acpi_bus.h
+++ linux/include/acpi/acpi_bus.h
@@ -338,6 +338,7 @@ acpi_status acpi_bus_get_status_handle(a
 				       unsigned long long *sta);
 int acpi_bus_get_status(struct acpi_device *device);
 int acpi_bus_set_power(acpi_handle handle, int state);
+int acpi_device_set_power(struct acpi_device *device, int state);
 int acpi_bus_update_power(acpi_handle handle, int *state_p);
 bool acpi_bus_power_manageable(acpi_handle handle);
 bool acpi_bus_can_wakeup(acpi_handle handle);
@@ -482,6 +483,16 @@ static inline int acpi_pm_device_sleep_w
 }
 #endif
 
+static inline bool acpi_device_power_manageable(struct acpi_device *adev)
+{
+	return adev->flags.power_manageable;
+}
+
+static inline bool acpi_device_can_wakeup(struct acpi_device *adev)
+{
+	return adev->wakeup.flags.valid;
+}
+
 #else	/* CONFIG_ACPI */
 
 static inline int register_acpi_bus_type(void *bus) { return 0; }

^ permalink raw reply

* [PATCH 6/7] ACPI / PM: Move device PM functions related to sleep states
From: Rafael J. Wysocki @ 2012-10-29  9:12 UTC (permalink / raw)
  To: Linux PM list
  Cc: ACPI Devel Maling List, Aaron Lu, Huang Ying, LKML, Len Brown,
	Lv Zheng, Adrian Hunter
In-Reply-To: <1766582.8gdQKXoi0K@vostro.rjw.lan>

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Introduce helper function returning the target sleep state of the
system and use it to move the remaining device power management
functions from sleep.c to device_pm.c.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/device_pm.c |   54 ++++++++++++++++++++++++++++++++++++++++
 drivers/acpi/sleep.c     |   63 ++++-------------------------------------------
 include/acpi/acpi_bus.h  |    1 
 3 files changed, 61 insertions(+), 57 deletions(-)

Index: linux/drivers/acpi/sleep.c
===================================================================
--- linux.orig/drivers/acpi/sleep.c
+++ linux/drivers/acpi/sleep.c
@@ -80,6 +80,12 @@ static int acpi_sleep_prepare(u32 acpi_s
 
 #ifdef CONFIG_ACPI_SLEEP
 static u32 acpi_target_sleep_state = ACPI_STATE_S0;
+
+u32 acpi_target_system_state(void)
+{
+	return acpi_target_sleep_state;
+}
+
 static bool pwr_btn_event_pending;
 
 /*
@@ -695,63 +701,6 @@ int acpi_suspend(u32 acpi_state)
 	return -EINVAL;
 }
 
-#ifdef CONFIG_PM
-/**
- * acpi_pm_device_sleep_state - Get preferred power state of ACPI device.
- * @dev: Device whose preferred target power state to return.
- * @d_min_p: Location to store the upper limit of the allowed states range.
- * @d_max_in: Deepest low-power state to take into consideration.
- * Return value: Preferred power state of the device on success, -ENODEV
- * (if there's no 'struct acpi_device' for @dev) or -EINVAL on failure
- *
- * The caller must ensure that @dev is valid before using this function.
- */
-int acpi_pm_device_sleep_state(struct device *dev, int *d_min_p, int d_max_in)
-{
-	acpi_handle handle = DEVICE_ACPI_HANDLE(dev);
-	struct acpi_device *adev;
-
-	if (!handle || ACPI_FAILURE(acpi_bus_get_device(handle, &adev))) {
-		dev_dbg(dev, "ACPI handle without context in %s!\n", __func__);
-		return -ENODEV;
-	}
-
-	return acpi_device_power_state(dev, adev, acpi_target_sleep_state,
-				       d_max_in, d_min_p);
-}
-EXPORT_SYMBOL(acpi_pm_device_sleep_state);
-#endif /* CONFIG_PM */
-
-#ifdef CONFIG_PM_SLEEP
-/**
- * acpi_pm_device_sleep_wake - Enable or disable device to wake up the system.
- * @dev: Device to enable/desible to wake up the system from sleep states.
- * @enable: Whether to enable or disable @dev to wake up the system.
- */
-int acpi_pm_device_sleep_wake(struct device *dev, bool enable)
-{
-	acpi_handle handle;
-	struct acpi_device *adev;
-	int error;
-
-	if (!device_can_wakeup(dev))
-		return -EINVAL;
-
-	handle = DEVICE_ACPI_HANDLE(dev);
-	if (!handle || ACPI_FAILURE(acpi_bus_get_device(handle, &adev))) {
-		dev_dbg(dev, "ACPI handle without context in %s!\n", __func__);
-		return -ENODEV;
-	}
-
-	error = __acpi_device_sleep_wake(adev, acpi_target_sleep_state, enable);
-	if (!error)
-		dev_info(dev, "System wakeup %s by ACPI\n",
-				enable ? "enabled" : "disabled");
-
-	return error;
-}
-#endif  /* CONFIG_PM_SLEEP */
-
 static void acpi_power_off_prepare(void)
 {
 	/* Prepare to power off the system */
Index: linux/include/acpi/acpi_bus.h
===================================================================
--- linux.orig/include/acpi/acpi_bus.h
+++ linux/include/acpi/acpi_bus.h
@@ -416,6 +416,7 @@ int acpi_enable_wakeup_device_power(stru
 int acpi_disable_wakeup_device_power(struct acpi_device *dev);
 
 #ifdef CONFIG_PM
+u32 acpi_target_system_state(void);
 acpi_status acpi_add_pm_notifier(struct acpi_device *adev,
 				 acpi_notify_handler handler, void *context);
 acpi_status acpi_remove_pm_notifier(struct acpi_device *adev,
Index: linux/drivers/acpi/device_pm.c
===================================================================
--- linux.orig/drivers/acpi/device_pm.c
+++ linux/drivers/acpi/device_pm.c
@@ -198,6 +198,31 @@ int acpi_device_power_state(struct devic
 }
 EXPORT_SYMBOL_GPL(acpi_device_power_state);
 
+/**
+ * acpi_pm_device_sleep_state - Get preferred power state of ACPI device.
+ * @dev: Device whose preferred target power state to return.
+ * @d_min_p: Location to store the upper limit of the allowed states range.
+ * @d_max_in: Deepest low-power state to take into consideration.
+ * Return value: Preferred power state of the device on success, -ENODEV
+ * (if there's no 'struct acpi_device' for @dev) or -EINVAL on failure
+ *
+ * The caller must ensure that @dev is valid before using this function.
+ */
+int acpi_pm_device_sleep_state(struct device *dev, int *d_min_p, int d_max_in)
+{
+	acpi_handle handle = DEVICE_ACPI_HANDLE(dev);
+	struct acpi_device *adev;
+
+	if (!handle || ACPI_FAILURE(acpi_bus_get_device(handle, &adev))) {
+		dev_dbg(dev, "ACPI handle without context in %s!\n", __func__);
+		return -ENODEV;
+	}
+
+	return acpi_device_power_state(dev, adev, acpi_target_system_state(),
+				       d_max_in, d_min_p);
+}
+EXPORT_SYMBOL(acpi_pm_device_sleep_state);
+
 #ifdef CONFIG_PM_RUNTIME
 /**
  * __acpi_device_run_wake - Enable/disable runtime remote wakeup for device.
@@ -274,4 +299,33 @@ int __acpi_device_sleep_wake(struct acpi
 		acpi_enable_wakeup_device_power(adev, target_state) :
 		acpi_disable_wakeup_device_power(adev);
 }
+
+/**
+ * acpi_pm_device_sleep_wake - Enable or disable device to wake up the system.
+ * @dev: Device to enable/desible to wake up the system from sleep states.
+ * @enable: Whether to enable or disable @dev to wake up the system.
+ */
+int acpi_pm_device_sleep_wake(struct device *dev, bool enable)
+{
+	acpi_handle handle;
+	struct acpi_device *adev;
+	int error;
+
+	if (!device_can_wakeup(dev))
+		return -EINVAL;
+
+	handle = DEVICE_ACPI_HANDLE(dev);
+	if (!handle || ACPI_FAILURE(acpi_bus_get_device(handle, &adev))) {
+		dev_dbg(dev, "ACPI handle without context in %s!\n", __func__);
+		return -ENODEV;
+	}
+
+	error = __acpi_device_sleep_wake(adev, acpi_target_system_state(),
+					 enable);
+	if (!error)
+		dev_info(dev, "System wakeup %s by ACPI\n",
+				enable ? "enabled" : "disabled");
+
+	return error;
+}
 #endif /* CONFIG_PM_SLEEP */

^ permalink raw reply

* [PATCH 7/7] ACPI / PM: Provide ACPI PM callback routines for subsystems
From: Rafael J. Wysocki @ 2012-10-29  9:13 UTC (permalink / raw)
  To: Linux PM list
  Cc: ACPI Devel Maling List, Aaron Lu, Huang Ying, LKML, Len Brown,
	Lv Zheng, Adrian Hunter
In-Reply-To: <1766582.8gdQKXoi0K@vostro.rjw.lan>

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Some bus types don't support power management natively, but generally
there may be device nodes in ACPI tables corresponding to the devices
whose bus types they are (under ACPI 5 those bus types may be SPI,
I2C and platform).  If that is the case, standard ACPI power
management may be applied to those devices, although currently the
kernel has no means for that.

For this reason, provide a set of routines that may be used as power
management callbacks for such devices.  This may be done in three
different ways.

 (1) Device drivers handling the devices in question may run
     acpi_dev_pm_attach() in their .probe() routines, which (on
     success) will cause the devices to be added to the general ACPI
     PM domain and ACPI power management will be used for them going
     forward.  Then, acpi_dev_pm_detach() may be used to remove the
     devices from the general ACPI PM domain if ACPI power management
     is not necessary for them any more.

 (2) The devices' subsystems may use acpi_subsys_runtime_suspend(),
     acpi_subsys_runtime_resume(), acpi_subsys_prepare(),
     acpi_subsys_suspend_late(), acpi_subsys_resume_early() as their
     power management callbacks in the same way as the general ACPI
     PM domain does that.

 (3) The devices' drivers may execute acpi_dev_suspend_late(),
     acpi_dev_resume_early(), acpi_dev_runtime_suspend(),
     acpi_dev_runtime_resume() from their power management callbacks
     as appropriate, if that's absolutely necessary, but it is not
     recommended to do that, because such drivers may not work
     without ACPI support as a result.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/device_pm.c |  314 +++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/acpi.h     |   34 +++++
 2 files changed, 348 insertions(+)

Index: linux/drivers/acpi/device_pm.c
===================================================================
--- linux.orig/drivers/acpi/device_pm.c
+++ linux/drivers/acpi/device_pm.c
@@ -225,6 +225,22 @@ EXPORT_SYMBOL(acpi_pm_device_sleep_state
 
 #ifdef CONFIG_PM_RUNTIME
 /**
+ * acpi_wakeup_device - Wakeup notification handler for ACPI devices.
+ * @handle: ACPI handle of the device the notification is for.
+ * @event: Type of the signaled event.
+ * @context: Device corresponding to @handle.
+ */
+static void acpi_wakeup_device(acpi_handle handle, u32 event, void *context)
+{
+	struct device *dev = context;
+
+	if (event == ACPI_NOTIFY_DEVICE_WAKE && dev) {
+		pm_wakeup_event(dev, 0);
+		pm_runtime_resume(dev);
+	}
+}
+
+/**
  * __acpi_device_run_wake - Enable/disable runtime remote wakeup for device.
  * @adev: ACPI device to enable/disable the remote wakeup for.
  * @enable: Whether to enable or disable the wakeup functionality.
@@ -329,3 +345,301 @@ int acpi_pm_device_sleep_wake(struct dev
 	return error;
 }
 #endif /* CONFIG_PM_SLEEP */
+
+/**
+ * acpi_dev_pm_get_node - Get ACPI device node for the given physical device.
+ * @dev: Device to get the ACPI node for.
+ */
+static struct acpi_device *acpi_dev_pm_get_node(struct device *dev)
+{
+	acpi_handle handle = DEVICE_ACPI_HANDLE(dev);
+	struct acpi_device *adev;
+
+	return handle && ACPI_SUCCESS(acpi_bus_get_device(handle, &adev)) ?
+		adev : NULL;
+}
+
+/**
+ * acpi_dev_pm_low_power - Put ACPI device into a low-power state.
+ * @dev: Device to put into a low-power state.
+ * @adev: ACPI device node corresponding to @dev.
+ * @system_state: System state to choose the device state for.
+ */
+static int acpi_dev_pm_low_power(struct device *dev, struct acpi_device *adev,
+				 u32 system_state)
+{
+	int power_state;
+
+	if (!acpi_device_power_manageable(adev))
+		return 0;
+
+	power_state = acpi_device_power_state(dev, adev, system_state,
+					      ACPI_STATE_D3, NULL);
+	if (power_state < ACPI_STATE_D0 || power_state > ACPI_STATE_D3)
+		return -EIO;
+
+	return acpi_device_set_power(adev, power_state);
+}
+
+/**
+ * acpi_dev_pm_full_power - Put ACPI device into the full-power state.
+ * @adev: ACPI device node to put into the full-power state.
+ */
+static int acpi_dev_pm_full_power(struct acpi_device *adev)
+{
+	return acpi_device_power_manageable(adev) ?
+		acpi_device_set_power(adev, ACPI_STATE_D0) : 0;
+}
+
+#ifdef CONFIG_PM_RUNTIME
+/**
+ * acpi_dev_runtime_suspend - Put device into a low-power state using ACPI.
+ * @dev: Device to put into a low-power state.
+ *
+ * Put the given device into a runtime low-power state using the standard ACPI
+ * mechanism.  Set up remote wakeup if desired, choose the state to put the
+ * device into (this checks if remote wakeup is expected to work too), and set
+ * the power state of the device.
+ */
+int acpi_dev_runtime_suspend(struct device *dev)
+{
+	struct acpi_device *adev = acpi_dev_pm_get_node(dev);
+	bool remote_wakeup;
+	int error;
+
+	if (!adev)
+		return 0;
+
+	remote_wakeup = dev_pm_qos_flags(dev, PM_QOS_FLAG_REMOTE_WAKEUP) >
+				PM_QOS_FLAGS_NONE;
+	error = __acpi_device_run_wake(adev, remote_wakeup);
+	if (remote_wakeup && error)
+		return -EAGAIN;
+
+	error = acpi_dev_pm_low_power(dev, adev, ACPI_STATE_S0);
+	if (error)
+		__acpi_device_run_wake(adev, false);
+
+	return error;
+}
+EXPORT_SYMBOL_GPL(acpi_dev_runtime_suspend);
+
+/**
+ * acpi_dev_runtime_resume - Put device into the full-power state using ACPI.
+ * @dev: Device to put into the full-power state.
+ *
+ * Put the given device into the full-power state using the standard ACPI
+ * mechanism at run time.  Set the power state of the device to ACPI D0 and
+ * disable remote wakeup.
+ */
+int acpi_dev_runtime_resume(struct device *dev)
+{
+	struct acpi_device *adev = acpi_dev_pm_get_node(dev);
+	int error;
+
+	if (!adev)
+		return 0;
+
+	error = acpi_dev_pm_full_power(adev);
+	__acpi_device_run_wake(adev, false);
+	return error;
+}
+EXPORT_SYMBOL_GPL(acpi_dev_runtime_resume);
+
+/**
+ * acpi_subsys_runtime_suspend - Suspend device using ACPI.
+ * @dev: Device to suspend.
+ *
+ * Carry out the generic runtime suspend procedure for @dev and use ACPI to put
+ * it into a runtime low-power state.
+ */
+int acpi_subsys_runtime_suspend(struct device *dev)
+{
+	int ret = pm_generic_runtime_suspend(dev);
+	return ret ? ret : acpi_dev_runtime_suspend(dev);
+}
+EXPORT_SYMBOL_GPL(acpi_subsys_runtime_suspend);
+
+/**
+ * acpi_subsys_runtime_resume - Resume device using ACPI.
+ * @dev: Device to Resume.
+ *
+ * Use ACPI to put the given device into the full-power state and carry out the
+ * generic runtime resume procedure for it.
+ */
+int acpi_subsys_runtime_resume(struct device *dev)
+{
+	int ret = acpi_dev_runtime_resume(dev);
+	return ret ? ret : pm_generic_runtime_resume(dev);
+}
+EXPORT_SYMBOL_GPL(acpi_subsys_runtime_resume);
+#endif /* CONFIG_PM_RUNTIME */
+
+#ifdef CONFIG_PM_SLEEP
+/**
+ * acpi_dev_suspend_late - Put device into a low-power state using ACPI.
+ * @dev: Device to put into a low-power state.
+ *
+ * Put the given device into a low-power state during system transition to a
+ * sleep state using the standard ACPI mechanism.  Set up system wakeup if
+ * desired, choose the state to put the device into (this checks if system
+ * wakeup is expected to work too), and set the power state of the device.
+ */
+int acpi_dev_suspend_late(struct device *dev)
+{
+	struct acpi_device *adev = acpi_dev_pm_get_node(dev);
+	u32 target_state;
+	bool wakeup;
+	int error;
+
+	if (!adev)
+		return 0;
+
+	target_state = acpi_target_system_state();
+	wakeup = device_may_wakeup(dev);
+	error = __acpi_device_sleep_wake(adev, target_state, wakeup);
+	if (wakeup && error)
+		return error;
+
+	error = acpi_dev_pm_low_power(dev, adev, target_state);
+	if (error)
+		__acpi_device_sleep_wake(adev, ACPI_STATE_UNKNOWN, false);
+
+	return error;
+}
+EXPORT_SYMBOL_GPL(acpi_dev_suspend_late);
+
+/**
+ * acpi_dev_resume_early - Put device into the full-power state using ACPI.
+ * @dev: Device to put into the full-power state.
+ *
+ * Put the given device into the full-power state using the standard ACPI
+ * mechanism during system transition to the working state.  Set the power
+ * state of the device to ACPI D0 and disable remote wakeup.
+ */
+int acpi_dev_resume_early(struct device *dev)
+{
+	struct acpi_device *adev = acpi_dev_pm_get_node(dev);
+	int error;
+
+	if (!adev)
+		return 0;
+
+	error = acpi_dev_pm_full_power(adev);
+	__acpi_device_sleep_wake(adev, ACPI_STATE_UNKNOWN, false);
+	return error;
+}
+EXPORT_SYMBOL_GPL(acpi_dev_resume_early);
+
+/**
+ * acpi_subsys_prepare - Prepare device for system transition to a sleep state.
+ * @dev: Device to prepare.
+ */
+int acpi_subsys_prepare(struct device *dev)
+{
+	/*
+	 * Follow PCI and resume devices suspended at run time before running
+	 * their system suspend callbacks.
+	 */
+	pm_runtime_resume(dev);
+	return pm_generic_prepare(dev);
+}
+EXPORT_SYMBOL_GPL(acpi_subsys_prepare);
+
+/**
+ * acpi_subsys_suspend_late - Suspend device using ACPI.
+ * @dev: Device to suspend.
+ *
+ * Carry out the generic late suspend procedure for @dev and use ACPI to put
+ * it into a low-power state during system transition into a sleep state.
+ */
+int acpi_subsys_suspend_late(struct device *dev)
+{
+	int ret = pm_generic_suspend_late(dev);
+	return ret ? ret : acpi_dev_suspend_late(dev);
+}
+EXPORT_SYMBOL_GPL(acpi_subsys_suspend_late);
+
+/**
+ * acpi_subsys_resume_early - Resume device using ACPI.
+ * @dev: Device to Resume.
+ *
+ * Use ACPI to put the given device into the full-power state and carry out the
+ * generic early resume procedure for it during system transition into the
+ * working state.
+ */
+int acpi_subsys_resume_early(struct device *dev)
+{
+	int ret = acpi_dev_runtime_resume(dev);
+	return ret ? ret : pm_generic_resume_early(dev);
+}
+EXPORT_SYMBOL_GPL(acpi_subsys_resume_early);
+#endif /* CONFIG_PM_SLEEP */
+
+static struct dev_pm_domain acpi_general_pm_domain = {
+	.ops = {
+#ifdef CONFIG_PM_RUNTIME
+		.runtime_suspend = acpi_subsys_runtime_suspend,
+		.runtime_resume = acpi_subsys_runtime_resume,
+		.runtime_idle = pm_generic_runtime_idle,
+#endif
+#ifdef CONFIG_PM_SLEEP
+		.prepare = acpi_subsys_prepare,
+		.suspend_late = acpi_subsys_suspend_late,
+		.resume_early = acpi_subsys_resume_early,
+		.poweroff_late = acpi_subsys_suspend_late,
+		.restore_early = acpi_subsys_resume_early,
+#endif
+	},
+};
+
+/**
+ * acpi_dev_pm_attach - Prepare device for ACPI power management.
+ * @dev: Device to prepare.
+ *
+ * If @dev has a valid ACPI handle that has a valid struct acpi_device object
+ * attached to it, install a wakeup notification handler for the device and
+ * add it to the general ACPI PM domain.
+ *
+ * This assumes that the @dev's bus type uses generic power management callbacks
+ * (or doesn't use any power management callbacks at all).
+ *
+ * Callers must ensure proper synchronization of this function with power
+ * management callbacks.
+ */
+int acpi_dev_pm_attach(struct device *dev)
+{
+	struct acpi_device *adev = acpi_dev_pm_get_node(dev);
+
+	if (!adev)
+		return -ENODEV;
+
+	if (dev->pm_domain)
+		return -EEXIST;
+
+	acpi_add_pm_notifier(adev, acpi_wakeup_device, dev);
+	dev->pm_domain = &acpi_general_pm_domain;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(acpi_dev_pm_attach);
+
+/**
+ * acpi_dev_pm_detach - Remove ACPI power management from the device.
+ * @dev: Device to take care of.
+ *
+ * Remove the device from the general ACPI PM domain and remove its wakeup
+ * notifier.
+ *
+ * Callers must ensure proper synchronization of this function with power
+ * management callbacks.
+ */
+void acpi_dev_pm_detach(struct device *dev)
+{
+	struct acpi_device *adev = acpi_dev_pm_get_node(dev);
+
+	if (adev && dev->pm_domain == &acpi_general_pm_domain) {
+		dev->pm_domain = NULL;
+		acpi_remove_pm_notifier(adev, acpi_wakeup_device);
+	}
+}
+EXPORT_SYMBOL_GPL(acpi_dev_pm_detach);
Index: linux/include/linux/acpi.h
===================================================================
--- linux.orig/include/linux/acpi.h
+++ linux/include/linux/acpi.h
@@ -434,4 +434,38 @@ acpi_status acpi_os_prepare_sleep(u8 sle
 #define acpi_os_set_prepare_sleep(func, pm1a_ctrl, pm1b_ctrl) do { } while (0)
 #endif
 
+#if defined(CONFIG_ACPI) && defined(CONFIG_PM_RUNTIME)
+int acpi_dev_runtime_suspend(struct device *dev);
+int acpi_dev_runtime_resume(struct device *dev);
+int acpi_subsys_runtime_suspend(struct device *dev);
+int acpi_subsys_runtime_resume(struct device *dev);
+#else
+static inline int acpi_dev_runtime_suspend(struct device *dev) { return 0; }
+static inline int acpi_dev_runtime_resume(struct device *dev) { return 0; }
+static inline int acpi_subsys_runtime_suspend(struct device *dev) { return 0; }
+static inline int acpi_subsys_runtime_resume(struct device *dev) { return 0; }
+#endif
+
+#ifdef CONFIG_ACPI_SLEEP
+int acpi_dev_suspend_late(struct device *dev);
+int acpi_dev_resume_early(struct device *dev);
+int acpi_subsys_prepare(struct device *dev);
+int acpi_subsys_suspend_late(struct device *dev);
+int acpi_subsys_resume_early(struct device *dev);
+#else
+static inline int acpi_dev_suspend_late(struct device *dev) { return 0; }
+static inline int acpi_dev_resume_early(struct device *dev) { return 0; }
+static inline int acpi_subsys_prepare(struct device *dev) { return 0; }
+static inline int acpi_subsys_suspend_late(struct device *dev) { return 0; }
+static inline int acpi_subsys_resume_early(struct device *dev) { return 0; }
+#endif
+
+#if defined(CONFIG_ACPI) && defined(CONFIG_PM)
+int acpi_dev_pm_attach(struct device *dev);
+int acpi_dev_pm_detach(struct device *dev);
+#else
+static inline int acpi_dev_pm_attach(struct device *dev) { return -ENODEV; }
+static inline void acpi_dev_pm_detach(struct device *dev) {}
+#endif
+
 #endif	/*_LINUX_ACPI_H*/


^ permalink raw reply

* Re: Question about hibernation on sparc64
From: marxdenl @ 2012-10-29  9:44 UTC (permalink / raw)
  To: Kirill Tkhai; +Cc: davem, sparclinux, linux-pm
In-Reply-To: <1862521351464447@web28d.yandex.ru>

On Mon, 2012-10-29 at 02:47 +0400, Kirill Tkhai wrote:
> 
> 28.10.2012, 19:21, "marxdenl" <marxdenl@gmail.com>:
> > On Thu, 2012-04-12 at 15:48 +0400, Kirill Tkhai wrote:
> >
> >>  Hello!
> >>
> >>  I bumped on the fact that there is no hibernation support
> >>  on sparc64. It's possible that the process of its porting
> >>  will become interesting for me in the future, but I'm not
> >>  exactly sure at the moment.
> >
> > hi, I'm also intersting to do that. How is that going?
> >
> > Waiting for reply:)
> 
> Hi, I did nothing. You may dive into this if you want.

I enabled CONFIG_HIBERNATION, implemented 'swsusp_arch_suspend'
which saves some general registers (%g,%i,%o,%l) and call 
'swsusp_save', and 'swsusp_arch_resume' which load 'pblist' and
restore those registers.

Now my T2 can create hibernation images, but after restore the image, 
it crashes. If there's something I missed?

Thanks!

--
Linwen Deng 


^ permalink raw reply

* Re: [PATCH] cpuidle: add missing header include
From: Daniel Lezcano @ 2012-10-29  9:45 UTC (permalink / raw)
  To: Jingoo Han; +Cc: 'Rafael J. Wysocki', linux-pm, linux-kernel
In-Reply-To: <002101cdb572$fcc4d670$f64e8350$%han@samsung.com>

On 10/29/2012 02:16 AM, Jingoo Han wrote:
> On Monday, October 29, 2012 6:49 AM Marek Vasut wrote
>>
>> On 10/26/2012 06:30 AM, Jingoo Han wrote:
>>> This patch adds missing device.h header to fix build warnings as below:
>>>
>>> drivers/cpuidle/cpuidle.h:26:41: warning: 'struct device' declared inside parameter list [enabled by
>> default]
>>> drivers/cpuidle/cpuidle.h:26:41: warning: its scope is only this definition or declaration, which is
>> probably not what you want
>>> [enabled by default]
>>> drivers/cpuidle/cpuidle.h:27:45: warning: 'struct device' declared inside parameter list [enabled by
>> default]
>>> In file included from drivers/cpuidle/driver.c:15:0:
>>> drivers/cpuidle/cpuidle.h:26:41: warning: 'struct device' declared inside parameter list [enabled by
>> default]
>>> drivers/cpuidle/cpuidle.h:26:41: warning: its scope is only this definition or declaration, which is
>> probably not what you want
>>> [enabled by default]
>>> drivers/cpuidle/cpuidle.h:27:45: warning: 'struct device' declared inside parameter list [enabled by
>> default]
>>>
>>> This build warning is introduced by commit efeca1b
>>> "cpuidle / sysfs: change function parameter".
>>>
>>> Signed-off-by: Jingoo Han <jg1.han@samsung.com>
>>> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
>>> ---
>>
>> Jingoo, could you copy-pastebin your config file. I don't have this
>> warning and I would like to understand why.
> 
> Hi Daniel Lezcano,
> 
> Could you build the code by using GCC 4.6.x?
> In my opinion, it would be better.
> 
> Also, my config option is as below:
>     make exynos4_defconfig + CONFIG_CPU_IDLE
> 

Ok, I got it.

on x86, the header ioport.h is included indirectly by the headers
including headers. In this file there is a forward declaration.
For ARM, this is not the case.

This is why the warning does not appear on x86 but on ARM.

Thanks
  -- Daniel

>> Thanks
>>   -- Daniel
>>
>>>  drivers/cpuidle/cpuidle.h |    2 ++
>>>  1 files changed, 2 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/drivers/cpuidle/cpuidle.h b/drivers/cpuidle/cpuidle.h
>>> index a5bbd1c..2120d9e 100644
>>> --- a/drivers/cpuidle/cpuidle.h
>>> +++ b/drivers/cpuidle/cpuidle.h
>>> @@ -5,6 +5,8 @@
>>>  #ifndef __DRIVER_CPUIDLE_H
>>>  #define __DRIVER_CPUIDLE_H
>>>
>>> +#include <linux/device.h>
>>> +
>>>  /* For internal use only */
>>>  extern struct cpuidle_governor *cpuidle_curr_governor;
>>>  extern struct list_head cpuidle_governors;
>>
>>
>> --
>>  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
>>
>> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
>> <http://twitter.com/#!/linaroorg> Twitter |
>> <http://www.linaro.org/linaro-blog/> Blog
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply

* Re: [PATCH V2 2/6] Thermal: make sure cpufreq cooling register after cpufreq driver
From: Amit Kachhap @ 2012-10-29 11:42 UTC (permalink / raw)
  To: hongbo.zhang
  Cc: linaro-dev, linux-kernel, linux-pm, STEricsson_nomadik_linux,
	kernel, linaro-kernel, hongbo.zhang, patches
In-Reply-To: <1351079900-32236-3-git-send-email-hongbo.zhang@linaro.com>

On 24 October 2012 17:28, hongbo.zhang <hongbo.zhang@linaro.org> wrote:
> From: "hongbo.zhang" <hongbo.zhang@linaro.com>
>
> The cpufreq works as a cooling device, so the cooling layer should check if the
> cpufreq driver is initialized or not.
>
> Signed-off-by: hongbo.zhang <hongbo.zhang@linaro.com>
> ---
>  drivers/thermal/cpu_cooling.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index b6b4c2a..7519a0b 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -354,6 +354,10 @@ struct thermal_cooling_device *cpufreq_cooling_register(
>         int ret = 0, i;
>         struct cpufreq_policy policy;
>
> +       /* make sure cpufreq driver has been initialized */
> +       if (!cpufreq_frequency_get_table(cpumask_any(clip_cpus)))
> +               return ERR_PTR(-EPROBE_DEFER);
> +
Hi Hongbo,

I am not against this change but this might cause unnecessary delay in
probe thread. I also thought about it but have not put this
restriction. Actually you can put a check in platform_bind for this
condition and defer the binding till the time actual throttling
starts. So basically only after throttling cpufreq_table is needed.
(See my implementation exynos_thermal.c).

Thanks,
Amit Daniel
>         list_for_each_entry(cpufreq_dev, &cooling_cpufreq_list, node)
>                 cpufreq_dev_count++;
>
> --
> 1.7.11.3
>
>
> _______________________________________________
> linaro-dev mailing list
> linaro-dev@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-dev

^ permalink raw reply

* Re: [PATCH V2 3/6] Thermal: fix bug of counting cpu frequencies.
From: Amit Kachhap @ 2012-10-29 11:54 UTC (permalink / raw)
  To: Hongbo Zhang, Zhang Rui
  Cc: linaro-kernel, linaro-dev, linux-pm, patches, linux-kernel,
	STEricsson_nomadik_linux, kernel, hongbo.zhang, Viresh Kumar
In-Reply-To: <CAKohpomygrbhNMJ5+P6DnppBTOSXB8GFeQsbjkUZ61Ou6bzaPQ@mail.gmail.com>

On 24 October 2012 19:04, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 24 October 2012 17:28, hongbo.zhang <hongbo.zhang@linaro.org> wrote:
>> From: "hongbo.zhang" <hongbo.zhang@linaro.com>
>>
>> In the while loop for counting cpu frequencies, if table[i].frequency equals
>> CPUFREQ_ENTRY_INVALID, index i won't be increased, so this leads to an endless
>> loop, what's more the index i cannot be referred as cpu frequencies number if
>> there is CPUFREQ_ENTRY_INVALID case.
>>
>> Signed-off-by: hongbo.zhang <hongbo.zhang@linaro.com>
>
> Good one.
>
> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
Changes looks fine. Adding thermal maintainer(Rui Zhang) in the mail list.
Reviewed-by: Amit Daniel Kachhap <amit.kachhap@linaro.org>

Thanks,
Amit Daniel
>
> _______________________________________________
> linaro-dev mailing list
> linaro-dev@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-dev

^ permalink raw reply

* [PATCH v3 0/6] solve deadlock caused by memory allocation with I/O
From: Ming Lei @ 2012-10-29 12:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Alan Stern, Oliver Neukum, Minchan Kim, Greg Kroah-Hartman,
	Rafael J. Wysocki, Jens Axboe, David S. Miller, Andrew Morton,
	netdev, linux-usb, linux-pm, linux-mm

This patchset try to solve one deadlock problem which might be caused
by memory allocation with block I/O during runtime resume and block device
error handling path. Traditionly, the problem is addressed by passing
GFP_NOIO statically to mm, but that is not a effective solution, see
detailed description in patch 1's commit log.

This patch set introduces one process flag and trys to fix one deadlock
problem on block device/network device during runtime resume or usb bus reset.

The 1st one is the change on include/sched.h and mm.

The 2nd patch introduces the flag of memalloc_noio_resume on 'dev_pm_info',
and pm_runtime_set_memalloc_noio(), so that PM Core can teach mm to not
allocate mm with GFP_IOFS during the runtime_resume callback only on
device with the flag set.

The following 2 patches apply the introduced pm_runtime_set_memalloc_noio()
to mark all devices as memalloc_noio_resume in the path from the block or
network device to the root device in device tree.

The last 2 patches are applied again PM and USB subsystem to demonstrate
how to use the introduced mechanism to fix the deadlock problem.

V3:
	- patch 2/6 and 5/6 changed, see their commit log
	- remove RFC from title since several guys have expressed that
	it is a reasonable solution
V2:
        - remove changes on 'may_writepage' and 'may_swap'(1/6)
        - unset GFP_IOFS in try_to_free_pages() path(1/6)
        - introduce pm_runtime_set_memalloc_noio()
        - only apply the meachnism on block/network device and its ancestors
        for runtime resume context
V1:
        - take Minchan's change to avoid the check in alloc_page hot path
        - change the helpers' style into save/restore as suggested by Alan
        - memory allocation with no io in usb bus reset path for all devices
        as suggested by Greg and Oliver

 block/genhd.c                |    8 ++++
 drivers/base/power/runtime.c |   88 +++++++++++++++++++++++++++++++++++++++++-
 drivers/usb/core/hub.c       |   15 +++++++
 include/linux/pm.h           |    1 +
 include/linux/pm_runtime.h   |    5 +++
 include/linux/sched.h        |   10 +++++
 mm/page_alloc.c              |   10 ++++-
 mm/vmscan.c                  |   12 ++++++
 net/core/net-sysfs.c         |    5 +++
 9 files changed, 152 insertions(+), 2 deletions(-)


Thanks,
--
Ming Lei

^ permalink raw reply

* [PATCH v3 1/6] mm: teach mm by current context info to not do I/O during memory allocation
From: Ming Lei @ 2012-10-29 12:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Alan Stern, Oliver Neukum, Minchan Kim, Greg Kroah-Hartman,
	Rafael J. Wysocki, Jens Axboe, David S. Miller, Andrew Morton,
	netdev, linux-usb, linux-pm, linux-mm, Ming Lei, Jiri Kosina,
	Mel Gorman, KAMEZAWA Hiroyuki, Michal Hocko, Ingo Molnar,
	Peter Zijlstra
In-Reply-To: <1351513440-9286-1-git-send-email-ming.lei@canonical.com>

This patch introduces PF_MEMALLOC_NOIO on process flag('flags' field of
'struct task_struct'), so that the flag can be set by one task
to avoid doing I/O inside memory allocation in the task's context.

The patch trys to solve one deadlock problem caused by block device,
and the problem may happen at least in the below situations:

- during block device runtime resume, if memory allocation with
GFP_KERNEL is called inside runtime resume callback of any one
of its ancestors(or the block device itself), the deadlock may be
triggered inside the memory allocation since it might not complete
until the block device becomes active and the involed page I/O finishes.
The situation is pointed out first by Alan Stern. It is not a good
approach to convert all GFP_KERNEL[1] in the path into GFP_NOIO because
several subsystems may be involved(for example, PCI, USB and SCSI may
be involved for usb mass stoarage device, network devices involved too
in the iSCSI case)

- during error handling of usb mass storage deivce, USB bus reset
will be put on the device, so there shouldn't have any
memory allocation with GFP_KERNEL during USB bus reset, otherwise
the deadlock similar with above may be triggered. Unfortunately, any
usb device may include one mass storage interface in theory, so it
requires all usb interface drivers to handle the situation. In fact,
most usb drivers don't know how to handle bus reset on the device
and don't provide .pre_set() and .post_reset() callback at all, so
USB core has to unbind and bind driver for these devices. So it
is still not practical to resort to GFP_NOIO for solving the problem.

Also the introduced solution can be used by block subsystem or block
drivers too, for example, set the PF_MEMALLOC_NOIO flag before doing
actual I/O transfer.

It is not a good idea to convert all these GFP_KERNEL in the
affected path into GFP_NOIO because these functions doing that may be
implemented as library and will be called in many other contexts.

In fact, memalloc_noio() can convert some of current static GFP_NOIO
allocation into GFP_KERNEL back in other non-affected contexts, at least
almost all GFP_NOIO in USB subsystem can be converted into GFP_KERNEL
after applying the approach and make allocation with GFP_IO
only happen in runtime resume/bus reset/block I/O transfer contexts
generally.

[1], several GFP_KERNEL allocation examples in runtime resume path

- pci subsystem
acpi_os_allocate
	<-acpi_ut_allocate
		<-ACPI_ALLOCATE_ZEROED
			<-acpi_evaluate_object
				<-__acpi_bus_set_power
					<-acpi_bus_set_power
						<-acpi_pci_set_power_state
							<-platform_pci_set_power_state
								<-pci_platform_power_transition
									<-__pci_complete_power_transition
										<-pci_set_power_state
											<-pci_restore_standard_config
												<-pci_pm_runtime_resume
- usb subsystem
usb_get_status
	<-finish_port_resume
		<-usb_port_resume
			<-generic_resume
				<-usb_resume_device
					<-usb_resume_both
						<-usb_runtime_resume

- some individual usb drivers
usblp, uvc, gspca, most of dvb-usb-v2 media drivers, cpia2, az6007, ....

That is just what I have found.  Unfortunately, this allocation can
only be found by human being now, and there should be many not found
since any function in the resume path(call tree) may allocate memory
with GFP_KERNEL.

Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Oliver Neukum <oneukum@suse.de>
Cc: Jiri Kosina <jiri.kosina@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>

---
v3:
	- no change
v2:
        - remove changes on 'may_writepage' and 'may_swap' because that
          isn't related with the patchset, and can't introduce I/O in
          allocation path if GFP_IOFS is unset, so handing 'may_swap'
          and may_writepage on GFP_NOIO or GFP_NOFS  should be a
          mm internal thing, and let mm guys deal with that, :-).

          Looks clearing the two may_XXX flag only excludes dirty pages
	  and anon pages for relaiming, and the behaviour should be decided
          by GFP FLAG, IMO.

        - unset GFP_IOFS in try_to_free_pages() path since
          alloc_page_buffers()
          and dma_alloc_from_contiguous may drop into the path, as
          pointed by KAMEZAWA Hiroyuki
v1:
        - take Minchan's change to avoid the check in alloc_page hot
          path

        - change the helpers' style into save/restore as suggested by
          Alan Stern
---
 include/linux/sched.h |   10 ++++++++++
 mm/page_alloc.c       |   10 +++++++++-
 mm/vmscan.c           |   12 ++++++++++++
 3 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index fb27acd..283fe86 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1805,6 +1805,7 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
 #define PF_FROZEN	0x00010000	/* frozen for system suspend */
 #define PF_FSTRANS	0x00020000	/* inside a filesystem transaction */
 #define PF_KSWAPD	0x00040000	/* I am kswapd */
+#define PF_MEMALLOC_NOIO 0x00080000	/* Allocating memory without IO involved */
 #define PF_LESS_THROTTLE 0x00100000	/* Throttle me less: I clean memory */
 #define PF_KTHREAD	0x00200000	/* I am a kernel thread */
 #define PF_RANDOMIZE	0x00400000	/* randomize virtual address space */
@@ -1842,6 +1843,15 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
 #define tsk_used_math(p) ((p)->flags & PF_USED_MATH)
 #define used_math() tsk_used_math(current)
 
+#define memalloc_noio() (current->flags & PF_MEMALLOC_NOIO)
+#define memalloc_noio_save(flag) do { \
+	(flag) = current->flags & PF_MEMALLOC_NOIO; \
+	current->flags |= PF_MEMALLOC_NOIO; \
+} while (0)
+#define memalloc_noio_restore(flag) do { \
+	current->flags = (current->flags & ~PF_MEMALLOC_NOIO) | flag; \
+} while (0)
+
 /*
  * task->jobctl flags
  */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 45c916b..548d41c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2634,10 +2634,18 @@ retry_cpuset:
 	page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, nodemask, order,
 			zonelist, high_zoneidx, alloc_flags,
 			preferred_zone, migratetype);
-	if (unlikely(!page))
+	if (unlikely(!page)) {
+		/*
+		 * Resume, block IO and its error handling path
+		 * can deadlock because I/O on the device might not
+		 * complete.
+		 */
+		if (unlikely(memalloc_noio()))
+			gfp_mask &= ~GFP_IOFS;
 		page = __alloc_pages_slowpath(gfp_mask, order,
 				zonelist, high_zoneidx, nodemask,
 				preferred_zone, migratetype);
+	}
 
 	trace_mm_page_alloc(page, order, gfp_mask, migratetype);
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 10090c8..035088a 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2304,6 +2304,12 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
 		.gfp_mask = sc.gfp_mask,
 	};
 
+	if (unlikely(memalloc_noio())) {
+		gfp_mask &= ~GFP_IOFS;
+		sc.gfp_mask = gfp_mask;
+		shrink.gfp_mask = sc.gfp_mask;
+	}
+
 	throttle_direct_reclaim(gfp_mask, zonelist, nodemask);
 
 	/*
@@ -3304,6 +3310,12 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 	};
 	unsigned long nr_slab_pages0, nr_slab_pages1;
 
+	if (unlikely(memalloc_noio())) {
+		gfp_mask &= ~GFP_IOFS;
+		sc.gfp_mask = gfp_mask;
+		shrink.gfp_mask = sc.gfp_mask;
+	}
+
 	cond_resched();
 	/*
 	 * We need to be able to allocate from the reserves for RECLAIM_SWAP
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related

* [PATCH v3 2/6] PM / Runtime: introduce pm_runtime_set[get]_memalloc_noio()
From: Ming Lei @ 2012-10-29 12:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Alan Stern, Oliver Neukum, Minchan Kim, Greg Kroah-Hartman,
	Rafael J. Wysocki, Jens Axboe, David S. Miller, Andrew Morton,
	netdev, linux-usb, linux-pm, linux-mm, Ming Lei
In-Reply-To: <1351513440-9286-1-git-send-email-ming.lei@canonical.com>

The patch introduces the flag of memalloc_noio_resume in
'struct dev_pm_info' to help PM core to teach mm not allocating
memory with GFP_KERNEL flag for avoiding probable deadlock
problem.

As explained in the comment, any GFP_KERNEL allocation inside
runtime_resume on any one of device in the path from one block
or network device to the root device in the device tree may cause
deadlock, the introduced pm_runtime_set_memalloc_noio() sets or
clears the flag on device of the path recursively.

This patch also introduces pm_runtime_get_memalloc_noio() because
the flag may be accessed in block device's error handling path
(for example, usb device reset)

Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
v3:
	- introduce pm_runtime_get_memalloc_noio()
	- hold one global lock on pm_runtime_set_memalloc_noio
	- hold device power lock when accessing memalloc_noio_resume
	  flag suggested by Alan Stern
	- implement pm_runtime_set_memalloc_noio without recursion
	  suggested by Alan Stern
v2:
	- introduce pm_runtime_set_memalloc_noio()
---
 drivers/base/power/runtime.c |   72 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/pm.h           |    1 +
 include/linux/pm_runtime.h   |    5 +++
 3 files changed, 78 insertions(+)

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 3148b10..9fa6ea7 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -124,6 +124,78 @@ unsigned long pm_runtime_autosuspend_expiration(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(pm_runtime_autosuspend_expiration);
 
+/*
+ * pm_runtime_get_memalloc_noio - Get a device's memalloc_noio flag.
+ * @dev: Device to handle.
+ *
+ * Return the device's memalloc_noio flag.
+ *
+ * The device power lock is held because bitfield is not SMP-safe.
+ */
+bool pm_runtime_get_memalloc_noio(struct device *dev)
+{
+	bool ret;
+	spin_lock_irq(&dev->power.lock);
+	ret = dev->power.memalloc_noio_resume;
+	spin_unlock_irq(&dev->power.lock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_get_memalloc_noio);
+
+static int dev_memalloc_noio(struct device *dev, void *data)
+{
+	return pm_runtime_get_memalloc_noio(dev);
+}
+
+/*
+ * pm_runtime_set_memalloc_noio - Set a device's memalloc_noio flag.
+ * @dev: Device to handle.
+ * @enable: True for setting the flag and False for clearing the flag.
+ *
+ * Set the flag for all devices in the path from the device to the
+ * root device in the device tree if @enable is true, otherwise clear
+ * the flag for devices in the path which sibliings don't set the flag.
+ *
+ * The function should only be called by block device, or network
+ * device driver for solving the deadlock problem during runtime
+ * resume:
+ * 	if memory allocation with GFP_KERNEL is called inside runtime
+ * 	resume callback of any one of its ancestors(or the block device
+ * 	itself), the deadlock may be triggered inside the memory
+ * 	allocation since it might not complete until the block device
+ * 	becomes active and the involed page I/O finishes. The situation
+ * 	is pointed out first by Alan Stern. Network device are involved
+ * 	in iSCSI kind of situation.
+ *
+ * The lock of dev_hotplug_mutex is held in the function for handling
+ * hotplug race because pm_runtime_set_memalloc_noio() may be called
+ * in async probe().
+ */
+void pm_runtime_set_memalloc_noio(struct device *dev, bool enable)
+{
+	static DEFINE_MUTEX(dev_hotplug_mutex);
+
+	mutex_lock(&dev_hotplug_mutex);
+	while (dev) {
+		/* hold power lock since bitfield is not SMP-safe. */
+		spin_lock_irq(&dev->power.lock);
+		dev->power.memalloc_noio_resume = enable;
+		spin_unlock_irq(&dev->power.lock);
+
+		dev = dev->parent;
+
+		/* only clear the flag for one device if all
+		 * children of the device don't set the flag.
+		 */
+		if (!dev || (!enable &&
+			     device_for_each_child(dev, NULL,
+						   dev_memalloc_noio)))
+			break;
+	}
+	mutex_unlock(&dev_hotplug_mutex);
+}
+EXPORT_SYMBOL_GPL(pm_runtime_set_memalloc_noio);
+
 /**
  * rpm_check_suspend_allowed - Test whether a device may be suspended.
  * @dev: Device to test.
diff --git a/include/linux/pm.h b/include/linux/pm.h
index 03d7bb1..d104579 100644
--- a/include/linux/pm.h
+++ b/include/linux/pm.h
@@ -538,6 +538,7 @@ struct dev_pm_info {
 	unsigned int		irq_safe:1;
 	unsigned int		use_autosuspend:1;
 	unsigned int		timer_autosuspends:1;
+	unsigned int		memalloc_noio_resume:1;
 	enum rpm_request	request;
 	enum rpm_status		runtime_status;
 	int			runtime_error;
diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h
index f271860..b522b09 100644
--- a/include/linux/pm_runtime.h
+++ b/include/linux/pm_runtime.h
@@ -47,6 +47,8 @@ extern void pm_runtime_set_autosuspend_delay(struct device *dev, int delay);
 extern unsigned long pm_runtime_autosuspend_expiration(struct device *dev);
 extern void pm_runtime_update_max_time_suspended(struct device *dev,
 						 s64 delta_ns);
+extern bool pm_runtime_get_memalloc_noio(struct device *dev);
+extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable);
 
 static inline bool pm_children_suspended(struct device *dev)
 {
@@ -149,6 +151,9 @@ static inline void pm_runtime_set_autosuspend_delay(struct device *dev,
 						int delay) {}
 static inline unsigned long pm_runtime_autosuspend_expiration(
 				struct device *dev) { return 0; }
+static inline bool pm_runtime_get_memalloc_noio(struct device *dev) { return false; }
+static inline void pm_runtime_set_memalloc_noio(struct device *dev,
+						bool enable){}
 
 #endif /* !CONFIG_PM_RUNTIME */
 
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related

* [PATCH v3 3/6] block/genhd.c: apply pm_runtime_set_memalloc_noio on block devices
From: Ming Lei @ 2012-10-29 12:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Alan Stern, Oliver Neukum, Minchan Kim, Greg Kroah-Hartman,
	Rafael J. Wysocki, Jens Axboe, David S. Miller, Andrew Morton,
	netdev, linux-usb, linux-pm, linux-mm, Ming Lei
In-Reply-To: <1351513440-9286-1-git-send-email-ming.lei@canonical.com>

This patch applyes the introduced pm_runtime_set_memalloc_noio on
block device so that PM core will teach mm to not allocate memory with
GFP_IOFS when calling the runtime_resume callback for block devices.

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 block/genhd.c |    8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/block/genhd.c b/block/genhd.c
index 9e02cd6..c5f10ea 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -18,6 +18,7 @@
 #include <linux/mutex.h>
 #include <linux/idr.h>
 #include <linux/log2.h>
+#include <linux/pm_runtime.h>
 
 #include "blk.h"
 
@@ -519,6 +520,12 @@ static void register_disk(struct gendisk *disk)
 
 	dev_set_name(ddev, disk->disk_name);
 
+	/* avoid probable deadlock caused by allocate memory with
+	 * GFP_KERNEL in runtime_resume callback of its all ancestor
+	 * deivces
+	 */
+	pm_runtime_set_memalloc_noio(ddev, true);
+
 	/* delay uevents, until we scanned partition table */
 	dev_set_uevent_suppress(ddev, 1);
 
@@ -661,6 +668,7 @@ void del_gendisk(struct gendisk *disk)
 	disk->driverfs_dev = NULL;
 	if (!sysfs_deprecated)
 		sysfs_remove_link(block_depr, dev_name(disk_to_dev(disk)));
+	pm_runtime_set_memalloc_noio(disk_to_dev(disk), false);
 	device_del(disk_to_dev(disk));
 }
 EXPORT_SYMBOL(del_gendisk);
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related

* [PATCH v3 5/6] PM / Runtime: force memory allocation with no I/O during runtime_resume callbcack
From: Ming Lei @ 2012-10-29 12:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Alan Stern, Oliver Neukum, Minchan Kim, Greg Kroah-Hartman,
	Rafael J. Wysocki, Jens Axboe, David S. Miller, Andrew Morton,
	netdev, linux-usb, linux-pm, linux-mm, Ming Lei
In-Reply-To: <1351513440-9286-1-git-send-email-ming.lei@canonical.com>

This patch applies the introduced memalloc_noio_save() and
memalloc_noio_restore() to force memory allocation with no I/O
during runtime_resume callback on device which is marked as
memalloc_noio_resume.

Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Oliver Neukum <oneukum@suse.de>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 drivers/base/power/runtime.c |   16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 9fa6ea7..c9e26b9 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -575,6 +575,7 @@ static int rpm_resume(struct device *dev, int rpmflags)
 	int (*callback)(struct device *);
 	struct device *parent = NULL;
 	int retval = 0;
+	unsigned int noio_flag;
 
 	trace_rpm_resume(dev, rpmflags);
 
@@ -724,7 +725,20 @@ static int rpm_resume(struct device *dev, int rpmflags)
 	if (!callback && dev->driver && dev->driver->pm)
 		callback = dev->driver->pm->runtime_resume;
 
-	retval = rpm_callback(callback, dev);
+	/*
+	 * Deadlock might be caused if memory allocation with GFP_KERNEL
+	 * happens inside runtime_resume callback of one block device's
+	 * ancestor or the block device itself. Network device might be
+	 * thought as part of iSCSI block device, so network device and
+	 * its ancestor should be marked as memalloc_noio_resume.
+	 */
+	if (dev->power.memalloc_noio_resume) {
+		memalloc_noio_save(noio_flag);
+		retval = rpm_callback(callback, dev);
+		memalloc_noio_restore(noio_flag);
+	} else {
+		retval = rpm_callback(callback, dev);
+	}
 	if (retval) {
 		__update_runtime_status(dev, RPM_SUSPENDED);
 		pm_runtime_cancel_pending(dev);
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related

* [PATCH v3 6/6] USB: forbid memory allocation with I/O during bus reset
From: Ming Lei @ 2012-10-29 12:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Alan Stern, Oliver Neukum, Minchan Kim, Greg Kroah-Hartman,
	Rafael J. Wysocki, Jens Axboe, David S. Miller, Andrew Morton,
	netdev, linux-usb, linux-pm, linux-mm, Ming Lei
In-Reply-To: <1351513440-9286-1-git-send-email-ming.lei@canonical.com>

If one storage interface or usb network interface(iSCSI case)
exists in current configuration, memory allocation with
GFP_KERNEL during usb_device_reset() might trigger I/O transfer
on the storage interface itself and cause deadlock because
the 'us->dev_mutex' is held in .pre_reset() and the storage
interface can't do I/O transfer when the reset is triggered
by other interface, or the error handling can't be completed
if the reset is triggered by the storage itself(error handling path).

Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
v3:
	- check usbnet device or usb mass storage device by
	'dev->power.memalloc_noio_resume' as suggested by Alan Stern
---
 drivers/usb/core/hub.c |   15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 5b131b6..5aea807 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -5044,6 +5044,8 @@ int usb_reset_device(struct usb_device *udev)
 {
 	int ret;
 	int i;
+	unsigned int uninitialized_var(noio_flag);
+	bool noio_set = false;
 	struct usb_host_config *config = udev->actconfig;
 
 	if (udev->state == USB_STATE_NOTATTACHED ||
@@ -5053,6 +5055,17 @@ int usb_reset_device(struct usb_device *udev)
 		return -EINVAL;
 	}
 
+	/*
+	 * Don't allocate memory with GFP_KERNEL in current
+	 * context to avoid possible deadlock if usb mass
+	 * storage interface or usbnet interface(iSCSI case)
+	 * is included in current configuration.
+	 */
+	if (pm_runtime_get_memalloc_noio(&udev->dev)) {
+		memalloc_noio_save(noio_flag);
+		noio_set = true;
+	}
+
 	/* Prevent autosuspend during the reset */
 	usb_autoresume_device(udev);
 
@@ -5097,6 +5110,8 @@ int usb_reset_device(struct usb_device *udev)
 	}
 
 	usb_autosuspend_device(udev);
+	if (noio_set)
+		memalloc_noio_restore(noio_flag);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(usb_reset_device);
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related

* [PATCH v3 4/6] net/core: apply pm_runtime_set_memalloc_noio on network devices
From: Ming Lei @ 2012-10-29 12:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Alan Stern, Oliver Neukum, Minchan Kim, Greg Kroah-Hartman,
	Rafael J. Wysocki, Jens Axboe, David S. Miller, Andrew Morton,
	netdev, linux-usb, linux-pm, linux-mm, Ming Lei, Eric Dumazet,
	David Decotigny, Tom Herbert, Ingo Molnar
In-Reply-To: <1351513440-9286-1-git-send-email-ming.lei@canonical.com>

Deadlock might be caused by allocating memory with GFP_KERNEL in
runtime_resume callback of network devices in iSCSI situation, so
mark network devices and its ancestor as 'memalloc_noio_resume'
with the introduced pm_runtime_set_memalloc_noio().

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Decotigny <david.decotigny@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 net/core/net-sysfs.c |    5 +++++
 1 file changed, 5 insertions(+)

diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index bcf02f6..9aba5be 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -22,6 +22,7 @@
 #include <linux/vmalloc.h>
 #include <linux/export.h>
 #include <linux/jiffies.h>
+#include <linux/pm_runtime.h>
 #include <net/wext.h>
 
 #include "net-sysfs.h"
@@ -1386,6 +1387,8 @@ void netdev_unregister_kobject(struct net_device * net)
 
 	remove_queue_kobjects(net);
 
+	pm_runtime_set_memalloc_noio(dev, false);
+
 	device_del(dev);
 }
 
@@ -1411,6 +1414,8 @@ int netdev_register_kobject(struct net_device *net)
 	*groups++ = &netstat_group;
 #endif /* CONFIG_SYSFS */
 
+	pm_runtime_set_memalloc_noio(dev, true);
+
 	error = device_add(dev);
 	if (error)
 		return error;
-- 
1.7.9.5


^ permalink raw reply related

* [linux-next PATCH] PM / devfreq: documentation cleanups for devfreq header
From: Nishanth Menon @ 2012-10-29 13:02 UTC (permalink / raw)
  To: linux-pm
  Cc: Nishanth Menon, Rajagopal Venkat, MyungJoo Ham, Kyungmin Park,
	Rafael J. Wysocki, Kevin Hilman, linux-kernel

struct parameters need to have ':' in documentation for
scripts/kernel-doc to parse appropriately.

Fix the errors reported by:
./scripts/kernel-doc include/linux/devfreq.h >/dev/null

Cc: Rajagopal Venkat <rajagopal.venkat@linaro.org>
Cc: MyungJoo Ham <myungjoo.ham@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Kevin Hilman <khilman@ti.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Signed-off-by: Nishanth Menon <nm@ti.com>
---
Applies on
linux-next                e083feb Merge branch 'acpi-next' into linux-next
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git

 include/linux/devfreq.h |   54 +++++++++++++++++++++++------------------------
 1 file changed, 27 insertions(+), 27 deletions(-)

diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index 7e2e2ea..1461fb2 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -25,12 +25,12 @@ struct devfreq;
  * struct devfreq_dev_status - Data given from devfreq user device to
  *			     governors. Represents the performance
  *			     statistics.
- * @total_time		The total time represented by this instance of
+ * @total_time:		The total time represented by this instance of
  *			devfreq_dev_status
- * @busy_time		The time that the device was working among the
+ * @busy_time:		The time that the device was working among the
  *			total_time.
- * @current_frequency	The operating frequency.
- * @private_data	An entry not specified by the devfreq framework.
+ * @current_frequency:	The operating frequency.
+ * @private_data:	An entry not specified by the devfreq framework.
  *			A device and a specific governor may have their
  *			own protocol with private_data. However, because
  *			this is governor-specific, a governor using this
@@ -54,21 +54,21 @@ struct devfreq_dev_status {
 
 /**
  * struct devfreq_dev_profile - Devfreq's user device profile
- * @initial_freq	The operating frequency when devfreq_add_device() is
+ * @initial_freq:	The operating frequency when devfreq_add_device() is
  *			called.
- * @polling_ms		The polling interval in ms. 0 disables polling.
- * @target		The device should set its operating frequency at
+ * @polling_ms:		The polling interval in ms. 0 disables polling.
+ * @target:		The device should set its operating frequency at
  *			freq or lowest-upper-than-freq value. If freq is
  *			higher than any operable frequency, set maximum.
  *			Before returning, target function should set
  *			freq at the current frequency.
  *			The "flags" parameter's possible values are
  *			explained above with "DEVFREQ_FLAG_*" macros.
- * @get_dev_status	The device should provide the current performance
+ * @get_dev_status:	The device should provide the current performance
  *			status to devfreq, which is used by governors.
- * @get_cur_freq	The device should provide the current frequency
+ * @get_cur_freq:	The device should provide the current frequency
  *			at which it is operating.
- * @exit		An optional callback that is called when devfreq
+ * @exit:		An optional callback that is called when devfreq
  *			is removing the devfreq object due to error or
  *			from devfreq_remove_device() call. If the user
  *			has registered devfreq->nb at a notifier-head,
@@ -87,14 +87,14 @@ struct devfreq_dev_profile {
 
 /**
  * struct devfreq_governor - Devfreq policy governor
- * @name		Governor's name
- * @get_target_freq	Returns desired operating frequency for the device.
+ * @name:		Governor's name
+ * @get_target_freq:	Returns desired operating frequency for the device.
  *			Basically, get_target_freq will run
  *			devfreq_dev_profile.get_dev_status() to get the
  *			status of the device (load = busy_time / total_time).
  *			If no_central_polling is set, this callback is called
  *			only with update_devfreq() notified by OPP.
- * @event_handler       Callback for devfreq core framework to notify events
+ * @event_handler:      Callback for devfreq core framework to notify events
  *                      to governors. Events include per device governor
  *                      init and exit, opp changes out of devfreq, suspend
  *                      and resume of per device devfreq during device idle.
@@ -110,23 +110,23 @@ struct devfreq_governor {
 
 /**
  * struct devfreq - Device devfreq structure
- * @node	list node - contains the devices with devfreq that have been
+ * @node:	list node - contains the devices with devfreq that have been
  *		registered.
- * @lock	a mutex to protect accessing devfreq.
- * @dev		device registered by devfreq class. dev.parent is the device
+ * @lock:	a mutex to protect accessing devfreq.
+ * @dev:	device registered by devfreq class. dev.parent is the device
  *		using devfreq.
- * @profile	device-specific devfreq profile
- * @governor	method how to choose frequency based on the usage.
- * @nb		notifier block used to notify devfreq object that it should
+ * @profile:	device-specific devfreq profile
+ * @governor:	method how to choose frequency based on the usage.
+ * @nb:		notifier block used to notify devfreq object that it should
  *		reevaluate operable frequencies. Devfreq users may use
  *		devfreq.nb to the corresponding register notifier call chain.
- * @work	delayed work for load monitoring.
- * @previous_freq	previously configured frequency value.
- * @data	Private data of the governor. The devfreq framework does not
+ * @work:	delayed work for load monitoring.
+ * @previous_freq:	previously configured frequency value.
+ * @data:	Private data of the governor. The devfreq framework does not
  *		touch this.
- * @min_freq	Limit minimum frequency requested by user (0: none)
- * @max_freq	Limit maximum frequency requested by user (0: none)
- * @stop_polling	 devfreq polling status of a device.
+ * @min_freq:	Limit minimum frequency requested by user (0: none)
+ * @max_freq:	Limit maximum frequency requested by user (0: none)
+ * @stop_polling:	 devfreq polling status of a device.
  *
  * This structure stores the devfreq information for a give device.
  *
@@ -186,9 +186,9 @@ extern const struct devfreq_governor devfreq_simple_ondemand;
 /**
  * struct devfreq_simple_ondemand_data - void *data fed to struct devfreq
  *	and devfreq_add_device
- * @ upthreshold	If the load is over this value, the frequency jumps.
+ * @upthreshold:	If the load is over this value, the frequency jumps.
  *			Specify 0 to use the default. Valid value = 0 to 100.
- * @ downdifferential	If the load is under upthreshold - downdifferential,
+ * @downdifferential:	If the load is under upthreshold - downdifferential,
  *			the governor may consider slowing the frequency down.
  *			Specify 0 to use the default. Valid value = 0 to 100.
  *			downdifferential < upthreshold must hold.
-- 
1.7.9.5


^ permalink raw reply related

* Question about runtime pm status issue
From: Lan Tianyu @ 2012-10-29 13:46 UTC (permalink / raw)
  To: Alan Stern, Rafael J. Wysocki, Linux-pm mailing list

hi Rafael, Alan:
	I recently meet a problem that I enable a usb disk's runtime auto autosuspend,
set power/autosuspend_delay_ms to be 100 and make its usage_count to be 0. But after
a long time, the disk's runtime status always is active. From my view, the
disk's status
should be "suspend" after the usage_count being 0 and delay timeout. If my
opinion was
wrong, please corrent me. Thanks.
-- 
Best regards
Tianyu Lan

^ permalink raw reply

* Re: Question about runtime pm status issue
From: Ming Lei @ 2012-10-29 14:20 UTC (permalink / raw)
  To: Lan Tianyu; +Cc: Alan Stern, Rafael J. Wysocki, Linux-pm mailing list
In-Reply-To: <508E889C.6070009@gmail.com>

On Mon, Oct 29, 2012 at 9:46 PM, Lan Tianyu <lantianyu1986@gmail.com> wrote:
> hi Rafael, Alan:
>         I recently meet a problem that I enable a usb disk's runtime auto autosuspend,
> set power/autosuspend_delay_ms to be 100 and make its usage_count to be 0. But after
> a long time, the disk's runtime status always is active. From my view, the
> disk's status
> should be "suspend" after the usage_count being 0 and delay timeout. If my
> opinion was
> wrong, please corrent me. Thanks.

Please try to umount the disk to see if it can be auto suspended.


Thanks,
-- 
Ming Lei

^ permalink raw reply

* Re: Question about runtime pm status issue
From: Lan Tianyu @ 2012-10-29 14:43 UTC (permalink / raw)
  To: Ming Lei; +Cc: Alan Stern, Rafael J. Wysocki, Linux-pm mailing list
In-Reply-To: <CACVXFVMZi7_uSsL0_0Sc=Nmkc4OMwogtiRB3v=uJY2LF1UYS3A@mail.gmail.com>

于 2012/10/29 22:20, Ming Lei 写道:
> On Mon, Oct 29, 2012 at 9:46 PM, Lan Tianyu <lantianyu1986@gmail.com> wrote:
>> hi Rafael, Alan:
>>          I recently meet a problem that I enable a usb disk's runtime auto autosuspend,
>> set power/autosuspend_delay_ms to be 100 and make its usage_count to be 0. But after
>> a long time, the disk's runtime status always is active. From my view, the
>> disk's status
>> should be "suspend" after the usage_count being 0 and delay timeout. If my
>> opinion was
>> wrong, please corrent me. Thanks.
>
> Please try to umount the disk to see if it can be auto suspended.
>
I have umount the disk and there is no user of the device since its usage_count
become zero.
>
> Thanks,
>

-- 
Best regards
Tianyu Lan

^ permalink raw reply

* [PATCH] cpuidle : fixup device.h header in cpuidle.h
From: Daniel Lezcano @ 2012-10-29 14:48 UTC (permalink / raw)
  To: rjw; +Cc: linux-pm, linaro-dev, patches

The "struct device" is only used in sysfs.c.

The other .c files including the private header "cpuidle.h"
do not need to pull the entire headers tree from there as they
don't manipulate the "struct device".

This patch fix this by moving the header inclusion to sysfs.c
and adding a forward declaration for the struct device.

The number of lines generated by the preprocesor:
Without this patch : 17269 loc
With this patch : 16446 loc

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 drivers/cpuidle/cpuidle.h |    5 +++--
 drivers/cpuidle/sysfs.c   |    1 +
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/cpuidle/cpuidle.h b/drivers/cpuidle/cpuidle.h
index 2120d9e..f6b0923 100644
--- a/drivers/cpuidle/cpuidle.h
+++ b/drivers/cpuidle/cpuidle.h
@@ -5,8 +5,6 @@
 #ifndef __DRIVER_CPUIDLE_H
 #define __DRIVER_CPUIDLE_H
 
-#include <linux/device.h>
-
 /* For internal use only */
 extern struct cpuidle_governor *cpuidle_curr_governor;
 extern struct list_head cpuidle_governors;
@@ -25,6 +23,9 @@ extern void cpuidle_uninstall_idle_handler(void);
 extern int cpuidle_switch_governor(struct cpuidle_governor *gov);
 
 /* sysfs */
+
+struct device;
+
 extern int cpuidle_add_interface(struct device *dev);
 extern void cpuidle_remove_interface(struct device *dev);
 extern int cpuidle_add_state_sysfs(struct cpuidle_device *device);
diff --git a/drivers/cpuidle/sysfs.c b/drivers/cpuidle/sysfs.c
index ed87399..860a686 100644
--- a/drivers/cpuidle/sysfs.c
+++ b/drivers/cpuidle/sysfs.c
@@ -12,6 +12,7 @@
 #include <linux/slab.h>
 #include <linux/cpu.h>
 #include <linux/capability.h>
+#include <linux/device.h>
 
 #include "cpuidle.h"
 
-- 
1.7.5.4


^ permalink raw reply related

* Re: Question about runtime pm status issue
From: Ming Lei @ 2012-10-29 14:50 UTC (permalink / raw)
  To: Lan Tianyu; +Cc: Alan Stern, Rafael J. Wysocki, Linux-pm mailing list
In-Reply-To: <508E95FA.5070906@gmail.com>

On Mon, Oct 29, 2012 at 10:43 PM, Lan Tianyu <lantianyu1986@gmail.com> wrote:

> I have umount the disk and there is no user of the device since its
> usage_count
> become zero.

Looks runtime PM of usb mass storage works well on my box with
3.7.0-rc3-next-20121029, maybe you can enable trace event on
runtime PM to see what happened.


Thanks
-- 
Ming Lei

^ permalink raw reply

* Re: Question about runtime pm status issue
From: Lan Tianyu @ 2012-10-29 14:52 UTC (permalink / raw)
  To: Ming Lei; +Cc: Alan Stern, Rafael J. Wysocki, Linux-pm mailing list
In-Reply-To: <CACVXFVOV2JBvPj-AnOYo6HELCYDthOfM1Qb54=BezNAikfL9OA@mail.gmail.com>

On 2012年10月29日 22:50:37, Ming Lei wrote:
> On Mon, Oct 29, 2012 at 10:43 PM, Lan Tianyu <lantianyu1986@gmail.com> wrote:
>
>> I have umount the disk and there is no user of the device since its
>> usage_count
>> become zero.
>
> Looks runtime PM of usb mass storage works well on my box with
> 3.7.0-rc3-next-20121029, maybe you can enable trace event on
> runtime PM to see what happened.
>
Ok. Thanks.
>
> Thanks



--
Best regards
Tianyu Lan

^ permalink raw reply

* Re: [PATCH v8 05/11] libata-eh: allow defer in ata_exec_internal
From: Tejun Heo @ 2012-10-29 15:20 UTC (permalink / raw)
  To: Aaron Lu
  Cc: Jeff Garzik, Rafael J. Wysocki, James Bottomley, Alan Stern,
	Oliver Neukum, Jeff Wu, Aaron Lu, Shane Huang, linux-ide,
	linux-pm, linux-scsi, linux-acpi
In-Reply-To: <1351501298-3716-6-git-send-email-aaron.lu@intel.com>

On Mon, Oct 29, 2012 at 05:01:32PM +0800, Aaron Lu wrote:
> ata_exec_internal will preempt the ata link's active_tag and ata port's
> qc_active flags, this is OK for error recovery, but if normal code path
> wants to use ata_exec_internal, there is a problem: we need to check if
> it is OK to issue a new command with the help of port_ops->defer.
> 
> In ZPODD, I'll need to find out the loading mechanism of the ODD by
> issuing a GET_CONFIGURATION command. And this command may very well
> race with commands issued from SCSI layer. So instead of preempt the
> current command, defer the new command if it's not OK to issue it, as
> it is always wrong to issue a non-NCQ command when there is command(s)
> in processing.

Why not do the discovery from EH?

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [linux-next PATCH] PM / devfreq: documentation cleanups for devfreq header
From: Randy Dunlap @ 2012-10-29 15:32 UTC (permalink / raw)
  To: Nishanth Menon
  Cc: linux-pm, Rajagopal Venkat, MyungJoo Ham, Kyungmin Park,
	Rafael J. Wysocki, Kevin Hilman, linux-kernel
In-Reply-To: <1351515743-23411-1-git-send-email-nm@ti.com>

On 10/29/2012 06:02 AM, Nishanth Menon wrote:

> struct parameters need to have ':' in documentation for
> scripts/kernel-doc to parse appropriately.
> 
> Fix the errors reported by:
> ./scripts/kernel-doc include/linux/devfreq.h >/dev/null
> 
> Cc: Rajagopal Venkat <rajagopal.venkat@linaro.org>
> Cc: MyungJoo Ham <myungjoo.ham@samsung.com>
> Cc: Kyungmin Park <kyungmin.park@samsung.com>
> Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
> Cc: Kevin Hilman <khilman@ti.com>
> Cc: linux-pm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> 
> Signed-off-by: Nishanth Menon <nm@ti.com>


Acked-by: Randy Dunlap <rdunlap@xenotime.net>

Thanks.

> ---
> Applies on
> linux-next                e083feb Merge branch 'acpi-next' into linux-next
> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
> 
>  include/linux/devfreq.h |   54 +++++++++++++++++++++++------------------------
>  1 file changed, 27 insertions(+), 27 deletions(-)
> 
> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> index 7e2e2ea..1461fb2 100644
> --- a/include/linux/devfreq.h
> +++ b/include/linux/devfreq.h
> @@ -25,12 +25,12 @@ struct devfreq;
>   * struct devfreq_dev_status - Data given from devfreq user device to
>   *			     governors. Represents the performance
>   *			     statistics.
> - * @total_time		The total time represented by this instance of
> + * @total_time:		The total time represented by this instance of
>   *			devfreq_dev_status
> - * @busy_time		The time that the device was working among the
> + * @busy_time:		The time that the device was working among the
>   *			total_time.
> - * @current_frequency	The operating frequency.
> - * @private_data	An entry not specified by the devfreq framework.
> + * @current_frequency:	The operating frequency.
> + * @private_data:	An entry not specified by the devfreq framework.
>   *			A device and a specific governor may have their
>   *			own protocol with private_data. However, because
>   *			this is governor-specific, a governor using this
> @@ -54,21 +54,21 @@ struct devfreq_dev_status {
>  
>  /**
>   * struct devfreq_dev_profile - Devfreq's user device profile
> - * @initial_freq	The operating frequency when devfreq_add_device() is
> + * @initial_freq:	The operating frequency when devfreq_add_device() is
>   *			called.
> - * @polling_ms		The polling interval in ms. 0 disables polling.
> - * @target		The device should set its operating frequency at
> + * @polling_ms:		The polling interval in ms. 0 disables polling.
> + * @target:		The device should set its operating frequency at
>   *			freq or lowest-upper-than-freq value. If freq is
>   *			higher than any operable frequency, set maximum.
>   *			Before returning, target function should set
>   *			freq at the current frequency.
>   *			The "flags" parameter's possible values are
>   *			explained above with "DEVFREQ_FLAG_*" macros.
> - * @get_dev_status	The device should provide the current performance
> + * @get_dev_status:	The device should provide the current performance
>   *			status to devfreq, which is used by governors.
> - * @get_cur_freq	The device should provide the current frequency
> + * @get_cur_freq:	The device should provide the current frequency
>   *			at which it is operating.
> - * @exit		An optional callback that is called when devfreq
> + * @exit:		An optional callback that is called when devfreq
>   *			is removing the devfreq object due to error or
>   *			from devfreq_remove_device() call. If the user
>   *			has registered devfreq->nb at a notifier-head,
> @@ -87,14 +87,14 @@ struct devfreq_dev_profile {
>  
>  /**
>   * struct devfreq_governor - Devfreq policy governor
> - * @name		Governor's name
> - * @get_target_freq	Returns desired operating frequency for the device.
> + * @name:		Governor's name
> + * @get_target_freq:	Returns desired operating frequency for the device.
>   *			Basically, get_target_freq will run
>   *			devfreq_dev_profile.get_dev_status() to get the
>   *			status of the device (load = busy_time / total_time).
>   *			If no_central_polling is set, this callback is called
>   *			only with update_devfreq() notified by OPP.
> - * @event_handler       Callback for devfreq core framework to notify events
> + * @event_handler:      Callback for devfreq core framework to notify events
>   *                      to governors. Events include per device governor
>   *                      init and exit, opp changes out of devfreq, suspend
>   *                      and resume of per device devfreq during device idle.
> @@ -110,23 +110,23 @@ struct devfreq_governor {
>  
>  /**
>   * struct devfreq - Device devfreq structure
> - * @node	list node - contains the devices with devfreq that have been
> + * @node:	list node - contains the devices with devfreq that have been
>   *		registered.
> - * @lock	a mutex to protect accessing devfreq.
> - * @dev		device registered by devfreq class. dev.parent is the device
> + * @lock:	a mutex to protect accessing devfreq.
> + * @dev:	device registered by devfreq class. dev.parent is the device
>   *		using devfreq.
> - * @profile	device-specific devfreq profile
> - * @governor	method how to choose frequency based on the usage.
> - * @nb		notifier block used to notify devfreq object that it should
> + * @profile:	device-specific devfreq profile
> + * @governor:	method how to choose frequency based on the usage.
> + * @nb:		notifier block used to notify devfreq object that it should
>   *		reevaluate operable frequencies. Devfreq users may use
>   *		devfreq.nb to the corresponding register notifier call chain.
> - * @work	delayed work for load monitoring.
> - * @previous_freq	previously configured frequency value.
> - * @data	Private data of the governor. The devfreq framework does not
> + * @work:	delayed work for load monitoring.
> + * @previous_freq:	previously configured frequency value.
> + * @data:	Private data of the governor. The devfreq framework does not
>   *		touch this.
> - * @min_freq	Limit minimum frequency requested by user (0: none)
> - * @max_freq	Limit maximum frequency requested by user (0: none)
> - * @stop_polling	 devfreq polling status of a device.
> + * @min_freq:	Limit minimum frequency requested by user (0: none)
> + * @max_freq:	Limit maximum frequency requested by user (0: none)
> + * @stop_polling:	 devfreq polling status of a device.
>   *
>   * This structure stores the devfreq information for a give device.
>   *
> @@ -186,9 +186,9 @@ extern const struct devfreq_governor devfreq_simple_ondemand;
>  /**
>   * struct devfreq_simple_ondemand_data - void *data fed to struct devfreq
>   *	and devfreq_add_device
> - * @ upthreshold	If the load is over this value, the frequency jumps.
> + * @upthreshold:	If the load is over this value, the frequency jumps.
>   *			Specify 0 to use the default. Valid value = 0 to 100.
> - * @ downdifferential	If the load is under upthreshold - downdifferential,
> + * @downdifferential:	If the load is under upthreshold - downdifferential,
>   *			the governor may consider slowing the frequency down.
>   *			Specify 0 to use the default. Valid value = 0 to 100.
>   *			downdifferential < upthreshold must hold.



-- 
~Randy

^ permalink raw reply

* Re: [PATCH v8 09/11] block: add a new interface to block events
From: Tejun Heo @ 2012-10-29 15:35 UTC (permalink / raw)
  To: Aaron Lu
  Cc: Jeff Garzik, Rafael J. Wysocki, James Bottomley, Alan Stern,
	Oliver Neukum, Jeff Wu, Aaron Lu, Shane Huang, linux-ide,
	linux-pm, linux-scsi, linux-acpi
In-Reply-To: <1351501298-3716-10-git-send-email-aaron.lu@intel.com>

Hello,

On Mon, Oct 29, 2012 at 05:01:36PM +0800, Aaron Lu wrote:
> ODD_suspend                        disk_events_workfn
>   ata_port_suspend                   check_events
>     disk_block_events                  resume ODD
>       cancel_delayed_work_sync           resume parent
>       (waiting for disk_events_workfn)   (waiting for suspend callback)

I don't understand why solving it needs to be this elaborate.
check_event() can retry.  Just add a per-sr mutex which is try-locked
by sr_block_check_events() and grab it when entering zero power.

> +/*
> + * Under some circumstances, there is a race between the calling thread
> + * of disk_block_events and the events checking function. To avoid such a race,
> + * this function will check if the delayed work is pending. If not, it means
> + * the work is either not queued or is already running, false is returned.
> + * And if yes, try to cancel the delayed work. If succedded, disk_block_events
> + * will be called and there is no worry that cancel_delayed_work_sync will
> + * deadlock the events checking function. And if failed, false is returned.
> + */
> +bool disk_try_block_events(struct gendisk *disk)
> +{
> +	struct disk_events *ev = disk->ev;
> +
> +	if (!ev)
> +		return false;
> +
> +	if (delayed_work_pending(&ev->dwork)) {

And please don't use delayed_work_pending() like this.  It doesn't add
anything.  cancel_delayed_work() already needs to perform all the
necessary tests.

> +		if (cancel_delayed_work(&disk->ev->dwork)) {
> +			disk_block_events(disk);
> +			return true;
> +		}
> +	}
> +
> +	return false;
> +}

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCH v3 2/6] PM / Runtime: introduce pm_runtime_set[get]_memalloc_noio()
From: Alan Stern @ 2012-10-29 15:41 UTC (permalink / raw)
  To: Ming Lei
  Cc: linux-kernel, Oliver Neukum, Minchan Kim, Greg Kroah-Hartman,
	Rafael J. Wysocki, Jens Axboe, David S. Miller, Andrew Morton,
	netdev, linux-usb, linux-pm, linux-mm
In-Reply-To: <1351513440-9286-3-git-send-email-ming.lei@canonical.com>

On Mon, 29 Oct 2012, Ming Lei wrote:

> The patch introduces the flag of memalloc_noio_resume in
> 'struct dev_pm_info' to help PM core to teach mm not allocating
> memory with GFP_KERNEL flag for avoiding probable deadlock
> problem.
> 
> As explained in the comment, any GFP_KERNEL allocation inside
> runtime_resume on any one of device in the path from one block
> or network device to the root device in the device tree may cause
> deadlock, the introduced pm_runtime_set_memalloc_noio() sets or
> clears the flag on device of the path recursively.
> 
> This patch also introduces pm_runtime_get_memalloc_noio() because
> the flag may be accessed in block device's error handling path
> (for example, usb device reset)

> +/*
> + * pm_runtime_get_memalloc_noio - Get a device's memalloc_noio flag.
> + * @dev: Device to handle.
> + *
> + * Return the device's memalloc_noio flag.
> + *
> + * The device power lock is held because bitfield is not SMP-safe.
> + */
> +bool pm_runtime_get_memalloc_noio(struct device *dev)
> +{
> +	bool ret;
> +	spin_lock_irq(&dev->power.lock);
> +	ret = dev->power.memalloc_noio_resume;
> +	spin_unlock_irq(&dev->power.lock);
> +	return ret;
> +}

You don't need to acquire and release a spinlock just to read the
value.  Reading bitfields _is_ SMP-safe; writing them is not.

> +/*
> + * pm_runtime_set_memalloc_noio - Set a device's memalloc_noio flag.
> + * @dev: Device to handle.
> + * @enable: True for setting the flag and False for clearing the flag.
> + *
> + * Set the flag for all devices in the path from the device to the
> + * root device in the device tree if @enable is true, otherwise clear
> + * the flag for devices in the path which sibliings don't set the flag.

s/which/whose/
s/ii/i

> + *
> + * The function should only be called by block device, or network
> + * device driver for solving the deadlock problem during runtime
> + * resume:
> + * 	if memory allocation with GFP_KERNEL is called inside runtime
> + * 	resume callback of any one of its ancestors(or the block device
> + * 	itself), the deadlock may be triggered inside the memory
> + * 	allocation since it might not complete until the block device
> + * 	becomes active and the involed page I/O finishes. The situation
> + * 	is pointed out first by Alan Stern. Network device are involved
> + * 	in iSCSI kind of situation.
> + *
> + * The lock of dev_hotplug_mutex is held in the function for handling
> + * hotplug race because pm_runtime_set_memalloc_noio() may be called
> + * in async probe().
> + */
> +void pm_runtime_set_memalloc_noio(struct device *dev, bool enable)
> +{
> +	static DEFINE_MUTEX(dev_hotplug_mutex);
> +
> +	mutex_lock(&dev_hotplug_mutex);
> +	while (dev) {

Unless you think somebody is likely to call this function with dev 
equal to NULL, this can simply be

	for (;;) {

> +		/* hold power lock since bitfield is not SMP-safe. */
> +		spin_lock_irq(&dev->power.lock);
> +		dev->power.memalloc_noio_resume = enable;
> +		spin_unlock_irq(&dev->power.lock);
> +
> +		dev = dev->parent;
> +
> +		/* only clear the flag for one device if all
> +		 * children of the device don't set the flag.
> +		 */
> +		if (!dev || (!enable &&

... thanks to this test.

> +			     device_for_each_child(dev, NULL,
> +						   dev_memalloc_noio)))
> +			break;
> +	}
> +	mutex_unlock(&dev_hotplug_mutex);
> +}

This might not work if somebody calls pm_runtime_set_memalloc_noio(dev,
true) and then afterwards registers dev at the same time as someone
else calls pm_runtime_set_memalloc_noio(dev2, false), if dev and dev2
have the same parent.

Perhaps the kerneldoc should mention that this function must not be 
called until after dev is registered.

Alan Stern

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox