[PATCH v2 0/3] PM: sleep: Improvements of async suspend and resume of devices

public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 0/3] PM: sleep: Improvements of async suspend and resume of devices
@ 2025-03-13 20:17 Rafael J. Wysocki
  2025-03-13 20:26 ` [PATCH v2 1/3] PM: sleep: Resume children after resuming the parent Rafael J. Wysocki
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2025-03-13 20:17 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Alan Stern, Ulf Hansson, Johan Hovold,
	Manivannan Sadhasivam, Jon Hunter, Saravana Kannan

Hi Everyone,

This is a new iteration of the async suspend/resume improvements work:

https://lore.kernel.org/linux-pm/13709135.uLZWGnKmhe@rjwysocki.net/

which includes some rework and fixes of the last 3 patches in the series
linked above.  The first 2 patches in that series have been applied
already and are waiting for the 6.15 merge window.

This new iteration is based on linux-pm.git/linux-next and on the recent
fix related to direct-complete:

https://lore.kernel.org/linux-pm/12627587.O9o76ZdvQC@rjwysocki.net/

The following part of the original cover letter is still applicable:

"Overall, the idea is that instead of starting an async work item for every
async device upfront, which is not very efficient because the majority of
those devices will not be able to make progress due to dependencies anyway,
the async handling is only started upfront for the devices that are likely
to be able to make progress.  That is, devices without parents in the resume
path and leaf devices (ie. devices without children or consumers) in the
suspend path (the underlying observation here is that devices without parents
are likely to have no suppliers too whereas devices without children that
have consumers are not unheard of).  This allows to reduce the amount of
processing that needs to be done to start with.

Then, after processing every device ("async" or "sync"), "async" processing
is started for some devices that have been "unblocked" by it, which are its
children in the resume path or its parent and its suppliers in the suspend
path.  This allows asynchronous handling to start as soon as it makes sense
without delaying the "async" devices unnecessarily.

Fortunately, the additional plumbing needed to implement this is not
particularly complicated."

Patch [1/3] deals with the resume path for all device resume phases.  Since
v1 it's been modified to avoid using dpm_list_mtx for protecting the
power.work_in_progress flag and a new lock is used specifically for that
now which should reduce lock contention.

Patch [2/3] optimizes the "suspend" phase which has the most visible effect (on
the systems in my office the speedup is in the 100 ms range which is around 20%
of the total device resume time).  Since v1 it's been adjusted to the changes
in patch [1/3] and two issues in it have been fixed: dpm_async_suspend_superior()
actually walks the device's suppliers and all devices are moved to the target
list in dpm_suspend() even if there is an error so that they can be resumed
properly in that case.

Patch [3/3] extend this to the "suspend late" and "suspend noirq" phases.

Thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/3] PM: sleep: Resume children after resuming the parent
  2025-03-13 20:17 [PATCH v2 0/3] PM: sleep: Improvements of async suspend and resume of devices Rafael J. Wysocki
@ 2025-03-13 20:26 ` Rafael J. Wysocki
  2025-03-13 21:16   ` Saravana Kannan
  2025-03-13 20:34 ` [PATCH v2 2/3] PM: sleep: Suspend parents and suppliers after suspending subordinates Rafael J. Wysocki
  2025-03-13 20:35 ` [PATCH v2 3/3] PM: sleep: Make suspend of devices more asynchronous Rafael J. Wysocki
  2 siblings, 1 reply; 9+ messages in thread
From: Rafael J. Wysocki @ 2025-03-13 20:26 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Alan Stern, Ulf Hansson, Johan Hovold,
	Manivannan Sadhasivam, Jon Hunter, Saravana Kannan

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

According to [1], the handling of device suspend and resume, and
particularly the latter, involves unnecessary overhead related to
starting new async work items for devices that cannot make progress
right away because they have to wait for other devices.

To reduce this problem in the resume path, use the observation that
starting the async resume of the children of a device after resuming
the parent is likely to produce less scheduling and memory management
noise than starting it upfront while at the same time it should not
increase the resume duration substantially.

Accordingly, modify the code to start the async resume of the device's
children when the processing of the parent has been completed in each
stage of device resume and only start async resume upfront for devices
without parents.

Also make it check if a given device can be resumed asynchronously
before starting the synchronous resume of it in case it will have to
wait for another that is already resuming asynchronously.

In addition to making the async resume of devices more friendly to
systems with relatively less computing resources, this change is also
preliminary for analogous changes in the suspend path.

On the systems where it has been tested, this change by itself does
not affect the overall system resume duration in a significant way.

Link: https://lore.kernel.org/linux-pm/20241114220921.2529905-1-saravanak@google.com/ [1]
Suggested-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

v1 -> v2:
   Use a separate lock for power.work_in_progress protection which should
   reduce lock contention on dpm_list_mtx.

---
 drivers/base/power/main.c |   80 +++++++++++++++++++++++++++++++++++-----------
 1 file changed, 61 insertions(+), 19 deletions(-)

--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -63,6 +63,7 @@
 static DEFINE_MUTEX(dpm_list_mtx);
 static pm_message_t pm_transition;
 
+static DEFINE_MUTEX(async_wip_mtx);
 static int async_error;
 
 static const char *pm_verb(int event)
@@ -597,8 +598,11 @@
 		&& !pm_trace_is_enabled();
 }
 
-static bool dpm_async_fn(struct device *dev, async_func_t func)
+static bool __dpm_async(struct device *dev, async_func_t func)
 {
+	if (dev->power.work_in_progress)
+		return true;
+
 	if (!is_async(dev))
 		return false;
 
@@ -611,14 +615,37 @@
 
 	put_device(dev);
 
+	return false;
+}
+
+static bool dpm_async_fn(struct device *dev, async_func_t func)
+{
+	guard(mutex)(&async_wip_mtx);
+
+	return __dpm_async(dev, func);
+}
+
+static int dpm_async_with_cleanup(struct device *dev, void *fn)
+{
+	guard(mutex)(&async_wip_mtx);
+
+	if (!__dpm_async(dev, fn))
+		dev->power.work_in_progress = false;
+
+	return 0;
+}
+
+static void dpm_async_resume_children(struct device *dev, async_func_t func)
+{
 	/*
-	 * async_schedule_dev_nocall() above has returned false, so func() is
-	 * not running and it is safe to update power.work_in_progress without
-	 * extra synchronization.
+	 * Start processing "async" children of the device unless it's been
+	 * started already for them.
+	 *
+	 * This could have been done for the device's "async" consumers too, but
+	 * they either need to wait for their parents or the processing has
+	 * already started for them after their parents were processed.
 	 */
-	dev->power.work_in_progress = false;
-
-	return false;
+	device_for_each_child(dev, func, dpm_async_with_cleanup);
 }
 
 static void dpm_clear_async_state(struct device *dev)
@@ -627,6 +654,8 @@
 	dev->power.work_in_progress = false;
 }
 
+static void async_resume_noirq(void *data, async_cookie_t cookie);
+
 /**
  * device_resume_noirq - Execute a "noirq resume" callback for given device.
  * @dev: Device to handle.
@@ -710,6 +739,8 @@
 		dpm_save_failed_dev(dev_name(dev));
 		pm_dev_err(dev, state, async ? " async noirq" : " noirq", error);
 	}
+
+	dpm_async_resume_children(dev, async_resume_noirq);
 }
 
 static void async_resume_noirq(void *data, async_cookie_t cookie)
@@ -733,19 +764,20 @@
 	mutex_lock(&dpm_list_mtx);
 
 	/*
-	 * Trigger the resume of "async" devices upfront so they don't have to
-	 * wait for the "non-async" ones they don't depend on.
+	 * Start processing "async" devices without parents upfront so they
+	 * don't wait for the "sync" devices they don't depend on.
 	 */
 	list_for_each_entry(dev, &dpm_noirq_list, power.entry) {
 		dpm_clear_async_state(dev);
-		dpm_async_fn(dev, async_resume_noirq);
+		if (!dev->parent)
+			dpm_async_with_cleanup(dev, async_resume_noirq);
 	}
 
 	while (!list_empty(&dpm_noirq_list)) {
 		dev = to_device(dpm_noirq_list.next);
 		list_move_tail(&dev->power.entry, &dpm_late_early_list);
 
-		if (!dev->power.work_in_progress) {
+		if (!dpm_async_fn(dev, async_resume_noirq)) {
 			get_device(dev);
 
 			mutex_unlock(&dpm_list_mtx);
@@ -781,6 +813,8 @@
 	device_wakeup_disarm_wake_irqs();
 }
 
+static void async_resume_early(void *data, async_cookie_t cookie);
+
 /**
  * device_resume_early - Execute an "early resume" callback for given device.
  * @dev: Device to handle.
@@ -848,6 +882,8 @@
 		dpm_save_failed_dev(dev_name(dev));
 		pm_dev_err(dev, state, async ? " async early" : " early", error);
 	}
+
+	dpm_async_resume_children(dev, async_resume_early);
 }
 
 static void async_resume_early(void *data, async_cookie_t cookie)
@@ -875,19 +911,20 @@
 	mutex_lock(&dpm_list_mtx);
 
 	/*
-	 * Trigger the resume of "async" devices upfront so they don't have to
-	 * wait for the "non-async" ones they don't depend on.
+	 * Start processing "async" devices without parents upfront so they
+	 * don't wait for the "sync" devices they don't depend on.
 	 */
 	list_for_each_entry(dev, &dpm_late_early_list, power.entry) {
 		dpm_clear_async_state(dev);
-		dpm_async_fn(dev, async_resume_early);
+		if (!dev->parent)
+			dpm_async_with_cleanup(dev, async_resume_early);
 	}
 
 	while (!list_empty(&dpm_late_early_list)) {
 		dev = to_device(dpm_late_early_list.next);
 		list_move_tail(&dev->power.entry, &dpm_suspended_list);
 
-		if (!dev->power.work_in_progress) {
+		if (!dpm_async_fn(dev, async_resume_early)) {
 			get_device(dev);
 
 			mutex_unlock(&dpm_list_mtx);
@@ -919,6 +956,8 @@
 }
 EXPORT_SYMBOL_GPL(dpm_resume_start);
 
+static void async_resume(void *data, async_cookie_t cookie);
+
 /**
  * device_resume - Execute "resume" callbacks for given device.
  * @dev: Device to handle.
@@ -1018,6 +1057,8 @@
 		dpm_save_failed_dev(dev_name(dev));
 		pm_dev_err(dev, state, async ? " async" : "", error);
 	}
+
+	dpm_async_resume_children(dev, async_resume);
 }
 
 static void async_resume(void *data, async_cookie_t cookie)
@@ -1049,19 +1090,20 @@
 	mutex_lock(&dpm_list_mtx);
 
 	/*
-	 * Trigger the resume of "async" devices upfront so they don't have to
-	 * wait for the "non-async" ones they don't depend on.
+	 * Start processing "async" devices without parents upfront so they
+	 * don't wait for the "sync" devices they don't depend on.
 	 */
 	list_for_each_entry(dev, &dpm_suspended_list, power.entry) {
 		dpm_clear_async_state(dev);
-		dpm_async_fn(dev, async_resume);
+		if (!dev->parent)
+			dpm_async_with_cleanup(dev, async_resume);
 	}
 
 	while (!list_empty(&dpm_suspended_list)) {
 		dev = to_device(dpm_suspended_list.next);
 		list_move_tail(&dev->power.entry, &dpm_prepared_list);
 
-		if (!dev->power.work_in_progress) {
+		if (!dpm_async_fn(dev, async_resume)) {
 			get_device(dev);
 
 			mutex_unlock(&dpm_list_mtx);




^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 2/3] PM: sleep: Suspend parents and suppliers after suspending subordinates
  2025-03-13 20:17 [PATCH v2 0/3] PM: sleep: Improvements of async suspend and resume of devices Rafael J. Wysocki
  2025-03-13 20:26 ` [PATCH v2 1/3] PM: sleep: Resume children after resuming the parent Rafael J. Wysocki
@ 2025-03-13 20:34 ` Rafael J. Wysocki
  2025-03-13 21:16   ` Saravana Kannan
  2025-03-13 20:35 ` [PATCH v2 3/3] PM: sleep: Make suspend of devices more asynchronous Rafael J. Wysocki
  2 siblings, 1 reply; 9+ messages in thread
From: Rafael J. Wysocki @ 2025-03-13 20:34 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Alan Stern, Ulf Hansson, Johan Hovold,
	Manivannan Sadhasivam, Jon Hunter, Saravana Kannan

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

In analogy with the previous change affecting the resume path,
make device_suspend() start the async suspend of the device's parent
and suppliers after the device itself has been processed and make
dpm_suspend() start processing "async" leaf devices (that is, devices
without children or consumers) upfront because they don't need to wait
for any other devices.

On the Dell XPS13 9360 in my office, this change reduces the total
duration of device suspend by approximately 100 ms (over 20%).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Suggested-by: Saravana Kannan <saravanak@google.com>
---

v1 -> v2:
   * Adjust for the changes in patch [1/3].
   * Fix walking suppliers in dpm_async_suspend_superior().
   * Use device links read locking in dpm_async_suspend_superior() (Saravana).
   * Move all devices to the target list even if there are errors in
     dpm_suspend() so they are properly resumed during rollback (Saravana).

---
 drivers/base/power/main.c |   78 ++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 72 insertions(+), 6 deletions(-)

--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -1231,6 +1231,50 @@
 
 /*------------------------- Suspend routines -------------------------*/
 
+static bool dpm_leaf_device(struct device *dev)
+{
+	struct device *child;
+
+	lockdep_assert_held(&dpm_list_mtx);
+
+	child = device_find_any_child(dev);
+	if (child) {
+		put_device(child);
+
+		return false;
+	}
+
+	/*
+	 * Since this function is required to run under dpm_list_mtx, the
+	 * list_empty() below will only return true if the device's list of
+	 * consumers is actually empty before calling it.
+	 */
+	return list_empty(&dev->links.consumers);
+}
+
+static void dpm_async_suspend_superior(struct device *dev, async_func_t func)
+{
+	struct device_link *link;
+	int idx;
+
+	mutex_lock(&dpm_list_mtx);
+
+	/* Start processing the device's parent if it is "async". */
+	if (dev->parent)
+		dpm_async_with_cleanup(dev->parent, func);
+
+	mutex_unlock(&dpm_list_mtx);
+
+	idx = device_links_read_lock();
+
+	/* Start processing the device's "async" suppliers. */
+	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
+		if (READ_ONCE(link->status) != DL_STATE_DORMANT)
+			dpm_async_with_cleanup(link->supplier, func);
+
+	device_links_read_unlock(idx);
+}
+
 /**
  * resume_event - Return a "resume" message for given "suspend" sleep state.
  * @sleep_state: PM message representing a sleep state.
@@ -1656,6 +1700,8 @@
 	device_links_read_unlock(idx);
 }
 
+static void async_suspend(void *data, async_cookie_t cookie);
+
 /**
  * device_suspend - Execute "suspend" callbacks for given device.
  * @dev: Device to handle.
@@ -1785,7 +1831,13 @@
 
 	complete_all(&dev->power.completion);
 	TRACE_SUSPEND(error);
-	return error;
+
+	if (error || async_error)
+		return error;
+
+	dpm_async_suspend_superior(dev, async_suspend);
+
+	return 0;
 }
 
 static void async_suspend(void *data, async_cookie_t cookie)
@@ -1803,6 +1855,7 @@
 int dpm_suspend(pm_message_t state)
 {
 	ktime_t starttime = ktime_get();
+	struct device *dev;
 	int error = 0;
 
 	trace_suspend_resume(TPS("dpm_suspend"), state.event, true);
@@ -1816,12 +1869,28 @@
 
 	mutex_lock(&dpm_list_mtx);
 
+	/*
+	 * Start processing "async" leaf devices upfront because they don't need
+	 * to wait.
+	 */
+	list_for_each_entry_reverse(dev, &dpm_prepared_list, power.entry) {
+		dpm_clear_async_state(dev);
+		if (dpm_leaf_device(dev))
+			dpm_async_with_cleanup(dev, async_suspend);
+	}
+
 	while (!list_empty(&dpm_prepared_list)) {
-		struct device *dev = to_device(dpm_prepared_list.prev);
+		dev = to_device(dpm_prepared_list.prev);
 
 		list_move(&dev->power.entry, &dpm_suspended_list);
 
-		dpm_clear_async_state(dev);
+		/*
+		 * Move all devices to the target list to resume them properly
+		 * on errors.
+		 */
+		if (error || async_error)
+			continue;
+
 		if (dpm_async_fn(dev, async_suspend))
 			continue;
 
@@ -1834,9 +1903,6 @@
 		put_device(dev);
 
 		mutex_lock(&dpm_list_mtx);
-
-		if (error || async_error)
-			break;
 	}
 
 	mutex_unlock(&dpm_list_mtx);




^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 3/3] PM: sleep: Make suspend of devices more asynchronous
  2025-03-13 20:17 [PATCH v2 0/3] PM: sleep: Improvements of async suspend and resume of devices Rafael J. Wysocki
  2025-03-13 20:26 ` [PATCH v2 1/3] PM: sleep: Resume children after resuming the parent Rafael J. Wysocki
  2025-03-13 20:34 ` [PATCH v2 2/3] PM: sleep: Suspend parents and suppliers after suspending subordinates Rafael J. Wysocki
@ 2025-03-13 20:35 ` Rafael J. Wysocki
  2 siblings, 0 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2025-03-13 20:35 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Alan Stern, Ulf Hansson, Johan Hovold,
	Manivannan Sadhasivam, Jon Hunter, Saravana Kannan

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

In analogy with previous changes, make device_suspend_late() and
device_suspend_noirq() start the async suspend of the device's parent
and suppliers after the device itself has been processed and make
dpm_suspend_late() and dpm_noirq_suspend_devices() start processing
"async" leaf devices (that is, devices without children or consumers)
upfront because they don't need to wait for any other devices.

This change reduces the total duration of device suspend on some systems
where it has been tested measurably, but not significantly.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Suggested-by: Saravana Kannan <saravanak@google.com>
---

v1 -> v2:
   * Adjust for the changes in patches [1-2/3].
   * Move all devices to the target lists even if there are errors in
     dpm_suspend_late() and dpm_noirq_suspend_devices() so they are
     properly resumed during rollback (Saravana).

---
 drivers/base/power/main.c |   68 +++++++++++++++++++++++++++++++++++++---------
 1 file changed, 56 insertions(+), 12 deletions(-)

--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -1312,6 +1312,8 @@
 	device_links_read_unlock(idx);
 }
 
+static void async_suspend_noirq(void *data, async_cookie_t cookie);
+
 /**
  * device_suspend_noirq - Execute a "noirq suspend" callback for given device.
  * @dev: Device to handle.
@@ -1390,7 +1392,13 @@
 Complete:
 	complete_all(&dev->power.completion);
 	TRACE_SUSPEND(error);
-	return error;
+
+	if (error || async_error)
+		return error;
+
+	dpm_async_suspend_superior(dev, async_suspend_noirq);
+
+	return 0;
 }
 
 static void async_suspend_noirq(void *data, async_cookie_t cookie)
@@ -1404,6 +1412,7 @@
 static int dpm_noirq_suspend_devices(pm_message_t state)
 {
 	ktime_t starttime = ktime_get();
+	struct device *dev;
 	int error = 0;
 
 	trace_suspend_resume(TPS("dpm_suspend_noirq"), state.event, true);
@@ -1413,12 +1422,28 @@
 
 	mutex_lock(&dpm_list_mtx);
 
+	/*
+	 * Start processing "async" leaf devices upfront because they don't need
+	 * to wait.
+	 */
+	list_for_each_entry_reverse(dev, &dpm_late_early_list, power.entry) {
+		dpm_clear_async_state(dev);
+		if (dpm_leaf_device(dev))
+			dpm_async_with_cleanup(dev, async_suspend_noirq);
+	}
+
 	while (!list_empty(&dpm_late_early_list)) {
-		struct device *dev = to_device(dpm_late_early_list.prev);
+		dev = to_device(dpm_late_early_list.prev);
 
 		list_move(&dev->power.entry, &dpm_noirq_list);
 
-		dpm_clear_async_state(dev);
+		/*
+		 * Move all devices to the target list to resume them properly
+		 * on errors.
+		 */
+		if (error || async_error)
+			break;
+
 		if (dpm_async_fn(dev, async_suspend_noirq))
 			continue;
 
@@ -1431,9 +1456,6 @@
 		put_device(dev);
 
 		mutex_lock(&dpm_list_mtx);
-
-		if (error || async_error)
-			break;
 	}
 
 	mutex_unlock(&dpm_list_mtx);
@@ -1486,6 +1508,8 @@
 	spin_unlock_irq(&parent->power.lock);
 }
 
+static void async_suspend_late(void *data, async_cookie_t cookie);
+
 /**
  * device_suspend_late - Execute a "late suspend" callback for given device.
  * @dev: Device to handle.
@@ -1562,7 +1586,13 @@
 Complete:
 	TRACE_SUSPEND(error);
 	complete_all(&dev->power.completion);
-	return error;
+
+	if (error || async_error)
+		return error;
+
+	dpm_async_suspend_superior(dev, async_suspend_late);
+
+	return 0;
 }
 
 static void async_suspend_late(void *data, async_cookie_t cookie)
@@ -1580,6 +1610,7 @@
 int dpm_suspend_late(pm_message_t state)
 {
 	ktime_t starttime = ktime_get();
+	struct device *dev;
 	int error = 0;
 
 	trace_suspend_resume(TPS("dpm_suspend_late"), state.event, true);
@@ -1591,12 +1622,28 @@
 
 	mutex_lock(&dpm_list_mtx);
 
+	/*
+	 * Start processing "async" leaf devices upfront because they don't need
+	 * to wait.
+	 */
+	list_for_each_entry_reverse(dev, &dpm_suspended_list, power.entry) {
+		dpm_clear_async_state(dev);
+		if (dpm_leaf_device(dev))
+			dpm_async_with_cleanup(dev, async_suspend_late);
+	}
+
 	while (!list_empty(&dpm_suspended_list)) {
-		struct device *dev = to_device(dpm_suspended_list.prev);
+		dev = to_device(dpm_suspended_list.prev);
 
 		list_move(&dev->power.entry, &dpm_late_early_list);
 
-		dpm_clear_async_state(dev);
+		/*
+		 * Move all devices to the target list to resume them properly
+		 * on errors.
+		 */
+		if (error || async_error)
+			continue;
+
 		if (dpm_async_fn(dev, async_suspend_late))
 			continue;
 
@@ -1609,9 +1656,6 @@
 		put_device(dev);
 
 		mutex_lock(&dpm_list_mtx);
-
-		if (error || async_error)
-			break;
 	}
 
 	mutex_unlock(&dpm_list_mtx);




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] PM: sleep: Suspend parents and suppliers after suspending subordinates
  2025-03-13 20:34 ` [PATCH v2 2/3] PM: sleep: Suspend parents and suppliers after suspending subordinates Rafael J. Wysocki
@ 2025-03-13 21:16   ` Saravana Kannan
  2025-03-13 22:36     ` Rafael J. Wysocki
  0 siblings, 1 reply; 9+ messages in thread
From: Saravana Kannan @ 2025-03-13 21:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, LKML, Alan Stern, Ulf Hansson, Johan Hovold,
	Manivannan Sadhasivam, Jon Hunter

On Thu, Mar 13, 2025 at 1:35 PM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> In analogy with the previous change affecting the resume path,
> make device_suspend() start the async suspend of the device's parent
> and suppliers after the device itself has been processed and make
> dpm_suspend() start processing "async" leaf devices (that is, devices
> without children or consumers) upfront because they don't need to wait
> for any other devices.
>
> On the Dell XPS13 9360 in my office, this change reduces the total
> duration of device suspend by approximately 100 ms (over 20%).
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Suggested-by: Saravana Kannan <saravanak@google.com>
> ---
>
> v1 -> v2:
>    * Adjust for the changes in patch [1/3].
>    * Fix walking suppliers in dpm_async_suspend_superior().
>    * Use device links read locking in dpm_async_suspend_superior() (Saravana).
>    * Move all devices to the target list even if there are errors in
>      dpm_suspend() so they are properly resumed during rollback (Saravana).
>
> ---
>  drivers/base/power/main.c |   78 ++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 72 insertions(+), 6 deletions(-)
>
> --- a/drivers/base/power/main.c
> +++ b/drivers/base/power/main.c
> @@ -1231,6 +1231,50 @@
>
>  /*------------------------- Suspend routines -------------------------*/
>
> +static bool dpm_leaf_device(struct device *dev)
> +{
> +       struct device *child;
> +
> +       lockdep_assert_held(&dpm_list_mtx);
> +
> +       child = device_find_any_child(dev);
> +       if (child) {
> +               put_device(child);
> +
> +               return false;
> +       }
> +
> +       /*
> +        * Since this function is required to run under dpm_list_mtx, the
> +        * list_empty() below will only return true if the device's list of
> +        * consumers is actually empty before calling it.
> +        */
> +       return list_empty(&dev->links.consumers);
> +}

We need the equivalent of this for resume.

> +
> +static void dpm_async_suspend_superior(struct device *dev, async_func_t func)
> +{
> +       struct device_link *link;
> +       int idx;
> +
> +       mutex_lock(&dpm_list_mtx);
> +
> +       /* Start processing the device's parent if it is "async". */
> +       if (dev->parent)
> +               dpm_async_with_cleanup(dev->parent, func);
> +
> +       mutex_unlock(&dpm_list_mtx);
> +
> +       idx = device_links_read_lock();
> +
> +       /* Start processing the device's "async" suppliers. */
> +       list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
> +               if (READ_ONCE(link->status) != DL_STATE_DORMANT)
> +                       dpm_async_with_cleanup(link->supplier, func);

We should check that the rest of the consumers of the supplier are
"done" before we queue the supplier. With 386 device links (and the
number only increases as we add support for more properties), there's
no doubt that we'll hit this often.

> +
> +       device_links_read_unlock(idx);

Is passing idx to unlock a new (within the past 6 months) thing? I
don't remember having to do this in the past.

> +}
> +
>  /**
>   * resume_event - Return a "resume" message for given "suspend" sleep state.
>   * @sleep_state: PM message representing a sleep state.
> @@ -1656,6 +1700,8 @@
>         device_links_read_unlock(idx);
>  }
>
> +static void async_suspend(void *data, async_cookie_t cookie);
> +
>  /**
>   * device_suspend - Execute "suspend" callbacks for given device.
>   * @dev: Device to handle.
> @@ -1785,7 +1831,13 @@
>
>         complete_all(&dev->power.completion);
>         TRACE_SUSPEND(error);
> -       return error;
> +
> +       if (error || async_error)
> +               return error;
> +
> +       dpm_async_suspend_superior(dev, async_suspend);
> +
> +       return 0;
>  }
>
>  static void async_suspend(void *data, async_cookie_t cookie)
> @@ -1803,6 +1855,7 @@
>  int dpm_suspend(pm_message_t state)
>  {
>         ktime_t starttime = ktime_get();
> +       struct device *dev;
>         int error = 0;
>
>         trace_suspend_resume(TPS("dpm_suspend"), state.event, true);
> @@ -1816,12 +1869,28 @@
>
>         mutex_lock(&dpm_list_mtx);
>
> +       /*
> +        * Start processing "async" leaf devices upfront because they don't need
> +        * to wait.
> +        */
> +       list_for_each_entry_reverse(dev, &dpm_prepared_list, power.entry) {
> +               dpm_clear_async_state(dev);
> +               if (dpm_leaf_device(dev))
> +                       dpm_async_with_cleanup(dev, async_suspend);
> +       }
> +
>         while (!list_empty(&dpm_prepared_list)) {
> -               struct device *dev = to_device(dpm_prepared_list.prev);
> +               dev = to_device(dpm_prepared_list.prev);
>
>                 list_move(&dev->power.entry, &dpm_suspended_list);
>
> -               dpm_clear_async_state(dev);
> +               /*
> +                * Move all devices to the target list to resume them properly
> +                * on errors.
> +                */

I did this initially on my end, but we have so many devices that
looping through them had a measurable impact. It's better to just
splice the lists on error.

-Saravana

> +               if (error || async_error)
> +                       continue;
> +
>                 if (dpm_async_fn(dev, async_suspend))
>                         continue;
>
> @@ -1834,9 +1903,6 @@
>                 put_device(dev);
>
>                 mutex_lock(&dpm_list_mtx);
> -
> -               if (error || async_error)
> -                       break;
>         }
>
>         mutex_unlock(&dpm_list_mtx);
>
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/3] PM: sleep: Resume children after resuming the parent
  2025-03-13 20:26 ` [PATCH v2 1/3] PM: sleep: Resume children after resuming the parent Rafael J. Wysocki
@ 2025-03-13 21:16   ` Saravana Kannan
  2025-03-13 22:29     ` Rafael J. Wysocki
  0 siblings, 1 reply; 9+ messages in thread
From: Saravana Kannan @ 2025-03-13 21:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, LKML, Alan Stern, Ulf Hansson, Johan Hovold,
	Manivannan Sadhasivam, Jon Hunter

On Thu, Mar 13, 2025 at 1:35 PM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> According to [1], the handling of device suspend and resume, and
> particularly the latter, involves unnecessary overhead related to
> starting new async work items for devices that cannot make progress
> right away because they have to wait for other devices.
>
> To reduce this problem in the resume path, use the observation that
> starting the async resume of the children of a device after resuming
> the parent is likely to produce less scheduling and memory management
> noise than starting it upfront while at the same time it should not
> increase the resume duration substantially.
>
> Accordingly, modify the code to start the async resume of the device's
> children when the processing of the parent has been completed in each
> stage of device resume and only start async resume upfront for devices
> without parents.
>
> Also make it check if a given device can be resumed asynchronously
> before starting the synchronous resume of it in case it will have to
> wait for another that is already resuming asynchronously.
>
> In addition to making the async resume of devices more friendly to
> systems with relatively less computing resources, this change is also
> preliminary for analogous changes in the suspend path.
>
> On the systems where it has been tested, this change by itself does
> not affect the overall system resume duration in a significant way.
>
> Link: https://lore.kernel.org/linux-pm/20241114220921.2529905-1-saravanak@google.com/ [1]
> Suggested-by: Saravana Kannan <saravanak@google.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>
> v1 -> v2:
>    Use a separate lock for power.work_in_progress protection which should
>    reduce lock contention on dpm_list_mtx.
>
> ---
>  drivers/base/power/main.c |   80 +++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 61 insertions(+), 19 deletions(-)
>
> --- a/drivers/base/power/main.c
> +++ b/drivers/base/power/main.c
> @@ -63,6 +63,7 @@
>  static DEFINE_MUTEX(dpm_list_mtx);
>  static pm_message_t pm_transition;
>
> +static DEFINE_MUTEX(async_wip_mtx);

I think (not sure) this can be a spinlock.

>  static int async_error;
>
>  static const char *pm_verb(int event)
> @@ -597,8 +598,11 @@
>                 && !pm_trace_is_enabled();
>  }
>
> -static bool dpm_async_fn(struct device *dev, async_func_t func)
> +static bool __dpm_async(struct device *dev, async_func_t func)
>  {
> +       if (dev->power.work_in_progress)
> +               return true;
> +
>         if (!is_async(dev))
>                 return false;
>
> @@ -611,14 +615,37 @@
>
>         put_device(dev);
>
> +       return false;
> +}
> +
> +static bool dpm_async_fn(struct device *dev, async_func_t func)
> +{
> +       guard(mutex)(&async_wip_mtx);
> +
> +       return __dpm_async(dev, func);
> +}
> +
> +static int dpm_async_with_cleanup(struct device *dev, void *fn)
> +{
> +       guard(mutex)(&async_wip_mtx);
> +
> +       if (!__dpm_async(dev, fn))
> +               dev->power.work_in_progress = false;
> +
> +       return 0;
> +}
> +
> +static void dpm_async_resume_children(struct device *dev, async_func_t func)
> +{
>         /*
> -        * async_schedule_dev_nocall() above has returned false, so func() is
> -        * not running and it is safe to update power.work_in_progress without
> -        * extra synchronization.
> +        * Start processing "async" children of the device unless it's been
> +        * started already for them.
> +        *
> +        * This could have been done for the device's "async" consumers too, but
> +        * they either need to wait for their parents or the processing has
> +        * already started for them after their parents were processed.
>          */
> -       dev->power.work_in_progress = false;
> -
> -       return false;
> +       device_for_each_child(dev, func, dpm_async_with_cleanup);

Continuing my comments from v1 here, it's not a good assumption to
make that the child can start resuming just because the parent has
finished resuming. In my example, I have 386 device links and ~600
devices that have some sort of suspend ops (I think the total device
node count is ~1700).

I'm even more confused by why you think resume needs to be asymmetric
with suspend. In suspend, you kick off all the suppliers too when a
device is done suspending, but in resume you don't kick off all the
consumers.

>  }
>
>  static void dpm_clear_async_state(struct device *dev)
> @@ -627,6 +654,8 @@
>         dev->power.work_in_progress = false;
>  }
>
> +static void async_resume_noirq(void *data, async_cookie_t cookie);
> +
>  /**
>   * device_resume_noirq - Execute a "noirq resume" callback for given device.
>   * @dev: Device to handle.
> @@ -710,6 +739,8 @@
>                 dpm_save_failed_dev(dev_name(dev));
>                 pm_dev_err(dev, state, async ? " async noirq" : " noirq", error);
>         }
> +
> +       dpm_async_resume_children(dev, async_resume_noirq);
>  }
>
>  static void async_resume_noirq(void *data, async_cookie_t cookie)
> @@ -733,19 +764,20 @@
>         mutex_lock(&dpm_list_mtx);
>
>         /*
> -        * Trigger the resume of "async" devices upfront so they don't have to
> -        * wait for the "non-async" ones they don't depend on.
> +        * Start processing "async" devices without parents upfront so they
> +        * don't wait for the "sync" devices they don't depend on.
>          */
>         list_for_each_entry(dev, &dpm_noirq_list, power.entry) {
>                 dpm_clear_async_state(dev);
> -               dpm_async_fn(dev, async_resume_noirq);
> +               if (!dev->parent)

This check isn't sufficient. There are plenty of devices where they
have no parent, but they have many suppliers. That's the reality in
the DT world. And when you do deeper down the tree
(dpm_async_resume_children), the children typically have more
suppliers.

You can also check for "no suppliers" to find the true leaf nodes and
start with them, but that means you also have to kick off the
consumers when you finish your resume. We definitely need to check
device links for this patchset to be useful for me. With my patch
series, it's effectively just NCPU kworkers running continuously on
each CPU. Won't be the case if we don't check the device links. And as
I said before, the overhead isn't just about context switches, but
also forking more kworker threads.


-Saravana

> +                       dpm_async_with_cleanup(dev, async_resume_noirq);
>         }
>
>         while (!list_empty(&dpm_noirq_list)) {
>                 dev = to_device(dpm_noirq_list.next);
>                 list_move_tail(&dev->power.entry, &dpm_late_early_list);
>
> -               if (!dev->power.work_in_progress) {
> +               if (!dpm_async_fn(dev, async_resume_noirq)) {
>                         get_device(dev);
>
>                         mutex_unlock(&dpm_list_mtx);
> @@ -781,6 +813,8 @@
>         device_wakeup_disarm_wake_irqs();
>  }
>
> +static void async_resume_early(void *data, async_cookie_t cookie);
> +
>  /**
>   * device_resume_early - Execute an "early resume" callback for given device.
>   * @dev: Device to handle.
> @@ -848,6 +882,8 @@
>                 dpm_save_failed_dev(dev_name(dev));
>                 pm_dev_err(dev, state, async ? " async early" : " early", error);
>         }
> +
> +       dpm_async_resume_children(dev, async_resume_early);
>  }
>
>  static void async_resume_early(void *data, async_cookie_t cookie)
> @@ -875,19 +911,20 @@
>         mutex_lock(&dpm_list_mtx);
>
>         /*
> -        * Trigger the resume of "async" devices upfront so they don't have to
> -        * wait for the "non-async" ones they don't depend on.
> +        * Start processing "async" devices without parents upfront so they
> +        * don't wait for the "sync" devices they don't depend on.
>          */
>         list_for_each_entry(dev, &dpm_late_early_list, power.entry) {
>                 dpm_clear_async_state(dev);
> -               dpm_async_fn(dev, async_resume_early);
> +               if (!dev->parent)
> +                       dpm_async_with_cleanup(dev, async_resume_early);
>         }
>
>         while (!list_empty(&dpm_late_early_list)) {
>                 dev = to_device(dpm_late_early_list.next);
>                 list_move_tail(&dev->power.entry, &dpm_suspended_list);
>
> -               if (!dev->power.work_in_progress) {
> +               if (!dpm_async_fn(dev, async_resume_early)) {
>                         get_device(dev);
>
>                         mutex_unlock(&dpm_list_mtx);
> @@ -919,6 +956,8 @@
>  }
>  EXPORT_SYMBOL_GPL(dpm_resume_start);
>
> +static void async_resume(void *data, async_cookie_t cookie);
> +
>  /**
>   * device_resume - Execute "resume" callbacks for given device.
>   * @dev: Device to handle.
> @@ -1018,6 +1057,8 @@
>                 dpm_save_failed_dev(dev_name(dev));
>                 pm_dev_err(dev, state, async ? " async" : "", error);
>         }
> +
> +       dpm_async_resume_children(dev, async_resume);
>  }
>
>  static void async_resume(void *data, async_cookie_t cookie)
> @@ -1049,19 +1090,20 @@
>         mutex_lock(&dpm_list_mtx);
>
>         /*
> -        * Trigger the resume of "async" devices upfront so they don't have to
> -        * wait for the "non-async" ones they don't depend on.
> +        * Start processing "async" devices without parents upfront so they
> +        * don't wait for the "sync" devices they don't depend on.
>          */
>         list_for_each_entry(dev, &dpm_suspended_list, power.entry) {
>                 dpm_clear_async_state(dev);
> -               dpm_async_fn(dev, async_resume);
> +               if (!dev->parent)
> +                       dpm_async_with_cleanup(dev, async_resume);
>         }
>
>         while (!list_empty(&dpm_suspended_list)) {
>                 dev = to_device(dpm_suspended_list.next);
>                 list_move_tail(&dev->power.entry, &dpm_prepared_list);
>
> -               if (!dev->power.work_in_progress) {
> +               if (!dpm_async_fn(dev, async_resume)) {
>                         get_device(dev);
>
>                         mutex_unlock(&dpm_list_mtx);
>
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/3] PM: sleep: Resume children after resuming the parent
  2025-03-13 21:16   ` Saravana Kannan
@ 2025-03-13 22:29     ` Rafael J. Wysocki
  0 siblings, 0 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2025-03-13 22:29 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rafael J. Wysocki, Linux PM, LKML, Alan Stern, Ulf Hansson,
	Johan Hovold, Manivannan Sadhasivam, Jon Hunter

On Thu, Mar 13, 2025 at 10:17 PM Saravana Kannan <saravanak@google.com> wrote:
>
> On Thu, Mar 13, 2025 at 1:35 PM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > According to [1], the handling of device suspend and resume, and
> > particularly the latter, involves unnecessary overhead related to
> > starting new async work items for devices that cannot make progress
> > right away because they have to wait for other devices.
> >
> > To reduce this problem in the resume path, use the observation that
> > starting the async resume of the children of a device after resuming
> > the parent is likely to produce less scheduling and memory management
> > noise than starting it upfront while at the same time it should not
> > increase the resume duration substantially.
> >
> > Accordingly, modify the code to start the async resume of the device's
> > children when the processing of the parent has been completed in each
> > stage of device resume and only start async resume upfront for devices
> > without parents.
> >
> > Also make it check if a given device can be resumed asynchronously
> > before starting the synchronous resume of it in case it will have to
> > wait for another that is already resuming asynchronously.
> >
> > In addition to making the async resume of devices more friendly to
> > systems with relatively less computing resources, this change is also
> > preliminary for analogous changes in the suspend path.
> >
> > On the systems where it has been tested, this change by itself does
> > not affect the overall system resume duration in a significant way.
> >
> > Link: https://lore.kernel.org/linux-pm/20241114220921.2529905-1-saravanak@google.com/ [1]
> > Suggested-by: Saravana Kannan <saravanak@google.com>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >
> > v1 -> v2:
> >    Use a separate lock for power.work_in_progress protection which should
> >    reduce lock contention on dpm_list_mtx.
> >
> > ---
> >  drivers/base/power/main.c |   80 +++++++++++++++++++++++++++++++++++-----------
> >  1 file changed, 61 insertions(+), 19 deletions(-)
> >
> > --- a/drivers/base/power/main.c
> > +++ b/drivers/base/power/main.c
> > @@ -63,6 +63,7 @@
> >  static DEFINE_MUTEX(dpm_list_mtx);
> >  static pm_message_t pm_transition;
> >
> > +static DEFINE_MUTEX(async_wip_mtx);
>
> I think (not sure) this can be a spinlock.

No, it can't, because async_schedule_dev_nocall() allocates memory
with GFP_KERNEL.

> >  static int async_error;
> >
> >  static const char *pm_verb(int event)
> > @@ -597,8 +598,11 @@
> >                 && !pm_trace_is_enabled();
> >  }
> >
> > -static bool dpm_async_fn(struct device *dev, async_func_t func)
> > +static bool __dpm_async(struct device *dev, async_func_t func)
> >  {
> > +       if (dev->power.work_in_progress)
> > +               return true;
> > +
> >         if (!is_async(dev))
> >                 return false;
> >
> > @@ -611,14 +615,37 @@
> >
> >         put_device(dev);
> >
> > +       return false;
> > +}
> > +
> > +static bool dpm_async_fn(struct device *dev, async_func_t func)
> > +{
> > +       guard(mutex)(&async_wip_mtx);
> > +
> > +       return __dpm_async(dev, func);
> > +}
> > +
> > +static int dpm_async_with_cleanup(struct device *dev, void *fn)
> > +{
> > +       guard(mutex)(&async_wip_mtx);
> > +
> > +       if (!__dpm_async(dev, fn))
> > +               dev->power.work_in_progress = false;
> > +
> > +       return 0;
> > +}
> > +
> > +static void dpm_async_resume_children(struct device *dev, async_func_t func)
> > +{
> >         /*
> > -        * async_schedule_dev_nocall() above has returned false, so func() is
> > -        * not running and it is safe to update power.work_in_progress without
> > -        * extra synchronization.
> > +        * Start processing "async" children of the device unless it's been
> > +        * started already for them.
> > +        *
> > +        * This could have been done for the device's "async" consumers too, but
> > +        * they either need to wait for their parents or the processing has
> > +        * already started for them after their parents were processed.
> >          */
> > -       dev->power.work_in_progress = false;
> > -
> > -       return false;
> > +       device_for_each_child(dev, func, dpm_async_with_cleanup);
>
> Continuing my comments from v1 here, it's not a good assumption to
> make that the child can start resuming just because the parent has
> finished resuming.

Well, there is one dependency less for it, so it makes sense to start
async processing for it because it may be ready.  It doesn't have to
be ready for sure, but if it's not ready, it'll wait.

I know that you want to avoid the waiting at all, but I'm not
convinced that this is necessary and even that this is the right
approach and I'm afraid the only way to convince me would be to
demonstrate it.

Explicit dependency tracking here would be extra overhead and
complexity and I'm not going to do it until I know that it is actually
worth the hassle.

> In my example, I have 386 device links and ~600
> devices that have some sort of suspend ops (I think the total device
> node count is ~1700).

However, some of those device links are between parents and children
IIRC.  If so, then how many?

> I'm even more confused by why you think resume needs to be asymmetric
> with suspend. In suspend, you kick off all the suppliers too when a
> device is done suspending, but in resume you don't kick off all the
> consumers.

This is based on the assumption that the majority of devices will have
a parent and since the diameter of the graph is less than the number
of devices, there are devices with multiple children whereas each
device can only have one parent.

Quite likely there are also devices with multiple consumers, but how
many devices with multiple suppliers are there?

> >  }
> >
> >  static void dpm_clear_async_state(struct device *dev)
> > @@ -627,6 +654,8 @@
> >         dev->power.work_in_progress = false;
> >  }
> >
> > +static void async_resume_noirq(void *data, async_cookie_t cookie);
> > +
> >  /**
> >   * device_resume_noirq - Execute a "noirq resume" callback for given device.
> >   * @dev: Device to handle.
> > @@ -710,6 +739,8 @@
> >                 dpm_save_failed_dev(dev_name(dev));
> >                 pm_dev_err(dev, state, async ? " async noirq" : " noirq", error);
> >         }
> > +
> > +       dpm_async_resume_children(dev, async_resume_noirq);
> >  }
> >
> >  static void async_resume_noirq(void *data, async_cookie_t cookie)
> > @@ -733,19 +764,20 @@
> >         mutex_lock(&dpm_list_mtx);
> >
> >         /*
> > -        * Trigger the resume of "async" devices upfront so they don't have to
> > -        * wait for the "non-async" ones they don't depend on.
> > +        * Start processing "async" devices without parents upfront so they
> > +        * don't wait for the "sync" devices they don't depend on.
> >          */
> >         list_for_each_entry(dev, &dpm_noirq_list, power.entry) {
> >                 dpm_clear_async_state(dev);
> > -               dpm_async_fn(dev, async_resume_noirq);
> > +               if (!dev->parent)
>
> This check isn't sufficient. There are plenty of devices where they
> have no parent, but they have many suppliers.

Again, how many?

> That's the reality in the DT world. And when you do deeper down the tree
> (dpm_async_resume_children), the children typically have more
> suppliers.

The numbers you gave above don't seem to support this, though, and
even so, by the time the parent has resumed, it is quite likely that
the suppliers have resumed either.

> You can also check for "no suppliers" to find the true leaf nodes and
> start with them, but that means you also have to kick off the
> consumers when you finish your resume.

That can be done, but how much of a difference will it really make?
You are saying "a lot", but this is not really quantitative.

> We definitely need to check
> device links for this patchset to be useful for me. With my patch
> series, it's effectively just NCPU kworkers running continuously on
> each CPU. Won't be the case if we don't check the device links. And as
> I said before, the overhead isn't just about context switches, but
> also forking more kworker threads.

What about comparing this series with your series on the system in
question and sharing the numbers?  You could easily get a stake in the
ground this way.

Cheers, Rafael

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] PM: sleep: Suspend parents and suppliers after suspending subordinates
  2025-03-13 21:16   ` Saravana Kannan
@ 2025-03-13 22:36     ` Rafael J. Wysocki
  2025-03-14 10:23       ` Rafael J. Wysocki
  0 siblings, 1 reply; 9+ messages in thread
From: Rafael J. Wysocki @ 2025-03-13 22:36 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rafael J. Wysocki, Linux PM, LKML, Alan Stern, Ulf Hansson,
	Johan Hovold, Manivannan Sadhasivam, Jon Hunter

On Thu, Mar 13, 2025 at 10:17 PM Saravana Kannan <saravanak@google.com> wrote:
>
> On Thu, Mar 13, 2025 at 1:35 PM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > In analogy with the previous change affecting the resume path,
> > make device_suspend() start the async suspend of the device's parent
> > and suppliers after the device itself has been processed and make
> > dpm_suspend() start processing "async" leaf devices (that is, devices
> > without children or consumers) upfront because they don't need to wait
> > for any other devices.
> >
> > On the Dell XPS13 9360 in my office, this change reduces the total
> > duration of device suspend by approximately 100 ms (over 20%).
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Suggested-by: Saravana Kannan <saravanak@google.com>
> > ---
> >
> > v1 -> v2:
> >    * Adjust for the changes in patch [1/3].
> >    * Fix walking suppliers in dpm_async_suspend_superior().
> >    * Use device links read locking in dpm_async_suspend_superior() (Saravana).
> >    * Move all devices to the target list even if there are errors in
> >      dpm_suspend() so they are properly resumed during rollback (Saravana).
> >
> > ---
> >  drivers/base/power/main.c |   78 ++++++++++++++++++++++++++++++++++++++++++----
> >  1 file changed, 72 insertions(+), 6 deletions(-)
> >
> > --- a/drivers/base/power/main.c
> > +++ b/drivers/base/power/main.c
> > @@ -1231,6 +1231,50 @@
> >
> >  /*------------------------- Suspend routines -------------------------*/
> >
> > +static bool dpm_leaf_device(struct device *dev)
> > +{
> > +       struct device *child;
> > +
> > +       lockdep_assert_held(&dpm_list_mtx);
> > +
> > +       child = device_find_any_child(dev);
> > +       if (child) {
> > +               put_device(child);
> > +
> > +               return false;
> > +       }
> > +
> > +       /*
> > +        * Since this function is required to run under dpm_list_mtx, the
> > +        * list_empty() below will only return true if the device's list of
> > +        * consumers is actually empty before calling it.
> > +        */
> > +       return list_empty(&dev->links.consumers);
> > +}
>
> We need the equivalent of this for resume.

Maybe.

> > +
> > +static void dpm_async_suspend_superior(struct device *dev, async_func_t func)
> > +{
> > +       struct device_link *link;
> > +       int idx;
> > +
> > +       mutex_lock(&dpm_list_mtx);
> > +
> > +       /* Start processing the device's parent if it is "async". */
> > +       if (dev->parent)
> > +               dpm_async_with_cleanup(dev->parent, func);
> > +
> > +       mutex_unlock(&dpm_list_mtx);
> > +
> > +       idx = device_links_read_lock();
> > +
> > +       /* Start processing the device's "async" suppliers. */
> > +       list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
> > +               if (READ_ONCE(link->status) != DL_STATE_DORMANT)
> > +                       dpm_async_with_cleanup(link->supplier, func);
>
> We should check that the rest of the consumers of the supplier are
> "done" before we queue the supplier. With 386 device links (and the
> number only increases as we add support for more properties), there's
> no doubt that we'll hit this often.

And I'm not doing this until I see any data confirming that it makes a
difference in the order of 10% or more.

> > +
> > +       device_links_read_unlock(idx);
>
> Is passing idx to unlock a new (within the past 6 months) thing? I
> don't remember having to do this in the past.

It's SRCU and it's been there forever.

> > +}
> > +
> >  /**
> >   * resume_event - Return a "resume" message for given "suspend" sleep state.
> >   * @sleep_state: PM message representing a sleep state.
> > @@ -1656,6 +1700,8 @@
> >         device_links_read_unlock(idx);
> >  }
> >
> > +static void async_suspend(void *data, async_cookie_t cookie);
> > +
> >  /**
> >   * device_suspend - Execute "suspend" callbacks for given device.
> >   * @dev: Device to handle.
> > @@ -1785,7 +1831,13 @@
> >
> >         complete_all(&dev->power.completion);
> >         TRACE_SUSPEND(error);
> > -       return error;
> > +
> > +       if (error || async_error)
> > +               return error;
> > +
> > +       dpm_async_suspend_superior(dev, async_suspend);
> > +
> > +       return 0;
> >  }
> >
> >  static void async_suspend(void *data, async_cookie_t cookie)
> > @@ -1803,6 +1855,7 @@
> >  int dpm_suspend(pm_message_t state)
> >  {
> >         ktime_t starttime = ktime_get();
> > +       struct device *dev;
> >         int error = 0;
> >
> >         trace_suspend_resume(TPS("dpm_suspend"), state.event, true);
> > @@ -1816,12 +1869,28 @@
> >
> >         mutex_lock(&dpm_list_mtx);
> >
> > +       /*
> > +        * Start processing "async" leaf devices upfront because they don't need
> > +        * to wait.
> > +        */
> > +       list_for_each_entry_reverse(dev, &dpm_prepared_list, power.entry) {
> > +               dpm_clear_async_state(dev);
> > +               if (dpm_leaf_device(dev))
> > +                       dpm_async_with_cleanup(dev, async_suspend);
> > +       }
> > +
> >         while (!list_empty(&dpm_prepared_list)) {
> > -               struct device *dev = to_device(dpm_prepared_list.prev);
> > +               dev = to_device(dpm_prepared_list.prev);
> >
> >                 list_move(&dev->power.entry, &dpm_suspended_list);
> >
> > -               dpm_clear_async_state(dev);
> > +               /*
> > +                * Move all devices to the target list to resume them properly
> > +                * on errors.
> > +                */
>
> I did this initially on my end, but we have so many devices that
> looping through them had a measurable impact.

Which I guess is super-important for error handling.  Come on.

> It's better to just splice the lists on error.

On top of this change, yes.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] PM: sleep: Suspend parents and suppliers after suspending subordinates
  2025-03-13 22:36     ` Rafael J. Wysocki
@ 2025-03-14 10:23       ` Rafael J. Wysocki
  0 siblings, 0 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2025-03-14 10:23 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rafael J. Wysocki, Linux PM, LKML, Alan Stern, Ulf Hansson,
	Johan Hovold, Manivannan Sadhasivam, Jon Hunter

On Thu, Mar 13, 2025 at 11:36 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Thu, Mar 13, 2025 at 10:17 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > On Thu, Mar 13, 2025 at 1:35 PM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > >
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >
> > > In analogy with the previous change affecting the resume path,
> > > make device_suspend() start the async suspend of the device's parent
> > > and suppliers after the device itself has been processed and make
> > > dpm_suspend() start processing "async" leaf devices (that is, devices
> > > without children or consumers) upfront because they don't need to wait
> > > for any other devices.
> > >
> > > On the Dell XPS13 9360 in my office, this change reduces the total
> > > duration of device suspend by approximately 100 ms (over 20%).
> > >
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > Suggested-by: Saravana Kannan <saravanak@google.com>
> > > ---
> > >
> > > v1 -> v2:
> > >    * Adjust for the changes in patch [1/3].
> > >    * Fix walking suppliers in dpm_async_suspend_superior().
> > >    * Use device links read locking in dpm_async_suspend_superior() (Saravana).
> > >    * Move all devices to the target list even if there are errors in
> > >      dpm_suspend() so they are properly resumed during rollback (Saravana).
> > >
> > > ---
> > >  drivers/base/power/main.c |   78 ++++++++++++++++++++++++++++++++++++++++++----
> > >  1 file changed, 72 insertions(+), 6 deletions(-)
> > >
> > > --- a/drivers/base/power/main.c
> > > +++ b/drivers/base/power/main.c
> > > @@ -1231,6 +1231,50 @@
> > >
> > >  /*------------------------- Suspend routines -------------------------*/
> > >
> > > +static bool dpm_leaf_device(struct device *dev)
> > > +{
> > > +       struct device *child;
> > > +
> > > +       lockdep_assert_held(&dpm_list_mtx);
> > > +
> > > +       child = device_find_any_child(dev);
> > > +       if (child) {
> > > +               put_device(child);
> > > +
> > > +               return false;
> > > +       }
> > > +
> > > +       /*
> > > +        * Since this function is required to run under dpm_list_mtx, the
> > > +        * list_empty() below will only return true if the device's list of
> > > +        * consumers is actually empty before calling it.
> > > +        */
> > > +       return list_empty(&dev->links.consumers);
> > > +}
> >
> > We need the equivalent of this for resume.
>
> Maybe.
>
> > > +
> > > +static void dpm_async_suspend_superior(struct device *dev, async_func_t func)
> > > +{
> > > +       struct device_link *link;
> > > +       int idx;
> > > +
> > > +       mutex_lock(&dpm_list_mtx);
> > > +
> > > +       /* Start processing the device's parent if it is "async". */
> > > +       if (dev->parent)
> > > +               dpm_async_with_cleanup(dev->parent, func);
> > > +
> > > +       mutex_unlock(&dpm_list_mtx);
> > > +
> > > +       idx = device_links_read_lock();
> > > +
> > > +       /* Start processing the device's "async" suppliers. */
> > > +       list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
> > > +               if (READ_ONCE(link->status) != DL_STATE_DORMANT)
> > > +                       dpm_async_with_cleanup(link->supplier, func);
> >
> > We should check that the rest of the consumers of the supplier are
> > "done" before we queue the supplier. With 386 device links (and the
> > number only increases as we add support for more properties), there's
> > no doubt that we'll hit this often.
>
> And I'm not doing this until I see any data confirming that it makes a
> difference in the order of 10% or more.
>
> > > +
> > > +       device_links_read_unlock(idx);
> >
> > Is passing idx to unlock a new (within the past 6 months) thing? I
> > don't remember having to do this in the past.
>
> It's SRCU and it's been there forever.
>
> > > +}
> > > +
> > >  /**
> > >   * resume_event - Return a "resume" message for given "suspend" sleep state.
> > >   * @sleep_state: PM message representing a sleep state.
> > > @@ -1656,6 +1700,8 @@
> > >         device_links_read_unlock(idx);
> > >  }
> > >
> > > +static void async_suspend(void *data, async_cookie_t cookie);
> > > +
> > >  /**
> > >   * device_suspend - Execute "suspend" callbacks for given device.
> > >   * @dev: Device to handle.
> > > @@ -1785,7 +1831,13 @@
> > >
> > >         complete_all(&dev->power.completion);
> > >         TRACE_SUSPEND(error);
> > > -       return error;
> > > +
> > > +       if (error || async_error)
> > > +               return error;
> > > +
> > > +       dpm_async_suspend_superior(dev, async_suspend);
> > > +
> > > +       return 0;
> > >  }
> > >
> > >  static void async_suspend(void *data, async_cookie_t cookie)
> > > @@ -1803,6 +1855,7 @@
> > >  int dpm_suspend(pm_message_t state)
> > >  {
> > >         ktime_t starttime = ktime_get();
> > > +       struct device *dev;
> > >         int error = 0;
> > >
> > >         trace_suspend_resume(TPS("dpm_suspend"), state.event, true);
> > > @@ -1816,12 +1869,28 @@
> > >
> > >         mutex_lock(&dpm_list_mtx);
> > >
> > > +       /*
> > > +        * Start processing "async" leaf devices upfront because they don't need
> > > +        * to wait.
> > > +        */
> > > +       list_for_each_entry_reverse(dev, &dpm_prepared_list, power.entry) {
> > > +               dpm_clear_async_state(dev);
> > > +               if (dpm_leaf_device(dev))
> > > +                       dpm_async_with_cleanup(dev, async_suspend);
> > > +       }
> > > +
> > >         while (!list_empty(&dpm_prepared_list)) {
> > > -               struct device *dev = to_device(dpm_prepared_list.prev);
> > > +               dev = to_device(dpm_prepared_list.prev);
> > >
> > >                 list_move(&dev->power.entry, &dpm_suspended_list);
> > >
> > > -               dpm_clear_async_state(dev);
> > > +               /*
> > > +                * Move all devices to the target list to resume them properly
> > > +                * on errors.
> > > +                */
> >
> > I did this initially on my end, but we have so many devices that
> > looping through them had a measurable impact.
>
> Which I guess is super-important for error handling.  Come on.
>
> > It's better to just splice the lists on error.
>
> On top of this change, yes.

Actually, adding a list_splice() after the error check would be a
simpler change than moving the check, so I'm doing this in v3.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-03-14 10:23 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-13 20:17 [PATCH v2 0/3] PM: sleep: Improvements of async suspend and resume of devices Rafael J. Wysocki
2025-03-13 20:26 ` [PATCH v2 1/3] PM: sleep: Resume children after resuming the parent Rafael J. Wysocki
2025-03-13 21:16   ` Saravana Kannan
2025-03-13 22:29     ` Rafael J. Wysocki
2025-03-13 20:34 ` [PATCH v2 2/3] PM: sleep: Suspend parents and suppliers after suspending subordinates Rafael J. Wysocki
2025-03-13 21:16   ` Saravana Kannan
2025-03-13 22:36     ` Rafael J. Wysocki
2025-03-14 10:23       ` Rafael J. Wysocki
2025-03-13 20:35 ` [PATCH v2 3/3] PM: sleep: Make suspend of devices more asynchronous Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox