public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/6] thermal: core: Fixes, simplifications and suspend/resume relocation
@ 2026-04-06 16:04 Rafael J. Wysocki
  2026-04-06 16:07 ` [PATCH v3 1/6] thermal: core: Fix thermal zone governor cleanup issues Rafael J. Wysocki
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Rafael J. Wysocki @ 2026-04-06 16:04 UTC (permalink / raw)
  To: Linux PM; +Cc: Daniel Lezcano, LKML, Lukasz Luba, Armin Wolf

Hi All,

This is an update of

https://lore.kernel.org/linux-pm/6277980.lOV4Wx5bFT@rafael.j.wysocki/

that adds two fixes to it and updates the last two patches of the
original series.

This series is intended for 7.1 (it applies on top of linux-next).

If fixes the thermal zone removal and registration rollback path by
addressing race possible race conditions and a memory leak in that
core (patches [1-2/6]), removes a redundant check (patch [3/6]), changes
the thermal workqueue to an unbound and non-freezable one (patch [4/6]),
changes the allocation of thermal_class to static (patch [5/4]), and
relocates the suspend and resume of thermal zones closer to the suspend
and resume of devices, respectively (patch [6/6]).

Thanks!




^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/6] thermal: core: Fix thermal zone governor cleanup issues
  2026-04-06 16:04 [PATCH v3 0/6] thermal: core: Fixes, simplifications and suspend/resume relocation Rafael J. Wysocki
@ 2026-04-06 16:07 ` Rafael J. Wysocki
  2026-04-06 16:09 ` [PATCH v3 2/6] thermal: core: Free thermal zone ID later during removal Rafael J. Wysocki
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Rafael J. Wysocki @ 2026-04-06 16:07 UTC (permalink / raw)
  To: Linux PM; +Cc: Daniel Lezcano, LKML, Lukasz Luba, Armin Wolf

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

If thermal_zone_device_register_with_trips() fails after adding
a thermal governor to the thermal zone being registered, the
governor is not removed from it as appropriate which may lead to
a memory leak, so address this by adding the governor cleanup to
the rollback path.

In turn, thermal_zone_device_unregister() calls thermal_set_governor()
without acquiring the thermal zone lock beforehand which may race with
a governor update via sysfs and may lead to a use-after-free in that
case, so address it by placing the cleanup thermal_set_governor()
call after the wait_for_completion() one, which reflects the
registration error path ordering.

Fixes: e33df1d2f3a0 ("thermal: let governors have private data for each thermal zone")
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

v2 -> v3: New patch

---
 drivers/thermal/thermal_core.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1618,7 +1618,7 @@ thermal_zone_device_register_with_trips(
 	/* Add nodes that are always present via .groups */
 	result = thermal_zone_create_device_groups(tz);
 	if (result)
-		goto remove_id;
+		goto remove_governor;
 
 	result = device_register(&tz->device);
 	if (result)
@@ -1649,6 +1649,8 @@ unregister:
 release_device:
 	put_device(&tz->device);
 	wait_for_completion(&tz->removal);
+remove_governor:
+	thermal_set_governor(tz, NULL);
 remove_id:
 	ida_free(&thermal_tz_ida, id);
 free_tzp:
@@ -1731,8 +1733,6 @@ void thermal_zone_device_unregister(stru
 
 	cancel_delayed_work_sync(&tz->poll_queue);
 
-	thermal_set_governor(tz, NULL);
-
 	thermal_thresholds_exit(tz);
 	thermal_remove_hwmon_sysfs(tz);
 	ida_free(&thermal_tz_ida, tz->id);
@@ -1744,6 +1744,9 @@ void thermal_zone_device_unregister(stru
 	thermal_notify_tz_delete(tz);
 
 	wait_for_completion(&tz->removal);
+
+	thermal_set_governor(tz, NULL);
+
 	kfree(tz->tzp);
 	kfree(tz);
 }




^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 2/6] thermal: core: Free thermal zone ID later during removal
  2026-04-06 16:04 [PATCH v3 0/6] thermal: core: Fixes, simplifications and suspend/resume relocation Rafael J. Wysocki
  2026-04-06 16:07 ` [PATCH v3 1/6] thermal: core: Fix thermal zone governor cleanup issues Rafael J. Wysocki
@ 2026-04-06 16:09 ` Rafael J. Wysocki
  2026-04-06 16:10 ` [PATCH v3 3/6] thermal: core: Drop redundant check from thermal_zone_device_update() Rafael J. Wysocki
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Rafael J. Wysocki @ 2026-04-06 16:09 UTC (permalink / raw)
  To: Linux PM; +Cc: Daniel Lezcano, LKML, Lukasz Luba, Armin Wolf

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The thermal zone removal ordering is different from the thermal zone
registration rollback path ordering and the former is arguably
problematic because freeing a thermal zone ID prematurely may cause
it to be used during the registration of another thermal zone which
may fail as a result.

Prevent that from occurring by changing the thermal zone removal
ordering to reflect the thermal zone registration rollback path
ordering.

Fixes: b31ef8285b19 ("thermal core: convert ID allocation to IDA")
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

v2 -> v3: New patch

---
 drivers/thermal/thermal_core.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1735,8 +1735,6 @@ void thermal_zone_device_unregister(stru
 
 	thermal_thresholds_exit(tz);
 	thermal_remove_hwmon_sysfs(tz);
-	ida_free(&thermal_tz_ida, tz->id);
-	ida_destroy(&tz->ida);
 
 	device_del(&tz->device);
 	put_device(&tz->device);
@@ -1747,6 +1745,9 @@ void thermal_zone_device_unregister(stru
 
 	thermal_set_governor(tz, NULL);
 
+	ida_free(&thermal_tz_ida, tz->id);
+	ida_destroy(&tz->ida);
+
 	kfree(tz->tzp);
 	kfree(tz);
 }




^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 3/6] thermal: core: Drop redundant check from thermal_zone_device_update()
  2026-04-06 16:04 [PATCH v3 0/6] thermal: core: Fixes, simplifications and suspend/resume relocation Rafael J. Wysocki
  2026-04-06 16:07 ` [PATCH v3 1/6] thermal: core: Fix thermal zone governor cleanup issues Rafael J. Wysocki
  2026-04-06 16:09 ` [PATCH v3 2/6] thermal: core: Free thermal zone ID later during removal Rafael J. Wysocki
@ 2026-04-06 16:10 ` Rafael J. Wysocki
  2026-04-06 16:11 ` [PATCH v3 4/6] thermal: core: Change thermal_wq to be unbound and not freezable Rafael J. Wysocki
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Rafael J. Wysocki @ 2026-04-06 16:10 UTC (permalink / raw)
  To: Linux PM; +Cc: Daniel Lezcano, LKML, Lukasz Luba, Armin Wolf

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Since __thermal_zone_device_update() checks if tz->state is
TZ_STATE_READY and bails out immediately otherwise, it is not
necessary to check the thermal_zone_is_present() return value in
thermal_zone_device_update().  Namely, tz->state is equal to
TZ_STATE_FLAG_INIT initially and that flag is only cleared in
thermal_zone_init_complete() after adding tz to the list of thermal
zones, and thermal_zone_exit() sets TZ_STATE_FLAG_EXIT in tz->state
while removing tz from that list.  Thus tz->state is not TZ_STATE_READY
when tz is not in the list and the check mentioned above is redundant.

Accordingly, drop the redundant thermal_zone_is_present() check from
thermal_zone_device_update() and drop the former altogether because it
has no more users.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

v1 -> v3: No changes

---
 drivers/thermal/thermal_core.c |    8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -702,18 +702,12 @@ int thermal_zone_device_disable(struct t
 }
 EXPORT_SYMBOL_GPL(thermal_zone_device_disable);
 
-static bool thermal_zone_is_present(struct thermal_zone_device *tz)
-{
-	return !list_empty(&tz->node);
-}
-
 void thermal_zone_device_update(struct thermal_zone_device *tz,
 				enum thermal_notify_event event)
 {
 	guard(thermal_zone)(tz);
 
-	if (thermal_zone_is_present(tz))
-		__thermal_zone_device_update(tz, event);
+	__thermal_zone_device_update(tz, event);
 }
 EXPORT_SYMBOL_GPL(thermal_zone_device_update);
 




^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 4/6] thermal: core: Change thermal_wq to be unbound and not freezable
  2026-04-06 16:04 [PATCH v3 0/6] thermal: core: Fixes, simplifications and suspend/resume relocation Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2026-04-06 16:10 ` [PATCH v3 3/6] thermal: core: Drop redundant check from thermal_zone_device_update() Rafael J. Wysocki
@ 2026-04-06 16:11 ` Rafael J. Wysocki
  2026-04-06 16:14 ` [PATCH v3 5/6] thermal: core: Allocate thermal_class statically Rafael J. Wysocki
  2026-04-06 16:16 ` [PATCH v3 6/6] thermal: core: Suspend thermal zones later and resume them earlier Rafael J. Wysocki
  5 siblings, 0 replies; 8+ messages in thread
From: Rafael J. Wysocki @ 2026-04-06 16:11 UTC (permalink / raw)
  To: Linux PM; +Cc: Daniel Lezcano, LKML, Lukasz Luba, Armin Wolf

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The thermal workqueue doesn't need to be freezable or per-CPU, so drop
WQ_FREEZABLE and WQ_PERCPU from the flags when allocating it.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

v1 -> v3: No changes

---
 drivers/thermal/thermal_core.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1919,8 +1919,7 @@ static int __init thermal_init(void)
 	if (result)
 		goto error;
 
-	thermal_wq = alloc_workqueue("thermal_events",
-				      WQ_FREEZABLE | WQ_POWER_EFFICIENT | WQ_PERCPU, 0);
+	thermal_wq = alloc_workqueue("thermal_events", WQ_POWER_EFFICIENT, 0);
 	if (!thermal_wq) {
 		result = -ENOMEM;
 		goto unregister_netlink;




^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 5/6] thermal: core: Allocate thermal_class statically
  2026-04-06 16:04 [PATCH v3 0/6] thermal: core: Fixes, simplifications and suspend/resume relocation Rafael J. Wysocki
                   ` (3 preceding siblings ...)
  2026-04-06 16:11 ` [PATCH v3 4/6] thermal: core: Change thermal_wq to be unbound and not freezable Rafael J. Wysocki
@ 2026-04-06 16:14 ` Rafael J. Wysocki
  2026-04-06 16:16 ` [PATCH v3 6/6] thermal: core: Suspend thermal zones later and resume them earlier Rafael J. Wysocki
  5 siblings, 0 replies; 8+ messages in thread
From: Rafael J. Wysocki @ 2026-04-06 16:14 UTC (permalink / raw)
  To: Linux PM; +Cc: Daniel Lezcano, LKML, Lukasz Luba, Armin Wolf

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Define thermal_class as a static structure to simplify thermal_init()
and to simplify thermal class availability checks that will need to
be carried out during the suspend and resume of thermal zones after
subsequent changes.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

v2 -> v3:
   * Use static variable thermal_class_unavailable (instead of a function)
     for checking if thermal_class is available.

v1 -> v2:
   * Reorder with respect to the next patch to allow the latter to be simpler
   * Add thermal_class_unavailable() (the next patch uses it too)

---
 drivers/thermal/thermal_core.c |   30 ++++++++++++------------------
 1 file changed, 12 insertions(+), 18 deletions(-)

--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -977,7 +977,11 @@ static void thermal_release(struct devic
 	}
 }
 
-static struct class *thermal_class;
+static const struct class thermal_class = {
+	.name = "thermal",
+	.dev_release = thermal_release,
+};
+static bool thermal_class_unavailable __ro_after_init = true;
 
 static inline
 void print_bind_err_msg(struct thermal_zone_device *tz,
@@ -1070,7 +1074,7 @@ __thermal_cooling_device_register(struct
 	    !ops->set_cur_state)
 		return ERR_PTR(-EINVAL);
 
-	if (!thermal_class)
+	if (thermal_class_unavailable)
 		return ERR_PTR(-ENODEV);
 
 	cdev = kzalloc_obj(*cdev);
@@ -1093,7 +1097,7 @@ __thermal_cooling_device_register(struct
 	cdev->np = np;
 	cdev->ops = ops;
 	cdev->updated = false;
-	cdev->device.class = thermal_class;
+	cdev->device.class = &thermal_class;
 	cdev->devdata = devdata;
 
 	ret = cdev->ops->get_max_state(cdev, &cdev->max_state);
@@ -1541,7 +1545,7 @@ thermal_zone_device_register_with_trips(
 	if (polling_delay && passive_delay > polling_delay)
 		return ERR_PTR(-EINVAL);
 
-	if (!thermal_class)
+	if (thermal_class_unavailable)
 		return ERR_PTR(-ENODEV);
 
 	tz = kzalloc_flex(*tz, trips, num_trips);
@@ -1577,7 +1581,7 @@ thermal_zone_device_register_with_trips(
 	if (!tz->ops.critical)
 		tz->ops.critical = thermal_zone_device_critical;
 
-	tz->device.class = thermal_class;
+	tz->device.class = &thermal_class;
 	tz->devdata = devdata;
 	tz->num_trips = num_trips;
 	for_each_trip_desc(tz, td) {
@@ -1929,21 +1933,11 @@ static int __init thermal_init(void)
 	if (result)
 		goto destroy_workqueue;
 
-	thermal_class = kzalloc_obj(*thermal_class);
-	if (!thermal_class) {
-		result = -ENOMEM;
+	result = class_register(&thermal_class);
+	if (result)
 		goto unregister_governors;
-	}
 
-	thermal_class->name = "thermal";
-	thermal_class->dev_release = thermal_release;
-
-	result = class_register(thermal_class);
-	if (result) {
-		kfree(thermal_class);
-		thermal_class = NULL;
-		goto unregister_governors;
-	}
+	thermal_class_unavailable = false;
 
 	result = register_pm_notifier(&thermal_pm_nb);
 	if (result)




^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 6/6] thermal: core: Suspend thermal zones later and resume them earlier
  2026-04-06 16:04 [PATCH v3 0/6] thermal: core: Fixes, simplifications and suspend/resume relocation Rafael J. Wysocki
                   ` (4 preceding siblings ...)
  2026-04-06 16:14 ` [PATCH v3 5/6] thermal: core: Allocate thermal_class statically Rafael J. Wysocki
@ 2026-04-06 16:16 ` Rafael J. Wysocki
  2026-04-06 21:40   ` Armin Wolf
  5 siblings, 1 reply; 8+ messages in thread
From: Rafael J. Wysocki @ 2026-04-06 16:16 UTC (permalink / raw)
  To: Linux PM; +Cc: Daniel Lezcano, LKML, Lukasz Luba, Armin Wolf

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

To avoid some undesirable interactions between thermal zone suspend
and resume with user space that is running when those operations are
carried out, move them closer to the suspend and resume of devices,
respectively, by updating dpm_prepare() to carry out thermal zone
suspend and dpm_complete() to start thermal zone resume (that will
continue asynchronously).

This also makes the code easier to follow by removing one, arguably
redundant, level of indirection represented by the thermal PM notifier.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

v2 -> v3:
   * Rebase on top of the v3 of the previous patch

v1 -> v2:
   * Reorder with respect to the previous patch
   * Use thermal_class_unavailable() to avoid running code that should
     not run without the thermal class
   * Suspend thermal zones after disabling device probing and resume
     them before enabling device probing for better synchronization

---
 drivers/base/power/main.c      |    5 +++
 drivers/thermal/thermal_core.c |   60 ++++++++++++-----------------------------
 include/linux/thermal.h        |    6 ++++
 3 files changed, 29 insertions(+), 42 deletions(-)

--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -33,6 +33,7 @@
 #include <trace/events/power.h>
 #include <linux/cpufreq.h>
 #include <linux/devfreq.h>
+#include <linux/thermal.h>
 #include <linux/timer.h>
 #include <linux/nmi.h>
 
@@ -1282,6 +1283,8 @@ void dpm_complete(pm_message_t state)
 	list_splice(&list, &dpm_list);
 	mutex_unlock(&dpm_list_mtx);
 
+	/* Start resuming thermal control */
+	thermal_pm_complete();
 	/* Allow device probing and trigger re-probing of deferred devices */
 	device_unblock_probing();
 	trace_suspend_resume(TPS("dpm_complete"), state.event, false);
@@ -2225,6 +2228,8 @@ int dpm_prepare(pm_message_t state)
 	 * instead. The normal behavior will be restored in dpm_complete().
 	 */
 	device_block_probing();
+	/* Suspend thermal control. */
+	thermal_pm_prepare();
 
 	mutex_lock(&dpm_list_mtx);
 	while (!list_empty(&dpm_list) && !error) {
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1838,7 +1838,7 @@ static void thermal_zone_pm_prepare(stru
 	cancel_delayed_work(&tz->poll_queue);
 }
 
-static void thermal_pm_notify_prepare(void)
+static void __thermal_pm_prepare(void)
 {
 	struct thermal_zone_device *tz;
 
@@ -1850,6 +1850,19 @@ static void thermal_pm_notify_prepare(vo
 		thermal_zone_pm_prepare(tz);
 }
 
+void thermal_pm_prepare(void)
+{
+	if (thermal_class_unavailable)
+		return;
+
+	__thermal_pm_prepare();
+	/*
+	 * Allow any leftover thermal work items already on the worqueue to
+	 * complete so they don't get in the way later.
+	 */
+	flush_workqueue(thermal_wq);
+}
+
 static void thermal_zone_pm_complete(struct thermal_zone_device *tz)
 {
 	guard(thermal_zone)(tz);
@@ -1866,10 +1879,13 @@ static void thermal_zone_pm_complete(str
 	mod_delayed_work(thermal_wq, &tz->poll_queue, 0);
 }
 
-static void thermal_pm_notify_complete(void)
+void thermal_pm_complete(void)
 {
 	struct thermal_zone_device *tz;
 
+	if (thermal_class_unavailable)
+		return;
+
 	guard(mutex)(&thermal_list_lock);
 
 	thermal_pm_suspended = false;
@@ -1878,41 +1894,6 @@ static void thermal_pm_notify_complete(v
 		thermal_zone_pm_complete(tz);
 }
 
-static int thermal_pm_notify(struct notifier_block *nb,
-			     unsigned long mode, void *_unused)
-{
-	switch (mode) {
-	case PM_HIBERNATION_PREPARE:
-	case PM_RESTORE_PREPARE:
-	case PM_SUSPEND_PREPARE:
-		thermal_pm_notify_prepare();
-		/*
-		 * Allow any leftover thermal work items already on the
-		 * worqueue to complete so they don't get in the way later.
-		 */
-		flush_workqueue(thermal_wq);
-		break;
-	case PM_POST_HIBERNATION:
-	case PM_POST_RESTORE:
-	case PM_POST_SUSPEND:
-		thermal_pm_notify_complete();
-		break;
-	default:
-		break;
-	}
-	return 0;
-}
-
-static struct notifier_block thermal_pm_nb = {
-	.notifier_call = thermal_pm_notify,
-	/*
-	 * Run at the lowest priority to avoid interference between the thermal
-	 * zone resume work items spawned by thermal_pm_notify() and the other
-	 * PM notifiers.
-	 */
-	.priority = INT_MIN,
-};
-
 static int __init thermal_init(void)
 {
 	int result;
@@ -1939,11 +1920,6 @@ static int __init thermal_init(void)
 
 	thermal_class_unavailable = false;
 
-	result = register_pm_notifier(&thermal_pm_nb);
-	if (result)
-		pr_warn("Thermal: Can not register suspend notifier, return %d\n",
-			result);
-
 	return 0;
 
 unregister_governors:
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -273,6 +273,9 @@ bool thermal_trip_is_bound_to_cdev(struc
 int thermal_zone_device_enable(struct thermal_zone_device *tz);
 int thermal_zone_device_disable(struct thermal_zone_device *tz);
 void thermal_zone_device_critical(struct thermal_zone_device *tz);
+
+void thermal_pm_prepare(void);
+void thermal_pm_complete(void);
 #else
 static inline struct thermal_zone_device *thermal_zone_device_register_with_trips(
 					const char *type,
@@ -350,6 +353,9 @@ static inline int thermal_zone_device_en
 
 static inline int thermal_zone_device_disable(struct thermal_zone_device *tz)
 { return -ENODEV; }
+
+static inline void thermal_pm_prepare(void) {}
+static inline void thermal_pm_complete(void) {}
 #endif /* CONFIG_THERMAL */
 
 #endif /* __THERMAL_H__ */




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 6/6] thermal: core: Suspend thermal zones later and resume them earlier
  2026-04-06 16:16 ` [PATCH v3 6/6] thermal: core: Suspend thermal zones later and resume them earlier Rafael J. Wysocki
@ 2026-04-06 21:40   ` Armin Wolf
  0 siblings, 0 replies; 8+ messages in thread
From: Armin Wolf @ 2026-04-06 21:40 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux PM; +Cc: Daniel Lezcano, LKML, Lukasz Luba

Am 06.04.26 um 18:16 schrieb Rafael J. Wysocki:

> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> To avoid some undesirable interactions between thermal zone suspend
> and resume with user space that is running when those operations are
> carried out, move them closer to the suspend and resume of devices,
> respectively, by updating dpm_prepare() to carry out thermal zone
> suspend and dpm_complete() to start thermal zone resume (that will
> continue asynchronously).
>
> This also makes the code easier to follow by removing one, arguably
> redundant, level of indirection represented by the thermal PM notifier.

The patch looks good to me. In the future maybe we could track PM dependencies
between thermal zones and cooling devices using device links instead, but for
now keeping the already existing logic looks fine to me.

Reviewed-by: Armin Wolf <W_Armin@gmx.de>

> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>
> v2 -> v3:
>     * Rebase on top of the v3 of the previous patch
>
> v1 -> v2:
>     * Reorder with respect to the previous patch
>     * Use thermal_class_unavailable() to avoid running code that should
>       not run without the thermal class
>     * Suspend thermal zones after disabling device probing and resume
>       them before enabling device probing for better synchronization
>
> ---
>   drivers/base/power/main.c      |    5 +++
>   drivers/thermal/thermal_core.c |   60 ++++++++++++-----------------------------
>   include/linux/thermal.h        |    6 ++++
>   3 files changed, 29 insertions(+), 42 deletions(-)
>
> --- a/drivers/base/power/main.c
> +++ b/drivers/base/power/main.c
> @@ -33,6 +33,7 @@
>   #include <trace/events/power.h>
>   #include <linux/cpufreq.h>
>   #include <linux/devfreq.h>
> +#include <linux/thermal.h>
>   #include <linux/timer.h>
>   #include <linux/nmi.h>
>   
> @@ -1282,6 +1283,8 @@ void dpm_complete(pm_message_t state)
>   	list_splice(&list, &dpm_list);
>   	mutex_unlock(&dpm_list_mtx);
>   
> +	/* Start resuming thermal control */
> +	thermal_pm_complete();
>   	/* Allow device probing and trigger re-probing of deferred devices */
>   	device_unblock_probing();
>   	trace_suspend_resume(TPS("dpm_complete"), state.event, false);
> @@ -2225,6 +2228,8 @@ int dpm_prepare(pm_message_t state)
>   	 * instead. The normal behavior will be restored in dpm_complete().
>   	 */
>   	device_block_probing();
> +	/* Suspend thermal control. */
> +	thermal_pm_prepare();
>   
>   	mutex_lock(&dpm_list_mtx);
>   	while (!list_empty(&dpm_list) && !error) {
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -1838,7 +1838,7 @@ static void thermal_zone_pm_prepare(stru
>   	cancel_delayed_work(&tz->poll_queue);
>   }
>   
> -static void thermal_pm_notify_prepare(void)
> +static void __thermal_pm_prepare(void)
>   {
>   	struct thermal_zone_device *tz;
>   
> @@ -1850,6 +1850,19 @@ static void thermal_pm_notify_prepare(vo
>   		thermal_zone_pm_prepare(tz);
>   }
>   
> +void thermal_pm_prepare(void)
> +{
> +	if (thermal_class_unavailable)
> +		return;
> +
> +	__thermal_pm_prepare();
> +	/*
> +	 * Allow any leftover thermal work items already on the worqueue to
> +	 * complete so they don't get in the way later.
> +	 */
> +	flush_workqueue(thermal_wq);
> +}
> +
>   static void thermal_zone_pm_complete(struct thermal_zone_device *tz)
>   {
>   	guard(thermal_zone)(tz);
> @@ -1866,10 +1879,13 @@ static void thermal_zone_pm_complete(str
>   	mod_delayed_work(thermal_wq, &tz->poll_queue, 0);
>   }
>   
> -static void thermal_pm_notify_complete(void)
> +void thermal_pm_complete(void)
>   {
>   	struct thermal_zone_device *tz;
>   
> +	if (thermal_class_unavailable)
> +		return;
> +
>   	guard(mutex)(&thermal_list_lock);
>   
>   	thermal_pm_suspended = false;
> @@ -1878,41 +1894,6 @@ static void thermal_pm_notify_complete(v
>   		thermal_zone_pm_complete(tz);
>   }
>   
> -static int thermal_pm_notify(struct notifier_block *nb,
> -			     unsigned long mode, void *_unused)
> -{
> -	switch (mode) {
> -	case PM_HIBERNATION_PREPARE:
> -	case PM_RESTORE_PREPARE:
> -	case PM_SUSPEND_PREPARE:
> -		thermal_pm_notify_prepare();
> -		/*
> -		 * Allow any leftover thermal work items already on the
> -		 * worqueue to complete so they don't get in the way later.
> -		 */
> -		flush_workqueue(thermal_wq);
> -		break;
> -	case PM_POST_HIBERNATION:
> -	case PM_POST_RESTORE:
> -	case PM_POST_SUSPEND:
> -		thermal_pm_notify_complete();
> -		break;
> -	default:
> -		break;
> -	}
> -	return 0;
> -}
> -
> -static struct notifier_block thermal_pm_nb = {
> -	.notifier_call = thermal_pm_notify,
> -	/*
> -	 * Run at the lowest priority to avoid interference between the thermal
> -	 * zone resume work items spawned by thermal_pm_notify() and the other
> -	 * PM notifiers.
> -	 */
> -	.priority = INT_MIN,
> -};
> -
>   static int __init thermal_init(void)
>   {
>   	int result;
> @@ -1939,11 +1920,6 @@ static int __init thermal_init(void)
>   
>   	thermal_class_unavailable = false;
>   
> -	result = register_pm_notifier(&thermal_pm_nb);
> -	if (result)
> -		pr_warn("Thermal: Can not register suspend notifier, return %d\n",
> -			result);
> -
>   	return 0;
>   
>   unregister_governors:
> --- a/include/linux/thermal.h
> +++ b/include/linux/thermal.h
> @@ -273,6 +273,9 @@ bool thermal_trip_is_bound_to_cdev(struc
>   int thermal_zone_device_enable(struct thermal_zone_device *tz);
>   int thermal_zone_device_disable(struct thermal_zone_device *tz);
>   void thermal_zone_device_critical(struct thermal_zone_device *tz);
> +
> +void thermal_pm_prepare(void);
> +void thermal_pm_complete(void);
>   #else
>   static inline struct thermal_zone_device *thermal_zone_device_register_with_trips(
>   					const char *type,
> @@ -350,6 +353,9 @@ static inline int thermal_zone_device_en
>   
>   static inline int thermal_zone_device_disable(struct thermal_zone_device *tz)
>   { return -ENODEV; }
> +
> +static inline void thermal_pm_prepare(void) {}
> +static inline void thermal_pm_complete(void) {}
>   #endif /* CONFIG_THERMAL */
>   
>   #endif /* __THERMAL_H__ */
>
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-04-06 21:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-06 16:04 [PATCH v3 0/6] thermal: core: Fixes, simplifications and suspend/resume relocation Rafael J. Wysocki
2026-04-06 16:07 ` [PATCH v3 1/6] thermal: core: Fix thermal zone governor cleanup issues Rafael J. Wysocki
2026-04-06 16:09 ` [PATCH v3 2/6] thermal: core: Free thermal zone ID later during removal Rafael J. Wysocki
2026-04-06 16:10 ` [PATCH v3 3/6] thermal: core: Drop redundant check from thermal_zone_device_update() Rafael J. Wysocki
2026-04-06 16:11 ` [PATCH v3 4/6] thermal: core: Change thermal_wq to be unbound and not freezable Rafael J. Wysocki
2026-04-06 16:14 ` [PATCH v3 5/6] thermal: core: Allocate thermal_class statically Rafael J. Wysocki
2026-04-06 16:16 ` [PATCH v3 6/6] thermal: core: Suspend thermal zones later and resume them earlier Rafael J. Wysocki
2026-04-06 21:40   ` Armin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox