* [PATCH v4 0/6] thermal: core: Fixes, simplifications and suspend/resume relocation
@ 2026-04-07 13:51 Rafael J. Wysocki
2026-04-07 13:55 ` [PATCH v4 1/6] thermal: core: Fix thermal zone governor cleanup issues Rafael J. Wysocki
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2026-04-07 13:51 UTC (permalink / raw)
To: Linux PM; +Cc: Daniel Lezcano, LKML, Lukasz Luba, Armin Wolf
Hi All,
This is an update of
https://lore.kernel.org/linux-pm/5119690.31r3eYUQgx@rafael.j.wysocki/
changing the first two patches to address Sashiko feedback.
This series is intended for 7.1 (it applies on top of linux-next).
If fixes the thermal zone removal and registration rollback path by
addressing possible race conditions and a memory leak in that code (patches
[1-2/6]), removes a redundant check (patch [3/6]), changes the thermal
workqueue to an unbound and non-freezable one (patch [4/6]), changes the
allocation of thermal_class to static (patch [5/4]), and relocates the
suspend and resume of thermal zones closer to the suspend and resume of
devices, respectively (patch [6/6]).
Thanks!
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v4 1/6] thermal: core: Fix thermal zone governor cleanup issues
2026-04-07 13:51 [PATCH v4 0/6] thermal: core: Fixes, simplifications and suspend/resume relocation Rafael J. Wysocki
@ 2026-04-07 13:55 ` Rafael J. Wysocki
2026-04-07 13:58 ` [PATCH v4 2/6] thermal: core: Free thermal zone ID later during removal Rafael J. Wysocki
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2026-04-07 13:55 UTC (permalink / raw)
To: Linux PM; +Cc: Daniel Lezcano, LKML, Lukasz Luba, Armin Wolf
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If thermal_zone_device_register_with_trips() fails after adding
a thermal governor to the thermal zone being registered, the
governor is not removed from it as appropriate which may lead to
a memory leak.
In turn, thermal_zone_device_unregister() calls thermal_set_governor()
without acquiring the thermal zone lock beforehand which may race with
a governor update via sysfs and may lead to a use-after-free in that
case.
Address these issues by adding two thermal_set_governor() calls, one to
thermal_release() to remove the governor from the given thermal zone,
and one to the thermal zone registration error path to cover failures
preceding the thermal zone device registration.
Fixes: e33df1d2f3a0 ("thermal: let governors have private data for each thermal zone")
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
v3 -> v4:
* Call thermal_set_governor() from thermal_release() to avoid use-after-free
of the device name (Sashiko)
* Call thermal_set_governor() in thermal zone device registration rollback
path if it fails before device registration
v2 -> v3: New patch
---
drivers/thermal/thermal_core.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -971,6 +971,7 @@ static void thermal_release(struct devic
sizeof("thermal_zone") - 1)) {
tz = to_thermal_zone(dev);
thermal_zone_destroy_device_groups(tz);
+ thermal_set_governor(tz, NULL);
mutex_destroy(&tz->lock);
complete(&tz->removal);
} else if (!strncmp(dev_name(dev), "cooling_device",
@@ -1617,8 +1618,10 @@ thermal_zone_device_register_with_trips(
/* sys I/F */
/* Add nodes that are always present via .groups */
result = thermal_zone_create_device_groups(tz);
- if (result)
+ if (result) {
+ thermal_set_governor(tz, NULL);
goto remove_id;
+ }
result = device_register(&tz->device);
if (result)
@@ -1731,8 +1734,6 @@ void thermal_zone_device_unregister(stru
cancel_delayed_work_sync(&tz->poll_queue);
- thermal_set_governor(tz, NULL);
-
thermal_thresholds_exit(tz);
thermal_remove_hwmon_sysfs(tz);
ida_free(&thermal_tz_ida, tz->id);
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v4 2/6] thermal: core: Free thermal zone ID later during removal
2026-04-07 13:51 [PATCH v4 0/6] thermal: core: Fixes, simplifications and suspend/resume relocation Rafael J. Wysocki
2026-04-07 13:55 ` [PATCH v4 1/6] thermal: core: Fix thermal zone governor cleanup issues Rafael J. Wysocki
@ 2026-04-07 13:58 ` Rafael J. Wysocki
2026-04-07 14:06 ` [PATCH v4 3/6] thermal: core: Drop redundant check from thermal_zone_device_update() Rafael J. Wysocki
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2026-04-07 13:58 UTC (permalink / raw)
To: Linux PM; +Cc: Daniel Lezcano, LKML, Lukasz Luba, Armin Wolf
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The thermal zone removal ordering is different from the thermal zone
registration rollback path ordering and the former is arguably
problematic because freeing a thermal zone ID prematurely may cause
it to be used during the registration of another thermal zone which
may fail as a result.
Prevent that from occurring by changing the thermal zone removal
ordering to reflect the thermal zone registration rollback path
ordering.
Also more the ida_destroy() call from thermal_zone_device_unregister()
to thermal_release() for consistency.
Fixes: b31ef8285b19 ("thermal core: convert ID allocation to IDA")
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
v3 -> v4:
* Call ida_destroy() in thermal_release() in analogy with the mutex
cleanup
v2 -> v3: New patch
---
drivers/thermal/thermal_core.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -972,6 +972,7 @@ static void thermal_release(struct devic
tz = to_thermal_zone(dev);
thermal_zone_destroy_device_groups(tz);
thermal_set_governor(tz, NULL);
+ ida_destroy(&tz->ida);
mutex_destroy(&tz->lock);
complete(&tz->removal);
} else if (!strncmp(dev_name(dev), "cooling_device",
@@ -1736,8 +1737,6 @@ void thermal_zone_device_unregister(stru
thermal_thresholds_exit(tz);
thermal_remove_hwmon_sysfs(tz);
- ida_free(&thermal_tz_ida, tz->id);
- ida_destroy(&tz->ida);
device_del(&tz->device);
put_device(&tz->device);
@@ -1745,6 +1744,9 @@ void thermal_zone_device_unregister(stru
thermal_notify_tz_delete(tz);
wait_for_completion(&tz->removal);
+
+ ida_free(&thermal_tz_ida, tz->id);
+
kfree(tz->tzp);
kfree(tz);
}
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v4 3/6] thermal: core: Drop redundant check from thermal_zone_device_update()
2026-04-07 13:51 [PATCH v4 0/6] thermal: core: Fixes, simplifications and suspend/resume relocation Rafael J. Wysocki
2026-04-07 13:55 ` [PATCH v4 1/6] thermal: core: Fix thermal zone governor cleanup issues Rafael J. Wysocki
2026-04-07 13:58 ` [PATCH v4 2/6] thermal: core: Free thermal zone ID later during removal Rafael J. Wysocki
@ 2026-04-07 14:06 ` Rafael J. Wysocki
2026-04-07 14:06 ` [PATCH v4 4/6] thermal: core: Change thermal_wq to be unbound and not freezable Rafael J. Wysocki
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2026-04-07 14:06 UTC (permalink / raw)
To: Linux PM; +Cc: Daniel Lezcano, LKML, Lukasz Luba, Armin Wolf
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Since __thermal_zone_device_update() checks if tz->state is
TZ_STATE_READY and bails out immediately otherwise, it is not
necessary to check the thermal_zone_is_present() return value in
thermal_zone_device_update(). Namely, tz->state is equal to
TZ_STATE_FLAG_INIT initially and that flag is only cleared in
thermal_zone_init_complete() after adding tz to the list of thermal
zones, and thermal_zone_exit() sets TZ_STATE_FLAG_EXIT in tz->state
while removing tz from that list. Thus tz->state is not TZ_STATE_READY
when tz is not in the list and the check mentioned above is redundant.
Accordingly, drop the redundant thermal_zone_is_present() check from
thermal_zone_device_update() and drop the former altogether because it
has no more users.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
v1 -> v4: No changes
---
drivers/thermal/thermal_core.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -702,18 +702,12 @@ int thermal_zone_device_disable(struct t
}
EXPORT_SYMBOL_GPL(thermal_zone_device_disable);
-static bool thermal_zone_is_present(struct thermal_zone_device *tz)
-{
- return !list_empty(&tz->node);
-}
-
void thermal_zone_device_update(struct thermal_zone_device *tz,
enum thermal_notify_event event)
{
guard(thermal_zone)(tz);
- if (thermal_zone_is_present(tz))
- __thermal_zone_device_update(tz, event);
+ __thermal_zone_device_update(tz, event);
}
EXPORT_SYMBOL_GPL(thermal_zone_device_update);
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v4 4/6] thermal: core: Change thermal_wq to be unbound and not freezable
2026-04-07 13:51 [PATCH v4 0/6] thermal: core: Fixes, simplifications and suspend/resume relocation Rafael J. Wysocki
` (2 preceding siblings ...)
2026-04-07 14:06 ` [PATCH v4 3/6] thermal: core: Drop redundant check from thermal_zone_device_update() Rafael J. Wysocki
@ 2026-04-07 14:06 ` Rafael J. Wysocki
2026-04-07 14:07 ` [PATCH v4 5/6] thermal: core: Allocate thermal_class statically Rafael J. Wysocki
2026-04-07 14:09 ` [PATCH v4 6/6] thermal: core: Suspend thermal zones later and resume them earlier Rafael J. Wysocki
5 siblings, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2026-04-07 14:06 UTC (permalink / raw)
To: Linux PM; +Cc: Daniel Lezcano, LKML, Lukasz Luba, Armin Wolf
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The thermal workqueue doesn't need to be freezable or per-CPU, so drop
WQ_FREEZABLE and WQ_PERCPU from the flags when allocating it.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
v1 -> v4: No changes
---
drivers/thermal/thermal_core.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1918,8 +1918,7 @@ static int __init thermal_init(void)
if (result)
goto error;
- thermal_wq = alloc_workqueue("thermal_events",
- WQ_FREEZABLE | WQ_POWER_EFFICIENT | WQ_PERCPU, 0);
+ thermal_wq = alloc_workqueue("thermal_events", WQ_POWER_EFFICIENT, 0);
if (!thermal_wq) {
result = -ENOMEM;
goto unregister_netlink;
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v4 5/6] thermal: core: Allocate thermal_class statically
2026-04-07 13:51 [PATCH v4 0/6] thermal: core: Fixes, simplifications and suspend/resume relocation Rafael J. Wysocki
` (3 preceding siblings ...)
2026-04-07 14:06 ` [PATCH v4 4/6] thermal: core: Change thermal_wq to be unbound and not freezable Rafael J. Wysocki
@ 2026-04-07 14:07 ` Rafael J. Wysocki
2026-04-07 14:09 ` [PATCH v4 6/6] thermal: core: Suspend thermal zones later and resume them earlier Rafael J. Wysocki
5 siblings, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2026-04-07 14:07 UTC (permalink / raw)
To: Linux PM; +Cc: Daniel Lezcano, LKML, Lukasz Luba, Armin Wolf
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Define thermal_class as a static structure to simplify thermal_init()
and to simplify thermal class availability checks that will need to
be carried out during the suspend and resume of thermal zones after
subsequent changes.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
v3 -> v4: No changes
v2 -> v3:
* Use static variable thermal_class_unavailable (instead of a function)
for checking if thermal_class is available.
v1 -> v2:
* Reorder with respect to the next patch to allow the latter to be simpler
* Add thermal_class_unavailable() (the next patch uses it too)
---
drivers/thermal/thermal_core.c | 30 ++++++++++++------------------
1 file changed, 12 insertions(+), 18 deletions(-)
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -979,7 +979,11 @@ static void thermal_release(struct devic
}
}
-static struct class *thermal_class;
+static const struct class thermal_class = {
+ .name = "thermal",
+ .dev_release = thermal_release,
+};
+static bool thermal_class_unavailable __ro_after_init = true;
static inline
void print_bind_err_msg(struct thermal_zone_device *tz,
@@ -1072,7 +1076,7 @@ __thermal_cooling_device_register(struct
!ops->set_cur_state)
return ERR_PTR(-EINVAL);
- if (!thermal_class)
+ if (thermal_class_unavailable)
return ERR_PTR(-ENODEV);
cdev = kzalloc_obj(*cdev);
@@ -1095,7 +1099,7 @@ __thermal_cooling_device_register(struct
cdev->np = np;
cdev->ops = ops;
cdev->updated = false;
- cdev->device.class = thermal_class;
+ cdev->device.class = &thermal_class;
cdev->devdata = devdata;
ret = cdev->ops->get_max_state(cdev, &cdev->max_state);
@@ -1543,7 +1547,7 @@ thermal_zone_device_register_with_trips(
if (polling_delay && passive_delay > polling_delay)
return ERR_PTR(-EINVAL);
- if (!thermal_class)
+ if (thermal_class_unavailable)
return ERR_PTR(-ENODEV);
tz = kzalloc_flex(*tz, trips, num_trips);
@@ -1579,7 +1583,7 @@ thermal_zone_device_register_with_trips(
if (!tz->ops.critical)
tz->ops.critical = thermal_zone_device_critical;
- tz->device.class = thermal_class;
+ tz->device.class = &thermal_class;
tz->devdata = devdata;
tz->num_trips = num_trips;
for_each_trip_desc(tz, td) {
@@ -1928,21 +1932,11 @@ static int __init thermal_init(void)
if (result)
goto destroy_workqueue;
- thermal_class = kzalloc_obj(*thermal_class);
- if (!thermal_class) {
- result = -ENOMEM;
+ result = class_register(&thermal_class);
+ if (result)
goto unregister_governors;
- }
- thermal_class->name = "thermal";
- thermal_class->dev_release = thermal_release;
-
- result = class_register(thermal_class);
- if (result) {
- kfree(thermal_class);
- thermal_class = NULL;
- goto unregister_governors;
- }
+ thermal_class_unavailable = false;
result = register_pm_notifier(&thermal_pm_nb);
if (result)
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v4 6/6] thermal: core: Suspend thermal zones later and resume them earlier
2026-04-07 13:51 [PATCH v4 0/6] thermal: core: Fixes, simplifications and suspend/resume relocation Rafael J. Wysocki
` (4 preceding siblings ...)
2026-04-07 14:07 ` [PATCH v4 5/6] thermal: core: Allocate thermal_class statically Rafael J. Wysocki
@ 2026-04-07 14:09 ` Rafael J. Wysocki
5 siblings, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2026-04-07 14:09 UTC (permalink / raw)
To: Linux PM; +Cc: Daniel Lezcano, LKML, Lukasz Luba, Armin Wolf
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
To avoid some undesirable interactions between thermal zone suspend
and resume with user space that is running when those operations are
carried out, move them closer to the suspend and resume of devices,
respectively, by updating dpm_prepare() to carry out thermal zone
suspend and dpm_complete() to start thermal zone resume (that will
continue asynchronously).
This also makes the code easier to follow by removing one, arguably
redundant, level of indirection represented by the thermal PM notifier.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Armin Wolf <W_Armin@gmx.de>
---
v3 -> v4:
* Add R-by from Armin
v2 -> v3:
* Rebase on top of the v3 of the previous patch
v1 -> v2:
* Reorder with respect to the previous patch
* Use thermal_class_unavailable() to avoid running code that should
not run without the thermal class
* Suspend thermal zones after disabling device probing and resume
them before enabling device probing for better synchronization
---
drivers/base/power/main.c | 5 +++
drivers/thermal/thermal_core.c | 60 ++++++++++++-----------------------------
include/linux/thermal.h | 6 ++++
3 files changed, 29 insertions(+), 42 deletions(-)
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -33,6 +33,7 @@
#include <trace/events/power.h>
#include <linux/cpufreq.h>
#include <linux/devfreq.h>
+#include <linux/thermal.h>
#include <linux/timer.h>
#include <linux/nmi.h>
@@ -1282,6 +1283,8 @@ void dpm_complete(pm_message_t state)
list_splice(&list, &dpm_list);
mutex_unlock(&dpm_list_mtx);
+ /* Start resuming thermal control */
+ thermal_pm_complete();
/* Allow device probing and trigger re-probing of deferred devices */
device_unblock_probing();
trace_suspend_resume(TPS("dpm_complete"), state.event, false);
@@ -2225,6 +2228,8 @@ int dpm_prepare(pm_message_t state)
* instead. The normal behavior will be restored in dpm_complete().
*/
device_block_probing();
+ /* Suspend thermal control. */
+ thermal_pm_prepare();
mutex_lock(&dpm_list_mtx);
while (!list_empty(&dpm_list) && !error) {
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1837,7 +1837,7 @@ static void thermal_zone_pm_prepare(stru
cancel_delayed_work(&tz->poll_queue);
}
-static void thermal_pm_notify_prepare(void)
+static void __thermal_pm_prepare(void)
{
struct thermal_zone_device *tz;
@@ -1849,6 +1849,19 @@ static void thermal_pm_notify_prepare(vo
thermal_zone_pm_prepare(tz);
}
+void thermal_pm_prepare(void)
+{
+ if (thermal_class_unavailable)
+ return;
+
+ __thermal_pm_prepare();
+ /*
+ * Allow any leftover thermal work items already on the worqueue to
+ * complete so they don't get in the way later.
+ */
+ flush_workqueue(thermal_wq);
+}
+
static void thermal_zone_pm_complete(struct thermal_zone_device *tz)
{
guard(thermal_zone)(tz);
@@ -1865,10 +1878,13 @@ static void thermal_zone_pm_complete(str
mod_delayed_work(thermal_wq, &tz->poll_queue, 0);
}
-static void thermal_pm_notify_complete(void)
+void thermal_pm_complete(void)
{
struct thermal_zone_device *tz;
+ if (thermal_class_unavailable)
+ return;
+
guard(mutex)(&thermal_list_lock);
thermal_pm_suspended = false;
@@ -1877,41 +1893,6 @@ static void thermal_pm_notify_complete(v
thermal_zone_pm_complete(tz);
}
-static int thermal_pm_notify(struct notifier_block *nb,
- unsigned long mode, void *_unused)
-{
- switch (mode) {
- case PM_HIBERNATION_PREPARE:
- case PM_RESTORE_PREPARE:
- case PM_SUSPEND_PREPARE:
- thermal_pm_notify_prepare();
- /*
- * Allow any leftover thermal work items already on the
- * worqueue to complete so they don't get in the way later.
- */
- flush_workqueue(thermal_wq);
- break;
- case PM_POST_HIBERNATION:
- case PM_POST_RESTORE:
- case PM_POST_SUSPEND:
- thermal_pm_notify_complete();
- break;
- default:
- break;
- }
- return 0;
-}
-
-static struct notifier_block thermal_pm_nb = {
- .notifier_call = thermal_pm_notify,
- /*
- * Run at the lowest priority to avoid interference between the thermal
- * zone resume work items spawned by thermal_pm_notify() and the other
- * PM notifiers.
- */
- .priority = INT_MIN,
-};
-
static int __init thermal_init(void)
{
int result;
@@ -1938,11 +1919,6 @@ static int __init thermal_init(void)
thermal_class_unavailable = false;
- result = register_pm_notifier(&thermal_pm_nb);
- if (result)
- pr_warn("Thermal: Can not register suspend notifier, return %d\n",
- result);
-
return 0;
unregister_governors:
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -273,6 +273,9 @@ bool thermal_trip_is_bound_to_cdev(struc
int thermal_zone_device_enable(struct thermal_zone_device *tz);
int thermal_zone_device_disable(struct thermal_zone_device *tz);
void thermal_zone_device_critical(struct thermal_zone_device *tz);
+
+void thermal_pm_prepare(void);
+void thermal_pm_complete(void);
#else
static inline struct thermal_zone_device *thermal_zone_device_register_with_trips(
const char *type,
@@ -350,6 +353,9 @@ static inline int thermal_zone_device_en
static inline int thermal_zone_device_disable(struct thermal_zone_device *tz)
{ return -ENODEV; }
+
+static inline void thermal_pm_prepare(void) {}
+static inline void thermal_pm_complete(void) {}
#endif /* CONFIG_THERMAL */
#endif /* __THERMAL_H__ */
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-04-07 14:11 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-07 13:51 [PATCH v4 0/6] thermal: core: Fixes, simplifications and suspend/resume relocation Rafael J. Wysocki
2026-04-07 13:55 ` [PATCH v4 1/6] thermal: core: Fix thermal zone governor cleanup issues Rafael J. Wysocki
2026-04-07 13:58 ` [PATCH v4 2/6] thermal: core: Free thermal zone ID later during removal Rafael J. Wysocki
2026-04-07 14:06 ` [PATCH v4 3/6] thermal: core: Drop redundant check from thermal_zone_device_update() Rafael J. Wysocki
2026-04-07 14:06 ` [PATCH v4 4/6] thermal: core: Change thermal_wq to be unbound and not freezable Rafael J. Wysocki
2026-04-07 14:07 ` [PATCH v4 5/6] thermal: core: Allocate thermal_class statically Rafael J. Wysocki
2026-04-07 14:09 ` [PATCH v4 6/6] thermal: core: Suspend thermal zones later and resume them earlier Rafael J. Wysocki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox