* [PATCH v1] thermal: core: Fix the handling of invalid trip points
@ 2024-05-16 17:20 Rafael J. Wysocki
2024-05-17 10:32 ` Wendy Wang
0 siblings, 1 reply; 2+ messages in thread
From: Rafael J. Wysocki @ 2024-05-16 17:20 UTC (permalink / raw)
To: Linux PM
Cc: LKML, Rafael J. Wysocki, Lukasz Luba, Daniel Lezcano,
Srinivas Pandruvada, Zhang Rui
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 9ad18043fb35 ("thermal: core: Send trip crossing notifications
at init time if needed") overlooked the case when a trip point that
has started as invalid is set to a valid temperature later. Namely,
the initial threshold value for all trips is zero, so if a previously
invalid trip becomes valid and its (new) low temperature is above the
zone temperature, a spurious trip crossing notification will occur and
it may trigger the WARN_ON() in handle_thermal_trip().
To address this, set the initial threshold for all trips to INT_MAX.
There is also the case when a valid writable trip becomes invalid that
requires special handling. First, in accordance with the change
mentioned above, the trip's threshold needs to be set to INT_MAX to
avoid the same issue. Second, if the trip in question is passive and
it has been crossed by the thermal zone temperature on the way up, the
zone's passive count has been incremented and it is in the passive
polling mode, so its passive count needs to be adjusted to allow the
passive polling to be turned off eventually.
Fixes: 9ad18043fb35 ("thermal: core: Send trip crossing notifications at init time if needed")
Fixes: 042a3d80f118 ("thermal: core: Move passive polling management to the core")
Reported-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
drivers/thermal/thermal_core.c | 9 ++++++++-
drivers/thermal/thermal_trip.c | 18 ++++++++++++++++++
2 files changed, 26 insertions(+), 1 deletion(-)
Index: linux-pm/drivers/thermal/thermal_core.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_core.c
+++ linux-pm/drivers/thermal/thermal_core.c
@@ -1401,8 +1401,15 @@ thermal_zone_device_register_with_trips(
tz->device.class = thermal_class;
tz->devdata = devdata;
tz->num_trips = num_trips;
- for_each_trip_desc(tz, td)
+ for_each_trip_desc(tz, td) {
td->trip = *trip++;
+ /*
+ * Mark all thresholds as invalid to start with even though
+ * this only matters for the trips that start as invalid and
+ * become valid later.
+ */
+ td->threshold = INT_MAX;
+ }
thermal_set_delay_jiffies(&tz->passive_delay_jiffies, passive_delay);
thermal_set_delay_jiffies(&tz->polling_delay_jiffies, polling_delay);
Index: linux-pm/drivers/thermal/thermal_trip.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_trip.c
+++ linux-pm/drivers/thermal/thermal_trip.c
@@ -137,6 +137,24 @@ void thermal_zone_set_trip_temp(struct t
if (trip->temperature == temp)
return;
+ if (temp == THERMAL_TEMP_INVALID) {
+ struct thermal_trip_desc *td = trip_to_trip_desc(trip);
+
+ if (trip->type == THERMAL_TRIP_PASSIVE &&
+ tz->temperature >= td->threshold) {
+ /*
+ * The trip has been crossed, so the thermal zone's
+ * passive count needs to be adjusted.
+ */
+ tz->passive--;
+ WARN_ON_ONCE(tz->passive < 0);
+ }
+ /*
+ * Invalidate the threshold to avoid triggering a spurious
+ * trip crossing notification when the trip becomes valid.
+ */
+ td->threshold = INT_MAX;
+ }
WRITE_ONCE(trip->temperature, temp);
thermal_notify_tz_trip_change(tz, trip);
}
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH v1] thermal: core: Fix the handling of invalid trip points
2024-05-16 17:20 [PATCH v1] thermal: core: Fix the handling of invalid trip points Rafael J. Wysocki
@ 2024-05-17 10:32 ` Wendy Wang
0 siblings, 0 replies; 2+ messages in thread
From: Wendy Wang @ 2024-05-17 10:32 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Linux PM, LKML, Rafael J. Wysocki, Lukasz Luba, Daniel Lezcano,
Srinivas Pandruvada, Zhang Rui
Hi Rafael,
I have tested your patch '[PATCH v1] thermal: core: Fix the handling of invalid trip points'
on my server, which is running the v6.9 kernel, the thermal throttling test case has passed.
Tested-by: Wendy Wang <wendy.wang@intel.com>
On 2024-05-16 at 19:20:19 +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Commit 9ad18043fb35 ("thermal: core: Send trip crossing notifications
> at init time if needed") overlooked the case when a trip point that
> has started as invalid is set to a valid temperature later. Namely,
> the initial threshold value for all trips is zero, so if a previously
> invalid trip becomes valid and its (new) low temperature is above the
> zone temperature, a spurious trip crossing notification will occur and
> it may trigger the WARN_ON() in handle_thermal_trip().
>
> To address this, set the initial threshold for all trips to INT_MAX.
>
> There is also the case when a valid writable trip becomes invalid that
> requires special handling. First, in accordance with the change
> mentioned above, the trip's threshold needs to be set to INT_MAX to
> avoid the same issue. Second, if the trip in question is passive and
> it has been crossed by the thermal zone temperature on the way up, the
> zone's passive count has been incremented and it is in the passive
> polling mode, so its passive count needs to be adjusted to allow the
> passive polling to be turned off eventually.
>
> Fixes: 9ad18043fb35 ("thermal: core: Send trip crossing notifications at init time if needed")
> Fixes: 042a3d80f118 ("thermal: core: Move passive polling management to the core")
> Reported-by: Zhang Rui <rui.zhang@intel.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> drivers/thermal/thermal_core.c | 9 ++++++++-
> drivers/thermal/thermal_trip.c | 18 ++++++++++++++++++
> 2 files changed, 26 insertions(+), 1 deletion(-)
>
> Index: linux-pm/drivers/thermal/thermal_core.c
> ===================================================================
> --- linux-pm.orig/drivers/thermal/thermal_core.c
> +++ linux-pm/drivers/thermal/thermal_core.c
> @@ -1401,8 +1401,15 @@ thermal_zone_device_register_with_trips(
> tz->device.class = thermal_class;
> tz->devdata = devdata;
> tz->num_trips = num_trips;
> - for_each_trip_desc(tz, td)
> + for_each_trip_desc(tz, td) {
> td->trip = *trip++;
> + /*
> + * Mark all thresholds as invalid to start with even though
> + * this only matters for the trips that start as invalid and
> + * become valid later.
> + */
> + td->threshold = INT_MAX;
> + }
>
> thermal_set_delay_jiffies(&tz->passive_delay_jiffies, passive_delay);
> thermal_set_delay_jiffies(&tz->polling_delay_jiffies, polling_delay);
> Index: linux-pm/drivers/thermal/thermal_trip.c
> ===================================================================
> --- linux-pm.orig/drivers/thermal/thermal_trip.c
> +++ linux-pm/drivers/thermal/thermal_trip.c
> @@ -137,6 +137,24 @@ void thermal_zone_set_trip_temp(struct t
> if (trip->temperature == temp)
> return;
>
> + if (temp == THERMAL_TEMP_INVALID) {
> + struct thermal_trip_desc *td = trip_to_trip_desc(trip);
> +
> + if (trip->type == THERMAL_TRIP_PASSIVE &&
> + tz->temperature >= td->threshold) {
> + /*
> + * The trip has been crossed, so the thermal zone's
> + * passive count needs to be adjusted.
> + */
> + tz->passive--;
> + WARN_ON_ONCE(tz->passive < 0);
> + }
> + /*
> + * Invalidate the threshold to avoid triggering a spurious
> + * trip crossing notification when the trip becomes valid.
> + */
> + td->threshold = INT_MAX;
> + }
> WRITE_ONCE(trip->temperature, temp);
> thermal_notify_tz_trip_change(tz, trip);
> }
>
>
>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-05-17 10:32 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-16 17:20 [PATCH v1] thermal: core: Fix the handling of invalid trip points Rafael J. Wysocki
2024-05-17 10:32 ` Wendy Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox