* [PATCH AUTOSEL 5.15 02/44] cpufreq: CPPC: Mark driver with NEED_UPDATE_LIMITS flag
[not found] <20250804003849.3627024-1-sashal@kernel.org>
@ 2025-08-04 0:38 ` Sasha Levin
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 10/44] cpufreq: Exit governor when failed to start old governor Sasha Levin
` (5 subsequent siblings)
6 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2025-08-04 0:38 UTC (permalink / raw)
To: patches, stable
Cc: Prashant Malani, Rafael J . Wysocki, Viresh Kumar, Sasha Levin,
rafael, linux-pm
From: Prashant Malani <pmalani@google.com>
[ Upstream commit 0a1416a49e63c320f6e6c1c8d07e1b58c0d4a3f3 ]
AMU counters on certain CPPC-based platforms tend to yield inaccurate
delivered performance measurements on systems that are idle/mostly idle.
This results in an inaccurate frequency being stored by cpufreq in its
policy structure when the CPU is brought online. [1]
Consequently, if the userspace governor tries to set the frequency to a
new value, there is a possibility that it would be the erroneous value
stored earlier. In such a scenario, cpufreq would assume that the
requested frequency has already been set and return early, resulting in
the correct/new frequency request never making it to the hardware.
Since the operating frequency is liable to this sort of inconsistency,
mark the CPPC driver with CPUFREQ_NEED_UPDATE_LIMITS so that it is always
invoked when a target frequency update is requested.
Link: https://lore.kernel.org/linux-pm/20250619000925.415528-3-pmalani@google.com/ [1]
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Prashant Malani <pmalani@google.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250722055611.130574-2-pmalani@google.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my
extensive analysis:
## Code Change Analysis
The change is minimal and focused - it only adds the
`CPUFREQ_NEED_UPDATE_LIMITS` flag to the cppc_cpufreq_driver struct:
```c
- .flags = CPUFREQ_CONST_LOOPS,
+ .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
```
This single-line change modifies the driver flags in
drivers/cpufreq/cppc_cpufreq.c:928.
## Bug Being Fixed
The commit addresses a real bug where:
1. **AMU counters on CPPC platforms provide inaccurate measurements**
when systems are idle/mostly idle
2. **This causes cpufreq to store incorrect frequency values** in its
policy structure during CPU online
3. **Userspace frequency changes fail silently** because cpufreq's early
return optimization (`if (target_freq == policy->cur)`) prevents the
hardware from receiving the correct frequency request
The bug manifests as userspace being unable to set CPU frequencies
correctly on affected CPPC platforms, which is a functional regression
that impacts power management and performance.
## Why This Qualifies for Stable
1. **Fixes a real bug**: The issue prevents proper CPU frequency scaling
on CPPC-based platforms, affecting power management functionality
2. **Minimal risk**: The change is extremely small (one flag addition)
and well-contained to the CPPC driver
3. **Clear fix semantics**: The `CPUFREQ_NEED_UPDATE_LIMITS` flag is an
established mechanism (introduced in commit 1c534352f47f)
specifically designed for this scenario where drivers need frequency
updates even when target_freq == policy->cur
4. **Similar precedent**: Other cpufreq drivers (intel_pstate, amd-
pstate) already use this flag for similar reasons, showing it's a
proven solution
5. **No architectural changes**: This doesn't introduce new features or
change kernel architecture - it simply ensures the driver behaves
correctly
6. **Recent related fixes**: The git history shows ongoing work to fix
CPPC feedback counter issues (e.g., commit c47195631960), indicating
this is part of stabilizing CPPC functionality
## Impact Assessment
- **Affected systems**: CPPC-based platforms with AMU counters that
exhibit the idle measurement issue
- **Risk**: Very low - the flag only bypasses an optimization when
needed, forcing frequency updates to reach hardware
- **Testing**: The change has been acked by the cpufreq maintainer
(Viresh Kumar) and suggested by the PM subsystem maintainer (Rafael J.
Wysocki)
The commit follows stable tree rules by fixing an important bug with
minimal changes and low regression risk.
drivers/cpufreq/cppc_cpufreq.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index c5a4aa0c2c9a..b7294531816b 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -682,7 +682,7 @@ static struct freq_attr *cppc_cpufreq_attr[] = {
};
static struct cpufreq_driver cppc_cpufreq_driver = {
- .flags = CPUFREQ_CONST_LOOPS,
+ .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
.verify = cppc_verify_policy,
.target = cppc_cpufreq_set_target,
.get = cppc_cpufreq_get_rate,
--
2.39.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH AUTOSEL 5.15 10/44] cpufreq: Exit governor when failed to start old governor
[not found] <20250804003849.3627024-1-sashal@kernel.org>
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 02/44] cpufreq: CPPC: Mark driver with NEED_UPDATE_LIMITS flag Sasha Levin
@ 2025-08-04 0:38 ` Sasha Levin
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 12/44] PM / devfreq: governor: Replace sscanf() with kstrtoul() in set_freq_store() Sasha Levin
` (4 subsequent siblings)
6 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2025-08-04 0:38 UTC (permalink / raw)
To: patches, stable
Cc: Lifeng Zheng, Rafael J . Wysocki, Sasha Levin, rafael,
viresh.kumar, linux-pm
From: Lifeng Zheng <zhenglifeng1@huawei.com>
[ Upstream commit 0ae204405095abfbc2d694ee0fbb49bcbbe55c57 ]
Detect the result of starting old governor in cpufreq_set_policy(). If it
fails, exit the governor and clear policy->governor.
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250709104145.2348017-5-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my
extensive analysis:
## Bug Analysis
The commit fixes a **resource leak and inconsistent state bug** in the
cpufreq governor switching code. Looking at the code changes in
`cpufreq_set_policy()`:
### The Bug:
In the original code (lines 2716-2721), when switching governors fails
and we need to restore the old governor:
```c
if (old_gov) {
policy->governor = old_gov;
if (cpufreq_init_governor(policy))
policy->governor = NULL;
else
cpufreq_start_governor(policy); // Bug: no error handling here
}
```
If `cpufreq_start_governor()` fails, the code doesn't handle the error.
This leaves the system in an **inconsistent state** where:
1. The governor is initialized (`cpufreq_init_governor` succeeded)
2. But the governor failed to start (`cpufreq_start_governor` failed)
3. The policy still points to a non-functional governor
4. Resources allocated during `cpufreq_init_governor` are **leaked**
(module reference count, governor's init() allocations)
### The Fix:
```c
if (cpufreq_init_governor(policy)) {
policy->governor = NULL;
} else if (cpufreq_start_governor(policy)) {
cpufreq_exit_governor(policy); // NEW: Clean up on failure
policy->governor = NULL; // NEW: Clear the governor pointer
}
```
## Why This Should Be Backported:
1. **Fixes a Real Bug**: This addresses a resource leak where
`cpufreq_init_governor()` acquires resources (notably
`try_module_get()` at line 2442 and potential governor->init()
allocations) that aren't cleaned up if `cpufreq_start_governor()`
fails.
2. **Small and Contained Fix**: The change is minimal - just 4 lines
modified to add proper error handling. It doesn't change any APIs or
introduce new functionality.
3. **Prevents System Instability**: Leaving the cpufreq subsystem in an
inconsistent state (initialized but not started governor) could lead
to:
- Module reference count leaks
- Memory leaks from governor init allocations
- Potential crashes if the partially-initialized governor is accessed
later
4. **Error Path Fix**: This is clearly an error handling path that was
incorrectly implemented. The pattern of calling
`cpufreq_exit_governor()` after a failed `cpufreq_start_governor()`
is already used elsewhere in the code (line 2711).
5. **No Architectural Changes**: The fix simply adds missing cleanup
code in an error path. It doesn't change the normal operation flow or
introduce new features.
6. **Critical Subsystem**: The cpufreq subsystem is critical for power
management and system stability. Bugs here can affect system
reliability.
The commit follows the stable kernel rules perfectly - it's a clear bug
fix that prevents resource leaks and system instability, with minimal
risk of regression since it only affects an error path that was already
broken.
drivers/cpufreq/cpufreq.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index bbb0cbb2eb8c..4f68e34cde0b 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2605,10 +2605,12 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
pr_debug("starting governor %s failed\n", policy->governor->name);
if (old_gov) {
policy->governor = old_gov;
- if (cpufreq_init_governor(policy))
+ if (cpufreq_init_governor(policy)) {
policy->governor = NULL;
- else
- cpufreq_start_governor(policy);
+ } else if (cpufreq_start_governor(policy)) {
+ cpufreq_exit_governor(policy);
+ policy->governor = NULL;
+ }
}
return ret;
--
2.39.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH AUTOSEL 5.15 12/44] PM / devfreq: governor: Replace sscanf() with kstrtoul() in set_freq_store()
[not found] <20250804003849.3627024-1-sashal@kernel.org>
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 02/44] cpufreq: CPPC: Mark driver with NEED_UPDATE_LIMITS flag Sasha Levin
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 10/44] cpufreq: Exit governor when failed to start old governor Sasha Levin
@ 2025-08-04 0:38 ` Sasha Levin
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 15/44] thermal/drivers/qcom-spmi-temp-alarm: Enable stage 2 shutdown when required Sasha Levin
` (3 subsequent siblings)
6 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2025-08-04 0:38 UTC (permalink / raw)
To: patches, stable
Cc: Lifeng Zheng, Chanwoo Choi, Sasha Levin, myungjoo.ham,
kyungmin.park, linux-pm
From: Lifeng Zheng <zhenglifeng1@huawei.com>
[ Upstream commit 914cc799b28f17d369d5b4db3b941957d18157e8 ]
Replace sscanf() with kstrtoul() in set_freq_store() and check the result
to avoid invalid input.
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://lore.kernel.org/lkml/20250421030020.3108405-2-zhenglifeng1@huawei.com/
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
## Backport Status: YES
This commit should be backported to stable kernel trees for the
following reasons:
1. **Security Issue - Input Validation Bug**: The original code uses
`sscanf(buf, "%lu", &wanted)` at line 45 without checking the return
value. This is a security vulnerability because:
- `sscanf()` returns the number of successfully parsed items
- If the input is malformed or contains invalid characters,
`sscanf()` may return 0 but leave `wanted` uninitialized
- The code then proceeds to use this potentially uninitialized value
in `data->user_frequency = wanted`
- This could lead to setting arbitrary frequency values from
uninitialized stack data
2. **Real User Impact**: The devfreq userspace governor allows users to
manually set device frequencies through sysfs. Invalid input handling
could cause:
- Setting incorrect frequencies based on uninitialized memory
- Potential system instability if invalid frequencies are applied to
hardware
- Security implications as uninitialized stack data could be used
3. **Small and Contained Fix**: The change is minimal and low-risk:
- Replaces `sscanf()` with `kstrtoul()` which has proper error
checking
- Adds explicit error handling that returns early on invalid input
- The change is confined to a single function (`set_freq_store()`)
- No architectural changes or new features added
4. **Follows Kernel Best Practices**: The kernel has been systematically
replacing `sscanf()` with `kstrto*()` functions for better input
validation. This is evident from similar commits found in the git log
(e.g., commit a5556fa1107d for asus-wmi).
5. **Critical Subsystem**: While devfreq might not be as critical as
core memory management, it controls device frequency scaling which
can affect:
- Power management
- System performance
- Hardware stability
6. **No Negative Side Effects**: The change only adds proper validation
and doesn't modify the core functionality. Valid inputs will continue
to work exactly as before.
The commit fixes a clear bug (missing input validation) that could lead
to undefined behavior and potential security issues, making it an
excellent candidate for stable backporting according to stable tree
rules.
drivers/devfreq/governor_userspace.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/devfreq/governor_userspace.c b/drivers/devfreq/governor_userspace.c
index d69672ccacc4..8d057cea09d5 100644
--- a/drivers/devfreq/governor_userspace.c
+++ b/drivers/devfreq/governor_userspace.c
@@ -9,6 +9,7 @@
#include <linux/slab.h>
#include <linux/device.h>
#include <linux/devfreq.h>
+#include <linux/kstrtox.h>
#include <linux/pm.h>
#include <linux/mutex.h>
#include <linux/module.h>
@@ -39,10 +40,13 @@ static ssize_t set_freq_store(struct device *dev, struct device_attribute *attr,
unsigned long wanted;
int err = 0;
+ err = kstrtoul(buf, 0, &wanted);
+ if (err)
+ return err;
+
mutex_lock(&devfreq->lock);
data = devfreq->governor_data;
- sscanf(buf, "%lu", &wanted);
data->user_frequency = wanted;
data->valid = true;
err = update_devfreq(devfreq);
--
2.39.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH AUTOSEL 5.15 15/44] thermal/drivers/qcom-spmi-temp-alarm: Enable stage 2 shutdown when required
[not found] <20250804003849.3627024-1-sashal@kernel.org>
` (2 preceding siblings ...)
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 12/44] PM / devfreq: governor: Replace sscanf() with kstrtoul() in set_freq_store() Sasha Levin
@ 2025-08-04 0:38 ` Sasha Levin
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 21/44] PM: runtime: Clear power.needs_force_resume in pm_runtime_reinit() Sasha Levin
` (2 subsequent siblings)
6 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2025-08-04 0:38 UTC (permalink / raw)
To: patches, stable
Cc: David Collins, Anjelique Melendez, Daniel Lezcano, Sasha Levin,
amitk, thara.gopinath, linux-pm, linux-arm-msm
From: David Collins <david.collins@oss.qualcomm.com>
[ Upstream commit f8e157ff2df46ddabd930815d196895976227831 ]
Certain TEMP_ALARM GEN2 PMIC peripherals need over-temperature stage 2
automatic PMIC partial shutdown. This will ensure that in the event of
reaching the hotter stage 3 over-temperature threshold, repeated faults
will be avoided during the automatic PMIC hardware full shutdown.
Modify the stage 2 shutdown control logic to ensure that stage 2
shutdown is enabled on all affected PMICs. Read the digital major
and minor revision registers to identify these PMICs.
Signed-off-by: David Collins <david.collins@oss.qualcomm.com>
Signed-off-by: Anjelique Melendez <anjelique.melendez@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250710224555.3047790-2-anjelique.melendez@oss.qualcomm.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my
detailed analysis:
## Key Reasons for Backporting:
### 1. **Fixes a Hardware Safety Issue**
The commit addresses a critical hardware protection mechanism where
certain PMIC peripherals require stage 2 automatic partial shutdown to
prevent "repeated faults" during stage 3 over-temperature events.
Without this fix, affected hardware could experience repeated thermal
faults during critical temperature conditions, potentially leading to
system instability or hardware damage.
### 2. **Targeted Hardware-Specific Fix**
The code changes identify specific PMIC revisions that require this
protection:
```c
switch (dig_revision) {
case 0x0001:
case 0x0002:
case 0x0100:
case 0x0101:
chip->require_stage2_shutdown = true;
break;
}
```
This shows it's a targeted fix for known hardware issues, not a general
enhancement.
### 3. **Small and Contained Change**
The fix is minimal and self-contained:
- Adds reading of DIG_MINOR register
- Adds a `require_stage2_shutdown` flag to the chip structure
- Modifies the logic in `qpnp_tm_update_critical_trip_temp()` to respect
this flag
- Total change is about 30 lines of code with clear boundaries
### 4. **Low Risk of Regression**
- The change only affects specific PMIC revisions (0x0001, 0x0002,
0x0100, 0x0101)
- For other hardware, behavior remains unchanged
- The modification is purely additive - it doesn't alter existing
functionality for unaffected devices
### 5. **Prevents Hardware Malfunction**
The commit message explicitly states this fix is needed to "avoid
potential repeated faults upon reaching over-temperature stage 3." This
indicates a real hardware malfunction scenario that users could
encounter without this patch.
### 6. **Follows Stable Tree Rules**
According to stable kernel rules, this qualifies because it:
- Fixes a real bug (hardware protection failure)
- Is not a new feature
- Has minimal changes
- Addresses a specific hardware issue that affects users
### Technical Analysis:
The key change is in the `qpnp_tm_update_critical_trip_temp()` function
where the condition is modified from:
```c
if (disable_s2_shutdown)
reg |= SHUTDOWN_CTRL1_OVERRIDE_S2;
```
to:
```c
if (disable_stage2_shutdown && !chip->require_stage2_shutdown)
reg |= SHUTDOWN_CTRL1_OVERRIDE_STAGE2;
```
This ensures that for the affected PMIC revisions, stage 2 shutdown is
never disabled, providing the necessary hardware protection against
thermal events.
Given that this fixes a hardware-specific thermal protection issue that
could lead to system instability or damage, and the fix is minimal with
low regression risk, this commit is an excellent candidate for stable
backporting.
drivers/thermal/qcom/qcom-spmi-temp-alarm.c | 43 ++++++++++++++++-----
1 file changed, 34 insertions(+), 9 deletions(-)
diff --git a/drivers/thermal/qcom/qcom-spmi-temp-alarm.c b/drivers/thermal/qcom/qcom-spmi-temp-alarm.c
index 1037de19873a..f466bbfa128d 100644
--- a/drivers/thermal/qcom/qcom-spmi-temp-alarm.c
+++ b/drivers/thermal/qcom/qcom-spmi-temp-alarm.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* Copyright (c) 2011-2015, 2017, 2020, The Linux Foundation. All rights reserved.
+ * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
*/
#include <linux/bitops.h>
@@ -17,6 +18,7 @@
#include "../thermal_core.h"
+#define QPNP_TM_REG_DIG_MINOR 0x00
#define QPNP_TM_REG_DIG_MAJOR 0x01
#define QPNP_TM_REG_TYPE 0x04
#define QPNP_TM_REG_SUBTYPE 0x05
@@ -32,7 +34,7 @@
#define STATUS_GEN2_STATE_MASK GENMASK(6, 4)
#define STATUS_GEN2_STATE_SHIFT 4
-#define SHUTDOWN_CTRL1_OVERRIDE_S2 BIT(6)
+#define SHUTDOWN_CTRL1_OVERRIDE_STAGE2 BIT(6)
#define SHUTDOWN_CTRL1_THRESHOLD_MASK GENMASK(1, 0)
#define SHUTDOWN_CTRL1_RATE_25HZ BIT(3)
@@ -80,6 +82,7 @@ struct qpnp_tm_chip {
/* protects .thresh, .stage and chip registers */
struct mutex lock;
bool initialized;
+ bool require_stage2_shutdown;
struct iio_channel *adc;
const long (*temp_map)[THRESH_COUNT][STAGE_COUNT];
@@ -222,13 +225,13 @@ static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip,
{
long stage2_threshold_min = (*chip->temp_map)[THRESH_MIN][1];
long stage2_threshold_max = (*chip->temp_map)[THRESH_MAX][1];
- bool disable_s2_shutdown = false;
+ bool disable_stage2_shutdown = false;
u8 reg;
WARN_ON(!mutex_is_locked(&chip->lock));
/*
- * Default: S2 and S3 shutdown enabled, thresholds at
+ * Default: Stage 2 and Stage 3 shutdown enabled, thresholds at
* lowest threshold set, monitoring at 25Hz
*/
reg = SHUTDOWN_CTRL1_RATE_25HZ;
@@ -243,12 +246,12 @@ static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip,
chip->thresh = THRESH_MAX -
((stage2_threshold_max - temp) /
TEMP_THRESH_STEP);
- disable_s2_shutdown = true;
+ disable_stage2_shutdown = true;
} else {
chip->thresh = THRESH_MAX;
if (chip->adc)
- disable_s2_shutdown = true;
+ disable_stage2_shutdown = true;
else
dev_warn(chip->dev,
"No ADC is configured and critical temperature %d mC is above the maximum stage 2 threshold of %ld mC! Configuring stage 2 shutdown at %ld mC.\n",
@@ -257,8 +260,8 @@ static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip,
skip:
reg |= chip->thresh;
- if (disable_s2_shutdown)
- reg |= SHUTDOWN_CTRL1_OVERRIDE_S2;
+ if (disable_stage2_shutdown && !chip->require_stage2_shutdown)
+ reg |= SHUTDOWN_CTRL1_OVERRIDE_STAGE2;
return qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg);
}
@@ -372,8 +375,8 @@ static int qpnp_tm_probe(struct platform_device *pdev)
{
struct qpnp_tm_chip *chip;
struct device_node *node;
- u8 type, subtype, dig_major;
- u32 res;
+ u8 type, subtype, dig_major, dig_minor;
+ u32 res, dig_revision;
int ret, irq;
node = pdev->dev.of_node;
@@ -428,6 +431,11 @@ static int qpnp_tm_probe(struct platform_device *pdev)
return ret;
}
+ ret = qpnp_tm_read(chip, QPNP_TM_REG_DIG_MINOR, &dig_minor);
+ if (ret < 0)
+ return dev_err_probe(&pdev->dev, ret,
+ "could not read dig_minor\n");
+
if (type != QPNP_TM_TYPE || (subtype != QPNP_TM_SUBTYPE_GEN1
&& subtype != QPNP_TM_SUBTYPE_GEN2)) {
dev_err(&pdev->dev, "invalid type 0x%02x or subtype 0x%02x\n",
@@ -441,6 +449,23 @@ static int qpnp_tm_probe(struct platform_device *pdev)
else
chip->temp_map = &temp_map_gen1;
+ if (chip->subtype == QPNP_TM_SUBTYPE_GEN2) {
+ dig_revision = (dig_major << 8) | dig_minor;
+ /*
+ * Check if stage 2 automatic partial shutdown must remain
+ * enabled to avoid potential repeated faults upon reaching
+ * over-temperature stage 3.
+ */
+ switch (dig_revision) {
+ case 0x0001:
+ case 0x0002:
+ case 0x0100:
+ case 0x0101:
+ chip->require_stage2_shutdown = true;
+ break;
+ }
+ }
+
/*
* Register the sensor before initializing the hardware to be able to
* read the trip points. get_temp() returns the default temperature
--
2.39.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH AUTOSEL 5.15 21/44] PM: runtime: Clear power.needs_force_resume in pm_runtime_reinit()
[not found] <20250804003849.3627024-1-sashal@kernel.org>
` (3 preceding siblings ...)
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 15/44] thermal/drivers/qcom-spmi-temp-alarm: Enable stage 2 shutdown when required Sasha Levin
@ 2025-08-04 0:38 ` Sasha Levin
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 22/44] thermal: sysfs: Return ENODATA instead of EAGAIN for reads Sasha Levin
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 23/44] PM: sleep: console: Fix the black screen issue Sasha Levin
6 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2025-08-04 0:38 UTC (permalink / raw)
To: patches, stable
Cc: Rafael J. Wysocki, Ulf Hansson, Sasha Levin, rafael, linux-pm
From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
[ Upstream commit 89d9cec3b1e9c49bae9375a2db6dc49bc7468af0 ]
Clear power.needs_force_resume in pm_runtime_reinit() in case it has
been set by pm_runtime_force_suspend() invoked from a driver remove
callback.
Suggested-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/9495163.CDJkKcVGEf@rjwysocki.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit is a clear bugfix that should be backported to stable kernel
trees for the following reasons:
1. **It fixes a real bug**: The commit addresses a specific issue where
`power.needs_force_resume` flag is not cleared in
`pm_runtime_reinit()` when a device is removed. If a driver calls
`pm_runtime_force_suspend()` from its remove callback (which sets
`needs_force_resume = 1`), this flag remains set even after the
device is removed and potentially re-probed.
2. **The fix is minimal and contained**: The change adds just 5 lines of
code (including comments) to clear a single flag. The modification
is:
```c
dev->power.needs_force_resume = false;
```
This is a very low-risk change that only affects the specific
condition being fixed.
3. **It prevents state leakage**: Looking at the code flow:
- `pm_runtime_force_suspend()` sets `dev->power.needs_force_resume =
1` (line in runtime.c)
- When a driver is removed, `pm_runtime_remove()` calls
`pm_runtime_reinit()`
- Without this fix, if the device is re-probed, it would still have
`needs_force_resume = 1` from the previous instance
- This could lead to incorrect PM runtime behavior where
`pm_runtime_force_resume()` would incorrectly think it needs to
resume a device that was never suspended in the current probe cycle
4. **Related to previous stable fixes**: The git history shows a
previous commit `c745253e2a69` ("PM: runtime: Fix unpaired parent
child_count for force_resume") was already marked for stable (4.16+),
indicating that issues with the `needs_force_resume` flag have been
problematic enough to warrant stable backports.
5. **Clear bug scenario**: The commit message describes a specific
scenario where this happens - when `pm_runtime_force_suspend()` is
called from a driver remove callback. This is a legitimate use case
where drivers want to ensure devices are suspended before removal.
6. **No architectural changes**: This is purely a bugfix that clears a
flag that should have been cleared during reinitialization. It
doesn't introduce new features or change any APIs.
The fix ensures proper PM runtime state management across device removal
and re-probing cycles, which is important for system stability and
correct power management behavior.
drivers/base/power/runtime.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 35e1a090ef90..26ea7f5c8d42 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -1714,6 +1714,11 @@ void pm_runtime_reinit(struct device *dev)
pm_runtime_put(dev->parent);
}
}
+ /*
+ * Clear power.needs_force_resume in case it has been set by
+ * pm_runtime_force_suspend() invoked from a driver remove callback.
+ */
+ dev->power.needs_force_resume = false;
}
/**
--
2.39.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH AUTOSEL 5.15 22/44] thermal: sysfs: Return ENODATA instead of EAGAIN for reads
[not found] <20250804003849.3627024-1-sashal@kernel.org>
` (4 preceding siblings ...)
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 21/44] PM: runtime: Clear power.needs_force_resume in pm_runtime_reinit() Sasha Levin
@ 2025-08-04 0:38 ` Sasha Levin
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 23/44] PM: sleep: console: Fix the black screen issue Sasha Levin
6 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2025-08-04 0:38 UTC (permalink / raw)
To: patches, stable
Cc: Hsin-Te Yuan, Rafael J . Wysocki, Sasha Levin, rafael,
daniel.lezcano, linux-pm
From: Hsin-Te Yuan <yuanhsinte@chromium.org>
[ Upstream commit 1a4aabc27e95674837f2e25f4ef340c0469e6203 ]
According to POSIX spec, EAGAIN returned by read with O_NONBLOCK set
means the read would block. Hence, the common implementation in
nonblocking model will poll the file when the nonblocking read returns
EAGAIN. However, when the target file is thermal zone, this mechanism
will totally malfunction because thermal zone doesn't implement sysfs
notification and thus the poll will never return.
For example, the read in Golang implemnts such method and sometimes
hangs at reading some thermal zones via sysfs.
Change to return -ENODATA instead of -EAGAIN to userspace.
Signed-off-by: Hsin-Te Yuan <yuanhsinte@chromium.org>
Link: https://patch.msgid.link/20250620-temp-v3-1-6becc6aeb66c@chromium.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit, here is my determination:
**Backport Status: YES**
## Extensive Explanation:
This commit should be backported to stable kernel trees for the
following reasons:
### 1. **Bug Fix for Real User-Facing Issue**
The commit fixes a bug where userspace applications can hang when
reading thermal zone temperature files via sysfs with O_NONBLOCK set.
The commit message clearly states that "the read in Golang implements
such method and sometimes hangs at reading some thermal zones via
sysfs." This is a real-world problem affecting actual applications.
### 2. **POSIX Compliance Issue**
The current behavior violates POSIX semantics. According to POSIX
specifications, EAGAIN from a non-blocking read means "the read would
block" - implying that the caller should poll/select and wait for data
to become available. However, thermal zone sysfs files don't implement
poll notifications, so applications following standard POSIX patterns
will poll forever, causing hangs.
### 3. **Small and Contained Change**
The code change is minimal and localized:
```c
- if (ret)
- return ret;
+ if (!ret)
+ return sprintf(buf, "%d\n", temperature);
- return sprintf(buf, "%d\n", temperature);
+ if (ret == -EAGAIN)
+ return -ENODATA;
+
+ return ret;
```
It only affects the `temp_show()` function in `thermal_sysfs.c` by
translating -EAGAIN to -ENODATA specifically for sysfs reads.
### 4. **Low Risk of Regression**
- The change only affects error handling paths
- It doesn't modify the normal success case
- It only translates one specific error code (-EAGAIN) to another
(-ENODATA)
- ENODATA is a more appropriate error for "no data available" in a sysfs
context
### 5. **Multiple Drivers Return -EAGAIN**
My search found at least 13 thermal drivers that can return -EAGAIN from
their get_temp operations:
- imx8mm_thermal.c
- imx_thermal.c
- tegra-bpmp-thermal.c
- qoriq_thermal.c
- lvts_thermal.c
- rockchip_thermal.c
- exynos_tmu.c
- sun8i_thermal.c
- stm_thermal.c
- intel_powerclamp.c
This indicates the issue affects multiple platforms and thermal drivers.
### 6. **Follows Stable Kernel Rules**
- Fixes a real bug that bothers people (application hangs)
- Fix is already in Linus's tree (based on the Signed-off-by from Rafael
J. Wysocki)
- Small change (< 100 lines)
- Obviously correct and tested
- Doesn't change APIs or break existing functionality
### 7. **No Architectural Changes**
The commit doesn't introduce new features or change the thermal
subsystem architecture. It's purely a bug fix that makes the sysfs
interface behave correctly with non-blocking reads.
The fix is appropriate because sysfs files are not meant to be pollable
in the traditional sense - they provide instantaneous data snapshots.
Returning ENODATA instead of EAGAIN properly communicates "no data
currently available" without implying that polling would help.
drivers/thermal/thermal_sysfs.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
index de7cdec3db90..a21af02f6347 100644
--- a/drivers/thermal/thermal_sysfs.c
+++ b/drivers/thermal/thermal_sysfs.c
@@ -39,10 +39,13 @@ temp_show(struct device *dev, struct device_attribute *attr, char *buf)
ret = thermal_zone_get_temp(tz, &temperature);
- if (ret)
- return ret;
+ if (!ret)
+ return sprintf(buf, "%d\n", temperature);
- return sprintf(buf, "%d\n", temperature);
+ if (ret == -EAGAIN)
+ return -ENODATA;
+
+ return ret;
}
static ssize_t
--
2.39.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH AUTOSEL 5.15 23/44] PM: sleep: console: Fix the black screen issue
[not found] <20250804003849.3627024-1-sashal@kernel.org>
` (5 preceding siblings ...)
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 22/44] thermal: sysfs: Return ENODATA instead of EAGAIN for reads Sasha Levin
@ 2025-08-04 0:38 ` Sasha Levin
6 siblings, 0 replies; 7+ messages in thread
From: Sasha Levin @ 2025-08-04 0:38 UTC (permalink / raw)
To: patches, stable
Cc: tuhaowen, Rafael J . Wysocki, Sasha Levin, rafael, linux-pm
From: tuhaowen <tuhaowen@uniontech.com>
[ Upstream commit 4266e8fa56d3d982bf451d382a410b9db432015c ]
When the computer enters sleep status without a monitor
connected, the system switches the console to the virtual
terminal tty63(SUSPEND_CONSOLE).
If a monitor is subsequently connected before waking up,
the system skips the required VT restoration process
during wake-up, leaving the console on tty63 instead of
switching back to tty1.
To fix this issue, a global flag vt_switch_done is introduced
to record whether the system has successfully switched to
the suspend console via vt_move_to_console() during suspend.
If the switch was completed, vt_switch_done is set to 1.
Later during resume, this flag is checked to ensure that
the original console is restored properly by calling
vt_move_to_console(orig_fgconsole, 0).
This prevents scenarios where the resume logic skips console
restoration due to incorrect detection of the console state,
especially when a monitor is reconnected before waking up.
Signed-off-by: tuhaowen <tuhaowen@uniontech.com>
Link: https://patch.msgid.link/20250611032345.29962-1-tuhaowen@uniontech.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my
extensive analysis:
**1. Bug Fix Nature:**
The commit fixes a clear user-visible bug - a black screen issue that
occurs when:
- A computer enters sleep without a monitor connected
- A monitor is connected while the system is asleep
- Upon wake, the console remains on tty63 (SUSPEND_CONSOLE) instead of
returning to tty1
This is a functional regression that impacts user experience, making the
system appear unresponsive or broken after resume.
**2. Small and Contained Fix:**
The fix is minimal and well-contained:
- Adds a single `static bool vt_switch_done` flag
- Modifies only two functions: `pm_prepare_console()` and
`pm_restore_console()`
- Total change is just 6 lines of code (excluding whitespace)
- The logic is straightforward: track whether VT switch happened during
suspend, and ensure restoration happens during resume
**3. Low Risk of Side Effects:**
The change has minimal risk:
- The new flag (`vt_switch_done`) is only set when
`vt_move_to_console()` succeeds
- The restoration logic preserves existing behavior when
`pm_vt_switch()` returns true
- The fix only adds an additional condition `!vt_switch_done` to ensure
restoration happens even when `pm_vt_switch()` returns false during
resume
- No architectural changes or new features are introduced
**4. Critical Subsystem but Safe Change:**
While this touches the PM/console subsystem (critical for
suspend/resume), the change is safe:
- It doesn't modify the core VT switching mechanism
- It only ensures the existing restoration path is executed when needed
- The file has been stable since 2013 (last major change was
f43f627d2f17e)
**5. Clear Problem Description:**
The commit message clearly describes:
- The specific scenario that triggers the bug
- The root cause (skipped VT restoration)
- The solution (tracking VT switch state)
**6. Follows Stable Rules:**
This aligns perfectly with stable tree criteria:
- Fixes a real bug affecting users
- Minimal change (6 lines)
- No new features
- Clear fix for a specific problem
- Low regression risk
The commit addresses a real-world scenario where users connecting
monitors to sleeping systems experience black screens upon wake - a
frustrating issue that makes systems appear broken. The fix is surgical,
adding only the minimum tracking needed to ensure proper console
restoration.
kernel/power/console.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/kernel/power/console.c b/kernel/power/console.c
index fcdf0e14a47d..19c48aa5355d 100644
--- a/kernel/power/console.c
+++ b/kernel/power/console.c
@@ -16,6 +16,7 @@
#define SUSPEND_CONSOLE (MAX_NR_CONSOLES-1)
static int orig_fgconsole, orig_kmsg;
+static bool vt_switch_done;
static DEFINE_MUTEX(vt_switch_mutex);
@@ -136,17 +137,21 @@ void pm_prepare_console(void)
if (orig_fgconsole < 0)
return;
+ vt_switch_done = true;
+
orig_kmsg = vt_kmsg_redirect(SUSPEND_CONSOLE);
return;
}
void pm_restore_console(void)
{
- if (!pm_vt_switch())
+ if (!pm_vt_switch() && !vt_switch_done)
return;
if (orig_fgconsole >= 0) {
vt_move_to_console(orig_fgconsole, 0);
vt_kmsg_redirect(orig_kmsg);
}
+
+ vt_switch_done = false;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-08-04 0:39 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20250804003849.3627024-1-sashal@kernel.org>
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 02/44] cpufreq: CPPC: Mark driver with NEED_UPDATE_LIMITS flag Sasha Levin
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 10/44] cpufreq: Exit governor when failed to start old governor Sasha Levin
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 12/44] PM / devfreq: governor: Replace sscanf() with kstrtoul() in set_freq_store() Sasha Levin
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 15/44] thermal/drivers/qcom-spmi-temp-alarm: Enable stage 2 shutdown when required Sasha Levin
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 21/44] PM: runtime: Clear power.needs_force_resume in pm_runtime_reinit() Sasha Levin
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 22/44] thermal: sysfs: Return ENODATA instead of EAGAIN for reads Sasha Levin
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 23/44] PM: sleep: console: Fix the black screen issue Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).