* [PATCH AUTOSEL 5.15 15/44] thermal/drivers/qcom-spmi-temp-alarm: Enable stage 2 shutdown when required
[not found] <20250804003849.3627024-1-sashal@kernel.org>
@ 2025-08-04 0:38 ` Sasha Levin
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 25/44] mmc: sdhci-msm: Ensure SD card power isn't ON when card removed Sasha Levin
1 sibling, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2025-08-04 0:38 UTC (permalink / raw)
To: patches, stable
Cc: David Collins, Anjelique Melendez, Daniel Lezcano, Sasha Levin,
amitk, thara.gopinath, linux-pm, linux-arm-msm
From: David Collins <david.collins@oss.qualcomm.com>
[ Upstream commit f8e157ff2df46ddabd930815d196895976227831 ]
Certain TEMP_ALARM GEN2 PMIC peripherals need over-temperature stage 2
automatic PMIC partial shutdown. This will ensure that in the event of
reaching the hotter stage 3 over-temperature threshold, repeated faults
will be avoided during the automatic PMIC hardware full shutdown.
Modify the stage 2 shutdown control logic to ensure that stage 2
shutdown is enabled on all affected PMICs. Read the digital major
and minor revision registers to identify these PMICs.
Signed-off-by: David Collins <david.collins@oss.qualcomm.com>
Signed-off-by: Anjelique Melendez <anjelique.melendez@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250710224555.3047790-2-anjelique.melendez@oss.qualcomm.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my
detailed analysis:
## Key Reasons for Backporting:
### 1. **Fixes a Hardware Safety Issue**
The commit addresses a critical hardware protection mechanism where
certain PMIC peripherals require stage 2 automatic partial shutdown to
prevent "repeated faults" during stage 3 over-temperature events.
Without this fix, affected hardware could experience repeated thermal
faults during critical temperature conditions, potentially leading to
system instability or hardware damage.
### 2. **Targeted Hardware-Specific Fix**
The code changes identify specific PMIC revisions that require this
protection:
```c
switch (dig_revision) {
case 0x0001:
case 0x0002:
case 0x0100:
case 0x0101:
chip->require_stage2_shutdown = true;
break;
}
```
This shows it's a targeted fix for known hardware issues, not a general
enhancement.
### 3. **Small and Contained Change**
The fix is minimal and self-contained:
- Adds reading of DIG_MINOR register
- Adds a `require_stage2_shutdown` flag to the chip structure
- Modifies the logic in `qpnp_tm_update_critical_trip_temp()` to respect
this flag
- Total change is about 30 lines of code with clear boundaries
### 4. **Low Risk of Regression**
- The change only affects specific PMIC revisions (0x0001, 0x0002,
0x0100, 0x0101)
- For other hardware, behavior remains unchanged
- The modification is purely additive - it doesn't alter existing
functionality for unaffected devices
### 5. **Prevents Hardware Malfunction**
The commit message explicitly states this fix is needed to "avoid
potential repeated faults upon reaching over-temperature stage 3." This
indicates a real hardware malfunction scenario that users could
encounter without this patch.
### 6. **Follows Stable Tree Rules**
According to stable kernel rules, this qualifies because it:
- Fixes a real bug (hardware protection failure)
- Is not a new feature
- Has minimal changes
- Addresses a specific hardware issue that affects users
### Technical Analysis:
The key change is in the `qpnp_tm_update_critical_trip_temp()` function
where the condition is modified from:
```c
if (disable_s2_shutdown)
reg |= SHUTDOWN_CTRL1_OVERRIDE_S2;
```
to:
```c
if (disable_stage2_shutdown && !chip->require_stage2_shutdown)
reg |= SHUTDOWN_CTRL1_OVERRIDE_STAGE2;
```
This ensures that for the affected PMIC revisions, stage 2 shutdown is
never disabled, providing the necessary hardware protection against
thermal events.
Given that this fixes a hardware-specific thermal protection issue that
could lead to system instability or damage, and the fix is minimal with
low regression risk, this commit is an excellent candidate for stable
backporting.
drivers/thermal/qcom/qcom-spmi-temp-alarm.c | 43 ++++++++++++++++-----
1 file changed, 34 insertions(+), 9 deletions(-)
diff --git a/drivers/thermal/qcom/qcom-spmi-temp-alarm.c b/drivers/thermal/qcom/qcom-spmi-temp-alarm.c
index 1037de19873a..f466bbfa128d 100644
--- a/drivers/thermal/qcom/qcom-spmi-temp-alarm.c
+++ b/drivers/thermal/qcom/qcom-spmi-temp-alarm.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* Copyright (c) 2011-2015, 2017, 2020, The Linux Foundation. All rights reserved.
+ * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
*/
#include <linux/bitops.h>
@@ -17,6 +18,7 @@
#include "../thermal_core.h"
+#define QPNP_TM_REG_DIG_MINOR 0x00
#define QPNP_TM_REG_DIG_MAJOR 0x01
#define QPNP_TM_REG_TYPE 0x04
#define QPNP_TM_REG_SUBTYPE 0x05
@@ -32,7 +34,7 @@
#define STATUS_GEN2_STATE_MASK GENMASK(6, 4)
#define STATUS_GEN2_STATE_SHIFT 4
-#define SHUTDOWN_CTRL1_OVERRIDE_S2 BIT(6)
+#define SHUTDOWN_CTRL1_OVERRIDE_STAGE2 BIT(6)
#define SHUTDOWN_CTRL1_THRESHOLD_MASK GENMASK(1, 0)
#define SHUTDOWN_CTRL1_RATE_25HZ BIT(3)
@@ -80,6 +82,7 @@ struct qpnp_tm_chip {
/* protects .thresh, .stage and chip registers */
struct mutex lock;
bool initialized;
+ bool require_stage2_shutdown;
struct iio_channel *adc;
const long (*temp_map)[THRESH_COUNT][STAGE_COUNT];
@@ -222,13 +225,13 @@ static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip,
{
long stage2_threshold_min = (*chip->temp_map)[THRESH_MIN][1];
long stage2_threshold_max = (*chip->temp_map)[THRESH_MAX][1];
- bool disable_s2_shutdown = false;
+ bool disable_stage2_shutdown = false;
u8 reg;
WARN_ON(!mutex_is_locked(&chip->lock));
/*
- * Default: S2 and S3 shutdown enabled, thresholds at
+ * Default: Stage 2 and Stage 3 shutdown enabled, thresholds at
* lowest threshold set, monitoring at 25Hz
*/
reg = SHUTDOWN_CTRL1_RATE_25HZ;
@@ -243,12 +246,12 @@ static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip,
chip->thresh = THRESH_MAX -
((stage2_threshold_max - temp) /
TEMP_THRESH_STEP);
- disable_s2_shutdown = true;
+ disable_stage2_shutdown = true;
} else {
chip->thresh = THRESH_MAX;
if (chip->adc)
- disable_s2_shutdown = true;
+ disable_stage2_shutdown = true;
else
dev_warn(chip->dev,
"No ADC is configured and critical temperature %d mC is above the maximum stage 2 threshold of %ld mC! Configuring stage 2 shutdown at %ld mC.\n",
@@ -257,8 +260,8 @@ static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip,
skip:
reg |= chip->thresh;
- if (disable_s2_shutdown)
- reg |= SHUTDOWN_CTRL1_OVERRIDE_S2;
+ if (disable_stage2_shutdown && !chip->require_stage2_shutdown)
+ reg |= SHUTDOWN_CTRL1_OVERRIDE_STAGE2;
return qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg);
}
@@ -372,8 +375,8 @@ static int qpnp_tm_probe(struct platform_device *pdev)
{
struct qpnp_tm_chip *chip;
struct device_node *node;
- u8 type, subtype, dig_major;
- u32 res;
+ u8 type, subtype, dig_major, dig_minor;
+ u32 res, dig_revision;
int ret, irq;
node = pdev->dev.of_node;
@@ -428,6 +431,11 @@ static int qpnp_tm_probe(struct platform_device *pdev)
return ret;
}
+ ret = qpnp_tm_read(chip, QPNP_TM_REG_DIG_MINOR, &dig_minor);
+ if (ret < 0)
+ return dev_err_probe(&pdev->dev, ret,
+ "could not read dig_minor\n");
+
if (type != QPNP_TM_TYPE || (subtype != QPNP_TM_SUBTYPE_GEN1
&& subtype != QPNP_TM_SUBTYPE_GEN2)) {
dev_err(&pdev->dev, "invalid type 0x%02x or subtype 0x%02x\n",
@@ -441,6 +449,23 @@ static int qpnp_tm_probe(struct platform_device *pdev)
else
chip->temp_map = &temp_map_gen1;
+ if (chip->subtype == QPNP_TM_SUBTYPE_GEN2) {
+ dig_revision = (dig_major << 8) | dig_minor;
+ /*
+ * Check if stage 2 automatic partial shutdown must remain
+ * enabled to avoid potential repeated faults upon reaching
+ * over-temperature stage 3.
+ */
+ switch (dig_revision) {
+ case 0x0001:
+ case 0x0002:
+ case 0x0100:
+ case 0x0101:
+ chip->require_stage2_shutdown = true;
+ break;
+ }
+ }
+
/*
* Register the sensor before initializing the hardware to be able to
* read the trip points. get_temp() returns the default temperature
--
2.39.5
^ permalink raw reply related [flat|nested] 2+ messages in thread
* [PATCH AUTOSEL 5.15 25/44] mmc: sdhci-msm: Ensure SD card power isn't ON when card removed
[not found] <20250804003849.3627024-1-sashal@kernel.org>
2025-08-04 0:38 ` [PATCH AUTOSEL 5.15 15/44] thermal/drivers/qcom-spmi-temp-alarm: Enable stage 2 shutdown when required Sasha Levin
@ 2025-08-04 0:38 ` Sasha Levin
1 sibling, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2025-08-04 0:38 UTC (permalink / raw)
To: patches, stable
Cc: Sarthak Garg, Adrian Hunter, Ulf Hansson, Sasha Levin,
linux-arm-msm, linux-mmc
From: Sarthak Garg <quic_sartgarg@quicinc.com>
[ Upstream commit db58532188ebf51d52b1d7693d9e94c76b926e9f ]
Many mobile phones feature multi-card tray designs, where the same
tray is used for both SD and SIM cards. If the SD card is placed
at the outermost location in the tray, the SIM card may come in
contact with SD card power-supply while removing the tray, possibly
resulting in SIM damage.
To prevent that, make sure the SD card is really inserted by reading
the Card Detect pin state. If it's not, turn off the power in
sdhci_msm_check_power_status() and also set the BUS_FAIL power state
on the controller as part of pwr_irq handling for BUS_ON request.
Signed-off-by: Sarthak Garg <quic_sartgarg@quicinc.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20250701100659.3310386-1-quic_sartgarg@quicinc.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Hardware Damage Prevention**: This is a critical safety fix that
prevents potential hardware damage to SIM cards. The commit message
clearly describes a real-world scenario where SD card power can
damage SIM cards in multi-card tray designs when the tray is removed.
This is a serious hardware safety issue that affects end users'
devices.
2. **Bug Fix Nature**: The commit fixes a specific bug where SD card
power remains ON even when the card is physically removed. The code
changes show two key additions:
- In `sdhci_msm_check_power_status()`: Added check at lines 1625-1629
that turns off power (writes 0 to SDHCI_POWER_CONTROL) when card is
not detected (`!mmc->ops->get_cd(mmc)`)
- In `sdhci_msm_handle_pwr_irq()`: Added check at lines 1689-1694
that sets BUS_FAIL state when attempting to power on the bus while
card is not present
3. **Small and Contained Fix**: The changes are minimal and well-
contained:
- Only 13 lines of actual code changes
- Changes are localized to the sdhci-msm driver
- No architectural changes or new features
- Simple logic additions that check card presence before power
operations
4. **Low Risk of Regression**: The fix adds defensive checks that only
activate when:
- A card is physically not present (detected via get_cd)
- Power operations are being performed
- This doesn't affect normal operation when cards are properly
inserted
5. **Platform-Specific Critical Fix**: This affects Qualcomm MSM-based
devices which are widely used in mobile phones. The multi-card tray
design mentioned is common in many smartphones, making this a
widespread potential issue.
6. **Clear Problem and Solution**: The commit has a clear problem
statement (SIM damage from SD power) and a straightforward solution
(turn off power when card is removed). This makes it easy to verify
the fix is correct.
The commit follows stable tree rules perfectly - it's a important bugfix
that prevents hardware damage, has minimal code changes, doesn't
introduce new features, and has very low regression risk. This is
exactly the type of safety-critical fix that stable kernels should
include.
drivers/mmc/host/sdhci-msm.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/drivers/mmc/host/sdhci-msm.c b/drivers/mmc/host/sdhci-msm.c
index 4b727754d8e3..8fb2ba20e221 100644
--- a/drivers/mmc/host/sdhci-msm.c
+++ b/drivers/mmc/host/sdhci-msm.c
@@ -1560,6 +1560,7 @@ static void sdhci_msm_check_power_status(struct sdhci_host *host, u32 req_type)
{
struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
struct sdhci_msm_host *msm_host = sdhci_pltfm_priv(pltfm_host);
+ struct mmc_host *mmc = host->mmc;
bool done = false;
u32 val = SWITCHABLE_SIGNALING_VOLTAGE;
const struct sdhci_msm_offset *msm_offset =
@@ -1617,6 +1618,12 @@ static void sdhci_msm_check_power_status(struct sdhci_host *host, u32 req_type)
"%s: pwr_irq for req: (%d) timed out\n",
mmc_hostname(host->mmc), req_type);
}
+
+ if ((req_type & REQ_BUS_ON) && mmc->card && !mmc->ops->get_cd(mmc)) {
+ sdhci_writeb(host, 0, SDHCI_POWER_CONTROL);
+ host->pwr = 0;
+ }
+
pr_debug("%s: %s: request %d done\n", mmc_hostname(host->mmc),
__func__, req_type);
}
@@ -1675,6 +1682,13 @@ static void sdhci_msm_handle_pwr_irq(struct sdhci_host *host, int irq)
udelay(10);
}
+ if ((irq_status & CORE_PWRCTL_BUS_ON) && mmc->card &&
+ !mmc->ops->get_cd(mmc)) {
+ msm_host_writel(msm_host, CORE_PWRCTL_BUS_FAIL, host,
+ msm_offset->core_pwrctl_ctl);
+ return;
+ }
+
/* Handle BUS ON/OFF*/
if (irq_status & CORE_PWRCTL_BUS_ON) {
pwr_state = REQ_BUS_ON;
--
2.39.5
^ permalink raw reply related [flat|nested] 2+ messages in thread