Archive-only list for patches
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: David Collins <david.collins@oss.qualcomm.com>,
	Anjelique Melendez <anjelique.melendez@oss.qualcomm.com>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Sasha Levin <sashal@kernel.org>,
	amitk@kernel.org, thara.gopinath@gmail.com,
	linux-arm-msm@vger.kernel.org, linux-pm@vger.kernel.org
Subject: [PATCH AUTOSEL 6.12 23/69] thermal/drivers/qcom-spmi-temp-alarm: Enable stage 2 shutdown when required
Date: Sun,  3 Aug 2025 20:30:33 -0400	[thread overview]
Message-ID: <20250804003119.3620476-23-sashal@kernel.org> (raw)
In-Reply-To: <20250804003119.3620476-1-sashal@kernel.org>

From: David Collins <david.collins@oss.qualcomm.com>

[ Upstream commit f8e157ff2df46ddabd930815d196895976227831 ]

Certain TEMP_ALARM GEN2 PMIC peripherals need over-temperature stage 2
automatic PMIC partial shutdown. This will ensure that in the event of
reaching the hotter stage 3 over-temperature threshold, repeated faults
will be avoided during the automatic PMIC hardware full shutdown.
Modify the stage 2 shutdown control logic to ensure that stage 2
shutdown is enabled on all affected PMICs. Read the digital major
and minor revision registers to identify these PMICs.

Signed-off-by: David Collins <david.collins@oss.qualcomm.com>
Signed-off-by: Anjelique Melendez <anjelique.melendez@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250710224555.3047790-2-anjelique.melendez@oss.qualcomm.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

**Backport Status: YES**

This commit should be backported to stable kernel trees. Here's my
detailed analysis:

## Key Reasons for Backporting:

### 1. **Fixes a Hardware Safety Issue**
The commit addresses a critical hardware protection mechanism where
certain PMIC peripherals require stage 2 automatic partial shutdown to
prevent "repeated faults" during stage 3 over-temperature events.
Without this fix, affected hardware could experience repeated thermal
faults during critical temperature conditions, potentially leading to
system instability or hardware damage.

### 2. **Targeted Hardware-Specific Fix**
The code changes identify specific PMIC revisions that require this
protection:
```c
switch (dig_revision) {
case 0x0001:
case 0x0002:
case 0x0100:
case 0x0101:
    chip->require_stage2_shutdown = true;
    break;
}
```
This shows it's a targeted fix for known hardware issues, not a general
enhancement.

### 3. **Small and Contained Change**
The fix is minimal and self-contained:
- Adds reading of DIG_MINOR register
- Adds a `require_stage2_shutdown` flag to the chip structure
- Modifies the logic in `qpnp_tm_update_critical_trip_temp()` to respect
  this flag
- Total change is about 30 lines of code with clear boundaries

### 4. **Low Risk of Regression**
- The change only affects specific PMIC revisions (0x0001, 0x0002,
  0x0100, 0x0101)
- For other hardware, behavior remains unchanged
- The modification is purely additive - it doesn't alter existing
  functionality for unaffected devices

### 5. **Prevents Hardware Malfunction**
The commit message explicitly states this fix is needed to "avoid
potential repeated faults upon reaching over-temperature stage 3." This
indicates a real hardware malfunction scenario that users could
encounter without this patch.

### 6. **Follows Stable Tree Rules**
According to stable kernel rules, this qualifies because it:
- Fixes a real bug (hardware protection failure)
- Is not a new feature
- Has minimal changes
- Addresses a specific hardware issue that affects users

### Technical Analysis:
The key change is in the `qpnp_tm_update_critical_trip_temp()` function
where the condition is modified from:
```c
if (disable_s2_shutdown)
    reg |= SHUTDOWN_CTRL1_OVERRIDE_S2;
```
to:
```c
if (disable_stage2_shutdown && !chip->require_stage2_shutdown)
    reg |= SHUTDOWN_CTRL1_OVERRIDE_STAGE2;
```

This ensures that for the affected PMIC revisions, stage 2 shutdown is
never disabled, providing the necessary hardware protection against
thermal events.

Given that this fixes a hardware-specific thermal protection issue that
could lead to system instability or damage, and the fix is minimal with
low regression risk, this commit is an excellent candidate for stable
backporting.

 drivers/thermal/qcom/qcom-spmi-temp-alarm.c | 43 ++++++++++++++++-----
 1 file changed, 34 insertions(+), 9 deletions(-)

diff --git a/drivers/thermal/qcom/qcom-spmi-temp-alarm.c b/drivers/thermal/qcom/qcom-spmi-temp-alarm.c
index c2d59cbfaea9..a575585c737b 100644
--- a/drivers/thermal/qcom/qcom-spmi-temp-alarm.c
+++ b/drivers/thermal/qcom/qcom-spmi-temp-alarm.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /*
  * Copyright (c) 2011-2015, 2017, 2020, The Linux Foundation. All rights reserved.
+ * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
  */
 
 #include <linux/bitops.h>
@@ -16,6 +17,7 @@
 
 #include "../thermal_hwmon.h"
 
+#define QPNP_TM_REG_DIG_MINOR		0x00
 #define QPNP_TM_REG_DIG_MAJOR		0x01
 #define QPNP_TM_REG_TYPE		0x04
 #define QPNP_TM_REG_SUBTYPE		0x05
@@ -31,7 +33,7 @@
 #define STATUS_GEN2_STATE_MASK		GENMASK(6, 4)
 #define STATUS_GEN2_STATE_SHIFT		4
 
-#define SHUTDOWN_CTRL1_OVERRIDE_S2	BIT(6)
+#define SHUTDOWN_CTRL1_OVERRIDE_STAGE2	BIT(6)
 #define SHUTDOWN_CTRL1_THRESHOLD_MASK	GENMASK(1, 0)
 
 #define SHUTDOWN_CTRL1_RATE_25HZ	BIT(3)
@@ -78,6 +80,7 @@ struct qpnp_tm_chip {
 	/* protects .thresh, .stage and chip registers */
 	struct mutex			lock;
 	bool				initialized;
+	bool				require_stage2_shutdown;
 
 	struct iio_channel		*adc;
 	const long			(*temp_map)[THRESH_COUNT][STAGE_COUNT];
@@ -220,13 +223,13 @@ static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip,
 {
 	long stage2_threshold_min = (*chip->temp_map)[THRESH_MIN][1];
 	long stage2_threshold_max = (*chip->temp_map)[THRESH_MAX][1];
-	bool disable_s2_shutdown = false;
+	bool disable_stage2_shutdown = false;
 	u8 reg;
 
 	WARN_ON(!mutex_is_locked(&chip->lock));
 
 	/*
-	 * Default: S2 and S3 shutdown enabled, thresholds at
+	 * Default: Stage 2 and Stage 3 shutdown enabled, thresholds at
 	 * lowest threshold set, monitoring at 25Hz
 	 */
 	reg = SHUTDOWN_CTRL1_RATE_25HZ;
@@ -241,12 +244,12 @@ static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip,
 		chip->thresh = THRESH_MAX -
 			((stage2_threshold_max - temp) /
 			 TEMP_THRESH_STEP);
-		disable_s2_shutdown = true;
+		disable_stage2_shutdown = true;
 	} else {
 		chip->thresh = THRESH_MAX;
 
 		if (chip->adc)
-			disable_s2_shutdown = true;
+			disable_stage2_shutdown = true;
 		else
 			dev_warn(chip->dev,
 				 "No ADC is configured and critical temperature %d mC is above the maximum stage 2 threshold of %ld mC! Configuring stage 2 shutdown at %ld mC.\n",
@@ -255,8 +258,8 @@ static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip,
 
 skip:
 	reg |= chip->thresh;
-	if (disable_s2_shutdown)
-		reg |= SHUTDOWN_CTRL1_OVERRIDE_S2;
+	if (disable_stage2_shutdown && !chip->require_stage2_shutdown)
+		reg |= SHUTDOWN_CTRL1_OVERRIDE_STAGE2;
 
 	return qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg);
 }
@@ -350,8 +353,8 @@ static int qpnp_tm_probe(struct platform_device *pdev)
 {
 	struct qpnp_tm_chip *chip;
 	struct device_node *node;
-	u8 type, subtype, dig_major;
-	u32 res;
+	u8 type, subtype, dig_major, dig_minor;
+	u32 res, dig_revision;
 	int ret, irq;
 
 	node = pdev->dev.of_node;
@@ -403,6 +406,11 @@ static int qpnp_tm_probe(struct platform_device *pdev)
 		return dev_err_probe(&pdev->dev, ret,
 				     "could not read dig_major\n");
 
+	ret = qpnp_tm_read(chip, QPNP_TM_REG_DIG_MINOR, &dig_minor);
+	if (ret < 0)
+		return dev_err_probe(&pdev->dev, ret,
+				     "could not read dig_minor\n");
+
 	if (type != QPNP_TM_TYPE || (subtype != QPNP_TM_SUBTYPE_GEN1
 				     && subtype != QPNP_TM_SUBTYPE_GEN2)) {
 		dev_err(&pdev->dev, "invalid type 0x%02x or subtype 0x%02x\n",
@@ -416,6 +424,23 @@ static int qpnp_tm_probe(struct platform_device *pdev)
 	else
 		chip->temp_map = &temp_map_gen1;
 
+	if (chip->subtype == QPNP_TM_SUBTYPE_GEN2) {
+		dig_revision = (dig_major << 8) | dig_minor;
+		/*
+		 * Check if stage 2 automatic partial shutdown must remain
+		 * enabled to avoid potential repeated faults upon reaching
+		 * over-temperature stage 3.
+		 */
+		switch (dig_revision) {
+		case 0x0001:
+		case 0x0002:
+		case 0x0100:
+		case 0x0101:
+			chip->require_stage2_shutdown = true;
+			break;
+		}
+	}
+
 	/*
 	 * Register the sensor before initializing the hardware to be able to
 	 * read the trip points. get_temp() returns the default temperature
-- 
2.39.5


  parent reply	other threads:[~2025-08-04  0:32 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-04  0:30 [PATCH AUTOSEL 6.12 01/69] usb: xhci: print xhci->xhc_state when queue_command failed Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 02/69] platform/x86/amd: pmc: Add Lenovo Yoga 6 13ALC6 to pmc quirk list Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 03/69] cpufreq: CPPC: Mark driver with NEED_UPDATE_LIMITS flag Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 04/69] selftests/futex: Define SYS_futex on 32-bit architectures with 64-bit time_t Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 05/69] usb: typec: ucsi: psy: Set current max to 100mA for BC 1.2 and Default Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 06/69] regulator: core: repeat voltage setting request for stepped regulators Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 07/69] usb: xhci: Avoid showing warnings for dying controller Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 08/69] usb: xhci: Set avg_trb_len = 8 for EP0 during Address Device Command Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 09/69] usb: xhci: Avoid showing errors during surprise removal Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 10/69] firmware: qcom: scm: initialize tzmem before marking SCM as available Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 11/69] soc: qcom: rpmh-rsc: Add RSC version 4 support Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 12/69] ACPI: APEI: send SIGBUS to current task if synchronous memory error not recovered Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 13/69] remoteproc: imx_rproc: skip clock enable when M-core is managed by the SCU Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 14/69] usb: typec: tcpm/tcpci_maxim: fix irq wake usage Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 15/69] pmdomain: ti: Select PM_GENERIC_DOMAINS Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 16/69] gpio: wcd934x: check the return value of regmap_update_bits() Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 17/69] cpufreq: Exit governor when failed to start old governor Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 18/69] cpufreq: intel_pstate: Add Granite Rapids support in no-HWP mode Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 19/69] ARM: rockchip: fix kernel hang during smp initialization Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 20/69] PM / devfreq: governor: Replace sscanf() with kstrtoul() in set_freq_store() Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 21/69] EDAC/synopsys: Clear the ECC counters on init Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 22/69] ASoC: soc-dapm: set bias_level if snd_soc_dapm_set_bias_level() was successed Sasha Levin
2025-08-04  0:30 ` Sasha Levin [this message]
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 24/69] tools/nolibc: define time_t in terms of __kernel_old_time_t Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 25/69] iio: adc: ad_sigma_delta: don't overallocate scan buffer Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 26/69] gpio: tps65912: check the return value of regmap_update_bits() Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 27/69] mfd: tps6594: Add TI TPS652G1 support Sasha Levin
2025-08-18  6:34   ` Michael Walle
2025-08-19  2:01     ` Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 28/69] ARM: tegra: Use I/O memcpy to write to IRAM Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 29/69] tools/build: Fix s390(x) cross-compilation with clang Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 30/69] selftests: tracing: Use mutex_unlock for testing glob filter Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 31/69] ACPI: PRM: Reduce unnecessary printing to avoid user confusion Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 32/69] firmware: arm_scmi: power_control: Ensure SCMI_SYSPOWER_IDLE is set early during resume Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 33/69] firmware: tegra: Fix IVC dependency problems Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 34/69] pwm: sifive: Fix PWM algorithm and clarify inverted compare behavior Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 35/69] PM: runtime: Clear power.needs_force_resume in pm_runtime_reinit() Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 36/69] thermal: sysfs: Return ENODATA instead of EAGAIN for reads Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 37/69] PM: sleep: console: Fix the black screen issue Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 38/69] ACPI: processor: fix acpi_object initialization Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 39/69] mmc: sdhci-msm: Ensure SD card power isn't ON when card removed Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 40/69] ACPI: APEI: GHES: add TAINT_MACHINE_CHECK on GHES panic path Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 41/69] selftests: vDSO: vdso_test_getrandom: Always print TAP header Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 42/69] pps: clients: gpio: fix interrupt handling order in remove path Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 43/69] reset: brcmstb: Enable reset drivers for ARCH_BCM2835 Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 44/69] char: misc: Fix improper and inaccurate error code returned by misc_init() Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 45/69] mei: bus: Check for still connected devices in mei_cl_bus_dev_release() Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 46/69] mmc: rtsx_usb_sdmmc: Fix error-path in sd_set_power_mode() Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 47/69] platform/chrome: cros_ec_sensorhub: Retries when a sensor is not ready Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 48/69] ALSA: hda: Handle the jack polling always via a work Sasha Levin
2025-08-04  0:30 ` [PATCH AUTOSEL 6.12 49/69] ALSA: hda: Disable jack polling at shutdown Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 50/69] x86/bugs: Avoid warning when overriding return thunk Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 51/69] ASoC: hdac_hdmi: Rate limit logging on connection and disconnection Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 52/69] ALSA: intel8x0: Fix incorrect codec index usage in mixer for ICH4 Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 53/69] ASoC: SOF: topology: Parse the dapm_widget_tokens in case of DSPless mode Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 54/69] tty: serial: fix print format specifiers Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 55/69] ASoC: core: Check for rtd == NULL in snd_soc_remove_pcm_runtime() Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 56/69] usb: typec: intel_pmc_mux: Defer probe if SCU IPC isn't present Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 57/69] usb: core: usb_submit_urb: downgrade type check Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 58/69] usb: typec: fusb302: fix scheduling while atomic when using virtio-gpio Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 59/69] pm: cpupower: Fix the snapshot-order of tsc,mperf, clock in mperf_stop() Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 60/69] imx8m-blk-ctrl: set ISI panic write hurry level Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 61/69] soc: qcom: mdt_loader: Actually use the e_phoff Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 62/69] platform/x86: thinkpad_acpi: Handle KCOV __init vs inline mismatches Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 63/69] platform/chrome: cros_ec_typec: Defer probe on missing EC parent Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 64/69] ALSA: hda/ca0132: Fix buffer overflow in add_tuning_control Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 65/69] ALSA: pcm: Rewrite recalculate_boundary() to avoid costly loop Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 66/69] ALSA: usb-audio: Avoid precedence issues in mixer_quirks macros Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 67/69] iio: adc: ad7768-1: Ensure SYNC_IN pulse minimum timing requirement Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 68/69] ASoC: codecs: rt5640: Retry DEVICE_ID verification Sasha Levin
2025-08-04  0:31 ` [PATCH AUTOSEL 6.12 69/69] ASoC: qcom: use drvdata instead of component to keep id Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250804003119.3620476-23-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=amitk@kernel.org \
    --cc=anjelique.melendez@oss.qualcomm.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=david.collins@oss.qualcomm.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=thara.gopinath@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox