* [PATCH 0/5] thermal/drivers/mediatek/lvts: Fixes for suspend and IRQ storm, and cleanups
@ 2024-11-25 21:20 Nícolas F. R. A. Prado
2024-11-25 21:20 ` [PATCH 1/5] thermal/drivers/mediatek/lvts: Disable monitor mode during suspend Nícolas F. R. A. Prado
` (4 more replies)
0 siblings, 5 replies; 16+ messages in thread
From: Nícolas F. R. A. Prado @ 2024-11-25 21:20 UTC (permalink / raw)
To: Rafael J. Wysocki, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Matthias Brugger, AngeloGioacchino Del Regno, Alexandre Mergnat,
Balsam CHIHI
Cc: kernel, linux-pm, linux-kernel, linux-arm-kernel, linux-mediatek,
Hsin-Te Yuan, Chen-Yu Tsai, Bernhard Rosenkränzer,
Rafael J. Wysocki, Nícolas F. R. A. Prado, stable
Patches 1 and 2 of this series fix the issue reported by Hsin-Te Yuan
[1] where MT8192-based Chromebooks are not able to suspend/resume 10
times in a row. Either one of those patches on its own is enough to fix
the issue, but I believe both are desirable, so I've included them both
here.
Patches 3-5 fix unrelated issues that I've noticed while debugging.
Patch 3 fixes IRQ storms when the temperature sensors drop to 20
Celsius. Patches 4 and 5 are cleanups to prevent future issues.
To test this series, I've run 'rtcwake -m mem -d 60' 10 times in a row
on a MT8192-Asurada-Spherion-rev3 Chromebook and checked that the wakeup
happened 60 seconds later (+-5 seconds). I've repeated that test on 10
separate runs. Not once did the chromebook wake up early with the series
applied.
I've also checked that during those runs, the LVTS interrupt didn't
trigger even once, while before the series it would trigger a few times
per run, generally during boot or resume.
Finally, as a sanity check I've verified that the interrupts still work
by lowering the thermal trip point to 45 Celsius and running 'stress -c
8'. Indeed they still do, and the temperature showed by the
thermal_temperature ftrace event matched the expected value.
[1] https://lore.kernel.org/all/20241108-lvts-v1-1-eee339c6ca20@chromium.org/
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
---
Nícolas F. R. A. Prado (5):
thermal/drivers/mediatek/lvts: Disable monitor mode during suspend
thermal/drivers/mediatek/lvts: Disable Stage 3 thermal threshold
thermal/drivers/mediatek/lvts: Disable low offset IRQ for minimum threshold
thermal/drivers/mediatek/lvts: Start sensor interrupts disabled
thermal/drivers/mediatek/lvts: Only update IRQ enable for valid sensors
drivers/thermal/mediatek/lvts_thermal.c | 103 ++++++++++++++++++++++----------
1 file changed, 72 insertions(+), 31 deletions(-)
---
base-commit: b852e1e7a0389ed6168ef1d38eb0bad71a6b11e8
change-id: 20241121-mt8192-lvts-filtered-suspend-fix-a5032ca8eceb
Best regards,
--
Nícolas F. R. A. Prado <nfraprado@collabora.com>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/5] thermal/drivers/mediatek/lvts: Disable monitor mode during suspend
2024-11-25 21:20 [PATCH 0/5] thermal/drivers/mediatek/lvts: Fixes for suspend and IRQ storm, and cleanups Nícolas F. R. A. Prado
@ 2024-11-25 21:20 ` Nícolas F. R. A. Prado
2024-11-26 8:00 ` Hsin-Te Yuan
2024-11-26 9:43 ` AngeloGioacchino Del Regno
2024-11-25 21:20 ` [PATCH 2/5] thermal/drivers/mediatek/lvts: Disable Stage 3 thermal threshold Nícolas F. R. A. Prado
` (3 subsequent siblings)
4 siblings, 2 replies; 16+ messages in thread
From: Nícolas F. R. A. Prado @ 2024-11-25 21:20 UTC (permalink / raw)
To: Rafael J. Wysocki, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Matthias Brugger, AngeloGioacchino Del Regno, Alexandre Mergnat,
Balsam CHIHI
Cc: kernel, linux-pm, linux-kernel, linux-arm-kernel, linux-mediatek,
Hsin-Te Yuan, Chen-Yu Tsai, Bernhard Rosenkränzer,
Rafael J. Wysocki, Nícolas F. R. A. Prado, stable
When configured in filtered mode, the LVTS thermal controller will
monitor the temperature from the sensors and trigger an interrupt once a
thermal threshold is crossed.
Currently this is true even during suspend and resume. The problem with
that is that when enabling the internal clock of the LVTS controller in
lvts_ctrl_set_enable() during resume, the temperature reading can glitch
and appear much higher than the real one, resulting in a spurious
interrupt getting generated.
Disable the temperature monitoring and give some time for the signals to
stabilize during suspend in order to prevent such spurious interrupts.
Cc: stable@vger.kernel.org
Reported-by: Hsin-Te Yuan <yuanhsinte@chromium.org>
Closes: https://lore.kernel.org/all/20241108-lvts-v1-1-eee339c6ca20@chromium.org/
Fixes: 8137bb90600d ("thermal/drivers/mediatek/lvts_thermal: Add suspend and resume")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
---
drivers/thermal/mediatek/lvts_thermal.c | 36 +++++++++++++++++++++++++++++++--
1 file changed, 34 insertions(+), 2 deletions(-)
diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
index 1997e91bb3be94a3059db619238aa5787edc7675..a92ff2325c40704adc537af6995b34f93c3b0650 100644
--- a/drivers/thermal/mediatek/lvts_thermal.c
+++ b/drivers/thermal/mediatek/lvts_thermal.c
@@ -860,6 +860,32 @@ static int lvts_ctrl_init(struct device *dev, struct lvts_domain *lvts_td,
return 0;
}
+static void lvts_ctrl_monitor_enable(struct device *dev, struct lvts_ctrl *lvts_ctrl, bool enable)
+{
+ /*
+ * Bitmaps to enable each sensor on filtered mode in the MONCTL0
+ * register.
+ */
+ u32 sensor_filt_bitmap[] = { BIT(0), BIT(1), BIT(2), BIT(3) };
+ u32 sensor_map = 0;
+ int i;
+
+ if (lvts_ctrl->mode != LVTS_MSR_FILTERED_MODE)
+ return;
+
+ if (enable) {
+ lvts_for_each_valid_sensor(i, lvts_ctrl)
+ sensor_map |= sensor_filt_bitmap[i];
+ }
+
+ /*
+ * Bits:
+ * 9: Single point access flow
+ * 0-3: Enable sensing point 0-3
+ */
+ writel(sensor_map | BIT(9), LVTS_MONCTL0(lvts_ctrl->base));
+}
+
/*
* At this point the configuration register is the only place in the
* driver where we write multiple values. Per hardware constraint,
@@ -1381,8 +1407,11 @@ static int lvts_suspend(struct device *dev)
lvts_td = dev_get_drvdata(dev);
- for (i = 0; i < lvts_td->num_lvts_ctrl; i++)
+ for (i = 0; i < lvts_td->num_lvts_ctrl; i++) {
+ lvts_ctrl_monitor_enable(dev, &lvts_td->lvts_ctrl[i], false);
+ usleep_range(100, 200);
lvts_ctrl_set_enable(&lvts_td->lvts_ctrl[i], false);
+ }
clk_disable_unprepare(lvts_td->clk);
@@ -1400,8 +1429,11 @@ static int lvts_resume(struct device *dev)
if (ret)
return ret;
- for (i = 0; i < lvts_td->num_lvts_ctrl; i++)
+ for (i = 0; i < lvts_td->num_lvts_ctrl; i++) {
lvts_ctrl_set_enable(&lvts_td->lvts_ctrl[i], true);
+ usleep_range(100, 200);
+ lvts_ctrl_monitor_enable(dev, &lvts_td->lvts_ctrl[i], true);
+ }
return 0;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 2/5] thermal/drivers/mediatek/lvts: Disable Stage 3 thermal threshold
2024-11-25 21:20 [PATCH 0/5] thermal/drivers/mediatek/lvts: Fixes for suspend and IRQ storm, and cleanups Nícolas F. R. A. Prado
2024-11-25 21:20 ` [PATCH 1/5] thermal/drivers/mediatek/lvts: Disable monitor mode during suspend Nícolas F. R. A. Prado
@ 2024-11-25 21:20 ` Nícolas F. R. A. Prado
2024-11-26 9:43 ` AngeloGioacchino Del Regno
2024-11-25 21:20 ` [PATCH 3/5] thermal/drivers/mediatek/lvts: Disable low offset IRQ for minimum threshold Nícolas F. R. A. Prado
` (2 subsequent siblings)
4 siblings, 1 reply; 16+ messages in thread
From: Nícolas F. R. A. Prado @ 2024-11-25 21:20 UTC (permalink / raw)
To: Rafael J. Wysocki, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Matthias Brugger, AngeloGioacchino Del Regno, Alexandre Mergnat,
Balsam CHIHI
Cc: kernel, linux-pm, linux-kernel, linux-arm-kernel, linux-mediatek,
Hsin-Te Yuan, Chen-Yu Tsai, Bernhard Rosenkränzer,
Rafael J. Wysocki, Nícolas F. R. A. Prado, stable
The Stage 3 thermal threshold is currently configured during
the controller initialization to 105 Celsius. From the kernel
perspective, this configuration is harmful because:
* The stage 3 interrupt that gets triggered when the threshold is
crossed is not handled in any way by the IRQ handler, it just gets
cleared. Besides, the temperature used for stage 3 comes from the
sensors, and the critical thermal trip points described in the
Devicetree will already cause a shutdown when crossed (at a lower
temperature, of 100 Celsius, for all SoCs currently using this
driver).
* The only effect of crossing the stage 3 threshold that has been
observed is that it causes the machine to no longer be able to enter
suspend. Even if that was a result of a momentary glitch in the
temperature reading of a sensor (as has been observed on the
MT8192-based Chromebooks).
For those reasons, disable the Stage 3 thermal threshold configuration.
Cc: stable@vger.kernel.org
Reported-by: Hsin-Te Yuan <yuanhsinte@chromium.org>
Closes: https://lore.kernel.org/all/20241108-lvts-v1-1-eee339c6ca20@chromium.org/
Fixes: f5f633b18234 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
---
drivers/thermal/mediatek/lvts_thermal.c | 16 ++--------------
1 file changed, 2 insertions(+), 14 deletions(-)
diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
index a92ff2325c40704adc537af6995b34f93c3b0650..6ac33030f015c7239e36d81018d1a6893cb69ef8 100644
--- a/drivers/thermal/mediatek/lvts_thermal.c
+++ b/drivers/thermal/mediatek/lvts_thermal.c
@@ -65,7 +65,7 @@
#define LVTS_HW_FILTER 0x0
#define LVTS_TSSEL_CONF 0x13121110
#define LVTS_CALSCALE_CONF 0x300
-#define LVTS_MONINT_CONF 0x8300318C
+#define LVTS_MONINT_CONF 0x0300318C
#define LVTS_MONINT_OFFSET_SENSOR0 0xC
#define LVTS_MONINT_OFFSET_SENSOR1 0x180
@@ -91,8 +91,6 @@
#define LVTS_MSR_READ_TIMEOUT_US 400
#define LVTS_MSR_READ_WAIT_US (LVTS_MSR_READ_TIMEOUT_US / 2)
-#define LVTS_HW_TSHUT_TEMP 105000
-
#define LVTS_MINIMUM_THRESHOLD 20000
static int golden_temp = LVTS_GOLDEN_TEMP_DEFAULT;
@@ -145,7 +143,6 @@ struct lvts_ctrl {
struct lvts_sensor sensors[LVTS_SENSOR_MAX];
const struct lvts_data *lvts_data;
u32 calibration[LVTS_SENSOR_MAX];
- u32 hw_tshut_raw_temp;
u8 valid_sensor_mask;
int mode;
void __iomem *base;
@@ -837,14 +834,6 @@ static int lvts_ctrl_init(struct device *dev, struct lvts_domain *lvts_td,
*/
lvts_ctrl[i].mode = lvts_data->lvts_ctrl[i].mode;
- /*
- * The temperature to raw temperature must be done
- * after initializing the calibration.
- */
- lvts_ctrl[i].hw_tshut_raw_temp =
- lvts_temp_to_raw(LVTS_HW_TSHUT_TEMP,
- lvts_data->temp_factor);
-
lvts_ctrl[i].low_thresh = INT_MIN;
lvts_ctrl[i].high_thresh = INT_MIN;
}
@@ -919,7 +908,6 @@ static int lvts_irq_init(struct lvts_ctrl *lvts_ctrl)
* 10 : Selected sensor with bits 19-18
* 11 : Reserved
*/
- writel(BIT(16), LVTS_PROTCTL(lvts_ctrl->base));
/*
* LVTS_PROTTA : Stage 1 temperature threshold
@@ -932,8 +920,8 @@ static int lvts_irq_init(struct lvts_ctrl *lvts_ctrl)
*
* writel(0x0, LVTS_PROTTA(lvts_ctrl->base));
* writel(0x0, LVTS_PROTTB(lvts_ctrl->base));
+ * writel(0x0, LVTS_PROTTC(lvts_ctrl->base));
*/
- writel(lvts_ctrl->hw_tshut_raw_temp, LVTS_PROTTC(lvts_ctrl->base));
/*
* LVTS_MONINT : Interrupt configuration register
--
2.47.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 3/5] thermal/drivers/mediatek/lvts: Disable low offset IRQ for minimum threshold
2024-11-25 21:20 [PATCH 0/5] thermal/drivers/mediatek/lvts: Fixes for suspend and IRQ storm, and cleanups Nícolas F. R. A. Prado
2024-11-25 21:20 ` [PATCH 1/5] thermal/drivers/mediatek/lvts: Disable monitor mode during suspend Nícolas F. R. A. Prado
2024-11-25 21:20 ` [PATCH 2/5] thermal/drivers/mediatek/lvts: Disable Stage 3 thermal threshold Nícolas F. R. A. Prado
@ 2024-11-25 21:20 ` Nícolas F. R. A. Prado
2024-11-26 9:43 ` AngeloGioacchino Del Regno
2024-11-25 21:20 ` [PATCH 4/5] thermal/drivers/mediatek/lvts: Start sensor interrupts disabled Nícolas F. R. A. Prado
2024-11-25 21:20 ` [PATCH 5/5] thermal/drivers/mediatek/lvts: Only update IRQ enable for valid sensors Nícolas F. R. A. Prado
4 siblings, 1 reply; 16+ messages in thread
From: Nícolas F. R. A. Prado @ 2024-11-25 21:20 UTC (permalink / raw)
To: Rafael J. Wysocki, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Matthias Brugger, AngeloGioacchino Del Regno, Alexandre Mergnat,
Balsam CHIHI
Cc: kernel, linux-pm, linux-kernel, linux-arm-kernel, linux-mediatek,
Hsin-Te Yuan, Chen-Yu Tsai, Bernhard Rosenkränzer,
Rafael J. Wysocki, Nícolas F. R. A. Prado, stable
In order to get working interrupts, a low offset value needs to be
configured. The minimum value for it is 20 Celsius, which is what is
configured when there's no lower thermal trip (ie the thermal core
passes -INT_MAX as low trip temperature). However, when the temperature
gets that low and fluctuates around that value it causes an interrupt
storm.
Prevent that interrupt storm by not enabling the low offset interrupt if
the low threshold is the minimum one.
Cc: stable@vger.kernel.org
Fixes: 77354eaef821 ("thermal/drivers/mediatek/lvts_thermal: Don't leave threshold zeroed")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
---
drivers/thermal/mediatek/lvts_thermal.c | 48 ++++++++++++++++++++++++---------
1 file changed, 35 insertions(+), 13 deletions(-)
diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
index 6ac33030f015c7239e36d81018d1a6893cb69ef8..2271023f090df82fbdd0b5755bb34879e58b0533 100644
--- a/drivers/thermal/mediatek/lvts_thermal.c
+++ b/drivers/thermal/mediatek/lvts_thermal.c
@@ -67,10 +67,14 @@
#define LVTS_CALSCALE_CONF 0x300
#define LVTS_MONINT_CONF 0x0300318C
-#define LVTS_MONINT_OFFSET_SENSOR0 0xC
-#define LVTS_MONINT_OFFSET_SENSOR1 0x180
-#define LVTS_MONINT_OFFSET_SENSOR2 0x3000
-#define LVTS_MONINT_OFFSET_SENSOR3 0x3000000
+#define LVTS_MONINT_OFFSET_HIGH_SENSOR0 BIT(3)
+#define LVTS_MONINT_OFFSET_HIGH_SENSOR1 BIT(8)
+#define LVTS_MONINT_OFFSET_HIGH_SENSOR2 BIT(13)
+#define LVTS_MONINT_OFFSET_HIGH_SENSOR3 BIT(25)
+#define LVTS_MONINT_OFFSET_LOW_SENSOR0 BIT(2)
+#define LVTS_MONINT_OFFSET_LOW_SENSOR1 BIT(7)
+#define LVTS_MONINT_OFFSET_LOW_SENSOR2 BIT(12)
+#define LVTS_MONINT_OFFSET_LOW_SENSOR3 BIT(24)
#define LVTS_INT_SENSOR0 0x0009001F
#define LVTS_INT_SENSOR1 0x001203E0
@@ -326,11 +330,17 @@ static int lvts_get_temp(struct thermal_zone_device *tz, int *temp)
static void lvts_update_irq_mask(struct lvts_ctrl *lvts_ctrl)
{
- u32 masks[] = {
- LVTS_MONINT_OFFSET_SENSOR0,
- LVTS_MONINT_OFFSET_SENSOR1,
- LVTS_MONINT_OFFSET_SENSOR2,
- LVTS_MONINT_OFFSET_SENSOR3,
+ u32 high_offset_masks[] = {
+ LVTS_MONINT_OFFSET_HIGH_SENSOR0,
+ LVTS_MONINT_OFFSET_HIGH_SENSOR1,
+ LVTS_MONINT_OFFSET_HIGH_SENSOR2,
+ LVTS_MONINT_OFFSET_HIGH_SENSOR3,
+ };
+ u32 low_offset_masks[] = {
+ LVTS_MONINT_OFFSET_LOW_SENSOR0,
+ LVTS_MONINT_OFFSET_LOW_SENSOR1,
+ LVTS_MONINT_OFFSET_LOW_SENSOR2,
+ LVTS_MONINT_OFFSET_LOW_SENSOR3,
};
u32 value = 0;
int i;
@@ -339,10 +349,22 @@ static void lvts_update_irq_mask(struct lvts_ctrl *lvts_ctrl)
for (i = 0; i < ARRAY_SIZE(masks); i++) {
if (lvts_ctrl->sensors[i].high_thresh == lvts_ctrl->high_thresh
- && lvts_ctrl->sensors[i].low_thresh == lvts_ctrl->low_thresh)
- value |= masks[i];
- else
- value &= ~masks[i];
+ && lvts_ctrl->sensors[i].low_thresh == lvts_ctrl->low_thresh) {
+ /*
+ * The minimum threshold needs to be configured in the
+ * OFFSETL register to get working interrupts, but we
+ * don't actually want to generate interrupts when
+ * crossing it.
+ */
+ if (lvts_ctrl->low_thresh == -INT_MAX) {
+ value &= ~low_offset_masks[i];
+ value |= high_offset_masks[i];
+ } else {
+ value |= low_offset_masks[i] | high_offset_masks[i];
+ }
+ } else {
+ value &= ~(low_offset_masks[i] | high_offset_masks[i]);
+ }
}
writel(value, LVTS_MONINT(lvts_ctrl->base));
--
2.47.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 4/5] thermal/drivers/mediatek/lvts: Start sensor interrupts disabled
2024-11-25 21:20 [PATCH 0/5] thermal/drivers/mediatek/lvts: Fixes for suspend and IRQ storm, and cleanups Nícolas F. R. A. Prado
` (2 preceding siblings ...)
2024-11-25 21:20 ` [PATCH 3/5] thermal/drivers/mediatek/lvts: Disable low offset IRQ for minimum threshold Nícolas F. R. A. Prado
@ 2024-11-25 21:20 ` Nícolas F. R. A. Prado
2024-11-26 9:43 ` AngeloGioacchino Del Regno
2024-11-25 21:20 ` [PATCH 5/5] thermal/drivers/mediatek/lvts: Only update IRQ enable for valid sensors Nícolas F. R. A. Prado
4 siblings, 1 reply; 16+ messages in thread
From: Nícolas F. R. A. Prado @ 2024-11-25 21:20 UTC (permalink / raw)
To: Rafael J. Wysocki, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Matthias Brugger, AngeloGioacchino Del Regno, Alexandre Mergnat,
Balsam CHIHI
Cc: kernel, linux-pm, linux-kernel, linux-arm-kernel, linux-mediatek,
Hsin-Te Yuan, Chen-Yu Tsai, Bernhard Rosenkränzer,
Rafael J. Wysocki, Nícolas F. R. A. Prado
Interrupts are enabled per sensor in lvts_update_irq_mask() as needed,
there's no point in enabling all of them during initialization. Change
the MONINT register initial value so all sensor interrupts start
disabled.
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
---
drivers/thermal/mediatek/lvts_thermal.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
index 2271023f090df82fbdd0b5755bb34879e58b0533..90f305fa6fb659ae9e3db0faf1a406ef1500adf2 100644
--- a/drivers/thermal/mediatek/lvts_thermal.c
+++ b/drivers/thermal/mediatek/lvts_thermal.c
@@ -65,7 +65,6 @@
#define LVTS_HW_FILTER 0x0
#define LVTS_TSSEL_CONF 0x13121110
#define LVTS_CALSCALE_CONF 0x300
-#define LVTS_MONINT_CONF 0x0300318C
#define LVTS_MONINT_OFFSET_HIGH_SENSOR0 BIT(3)
#define LVTS_MONINT_OFFSET_HIGH_SENSOR1 BIT(8)
@@ -951,7 +950,7 @@ static int lvts_irq_init(struct lvts_ctrl *lvts_ctrl)
* The LVTS_MONINT register layout is the same as the LVTS_MONINTSTS
* register, except we set the bits to enable the interrupt.
*/
- writel(LVTS_MONINT_CONF, LVTS_MONINT(lvts_ctrl->base));
+ writel(0, LVTS_MONINT(lvts_ctrl->base));
return 0;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 5/5] thermal/drivers/mediatek/lvts: Only update IRQ enable for valid sensors
2024-11-25 21:20 [PATCH 0/5] thermal/drivers/mediatek/lvts: Fixes for suspend and IRQ storm, and cleanups Nícolas F. R. A. Prado
` (3 preceding siblings ...)
2024-11-25 21:20 ` [PATCH 4/5] thermal/drivers/mediatek/lvts: Start sensor interrupts disabled Nícolas F. R. A. Prado
@ 2024-11-25 21:20 ` Nícolas F. R. A. Prado
2024-11-26 9:43 ` AngeloGioacchino Del Regno
4 siblings, 1 reply; 16+ messages in thread
From: Nícolas F. R. A. Prado @ 2024-11-25 21:20 UTC (permalink / raw)
To: Rafael J. Wysocki, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Matthias Brugger, AngeloGioacchino Del Regno, Alexandre Mergnat,
Balsam CHIHI
Cc: kernel, linux-pm, linux-kernel, linux-arm-kernel, linux-mediatek,
Hsin-Te Yuan, Chen-Yu Tsai, Bernhard Rosenkränzer,
Rafael J. Wysocki, Nícolas F. R. A. Prado
Only sensors that are valid need to have their interrupts enable status
updated based on their thresholds. Use the lvts_for_each_valid_sensor()
helper in lvts_update_irq_mask() to ignore invalid sensors.
Currently, since the invalid sensors will always contain zeroed out
thresholds (from kzalloc), they will always get their interrupts
disabled on this loop. So this commit doesn't change the resulting
interrupts configuration, but it slightly optimizes the loop by skipping
the invalid sensors, avoids potential future surprises if at some point
memory is no longer allocated for invalid sensors, as well as makes the
code more obvious.
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
---
drivers/thermal/mediatek/lvts_thermal.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
index 90f305fa6fb659ae9e3db0faf1a406ef1500adf2..ed72ede040f3b22a60fbdb44fb9bd4f2e29db6ab 100644
--- a/drivers/thermal/mediatek/lvts_thermal.c
+++ b/drivers/thermal/mediatek/lvts_thermal.c
@@ -346,7 +346,7 @@ static void lvts_update_irq_mask(struct lvts_ctrl *lvts_ctrl)
value = readl(LVTS_MONINT(lvts_ctrl->base));
- for (i = 0; i < ARRAY_SIZE(masks); i++) {
+ lvts_for_each_valid_sensor(i, lvts_ctrl) {
if (lvts_ctrl->sensors[i].high_thresh == lvts_ctrl->high_thresh
&& lvts_ctrl->sensors[i].low_thresh == lvts_ctrl->low_thresh) {
/*
--
2.47.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] thermal/drivers/mediatek/lvts: Disable monitor mode during suspend
2024-11-25 21:20 ` [PATCH 1/5] thermal/drivers/mediatek/lvts: Disable monitor mode during suspend Nícolas F. R. A. Prado
@ 2024-11-26 8:00 ` Hsin-Te Yuan
2024-11-26 13:37 ` Nícolas F. R. A. Prado
2024-11-26 9:43 ` AngeloGioacchino Del Regno
1 sibling, 1 reply; 16+ messages in thread
From: Hsin-Te Yuan @ 2024-11-26 8:00 UTC (permalink / raw)
To: Nícolas F. R. A. Prado
Cc: Rafael J. Wysocki, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Matthias Brugger, AngeloGioacchino Del Regno, Alexandre Mergnat,
Balsam CHIHI, kernel, linux-pm, linux-kernel, linux-arm-kernel,
linux-mediatek, Hsin-Te Yuan, Chen-Yu Tsai,
Bernhard Rosenkränzer, Rafael J. Wysocki, stable
On Tue, Nov 26, 2024 at 5:21 AM Nícolas F. R. A. Prado
<nfraprado@collabora.com> wrote:
>
> When configured in filtered mode, the LVTS thermal controller will
> monitor the temperature from the sensors and trigger an interrupt once a
> thermal threshold is crossed.
>
> Currently this is true even during suspend and resume. The problem with
> that is that when enabling the internal clock of the LVTS controller in
> lvts_ctrl_set_enable() during resume, the temperature reading can glitch
> and appear much higher than the real one, resulting in a spurious
> interrupt getting generated.
>
This sounds weird to me. On my end, the symptom is that the device
sometimes cannot suspend.
To be more precise, `echo mem > /sys/power/state` returns almost
immediately. I think the irq is more
likely to be triggered during suspension.
> Disable the temperature monitoring and give some time for the signals to
> stabilize during suspend in order to prevent such spurious interrupts.
>
> Cc: stable@vger.kernel.org
> Reported-by: Hsin-Te Yuan <yuanhsinte@chromium.org>
> Closes: https://lore.kernel.org/all/20241108-lvts-v1-1-eee339c6ca20@chromium.org/
> Fixes: 8137bb90600d ("thermal/drivers/mediatek/lvts_thermal: Add suspend and resume")
> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
> ---
> drivers/thermal/mediatek/lvts_thermal.c | 36 +++++++++++++++++++++++++++++++--
> 1 file changed, 34 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
> index 1997e91bb3be94a3059db619238aa5787edc7675..a92ff2325c40704adc537af6995b34f93c3b0650 100644
> --- a/drivers/thermal/mediatek/lvts_thermal.c
> +++ b/drivers/thermal/mediatek/lvts_thermal.c
> @@ -860,6 +860,32 @@ static int lvts_ctrl_init(struct device *dev, struct lvts_domain *lvts_td,
> return 0;
> }
>
> +static void lvts_ctrl_monitor_enable(struct device *dev, struct lvts_ctrl *lvts_ctrl, bool enable)
> +{
> + /*
> + * Bitmaps to enable each sensor on filtered mode in the MONCTL0
> + * register.
> + */
> + u32 sensor_filt_bitmap[] = { BIT(0), BIT(1), BIT(2), BIT(3) };
> + u32 sensor_map = 0;
> + int i;
> +
> + if (lvts_ctrl->mode != LVTS_MSR_FILTERED_MODE)
> + return;
> +
> + if (enable) {
> + lvts_for_each_valid_sensor(i, lvts_ctrl)
> + sensor_map |= sensor_filt_bitmap[i];
> + }
> +
> + /*
> + * Bits:
> + * 9: Single point access flow
> + * 0-3: Enable sensing point 0-3
> + */
> + writel(sensor_map | BIT(9), LVTS_MONCTL0(lvts_ctrl->base));
> +}
> +
> /*
> * At this point the configuration register is the only place in the
> * driver where we write multiple values. Per hardware constraint,
> @@ -1381,8 +1407,11 @@ static int lvts_suspend(struct device *dev)
>
> lvts_td = dev_get_drvdata(dev);
>
> - for (i = 0; i < lvts_td->num_lvts_ctrl; i++)
> + for (i = 0; i < lvts_td->num_lvts_ctrl; i++) {
> + lvts_ctrl_monitor_enable(dev, &lvts_td->lvts_ctrl[i], false);
> + usleep_range(100, 200);
> lvts_ctrl_set_enable(&lvts_td->lvts_ctrl[i], false);
> + }
>
> clk_disable_unprepare(lvts_td->clk);
>
> @@ -1400,8 +1429,11 @@ static int lvts_resume(struct device *dev)
> if (ret)
> return ret;
>
> - for (i = 0; i < lvts_td->num_lvts_ctrl; i++)
> + for (i = 0; i < lvts_td->num_lvts_ctrl; i++) {
> lvts_ctrl_set_enable(&lvts_td->lvts_ctrl[i], true);
> + usleep_range(100, 200);
> + lvts_ctrl_monitor_enable(dev, &lvts_td->lvts_ctrl[i], true);
> + }
>
> return 0;
> }
>
> --
> 2.47.0
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 5/5] thermal/drivers/mediatek/lvts: Only update IRQ enable for valid sensors
2024-11-25 21:20 ` [PATCH 5/5] thermal/drivers/mediatek/lvts: Only update IRQ enable for valid sensors Nícolas F. R. A. Prado
@ 2024-11-26 9:43 ` AngeloGioacchino Del Regno
0 siblings, 0 replies; 16+ messages in thread
From: AngeloGioacchino Del Regno @ 2024-11-26 9:43 UTC (permalink / raw)
To: Nícolas F. R. A. Prado, Rafael J. Wysocki, Daniel Lezcano,
Zhang Rui, Lukasz Luba, Matthias Brugger, Alexandre Mergnat,
Balsam CHIHI
Cc: kernel, linux-pm, linux-kernel, linux-arm-kernel, linux-mediatek,
Hsin-Te Yuan, Chen-Yu Tsai, Bernhard Rosenkränzer,
Rafael J. Wysocki
Il 25/11/24 22:20, Nícolas F. R. A. Prado ha scritto:
> Only sensors that are valid need to have their interrupts enable status
> updated based on their thresholds. Use the lvts_for_each_valid_sensor()
> helper in lvts_update_irq_mask() to ignore invalid sensors.
>
> Currently, since the invalid sensors will always contain zeroed out
> thresholds (from kzalloc), they will always get their interrupts
> disabled on this loop. So this commit doesn't change the resulting
> interrupts configuration, but it slightly optimizes the loop by skipping
> the invalid sensors, avoids potential future surprises if at some point
> memory is no longer allocated for invalid sensors, as well as makes the
> code more obvious.
>
> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
That's lovely.
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/5] thermal/drivers/mediatek/lvts: Start sensor interrupts disabled
2024-11-25 21:20 ` [PATCH 4/5] thermal/drivers/mediatek/lvts: Start sensor interrupts disabled Nícolas F. R. A. Prado
@ 2024-11-26 9:43 ` AngeloGioacchino Del Regno
0 siblings, 0 replies; 16+ messages in thread
From: AngeloGioacchino Del Regno @ 2024-11-26 9:43 UTC (permalink / raw)
To: Nícolas F. R. A. Prado, Rafael J. Wysocki, Daniel Lezcano,
Zhang Rui, Lukasz Luba, Matthias Brugger, Alexandre Mergnat,
Balsam CHIHI
Cc: kernel, linux-pm, linux-kernel, linux-arm-kernel, linux-mediatek,
Hsin-Te Yuan, Chen-Yu Tsai, Bernhard Rosenkränzer,
Rafael J. Wysocki
Il 25/11/24 22:20, Nícolas F. R. A. Prado ha scritto:
> Interrupts are enabled per sensor in lvts_update_irq_mask() as needed,
> there's no point in enabling all of them during initialization. Change
> the MONINT register initial value so all sensor interrupts start
> disabled.
>
> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
I definitely agree
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/5] thermal/drivers/mediatek/lvts: Disable low offset IRQ for minimum threshold
2024-11-25 21:20 ` [PATCH 3/5] thermal/drivers/mediatek/lvts: Disable low offset IRQ for minimum threshold Nícolas F. R. A. Prado
@ 2024-11-26 9:43 ` AngeloGioacchino Del Regno
0 siblings, 0 replies; 16+ messages in thread
From: AngeloGioacchino Del Regno @ 2024-11-26 9:43 UTC (permalink / raw)
To: Nícolas F. R. A. Prado, Rafael J. Wysocki, Daniel Lezcano,
Zhang Rui, Lukasz Luba, Matthias Brugger, Alexandre Mergnat,
Balsam CHIHI
Cc: kernel, linux-pm, linux-kernel, linux-arm-kernel, linux-mediatek,
Hsin-Te Yuan, Chen-Yu Tsai, Bernhard Rosenkränzer,
Rafael J. Wysocki, stable
Il 25/11/24 22:20, Nícolas F. R. A. Prado ha scritto:
> In order to get working interrupts, a low offset value needs to be
> configured. The minimum value for it is 20 Celsius, which is what is
> configured when there's no lower thermal trip (ie the thermal core
> passes -INT_MAX as low trip temperature). However, when the temperature
> gets that low and fluctuates around that value it causes an interrupt
> storm.
>
> Prevent that interrupt storm by not enabling the low offset interrupt if
> the low threshold is the minimum one.
>
> Cc: stable@vger.kernel.org
> Fixes: 77354eaef821 ("thermal/drivers/mediatek/lvts_thermal: Don't leave threshold zeroed")
> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
> ---
> drivers/thermal/mediatek/lvts_thermal.c | 48 ++++++++++++++++++++++++---------
> 1 file changed, 35 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
> index 6ac33030f015c7239e36d81018d1a6893cb69ef8..2271023f090df82fbdd0b5755bb34879e58b0533 100644
> --- a/drivers/thermal/mediatek/lvts_thermal.c
> +++ b/drivers/thermal/mediatek/lvts_thermal.c
> @@ -67,10 +67,14 @@
> #define LVTS_CALSCALE_CONF 0x300
> #define LVTS_MONINT_CONF 0x0300318C
>
> -#define LVTS_MONINT_OFFSET_SENSOR0 0xC
> -#define LVTS_MONINT_OFFSET_SENSOR1 0x180
> -#define LVTS_MONINT_OFFSET_SENSOR2 0x3000
> -#define LVTS_MONINT_OFFSET_SENSOR3 0x3000000
> +#define LVTS_MONINT_OFFSET_HIGH_SENSOR0 BIT(3)
Yeah it's longer, but that's more readable:
#define LVTS_MONINT_OFFSET_HIGH_INTEN_SENSOR0
...because what this BIT does is enabling the high offset interrupt for the
sensing point 0 (which in this driver we call sensor 0).
That name would make it (imo) way less likely to need any datasheet to understand
what is actually going on with that setting :-)
> +#define LVTS_MONINT_OFFSET_HIGH_SENSOR1 BIT(8)
> +#define LVTS_MONINT_OFFSET_HIGH_SENSOR2 BIT(13)
> +#define LVTS_MONINT_OFFSET_HIGH_SENSOR3 BIT(25)
> +#define LVTS_MONINT_OFFSET_LOW_SENSOR0 BIT(2)
Of course, the comment is valid for the LOW ones as well!
Everything else is good for me, and since it is just about simple renaming, I can
already give you my
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> +#define LVTS_MONINT_OFFSET_LOW_SENSOR1 BIT(7)
> +#define LVTS_MONINT_OFFSET_LOW_SENSOR2 BIT(12)
> +#define LVTS_MONINT_OFFSET_LOW_SENSOR3 BIT(24)
>
> #define LVTS_INT_SENSOR0 0x0009001F
> #define LVTS_INT_SENSOR1 0x001203E0
> @@ -326,11 +330,17 @@ static int lvts_get_temp(struct thermal_zone_device *tz, int *temp)
>
> static void lvts_update_irq_mask(struct lvts_ctrl *lvts_ctrl)
> {
> - u32 masks[] = {
> - LVTS_MONINT_OFFSET_SENSOR0,
> - LVTS_MONINT_OFFSET_SENSOR1,
> - LVTS_MONINT_OFFSET_SENSOR2,
> - LVTS_MONINT_OFFSET_SENSOR3,
> + u32 high_offset_masks[] = {
> + LVTS_MONINT_OFFSET_HIGH_SENSOR0,
> + LVTS_MONINT_OFFSET_HIGH_SENSOR1,
> + LVTS_MONINT_OFFSET_HIGH_SENSOR2,
> + LVTS_MONINT_OFFSET_HIGH_SENSOR3,
> + };
> + u32 low_offset_masks[] = {
> + LVTS_MONINT_OFFSET_LOW_SENSOR0,
> + LVTS_MONINT_OFFSET_LOW_SENSOR1,
> + LVTS_MONINT_OFFSET_LOW_SENSOR2,
> + LVTS_MONINT_OFFSET_LOW_SENSOR3,
> };
> u32 value = 0;
> int i;
> @@ -339,10 +349,22 @@ static void lvts_update_irq_mask(struct lvts_ctrl *lvts_ctrl)
>
> for (i = 0; i < ARRAY_SIZE(masks); i++) {
> if (lvts_ctrl->sensors[i].high_thresh == lvts_ctrl->high_thresh
> - && lvts_ctrl->sensors[i].low_thresh == lvts_ctrl->low_thresh)
> - value |= masks[i];
> - else
> - value &= ~masks[i];
> + && lvts_ctrl->sensors[i].low_thresh == lvts_ctrl->low_thresh) {
> + /*
> + * The minimum threshold needs to be configured in the
> + * OFFSETL register to get working interrupts, but we
> + * don't actually want to generate interrupts when
> + * crossing it.
> + */
> + if (lvts_ctrl->low_thresh == -INT_MAX) {
> + value &= ~low_offset_masks[i];
> + value |= high_offset_masks[i];
> + } else {
> + value |= low_offset_masks[i] | high_offset_masks[i];
> + }
> + } else {
> + value &= ~(low_offset_masks[i] | high_offset_masks[i]);
> + }
> }
>
> writel(value, LVTS_MONINT(lvts_ctrl->base));
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/5] thermal/drivers/mediatek/lvts: Disable Stage 3 thermal threshold
2024-11-25 21:20 ` [PATCH 2/5] thermal/drivers/mediatek/lvts: Disable Stage 3 thermal threshold Nícolas F. R. A. Prado
@ 2024-11-26 9:43 ` AngeloGioacchino Del Regno
0 siblings, 0 replies; 16+ messages in thread
From: AngeloGioacchino Del Regno @ 2024-11-26 9:43 UTC (permalink / raw)
To: Nícolas F. R. A. Prado, Rafael J. Wysocki, Daniel Lezcano,
Zhang Rui, Lukasz Luba, Matthias Brugger, Alexandre Mergnat,
Balsam CHIHI
Cc: kernel, linux-pm, linux-kernel, linux-arm-kernel, linux-mediatek,
Hsin-Te Yuan, Chen-Yu Tsai, Bernhard Rosenkränzer,
Rafael J. Wysocki, stable
Il 25/11/24 22:20, Nícolas F. R. A. Prado ha scritto:
> The Stage 3 thermal threshold is currently configured during
> the controller initialization to 105 Celsius. From the kernel
> perspective, this configuration is harmful because:
> * The stage 3 interrupt that gets triggered when the threshold is
> crossed is not handled in any way by the IRQ handler, it just gets
> cleared. Besides, the temperature used for stage 3 comes from the
> sensors, and the critical thermal trip points described in the
> Devicetree will already cause a shutdown when crossed (at a lower
> temperature, of 100 Celsius, for all SoCs currently using this
> driver).
> * The only effect of crossing the stage 3 threshold that has been
> observed is that it causes the machine to no longer be able to enter
> suspend. Even if that was a result of a momentary glitch in the
> temperature reading of a sensor (as has been observed on the
> MT8192-based Chromebooks).
>
> For those reasons, disable the Stage 3 thermal threshold configuration.
>
> Cc: stable@vger.kernel.org
> Reported-by: Hsin-Te Yuan <yuanhsinte@chromium.org>
> Closes: https://lore.kernel.org/all/20241108-lvts-v1-1-eee339c6ca20@chromium.org/
> Fixes: f5f633b18234 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] thermal/drivers/mediatek/lvts: Disable monitor mode during suspend
2024-11-25 21:20 ` [PATCH 1/5] thermal/drivers/mediatek/lvts: Disable monitor mode during suspend Nícolas F. R. A. Prado
2024-11-26 8:00 ` Hsin-Te Yuan
@ 2024-11-26 9:43 ` AngeloGioacchino Del Regno
2024-11-26 13:19 ` Nícolas F. R. A. Prado
1 sibling, 1 reply; 16+ messages in thread
From: AngeloGioacchino Del Regno @ 2024-11-26 9:43 UTC (permalink / raw)
To: Nícolas F. R. A. Prado, Rafael J. Wysocki, Daniel Lezcano,
Zhang Rui, Lukasz Luba, Matthias Brugger, Alexandre Mergnat,
Balsam CHIHI
Cc: kernel, linux-pm, linux-kernel, linux-arm-kernel, linux-mediatek,
Hsin-Te Yuan, Chen-Yu Tsai, Bernhard Rosenkränzer,
Rafael J. Wysocki, stable
Il 25/11/24 22:20, Nícolas F. R. A. Prado ha scritto:
> When configured in filtered mode, the LVTS thermal controller will
> monitor the temperature from the sensors and trigger an interrupt once a
> thermal threshold is crossed.
>
> Currently this is true even during suspend and resume. The problem with
> that is that when enabling the internal clock of the LVTS controller in
> lvts_ctrl_set_enable() during resume, the temperature reading can glitch
> and appear much higher than the real one, resulting in a spurious
> interrupt getting generated.
>
> Disable the temperature monitoring and give some time for the signals to
> stabilize during suspend in order to prevent such spurious interrupts.
>
> Cc: stable@vger.kernel.org
> Reported-by: Hsin-Te Yuan <yuanhsinte@chromium.org>
> Closes: https://lore.kernel.org/all/20241108-lvts-v1-1-eee339c6ca20@chromium.org/
> Fixes: 8137bb90600d ("thermal/drivers/mediatek/lvts_thermal: Add suspend and resume")
> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
> ---
> drivers/thermal/mediatek/lvts_thermal.c | 36 +++++++++++++++++++++++++++++++--
> 1 file changed, 34 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
> index 1997e91bb3be94a3059db619238aa5787edc7675..a92ff2325c40704adc537af6995b34f93c3b0650 100644
> --- a/drivers/thermal/mediatek/lvts_thermal.c
> +++ b/drivers/thermal/mediatek/lvts_thermal.c
> @@ -860,6 +860,32 @@ static int lvts_ctrl_init(struct device *dev, struct lvts_domain *lvts_td,
> return 0;
> }
>
> +static void lvts_ctrl_monitor_enable(struct device *dev, struct lvts_ctrl *lvts_ctrl, bool enable)
> +{
> + /*
> + * Bitmaps to enable each sensor on filtered mode in the MONCTL0
> + * register.
> + */
> + u32 sensor_filt_bitmap[] = { BIT(0), BIT(1), BIT(2), BIT(3) };
> + u32 sensor_map = 0;
> + int i;
> +
> + if (lvts_ctrl->mode != LVTS_MSR_FILTERED_MODE)
> + return;
> +
That's easier and shorter:
static void lvts_ctrl_monitor_enable( .... )
{
/* Bitmap to enable each sensor on filtered mode in the MONCTL0 register */
const u32 sensor_map = GENMASK(3, 0);
if (lvts_ctrl->mode != LVTS_MSR_FILTERED_MODE)
return;
/* Bits 0-3: Sensing points - Bit 9: Single point access flow */
if (enable)
writel(sensor_map | BIT(9), LVTS_MONCTL0(lvts_ctrl->base));
else
writel(BIT(9), LVTS_MONCTL0 ....
}
Cheers,
Angelo
> + if (enable) {
> + lvts_for_each_valid_sensor(i, lvts_ctrl)
> + sensor_map |= sensor_filt_bitmap[i];
> + }
> +
> + /*
> + * Bits:
> + * 9: Single point access flow
> + * 0-3: Enable sensing point 0-3
> + */
> + writel(sensor_map | BIT(9), LVTS_MONCTL0(lvts_ctrl->base));
> +}
> +
> /*
> * At this point the configuration register is the only place in the
> * driver where we write multiple values. Per hardware constraint,
> @@ -1381,8 +1407,11 @@ static int lvts_suspend(struct device *dev)
>
> lvts_td = dev_get_drvdata(dev);
>
> - for (i = 0; i < lvts_td->num_lvts_ctrl; i++)
> + for (i = 0; i < lvts_td->num_lvts_ctrl; i++) {
> + lvts_ctrl_monitor_enable(dev, &lvts_td->lvts_ctrl[i], false);
> + usleep_range(100, 200);
> lvts_ctrl_set_enable(&lvts_td->lvts_ctrl[i], false);
> + }
>
> clk_disable_unprepare(lvts_td->clk);
>
> @@ -1400,8 +1429,11 @@ static int lvts_resume(struct device *dev)
> if (ret)
> return ret;
>
> - for (i = 0; i < lvts_td->num_lvts_ctrl; i++)
> + for (i = 0; i < lvts_td->num_lvts_ctrl; i++) {
> lvts_ctrl_set_enable(&lvts_td->lvts_ctrl[i], true);
> + usleep_range(100, 200);
> + lvts_ctrl_monitor_enable(dev, &lvts_td->lvts_ctrl[i], true);
> + }
>
> return 0;
> }
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] thermal/drivers/mediatek/lvts: Disable monitor mode during suspend
2024-11-26 9:43 ` AngeloGioacchino Del Regno
@ 2024-11-26 13:19 ` Nícolas F. R. A. Prado
2024-11-26 14:38 ` AngeloGioacchino Del Regno
0 siblings, 1 reply; 16+ messages in thread
From: Nícolas F. R. A. Prado @ 2024-11-26 13:19 UTC (permalink / raw)
To: AngeloGioacchino Del Regno
Cc: Rafael J. Wysocki, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Matthias Brugger, Alexandre Mergnat, Balsam CHIHI, kernel,
linux-pm, linux-kernel, linux-arm-kernel, linux-mediatek,
Hsin-Te Yuan, Chen-Yu Tsai, Bernhard Rosenkränzer,
Rafael J. Wysocki, stable
On Tue, Nov 26, 2024 at 10:43:55AM +0100, AngeloGioacchino Del Regno wrote:
> Il 25/11/24 22:20, Nícolas F. R. A. Prado ha scritto:
> > When configured in filtered mode, the LVTS thermal controller will
> > monitor the temperature from the sensors and trigger an interrupt once a
> > thermal threshold is crossed.
> >
> > Currently this is true even during suspend and resume. The problem with
> > that is that when enabling the internal clock of the LVTS controller in
> > lvts_ctrl_set_enable() during resume, the temperature reading can glitch
> > and appear much higher than the real one, resulting in a spurious
> > interrupt getting generated.
> >
> > Disable the temperature monitoring and give some time for the signals to
> > stabilize during suspend in order to prevent such spurious interrupts.
> >
> > Cc: stable@vger.kernel.org
> > Reported-by: Hsin-Te Yuan <yuanhsinte@chromium.org>
> > Closes: https://lore.kernel.org/all/20241108-lvts-v1-1-eee339c6ca20@chromium.org/
> > Fixes: 8137bb90600d ("thermal/drivers/mediatek/lvts_thermal: Add suspend and resume")
> > Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
> > ---
> > drivers/thermal/mediatek/lvts_thermal.c | 36 +++++++++++++++++++++++++++++++--
> > 1 file changed, 34 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
> > index 1997e91bb3be94a3059db619238aa5787edc7675..a92ff2325c40704adc537af6995b34f93c3b0650 100644
> > --- a/drivers/thermal/mediatek/lvts_thermal.c
> > +++ b/drivers/thermal/mediatek/lvts_thermal.c
> > @@ -860,6 +860,32 @@ static int lvts_ctrl_init(struct device *dev, struct lvts_domain *lvts_td,
> > return 0;
> > }
> > +static void lvts_ctrl_monitor_enable(struct device *dev, struct lvts_ctrl *lvts_ctrl, bool enable)
> > +{
> > + /*
> > + * Bitmaps to enable each sensor on filtered mode in the MONCTL0
> > + * register.
> > + */
> > + u32 sensor_filt_bitmap[] = { BIT(0), BIT(1), BIT(2), BIT(3) };
> > + u32 sensor_map = 0;
> > + int i;
> > +
> > + if (lvts_ctrl->mode != LVTS_MSR_FILTERED_MODE)
> > + return;
> > +
>
> That's easier and shorter:
>
> static void lvts_ctrl_monitor_enable( .... )
> {
> /* Bitmap to enable each sensor on filtered mode in the MONCTL0 register */
> const u32 sensor_map = GENMASK(3, 0);
>
> if (lvts_ctrl->mode != LVTS_MSR_FILTERED_MODE)
> return;
>
> /* Bits 0-3: Sensing points - Bit 9: Single point access flow */
> if (enable)
> writel(sensor_map | BIT(9), LVTS_MONCTL0(lvts_ctrl->base));
Wait, no, here you're enabling all the sensors in the controller. We only want
to enable ones that are valid, otherwise we might get garbage data and irqs from
sensors that aren't actually there. That's why I use the
lvts_for_each_valid_sensor() helper in this patch.
Thanks,
Nícolas
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] thermal/drivers/mediatek/lvts: Disable monitor mode during suspend
2024-11-26 8:00 ` Hsin-Te Yuan
@ 2024-11-26 13:37 ` Nícolas F. R. A. Prado
2024-11-27 7:27 ` Hsin-Te Yuan
0 siblings, 1 reply; 16+ messages in thread
From: Nícolas F. R. A. Prado @ 2024-11-26 13:37 UTC (permalink / raw)
To: Hsin-Te Yuan
Cc: Rafael J. Wysocki, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Matthias Brugger, AngeloGioacchino Del Regno, Alexandre Mergnat,
Balsam CHIHI, kernel, linux-pm, linux-kernel, linux-arm-kernel,
linux-mediatek, Chen-Yu Tsai, Bernhard Rosenkränzer,
Rafael J. Wysocki, stable
On Tue, Nov 26, 2024 at 04:00:42PM +0800, Hsin-Te Yuan wrote:
> On Tue, Nov 26, 2024 at 5:21 AM Nícolas F. R. A. Prado
> <nfraprado@collabora.com> wrote:
> >
> > When configured in filtered mode, the LVTS thermal controller will
> > monitor the temperature from the sensors and trigger an interrupt once a
> > thermal threshold is crossed.
> >
> > Currently this is true even during suspend and resume. The problem with
> > that is that when enabling the internal clock of the LVTS controller in
> > lvts_ctrl_set_enable() during resume, the temperature reading can glitch
> > and appear much higher than the real one, resulting in a spurious
> > interrupt getting generated.
> >
> This sounds weird to me. On my end, the symptom is that the device
> sometimes cannot suspend.
> To be more precise, `echo mem > /sys/power/state` returns almost
> immediately. I think the irq is more
> likely to be triggered during suspension.
Hi Hsin-Te,
please also check the first paragraph of the cover letter, and patch 2, that
should clarify it. But anyway, I can explain it here too:
The issue you observed is caused by two things combined:
* When returning from resume with filtered mode enabled, the sensor temperature
reading can glitch, appearing much higher. (fixed by this patch)
* Since the Stage 3 threshold is enabled and configured to take the maximum
reading from the sensors, it will be triggered by that glitch and bring the
system into a state where it can no longer suspend, it will just resume right
away. (fixed by patch 2)
So currently, every so often, during resume both these things will happen, and
any future suspend will resume right away. That's why this was never observed by
me when testing a single suspend/resume. It only breaks on resume, and only
affects future suspends, so you need to test multiple suspend/resumes on the
same run to observe this issue.
And also since both things are needed to cause this issue, if you apply only
patch 1 or only patch 2, it will already fix the issue.
Hope this clarifies it.
Thanks,
Nícolas
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] thermal/drivers/mediatek/lvts: Disable monitor mode during suspend
2024-11-26 13:19 ` Nícolas F. R. A. Prado
@ 2024-11-26 14:38 ` AngeloGioacchino Del Regno
0 siblings, 0 replies; 16+ messages in thread
From: AngeloGioacchino Del Regno @ 2024-11-26 14:38 UTC (permalink / raw)
To: Nícolas F. R. A. Prado
Cc: Rafael J. Wysocki, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Matthias Brugger, Alexandre Mergnat, Balsam CHIHI, kernel,
linux-pm, linux-kernel, linux-arm-kernel, linux-mediatek,
Hsin-Te Yuan, Chen-Yu Tsai, Bernhard Rosenkränzer,
Rafael J. Wysocki, stable
Il 26/11/24 14:19, Nícolas F. R. A. Prado ha scritto:
> On Tue, Nov 26, 2024 at 10:43:55AM +0100, AngeloGioacchino Del Regno wrote:
>> Il 25/11/24 22:20, Nícolas F. R. A. Prado ha scritto:
>>> When configured in filtered mode, the LVTS thermal controller will
>>> monitor the temperature from the sensors and trigger an interrupt once a
>>> thermal threshold is crossed.
>>>
>>> Currently this is true even during suspend and resume. The problem with
>>> that is that when enabling the internal clock of the LVTS controller in
>>> lvts_ctrl_set_enable() during resume, the temperature reading can glitch
>>> and appear much higher than the real one, resulting in a spurious
>>> interrupt getting generated.
>>>
>>> Disable the temperature monitoring and give some time for the signals to
>>> stabilize during suspend in order to prevent such spurious interrupts.
>>>
>>> Cc: stable@vger.kernel.org
>>> Reported-by: Hsin-Te Yuan <yuanhsinte@chromium.org>
>>> Closes: https://lore.kernel.org/all/20241108-lvts-v1-1-eee339c6ca20@chromium.org/
>>> Fixes: 8137bb90600d ("thermal/drivers/mediatek/lvts_thermal: Add suspend and resume")
>>> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
>>> ---
>>> drivers/thermal/mediatek/lvts_thermal.c | 36 +++++++++++++++++++++++++++++++--
>>> 1 file changed, 34 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
>>> index 1997e91bb3be94a3059db619238aa5787edc7675..a92ff2325c40704adc537af6995b34f93c3b0650 100644
>>> --- a/drivers/thermal/mediatek/lvts_thermal.c
>>> +++ b/drivers/thermal/mediatek/lvts_thermal.c
>>> @@ -860,6 +860,32 @@ static int lvts_ctrl_init(struct device *dev, struct lvts_domain *lvts_td,
>>> return 0;
>>> }
>>> +static void lvts_ctrl_monitor_enable(struct device *dev, struct lvts_ctrl *lvts_ctrl, bool enable)
>>> +{
>>> + /*
>>> + * Bitmaps to enable each sensor on filtered mode in the MONCTL0
>>> + * register.
>>> + */
>>> + u32 sensor_filt_bitmap[] = { BIT(0), BIT(1), BIT(2), BIT(3) };
>>> + u32 sensor_map = 0;
>>> + int i;
>>> +
>>> + if (lvts_ctrl->mode != LVTS_MSR_FILTERED_MODE)
>>> + return;
>>> +
>>
>> That's easier and shorter:
>>
>> static void lvts_ctrl_monitor_enable( .... )
>> {
>> /* Bitmap to enable each sensor on filtered mode in the MONCTL0 register */
>> const u32 sensor_map = GENMASK(3, 0);
>>
>> if (lvts_ctrl->mode != LVTS_MSR_FILTERED_MODE)
>> return;
>>
>> /* Bits 0-3: Sensing points - Bit 9: Single point access flow */
>> if (enable)
>> writel(sensor_map | BIT(9), LVTS_MONCTL0(lvts_ctrl->base));
>
> Wait, no, here you're enabling all the sensors in the controller. We only want
> to enable ones that are valid, otherwise we might get garbage data and irqs from
> sensors that aren't actually there. That's why I use the
> lvts_for_each_valid_sensor() helper in this patch.
>
Whoa, my brain actually missed the lvts_for_each_valid_sensor()!
Okay no, then you're right - sorry for the bad example! In that case, though, I
still have one more comment.
You can constify sensor_filt_bitmap, and since the values never go higher than
BIT(3), you should also be able to spare some memory by turning that into a u8:
const u8 sensor_filt_bitmap[] = { BIT(0), BIT(1), BIT(2), BIT(3) };
...and then I assume that there's no way valid sensors could ever read from an
index that is more than 4 (so, I assume that there's no way the loop tries to
read out of the array upper boundary).
In which case - after at least constifying the sensor_filt_bitmap array, for v2
feel free to add my
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
...and sorry again for the initial miss :-)
Cheers,
Angelo
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] thermal/drivers/mediatek/lvts: Disable monitor mode during suspend
2024-11-26 13:37 ` Nícolas F. R. A. Prado
@ 2024-11-27 7:27 ` Hsin-Te Yuan
0 siblings, 0 replies; 16+ messages in thread
From: Hsin-Te Yuan @ 2024-11-27 7:27 UTC (permalink / raw)
To: Nícolas F. R. A. Prado
Cc: Hsin-Te Yuan, Rafael J. Wysocki, Daniel Lezcano, Zhang Rui,
Lukasz Luba, Matthias Brugger, AngeloGioacchino Del Regno,
Alexandre Mergnat, Balsam CHIHI, kernel, linux-pm, linux-kernel,
linux-arm-kernel, linux-mediatek, Chen-Yu Tsai,
Bernhard Rosenkränzer, Rafael J. Wysocki, stable
On Tue, Nov 26, 2024 at 9:37 PM Nícolas F. R. A. Prado
<nfraprado@collabora.com> wrote:
>
> On Tue, Nov 26, 2024 at 04:00:42PM +0800, Hsin-Te Yuan wrote:
> > On Tue, Nov 26, 2024 at 5:21 AM Nícolas F. R. A. Prado
> > <nfraprado@collabora.com> wrote:
> > >
> > > When configured in filtered mode, the LVTS thermal controller will
> > > monitor the temperature from the sensors and trigger an interrupt once a
> > > thermal threshold is crossed.
> > >
> > > Currently this is true even during suspend and resume. The problem with
> > > that is that when enabling the internal clock of the LVTS controller in
> > > lvts_ctrl_set_enable() during resume, the temperature reading can glitch
> > > and appear much higher than the real one, resulting in a spurious
> > > interrupt getting generated.
> > >
> > This sounds weird to me. On my end, the symptom is that the device
> > sometimes cannot suspend.
> > To be more precise, `echo mem > /sys/power/state` returns almost
> > immediately. I think the irq is more
> > likely to be triggered during suspension.
>
> Hi Hsin-Te,
>
> please also check the first paragraph of the cover letter, and patch 2, that
> should clarify it. But anyway, I can explain it here too:
>
> The issue you observed is caused by two things combined:
> * When returning from resume with filtered mode enabled, the sensor temperature
> reading can glitch, appearing much higher. (fixed by this patch)
> * Since the Stage 3 threshold is enabled and configured to take the maximum
> reading from the sensors, it will be triggered by that glitch and bring the
> system into a state where it can no longer suspend, it will just resume right
> away. (fixed by patch 2)
>
> So currently, every so often, during resume both these things will happen, and
> any future suspend will resume right away. That's why this was never observed by
> me when testing a single suspend/resume. It only breaks on resume, and only
> affects future suspends, so you need to test multiple suspend/resumes on the
> same run to observe this issue.
>
> And also since both things are needed to cause this issue, if you apply only
> patch 1 or only patch 2, it will already fix the issue.
>
> Hope this clarifies it.
>
> Thanks,
> Nícolas
Thanks for the explanation!
Regards,
Hsin-Te
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2024-11-27 7:27 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-25 21:20 [PATCH 0/5] thermal/drivers/mediatek/lvts: Fixes for suspend and IRQ storm, and cleanups Nícolas F. R. A. Prado
2024-11-25 21:20 ` [PATCH 1/5] thermal/drivers/mediatek/lvts: Disable monitor mode during suspend Nícolas F. R. A. Prado
2024-11-26 8:00 ` Hsin-Te Yuan
2024-11-26 13:37 ` Nícolas F. R. A. Prado
2024-11-27 7:27 ` Hsin-Te Yuan
2024-11-26 9:43 ` AngeloGioacchino Del Regno
2024-11-26 13:19 ` Nícolas F. R. A. Prado
2024-11-26 14:38 ` AngeloGioacchino Del Regno
2024-11-25 21:20 ` [PATCH 2/5] thermal/drivers/mediatek/lvts: Disable Stage 3 thermal threshold Nícolas F. R. A. Prado
2024-11-26 9:43 ` AngeloGioacchino Del Regno
2024-11-25 21:20 ` [PATCH 3/5] thermal/drivers/mediatek/lvts: Disable low offset IRQ for minimum threshold Nícolas F. R. A. Prado
2024-11-26 9:43 ` AngeloGioacchino Del Regno
2024-11-25 21:20 ` [PATCH 4/5] thermal/drivers/mediatek/lvts: Start sensor interrupts disabled Nícolas F. R. A. Prado
2024-11-26 9:43 ` AngeloGioacchino Del Regno
2024-11-25 21:20 ` [PATCH 5/5] thermal/drivers/mediatek/lvts: Only update IRQ enable for valid sensors Nícolas F. R. A. Prado
2024-11-26 9:43 ` AngeloGioacchino Del Regno
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox