From: Zhang Rui <rui.zhang@intel.com>
To: rjw@rjwysocki.net, kvalo@kernel.org, alexandre.belloni@bootlin.com
Cc: linux-pm@vger.kernel.org, linux-acpi@vger.kernel.org,
linux-rtc@vger.kernel.org, linux-wireless@vger.kernel.org,
daniel.lezcano@linaro.org, merez@codeaurora.org,
mat.jonczyk@o2.pl, sumeet.r.pawnikar@intel.com,
len.brown@intel.com
Subject: [PATCH 2/7] thermal: intel: pch: enhance overheat handling
Date: Thu, 5 May 2022 09:58:09 +0800 [thread overview]
Message-ID: <20220505015814.3727692-3-rui.zhang@intel.com> (raw)
In-Reply-To: <20220505015814.3727692-1-rui.zhang@intel.com>
Commit ef63b043ac86 ("thermal: intel: pch: fix S0ix failure due to PCH
temperature above threshold") introduces delay loop mechanism that allows
PCH temperature to go down below threshold during suspend so it won't
block S0ix. And the default overall delay timeout is 1 second.
However, in practice, we found that the time it takes to cool the PCH down
below threshold highly depends on the initial PCH temperature when the
delay starts, as well as the ambient temperature.
And in some cases, the 1 second delay is not sufficient. As a result, the
system stays in a shallower power state like PCx instead of S0ix, and
drains the battery power, without user' notice.
To make sure S0ix is not blocked by the PCH overheating, we
1. expand the default overall timeout to 60 seconds.
2. make sure the temperature is below threshold rather than equal to it.
3. move the delay to .suspend_noirq phase instead, in order to
a) do cooling delay with a more quiescent system
b) be aware of wakeup events during the long delay, because some wakeup
events (ACPI Power button Press, USB mouse, etc) become valid only
in .suspend_noirq phase and later.
This may introduce longer suspend time, but only in the cases when the
system overheats and Linux used to enter a shallower S2idle state, say,
PCx instead of S0ix.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
---
drivers/thermal/intel/intel_pch_thermal.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/drivers/thermal/intel/intel_pch_thermal.c b/drivers/thermal/intel/intel_pch_thermal.c
index 527c91f5960b..b7b32e2f5ae2 100644
--- a/drivers/thermal/intel/intel_pch_thermal.c
+++ b/drivers/thermal/intel/intel_pch_thermal.c
@@ -70,8 +70,8 @@ static unsigned int delay_timeout = 100;
module_param(delay_timeout, int, 0644);
MODULE_PARM_DESC(delay_timeout, "amount of time delay for each iteration.");
-/* Number of iterations for cooling delay, 10 counts by default for now */
-static unsigned int delay_cnt = 10;
+/* Number of iterations for cooling delay, 600 counts by default for now */
+static unsigned int delay_cnt = 600;
module_param(delay_cnt, int, 0644);
MODULE_PARM_DESC(delay_cnt, "total number of iterations for time delay.");
@@ -193,10 +193,11 @@ static int pch_wpt_get_temp(struct pch_thermal_device *ptd, int *temp)
return 0;
}
+/* Cool the PCH when it's overheat in .suspend_noirq phase */
static int pch_wpt_suspend(struct pch_thermal_device *ptd)
{
u8 tsel;
- u8 pch_delay_cnt = 1;
+ int pch_delay_cnt = 1;
u16 pch_thr_temp, pch_cur_temp;
/* Shutdown the thermal sensor if it is not enabled by BIOS */
@@ -233,7 +234,10 @@ static int pch_wpt_suspend(struct pch_thermal_device *ptd)
* which helps to indentify the reason why S0ix entry was rejected.
*/
while (pch_delay_cnt <= delay_cnt) {
- if (pch_cur_temp <= pch_thr_temp)
+ if (pch_cur_temp < pch_thr_temp)
+ break;
+
+ if (pm_wakeup_pending())
break;
dev_warn(&ptd->pdev->dev,
@@ -245,7 +249,7 @@ static int pch_wpt_suspend(struct pch_thermal_device *ptd)
pch_delay_cnt++;
}
- if (pch_cur_temp > pch_thr_temp)
+ if (pch_cur_temp >= pch_thr_temp)
dev_warn(&ptd->pdev->dev,
"CPU-PCH is hot [%dC] even after delay, continue to suspend. S0ix might fail\n",
pch_cur_temp);
@@ -455,7 +459,7 @@ static void intel_pch_thermal_remove(struct pci_dev *pdev)
pci_disable_device(pdev);
}
-static int intel_pch_thermal_suspend(struct device *device)
+static int intel_pch_thermal_suspend_noirq(struct device *device)
{
struct pch_thermal_device *ptd = dev_get_drvdata(device);
@@ -495,7 +499,7 @@ static const struct pci_device_id intel_pch_thermal_id[] = {
MODULE_DEVICE_TABLE(pci, intel_pch_thermal_id);
static const struct dev_pm_ops intel_pch_pm_ops = {
- .suspend = intel_pch_thermal_suspend,
+ .suspend_noirq = intel_pch_thermal_suspend_noirq,
.resume = intel_pch_thermal_resume,
};
--
2.17.1
next prev parent reply other threads:[~2022-05-05 1:58 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-05 1:58 [PATCH 0/7] PM: Solution for S0ix failure caused by PCH overheating Zhang Rui
2022-05-05 1:58 ` [PATCH 1/7] PM: wakeup: expose pm_wakeup_pending to modules Zhang Rui
2022-05-05 1:58 ` Zhang Rui [this message]
2022-05-17 15:02 ` [PATCH 2/7] thermal: intel: pch: enhance overheat handling Rafael J. Wysocki
2022-05-05 1:58 ` [PATCH 3/7] thermal: intel: pch: improve the cooling delay log Zhang Rui
2022-05-05 1:58 ` [PATCH 4/7] ACPI: video: improve PM notifer callback Zhang Rui
2022-05-05 1:58 ` [PATCH 5/7] wil6210: remove debug message for unsupported PM event Zhang Rui
2022-05-05 4:38 ` Kalle Valo
2022-05-05 5:24 ` Zhang Rui
2022-05-06 14:04 ` Kalle Valo
2022-05-07 1:23 ` Zhang Rui
2022-05-05 1:58 ` [PATCH 6/7] PM: suspend: introduce PM_SUSPEND_LATE event Zhang Rui
2022-05-05 1:58 ` [PATCH 7/7] rtc: cmos: Add suspend/resume endurance testing hook Zhang Rui
2022-05-06 21:46 ` Alexandre Belloni
2022-05-07 2:00 ` Zhang Rui
2022-05-07 7:31 ` Alexandre Belloni
2022-05-07 7:41 ` Zhang Rui
2022-05-16 7:50 ` Zhang Rui
2022-05-17 15:14 ` Rafael J. Wysocki
2022-05-18 14:44 ` Zhang Rui
2022-05-18 15:02 ` Rafael J. Wysocki
2022-05-18 16:07 ` Zhang Rui
2022-05-19 2:33 ` Len Brown
2022-05-19 10:56 ` Rafael J. Wysocki
2022-05-05 8:22 ` [PATCH 0/7] PM: Solution for S0ix failure caused by PCH overheating Oliver Neukum
2022-05-05 12:02 ` Rafael J. Wysocki
2022-05-05 15:18 ` Zhang Rui
2022-05-17 15:11 ` Rafael J. Wysocki
2022-05-17 17:07 ` Alexandre Belloni
2022-05-18 14:11 ` Zhang Rui
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220505015814.3727692-3-rui.zhang@intel.com \
--to=rui.zhang@intel.com \
--cc=alexandre.belloni@bootlin.com \
--cc=daniel.lezcano@linaro.org \
--cc=kvalo@kernel.org \
--cc=len.brown@intel.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=linux-rtc@vger.kernel.org \
--cc=linux-wireless@vger.kernel.org \
--cc=mat.jonczyk@o2.pl \
--cc=merez@codeaurora.org \
--cc=rjw@rjwysocki.net \
--cc=sumeet.r.pawnikar@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).