linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexandre Belloni <alexandre.belloni@bootlin.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	kvalo@kernel.org, Linux PM <linux-pm@vger.kernel.org>,
	ACPI Devel Maling List <linux-acpi@vger.kernel.org>,
	linux-rtc@vger.kernel.org,
	"open list:NETWORKING DRIVERS (WIRELESS)" 
	<linux-wireless@vger.kernel.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	merez@codeaurora.org, mat.jonczyk@o2.pl,
	Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>,
	Len Brown <len.brown@intel.com>
Subject: Re: [PATCH 0/7] PM: Solution for S0ix failure caused by PCH overheating
Date: Tue, 17 May 2022 19:07:37 +0200	[thread overview]
Message-ID: <YoPWWYOMqshE06vr@mail.local> (raw)
In-Reply-To: <CAJZ5v0gj=zOF-9sMaWtWbPm7051W31qbJsK25bceeAe693Wj3g@mail.gmail.com>

On 17/05/2022 17:11:05+0200, Rafael J. Wysocki wrote:
> On Thu, May 5, 2022 at 3:58 AM Zhang Rui <rui.zhang@intel.com> wrote:
> >
> > On some Intel client platforms like SKL/KBL/CNL/CML, there is a
> > PCH thermal sensor that monitors the PCH temperature and blocks the system
> > from entering S0ix in case it overheats.
> >
> > Commit ef63b043ac86 ("thermal: intel: pch: fix S0ix failure due to PCH
> > temperature above threshold") introduces a delay loop to cool the
> > temperature down for this purpose.
> >
> > However, in practice, we found that the time it takes to cool the PCH down
> > below threshold highly depends on the initial PCH temperature when the
> > delay starts, as well as the ambient temperature.
> >
> > For example, on a Dell XPS 9360 laptop, the problem can be triggered
> > 1. when it is suspended with heavy workload running.
> > or
> > 2. when it is moved from New Hampshire to Florida.
> >
> > In these cases, the 1 second delay is not sufficient. As a result, the
> > system stays in a shallower power state like PCx instead of S0ix, and
> > drains the battery power, without user' notice.
> >
> > In this patch series, we first fix the problem in patch 1/7 ~ 3/7, by
> > 1. expand the default overall cooling delay timeout to 60 seconds.
> > 2. make sure the temperature is below threshold rather than equal to it.
> > 3. move the delay to .suspend_noirq phase instead, in order to
> >    a) do the cooling when the system is in a more quiescent state
> >    b) be aware of wakeup events during the long delay, because some wakeup
> >       events (ACPI Power button Press, USB mouse, etc) become valid only
> >       in .suspend_noirq phase and later.
> >
> > However, this potential long delay introduces a problem to our suspend
> > stress automation test, because the delay makes it hard to predict how
> > much time it takes to suspend the system.
> > As we want to do as much suspend iterations as possible in limited time,
> > setting a 60+ seconds rtc alarm for suspend which usually takes shorter
> > than 1 second is far beyond overkill.
> >
> > Thus, in patch 4/7 ~ 7/7, a rtc driver hook is introduced, which cancels
> > the armed rtc alarm in the beginning of suspend and then rearm the rtc
> > alarm with a short interval (say, 2 second) right before system suspended.
> >
> > By running
> >  # echo 2 > /sys/module/rtc_cmos/parameters/rtc_wake_override_sec
> > before suspend, the system can be resumed by RTC alarm right after it is
> > suspended, no matter how much time the suspend really takes.
> >
> > This patch series has been tested on the same Dell XPS 9360 laptop and
> > S0ix is 100% achieved across 1000+ s2idle iterations.
> 
> Overall, the first three patches in the series can go in without the
> rest, so let's put them into a separate series.
> 
> Patch [4/7] doesn't depend on the first three ones, so it can go in by itself.
> 
> Patch [5/7] is to be dropped anyway as per the earlier discussion.
> 
> Patch [6/7] is only needed to apply patch [7/7] which is controversial.
> 
> I think that we can drop or defer patches [6-7/7] for now.

I don't think 7/7 is really useful in the upstream kernel, I don't plan
to apply it

-- 
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

  reply	other threads:[~2022-05-17 17:09 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-05  1:58 [PATCH 0/7] PM: Solution for S0ix failure caused by PCH overheating Zhang Rui
2022-05-05  1:58 ` [PATCH 1/7] PM: wakeup: expose pm_wakeup_pending to modules Zhang Rui
2022-05-05  1:58 ` [PATCH 2/7] thermal: intel: pch: enhance overheat handling Zhang Rui
2022-05-17 15:02   ` Rafael J. Wysocki
2022-05-05  1:58 ` [PATCH 3/7] thermal: intel: pch: improve the cooling delay log Zhang Rui
2022-05-05  1:58 ` [PATCH 4/7] ACPI: video: improve PM notifer callback Zhang Rui
2022-05-05  1:58 ` [PATCH 5/7] wil6210: remove debug message for unsupported PM event Zhang Rui
2022-05-05  4:38   ` Kalle Valo
2022-05-05  5:24     ` Zhang Rui
2022-05-06 14:04       ` Kalle Valo
2022-05-07  1:23         ` Zhang Rui
2022-05-05  1:58 ` [PATCH 6/7] PM: suspend: introduce PM_SUSPEND_LATE event Zhang Rui
2022-05-05  1:58 ` [PATCH 7/7] rtc: cmos: Add suspend/resume endurance testing hook Zhang Rui
2022-05-06 21:46   ` Alexandre Belloni
2022-05-07  2:00     ` Zhang Rui
2022-05-07  7:31       ` Alexandre Belloni
2022-05-07  7:41         ` Zhang Rui
2022-05-16  7:50           ` Zhang Rui
2022-05-17 15:14   ` Rafael J. Wysocki
2022-05-18 14:44     ` Zhang Rui
2022-05-18 15:02       ` Rafael J. Wysocki
2022-05-18 16:07         ` Zhang Rui
2022-05-19  2:33           ` Len Brown
2022-05-19 10:56             ` Rafael J. Wysocki
2022-05-05  8:22 ` [PATCH 0/7] PM: Solution for S0ix failure caused by PCH overheating Oliver Neukum
2022-05-05 12:02   ` Rafael J. Wysocki
2022-05-05 15:18     ` Zhang Rui
2022-05-17 15:11 ` Rafael J. Wysocki
2022-05-17 17:07   ` Alexandre Belloni [this message]
2022-05-18 14:11   ` Zhang Rui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YoPWWYOMqshE06vr@mail.local \
    --to=alexandre.belloni@bootlin.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=kvalo@kernel.org \
    --cc=len.brown@intel.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-rtc@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=mat.jonczyk@o2.pl \
    --cc=merez@codeaurora.org \
    --cc=rafael@kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=rui.zhang@intel.com \
    --cc=sumeet.r.pawnikar@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).