All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexandre Belloni <alexandre.belloni@bootlin.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	kvalo@kernel.org, Linux PM <linux-pm@vger.kernel.org>,
	ACPI Devel Maling List <linux-acpi@vger.kernel.org>,
	linux-rtc@vger.kernel.org,
	"open list:NETWORKING DRIVERS (WIRELESS)" 
	<linux-wireless@vger.kernel.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	merez@codeaurora.org, mat.jonczyk@o2.pl,
	Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>,
	Len Brown <len.brown@intel.com>
Subject: Re: [PATCH 0/7] PM: Solution for S0ix failure caused by PCH overheating
Date: Tue, 17 May 2022 19:07:37 +0200	[thread overview]
Message-ID: <YoPWWYOMqshE06vr@mail.local> (raw)
In-Reply-To: <CAJZ5v0gj=zOF-9sMaWtWbPm7051W31qbJsK25bceeAe693Wj3g@mail.gmail.com>

On 17/05/2022 17:11:05+0200, Rafael J. Wysocki wrote:
> On Thu, May 5, 2022 at 3:58 AM Zhang Rui <rui.zhang@intel.com> wrote:
> >
> > On some Intel client platforms like SKL/KBL/CNL/CML, there is a
> > PCH thermal sensor that monitors the PCH temperature and blocks the system
> > from entering S0ix in case it overheats.
> >
> > Commit ef63b043ac86 ("thermal: intel: pch: fix S0ix failure due to PCH
> > temperature above threshold") introduces a delay loop to cool the
> > temperature down for this purpose.
> >
> > However, in practice, we found that the time it takes to cool the PCH down
> > below threshold highly depends on the initial PCH temperature when the
> > delay starts, as well as the ambient temperature.
> >
> > For example, on a Dell XPS 9360 laptop, the problem can be triggered
> > 1. when it is suspended with heavy workload running.
> > or
> > 2. when it is moved from New Hampshire to Florida.
> >
> > In these cases, the 1 second delay is not sufficient. As a result, the
> > system stays in a shallower power state like PCx instead of S0ix, and
> > drains the battery power, without user' notice.
> >
> > In this patch series, we first fix the problem in patch 1/7 ~ 3/7, by
> > 1. expand the default overall cooling delay timeout to 60 seconds.
> > 2. make sure the temperature is below threshold rather than equal to it.
> > 3. move the delay to .suspend_noirq phase instead, in order to
> >    a) do the cooling when the system is in a more quiescent state
> >    b) be aware of wakeup events during the long delay, because some wakeup
> >       events (ACPI Power button Press, USB mouse, etc) become valid only
> >       in .suspend_noirq phase and later.
> >
> > However, this potential long delay introduces a problem to our suspend
> > stress automation test, because the delay makes it hard to predict how
> > much time it takes to suspend the system.
> > As we want to do as much suspend iterations as possible in limited time,
> > setting a 60+ seconds rtc alarm for suspend which usually takes shorter
> > than 1 second is far beyond overkill.
> >
> > Thus, in patch 4/7 ~ 7/7, a rtc driver hook is introduced, which cancels
> > the armed rtc alarm in the beginning of suspend and then rearm the rtc
> > alarm with a short interval (say, 2 second) right before system suspended.
> >
> > By running
> >  # echo 2 > /sys/module/rtc_cmos/parameters/rtc_wake_override_sec
> > before suspend, the system can be resumed by RTC alarm right after it is
> > suspended, no matter how much time the suspend really takes.
> >
> > This patch series has been tested on the same Dell XPS 9360 laptop and
> > S0ix is 100% achieved across 1000+ s2idle iterations.
> 
> Overall, the first three patches in the series can go in without the
> rest, so let's put them into a separate series.
> 
> Patch [4/7] doesn't depend on the first three ones, so it can go in by itself.
> 
> Patch [5/7] is to be dropped anyway as per the earlier discussion.
> 
> Patch [6/7] is only needed to apply patch [7/7] which is controversial.
> 
> I think that we can drop or defer patches [6-7/7] for now.

I don't think 7/7 is really useful in the upstream kernel, I don't plan
to apply it

-- 
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

  reply	other threads:[~2022-05-17 17:09 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-05  1:58 [PATCH 0/7] PM: Solution for S0ix failure caused by PCH overheating Zhang Rui
2022-05-05  1:58 ` [PATCH 1/7] PM: wakeup: expose pm_wakeup_pending to modules Zhang Rui
2022-05-05  1:58 ` [PATCH 2/7] thermal: intel: pch: enhance overheat handling Zhang Rui
2022-05-17 15:02   ` Rafael J. Wysocki
2022-05-05  1:58 ` [PATCH 3/7] thermal: intel: pch: improve the cooling delay log Zhang Rui
2022-05-05  1:58 ` [PATCH 4/7] ACPI: video: improve PM notifer callback Zhang Rui
2022-05-05  1:58 ` [PATCH 5/7] wil6210: remove debug message for unsupported PM event Zhang Rui
2022-05-05  4:38   ` Kalle Valo
2022-05-05  5:24     ` Zhang Rui
2022-05-06 14:04       ` Kalle Valo
2022-05-07  1:23         ` Zhang Rui
2022-05-05  1:58 ` [PATCH 6/7] PM: suspend: introduce PM_SUSPEND_LATE event Zhang Rui
2022-05-05  1:58 ` [PATCH 7/7] rtc: cmos: Add suspend/resume endurance testing hook Zhang Rui
2022-05-06 21:46   ` Alexandre Belloni
2022-05-07  2:00     ` Zhang Rui
2022-05-07  7:31       ` Alexandre Belloni
2022-05-07  7:41         ` Zhang Rui
2022-05-16  7:50           ` Zhang Rui
2022-05-17 15:14   ` Rafael J. Wysocki
2022-05-18 14:44     ` Zhang Rui
2022-05-18 15:02       ` Rafael J. Wysocki
2022-05-18 16:07         ` Zhang Rui
2022-05-19  2:33           ` Len Brown
2022-05-19 10:56             ` Rafael J. Wysocki
2022-05-05  8:22 ` [PATCH 0/7] PM: Solution for S0ix failure caused by PCH overheating Oliver Neukum
2022-05-05 12:02   ` Rafael J. Wysocki
2022-05-05 15:18     ` Zhang Rui
2022-05-17 15:11 ` Rafael J. Wysocki
2022-05-17 17:07   ` Alexandre Belloni [this message]
2022-05-18 14:11   ` Zhang Rui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YoPWWYOMqshE06vr@mail.local \
    --to=alexandre.belloni@bootlin.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=kvalo@kernel.org \
    --cc=len.brown@intel.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-rtc@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=mat.jonczyk@o2.pl \
    --cc=merez@codeaurora.org \
    --cc=rafael@kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=rui.zhang@intel.com \
    --cc=sumeet.r.pawnikar@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.