* [PATCH][RFC] ACPI throttling: Save/restore tstate for each CPUs across suspend/resume
@ 2016-11-14 17:44 Chen Yu
2016-11-22 23:03 ` Rafael J. Wysocki
0 siblings, 1 reply; 4+ messages in thread
From: Chen Yu @ 2016-11-14 17:44 UTC (permalink / raw)
To: linux-acpi
Cc: Rui Wang, linux-kernel, Chen Yu, Len Brown, Rafael J. Wysocki,
Pavel Machek, Matthew Garrett, Rui Zhang, linux-pm
This is a trial version and any comments are appreciated.
Previously a bug was reported that on certain Broadwell
platforms, after resuming from S3, the CPU is running at
an anomalously low speed, due to BIOS has enabled the
throttling across S3. The solution to this is to introduce
a quirk framework to save/restore tstate MSR register
around suspend/resume, in Commit 7a9c2dd08ead ("x86/pm:
Introduce quirk framework to save/restore extra MSR
registers around suspend/resume").
However more and more reports show that other platforms also
experienced the same issue, because some BIOSes would like to
adjust the tstate if he thinks the temperature is too high.
To deal with this situation, the Linux uses a compensation strategy
that, the thermal management leverages thermal_pm_notify() upon resume
to check if the Processors inside the thermal zone should be throttled
or not, thus tstate would be re-evaluated. Unfortunately on these bogus
platforms, none of the Processors are inside any thermal zones due
to BIOS's implementation. Thus tstate for Processors never has a
chance to be brought back to normal.
This patch tries to save/restore tstate on receiving the
PM_SUSPEND_PREPARE and PM_POST_SUSPEND, to be more specific,
the tstate is saved after thermal_pm_notify(PM_SUSPEND_PREPARE)
is called, while it's restored before thermal_pm_notify(PM_POST_SUSPEND),
in this way the thermal zone would adjust the tstate eventually and
also help adjust the tstate for Processors which do not have
thermal zone bound. Thus it does not imapct the old semantics.
Another concern is that, each CPU should take care of the
save/restore operation, thus this patch uses percpu workqueue
to achieve this.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=90041
Reported-by: Matthew Garrett <mjg59@srcf.ucam.org>
Reported-by: Kadir <kadir@colakoglu.nl>
Cc: Len Brown <lenb@kernel.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
drivers/acpi/processor_throttling.c | 70 +++++++++++++++++++++++++++++++++++++
1 file changed, 70 insertions(+)
diff --git a/drivers/acpi/processor_throttling.c b/drivers/acpi/processor_throttling.c
index d51ca1c..8ddc7d6 100644
--- a/drivers/acpi/processor_throttling.c
+++ b/drivers/acpi/processor_throttling.c
@@ -29,6 +29,7 @@
#include <linux/sched.h>
#include <linux/cpufreq.h>
#include <linux/acpi.h>
+#include <linux/suspend.h>
#include <acpi/processor.h>
#include <asm/io.h>
#include <asm/uaccess.h>
@@ -758,6 +759,75 @@ static int acpi_throttling_wrmsr(u64 value)
}
return ret;
}
+
+#ifdef CONFIG_PM_SLEEP
+static DEFINE_PER_CPU(u64, tstate_msr);
+
+static long tstate_pm_fn(void *data)
+{
+ u64 value;
+ bool save = *(bool *)data;
+
+ if (save) {
+ acpi_throttling_rdmsr(&value);
+ this_cpu_write(tstate_msr, value);
+ } else {
+ value = this_cpu_read(tstate_msr);
+ if (value)
+ acpi_throttling_wrmsr(value);
+ }
+ return 0;
+}
+
+static void tstate_check(unsigned long mode, bool suspend)
+{
+ int cpu;
+ bool save;
+
+ if (suspend && mode == PM_SUSPEND_PREPARE)
+ save = true;
+ else if (!suspend && mode == PM_POST_SUSPEND)
+ save = false;
+ else
+ return;
+
+ get_online_cpus();
+ for_each_online_cpu(cpu)
+ work_on_cpu(cpu, tstate_pm_fn, &save);
+ put_online_cpus();
+}
+
+static int tstate_suspend(struct notifier_block *nb,
+ unsigned long mode, void *_unused)
+{
+ tstate_check(mode, true);
+ return 0;
+}
+
+static int tstate_resume(struct notifier_block *nb,
+ unsigned long mode, void *_unused)
+{
+ tstate_check(mode, false);
+ return 0;
+}
+
+static int __init tstate_pm_init(void)
+{
+ /*
+ * tstate_suspend should save tstate after
+ * thermal zone's update in thermal_pm_notify,
+ * vice versa tstate_resume restore tstate before
+ * thermal_pm_notify, thus the thermal framework
+ * has a chance to re-adjust tstate according to the
+ * temperature trend.
+ */
+ pm_notifier(tstate_suspend, -1);
+ pm_notifier(tstate_resume, 1);
+ return 0;
+}
+
+core_initcall(tstate_pm_init);
+#endif
#else
static int acpi_throttling_rdmsr(u64 *value)
{
--
2.7.4
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH][RFC] ACPI throttling: Save/restore tstate for each CPUs across suspend/resume
2016-11-14 17:44 [PATCH][RFC] ACPI throttling: Save/restore tstate for each CPUs across suspend/resume Chen Yu
@ 2016-11-22 23:03 ` Rafael J. Wysocki
2016-11-29 1:27 ` Chen, Yu C
0 siblings, 1 reply; 4+ messages in thread
From: Rafael J. Wysocki @ 2016-11-22 23:03 UTC (permalink / raw)
To: Chen Yu
Cc: ACPI Devel Maling List, Rui Wang, Linux Kernel Mailing List,
Len Brown, Rafael J. Wysocki, Pavel Machek, Matthew Garrett,
Rui Zhang, Linux PM
On Mon, Nov 14, 2016 at 6:44 PM, Chen Yu <yu.c.chen@intel.com> wrote:
> This is a trial version and any comments are appreciated.
>
> Previously a bug was reported that on certain Broadwell
> platforms, after resuming from S3, the CPU is running at
> an anomalously low speed, due to BIOS has enabled the
> throttling across S3. The solution to this is to introduce
> a quirk framework to save/restore tstate MSR register
> around suspend/resume, in Commit 7a9c2dd08ead ("x86/pm:
> Introduce quirk framework to save/restore extra MSR
> registers around suspend/resume").
>
> However more and more reports show that other platforms also
> experienced the same issue, because some BIOSes would like to
> adjust the tstate if he thinks the temperature is too high.
> To deal with this situation, the Linux uses a compensation strategy
> that, the thermal management leverages thermal_pm_notify() upon resume
> to check if the Processors inside the thermal zone should be throttled
> or not, thus tstate would be re-evaluated. Unfortunately on these bogus
> platforms, none of the Processors are inside any thermal zones due
> to BIOS's implementation. Thus tstate for Processors never has a
> chance to be brought back to normal.
>
> This patch tries to save/restore tstate on receiving the
> PM_SUSPEND_PREPARE and PM_POST_SUSPEND, to be more specific,
> the tstate is saved after thermal_pm_notify(PM_SUSPEND_PREPARE)
> is called, while it's restored before thermal_pm_notify(PM_POST_SUSPEND),
> in this way the thermal zone would adjust the tstate eventually and
> also help adjust the tstate for Processors which do not have
> thermal zone bound. Thus it does not imapct the old semantics.
>
> Another concern is that, each CPU should take care of the
> save/restore operation, thus this patch uses percpu workqueue
> to achieve this.
>
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=90041
> Reported-by: Matthew Garrett <mjg59@srcf.ucam.org>
> Reported-by: Kadir <kadir@colakoglu.nl>
> Cc: Len Brown <lenb@kernel.org>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Pavel Machek <pavel@ucw.cz>
> Cc: Matthew Garrett <mjg59@srcf.ucam.org>
> Cc: Rui Zhang <rui.zhang@intel.com>
> Cc: linux-pm@vger.kernel.org
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> ---
> drivers/acpi/processor_throttling.c | 70 +++++++++++++++++++++++++++++++++++++
> 1 file changed, 70 insertions(+)
>
> diff --git a/drivers/acpi/processor_throttling.c b/drivers/acpi/processor_throttling.c
> index d51ca1c..8ddc7d6 100644
> --- a/drivers/acpi/processor_throttling.c
> +++ b/drivers/acpi/processor_throttling.c
> @@ -29,6 +29,7 @@
> #include <linux/sched.h>
> #include <linux/cpufreq.h>
> #include <linux/acpi.h>
> +#include <linux/suspend.h>
> #include <acpi/processor.h>
> #include <asm/io.h>
> #include <asm/uaccess.h>
> @@ -758,6 +759,75 @@ static int acpi_throttling_wrmsr(u64 value)
> }
> return ret;
> }
> +
> +#ifdef CONFIG_PM_SLEEP
> +static DEFINE_PER_CPU(u64, tstate_msr);
Call it saved_tstate_msr maybe?
> +
> +static long tstate_pm_fn(void *data)
> +{
> + u64 value;
> + bool save = *(bool *)data;
> +
> + if (save) {
> + acpi_throttling_rdmsr(&value);
> + this_cpu_write(tstate_msr, value);
> + } else {
> + value = this_cpu_read(tstate_msr);
> + if (value)
> + acpi_throttling_wrmsr(value);
> + }
> + return 0;
> +}
I would split the above into two functions, one for saving and one for
restoring ->
> +
> +static void tstate_check(unsigned long mode, bool suspend)
> +{
> + int cpu;
> + bool save;
> +
> + if (suspend && mode == PM_SUSPEND_PREPARE)
> + save = true;
> + else if (!suspend && mode == PM_POST_SUSPEND)
> + save = false;
> + else
> + return;
> +
> + get_online_cpus();
> + for_each_online_cpu(cpu)
-> and decide here which one to invoke.
> + work_on_cpu(cpu, tstate_pm_fn, &save);
Does work_on_cpu() wait for the work to complete?
> + put_online_cpus();
> +}
> +
> +static int tstate_suspend(struct notifier_block *nb,
> + unsigned long mode, void *_unused)
> +{
> + tstate_check(mode, true);
> + return 0;
> +}
> +
> +static int tstate_resume(struct notifier_block *nb,
> + unsigned long mode, void *_unused)
> +{
> + tstate_check(mode, false);
> + return 0;
> +}
> +
> +static int __init tstate_pm_init(void)
> +{
> + /*
> + * tstate_suspend should save tstate after
> + * thermal zone's update in thermal_pm_notify,
> + * vice versa tstate_resume restore tstate before
> + * thermal_pm_notify, thus the thermal framework
> + * has a chance to re-adjust tstate according to the
> + * temperature trend.
> + */
> + pm_notifier(tstate_suspend, -1);
> + pm_notifier(tstate_resume, 1);
I don't think this is going to do what you really want.
Each of these notifiers is going to be invoked during both suspend and
resume, so I guess you only need one notifier?
> + return 0;
> +}
> +
> +core_initcall(tstate_pm_init);
> +#endif
> #else
> static int acpi_throttling_rdmsr(u64 *value)
> {
> --
Thanks,
Rafael
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: [PATCH][RFC] ACPI throttling: Save/restore tstate for each CPUs across suspend/resume
2016-11-22 23:03 ` Rafael J. Wysocki
@ 2016-11-29 1:27 ` Chen, Yu C
2016-11-29 1:50 ` Rafael J. Wysocki
0 siblings, 1 reply; 4+ messages in thread
From: Chen, Yu C @ 2016-11-29 1:27 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: ACPI Devel Maling List, Rui Wang, Linux Kernel Mailing List,
Len Brown, Pavel Machek, Matthew Garrett, Zhang, Rui, Linux PM
Hi,
> -----Original Message-----
> From: rjwysocki@gmail.com [mailto:rjwysocki@gmail.com] On Behalf Of
> Rafael J. Wysocki
> Sent: Wednesday, November 23, 2016 7:03 AM
> To: Chen, Yu C
> Cc: ACPI Devel Maling List; Rui Wang; Linux Kernel Mailing List; Len Brown;
> Rafael J. Wysocki; Pavel Machek; Matthew Garrett; Zhang, Rui; Linux PM
> Subject: Re: [PATCH][RFC] ACPI throttling: Save/restore tstate for each CPUs
> across suspend/resume
>
> On Mon, Nov 14, 2016 at 6:44 PM, Chen Yu <yu.c.chen@intel.com> wrote:
> > This is a trial version and any comments are appreciated.
> >
> > Previously a bug was reported that on certain Broadwell platforms,
> > after resuming from S3, the CPU is running at an anomalously low
> > speed, due to BIOS has enabled the throttling across S3. The solution
> > to this is to introduce a quirk framework to save/restore tstate MSR
> > register around suspend/resume, in Commit 7a9c2dd08ead ("x86/pm:
> > Introduce quirk framework to save/restore extra MSR registers around
> > suspend/resume").
> >
> > However more and more reports show that other platforms also
> > experienced the same issue, because some BIOSes would like to adjust
> > the tstate if he thinks the temperature is too high.
> > To deal with this situation, the Linux uses a compensation strategy
> > that, the thermal management leverages thermal_pm_notify() upon resume
> > to check if the Processors inside the thermal zone should be throttled
> > or not, thus tstate would be re-evaluated. Unfortunately on these
> > bogus platforms, none of the Processors are inside any thermal zones
> > due to BIOS's implementation. Thus tstate for Processors never has a
> > chance to be brought back to normal.
> >
> > This patch tries to save/restore tstate on receiving the
> > PM_SUSPEND_PREPARE and PM_POST_SUSPEND, to be more specific, the
> > tstate is saved after thermal_pm_notify(PM_SUSPEND_PREPARE)
> > is called, while it's restored before
> > thermal_pm_notify(PM_POST_SUSPEND),
> > in this way the thermal zone would adjust the tstate eventually and
> > also help adjust the tstate for Processors which do not have thermal
> > zone bound. Thus it does not imapct the old semantics.
> >
> > Another concern is that, each CPU should take care of the save/restore
> > operation, thus this patch uses percpu workqueue to achieve this.
> >
> > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=90041
> > Reported-by: Matthew Garrett <mjg59@srcf.ucam.org>
> > Reported-by: Kadir <kadir@colakoglu.nl>
> > Cc: Len Brown <lenb@kernel.org>
> > Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> > Cc: Pavel Machek <pavel@ucw.cz>
> > Cc: Matthew Garrett <mjg59@srcf.ucam.org>
> > Cc: Rui Zhang <rui.zhang@intel.com>
> > Cc: linux-pm@vger.kernel.org
> > Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> > ---
> > drivers/acpi/processor_throttling.c | 70
> > +++++++++++++++++++++++++++++++++++++
> > 1 file changed, 70 insertions(+)
> >
> > diff --git a/drivers/acpi/processor_throttling.c
> > b/drivers/acpi/processor_throttling.c
> > index d51ca1c..8ddc7d6 100644
> > --- a/drivers/acpi/processor_throttling.c
> > +++ b/drivers/acpi/processor_throttling.c
> > @@ -29,6 +29,7 @@
> > #include <linux/sched.h>
> > #include <linux/cpufreq.h>
> > #include <linux/acpi.h>
> > +#include <linux/suspend.h>
> > #include <acpi/processor.h>
> > #include <asm/io.h>
> > #include <asm/uaccess.h>
> > @@ -758,6 +759,75 @@ static int acpi_throttling_wrmsr(u64 value)
> > }
> > return ret;
> > }
> > +
> > +#ifdef CONFIG_PM_SLEEP
> > +static DEFINE_PER_CPU(u64, tstate_msr);
>
> Call it saved_tstate_msr maybe?
OK.
>
> > +
> > +static long tstate_pm_fn(void *data)
> > +{
> > + u64 value;
> > + bool save = *(bool *)data;
> > +
> > + if (save) {
> > + acpi_throttling_rdmsr(&value);
> > + this_cpu_write(tstate_msr, value);
> > + } else {
> > + value = this_cpu_read(tstate_msr);
> > + if (value)
> > + acpi_throttling_wrmsr(value);
> > + }
> > + return 0;
> > +}
>
> I would split the above into two functions, one for saving and one for restoring ->
>
OK.
> > +
> > +static void tstate_check(unsigned long mode, bool suspend) {
> > + int cpu;
> > + bool save;
> > +
> > + if (suspend && mode == PM_SUSPEND_PREPARE)
> > + save = true;
> > + else if (!suspend && mode == PM_POST_SUSPEND)
> > + save = false;
> > + else
> > + return;
> > +
> > + get_online_cpus();
> > + for_each_online_cpu(cpu)
>
> -> and decide here which one to invoke.
OK.
>
> > + work_on_cpu(cpu, tstate_pm_fn, &save);
>
> Does work_on_cpu() wait for the work to complete?
>
Yes, it might increase the suspend/resume time, a 'queue_work_on' might be better?
> > + put_online_cpus();
> > +}
> > +
> > +static int tstate_suspend(struct notifier_block *nb,
> > + unsigned long mode, void *_unused) {
> > + tstate_check(mode, true);
> > + return 0;
> > +}
> > +
> > +static int tstate_resume(struct notifier_block *nb,
> > + unsigned long mode, void *_unused) {
> > + tstate_check(mode, false);
> > + return 0;
> > +}
> > +
> > +static int __init tstate_pm_init(void) {
> > + /*
> > + * tstate_suspend should save tstate after
> > + * thermal zone's update in thermal_pm_notify,
> > + * vice versa tstate_resume restore tstate before
> > + * thermal_pm_notify, thus the thermal framework
> > + * has a chance to re-adjust tstate according to the
> > + * temperature trend.
> > + */
> > + pm_notifier(tstate_suspend, -1);
> > + pm_notifier(tstate_resume, 1);
>
> I don't think this is going to do what you really want.
>
> Each of these notifiers is going to be invoked during both suspend and resume,
Yes,
> so I guess you only need one notifier?
Here's my original thought: tstate_suspend needs to be invoked after
thermal_pm_notify, which has a priority of '0',
so the notifier of tstate_suspend should be lower than '0',
thus '-1'. And the same for tstate_resume,
it should be invoked before thermal_pm_notify,
thus priority is '1' ?
>
> > + return 0;
> > +}
> > +
> > +core_initcall(tstate_pm_init);
> > +#endif
> > #else
> > static int acpi_throttling_rdmsr(u64 *value) {
> > --
>
> Thanks,
> Rafael
Thanks,
Yu
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH][RFC] ACPI throttling: Save/restore tstate for each CPUs across suspend/resume
2016-11-29 1:27 ` Chen, Yu C
@ 2016-11-29 1:50 ` Rafael J. Wysocki
0 siblings, 0 replies; 4+ messages in thread
From: Rafael J. Wysocki @ 2016-11-29 1:50 UTC (permalink / raw)
To: Chen, Yu C
Cc: Rafael J. Wysocki, ACPI Devel Maling List, Rui Wang,
Linux Kernel Mailing List, Len Brown, Pavel Machek,
Matthew Garrett, Zhang, Rui, Linux PM
On Tue, Nov 29, 2016 at 2:27 AM, Chen, Yu C <yu.c.chen@intel.com> wrote:
> Hi,
>> -----Original Message-----
>> From: rjwysocki@gmail.com [mailto:rjwysocki@gmail.com] On Behalf Of
>> Rafael J. Wysocki
>> Sent: Wednesday, November 23, 2016 7:03 AM
[cut]
>>
>> > + work_on_cpu(cpu, tstate_pm_fn, &save);
>>
>> Does work_on_cpu() wait for the work to complete?
>>
> Yes, it might increase the suspend/resume time, a 'queue_work_on' might be better?
Not really, because you'd need to wait before doing the
put_online_cpus() anyway.
>> > + put_online_cpus();
>> > +}
>> > +
>> > +static int tstate_suspend(struct notifier_block *nb,
>> > + unsigned long mode, void *_unused) {
>> > + tstate_check(mode, true);
>> > + return 0;
>> > +}
>> > +
>> > +static int tstate_resume(struct notifier_block *nb,
>> > + unsigned long mode, void *_unused) {
>> > + tstate_check(mode, false);
>> > + return 0;
>> > +}
>> > +
>> > +static int __init tstate_pm_init(void) {
>> > + /*
>> > + * tstate_suspend should save tstate after
>> > + * thermal zone's update in thermal_pm_notify,
>> > + * vice versa tstate_resume restore tstate before
>> > + * thermal_pm_notify, thus the thermal framework
>> > + * has a chance to re-adjust tstate according to the
>> > + * temperature trend.
>> > + */
>> > + pm_notifier(tstate_suspend, -1);
>> > + pm_notifier(tstate_resume, 1);
>>
>> I don't think this is going to do what you really want.
>>
>> Each of these notifiers is going to be invoked during both suspend and resume,
> Yes,
>
>> so I guess you only need one notifier?
>
> Here's my original thought: tstate_suspend needs to be invoked after
> thermal_pm_notify, which has a priority of '0',
> so the notifier of tstate_suspend should be lower than '0',
> thus '-1'. And the same for tstate_resume,
> it should be invoked before thermal_pm_notify,
> thus priority is '1' ?
If there's a dependency like that, it needs to be explicit. That is,
thermal_pm_notify() needs to invoke your new code in the right order
with respect to what it does already.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-11-29 1:50 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-14 17:44 [PATCH][RFC] ACPI throttling: Save/restore tstate for each CPUs across suspend/resume Chen Yu
2016-11-22 23:03 ` Rafael J. Wysocki
2016-11-29 1:27 ` Chen, Yu C
2016-11-29 1:50 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).