* [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch
@ 2015-10-06 21:49 Prarit Bhargava
2015-10-06 22:43 ` Rafael J. Wysocki
0 siblings, 1 reply; 21+ messages in thread
From: Prarit Bhargava @ 2015-10-06 21:49 UTC (permalink / raw)
To: linux-kernel
Cc: Prarit Bhargava, Kristen Carlson Accardi, Rafael J. Wysocki,
Viresh Kumar, linux-pm
Intel CPUs will not enter higher p-states when after switching from the
performance governor to the powersave governor, until
/sys/devices/system/cpu/intel_pstate/min_perf_pct is set to a low value.
This differs from previous behaviour in which a switch to the powersave
governor would result in a low default value for min_perf_pct.
The behavior of the powersave governor changed after commit a04759924e25
("[cpufreq] intel_pstate: honor user space min_perf_pct override on
resume"). The commit introduced tracking of performance percentage
changes via sysfs in order to restore userspace changes during
suspend/resume. The problem occurs because the global values of the newly
introduced max_sysfs_pct and min_sysfs_pct are not reset on a governor
change and this causes the new governor to inherit the previous governor's
settings.
This patch sets max_sysfs_pct to 100 and min_sysfs_pct to 0 on a governor
change which fixes the problem with governor switching. These changes
also make the initial calculations for max_perf_pct and min_perf_pct
slightly simpler.
Before patch:
[root@intel-skylake-y-01 power]# cpupower frequency-set -g performance
[root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct
100
[root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct
100
[root@intel-skylake-y-01 power]# cpupower frequency-set -g powersave
[root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct
100
[root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct
100
After patch:
[root@intel-skylake-y-01 power]# cpupower frequency-set -g performance
[root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct
100
[root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct
100
[root@intel-skylake-y-01 power]# cpupower frequency-set -g powersave
[root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct
14
[root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct
100
Also note that I have tested suspend/resume (using CONFIG_PM_DEBUG):
[root@intel-skylake-y-01 power]# echo 50 > /sys/devices/system/cpu/intel_pstate/min_perf_pct
[root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct
100
50
[root@intel-skylake-y-01 power]# echo devices > /sys/power/pm_test
[root@intel-skylake-y-01 power]# echo platform > /sys/power/disk
[root@intel-skylake-y-01 power]# echo disk > /sys/power/state
[root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct
100
50
Fixes: a04759924e25 ("[cpufreq] intel_pstate: honor user space min_perf_pct override on resume")
Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linux-pm@vger.kernel.org
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
---
drivers/cpufreq/intel_pstate.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 3af9dd7..bb24458 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -986,6 +986,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy)
if (!policy->cpuinfo.max_freq)
return -ENODEV;
+ limits.min_sysfs_pct = 0;
+ limits.max_sysfs_pct = 100;
+
if (policy->policy == CPUFREQ_POLICY_PERFORMANCE &&
policy->max >= policy->cpuinfo.max_freq) {
limits.min_policy_pct = 100;
@@ -1004,9 +1007,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy)
limits.max_policy_pct = clamp_t(int, limits.max_policy_pct, 0 , 100);
/* Normalize user input to [min_policy_pct, max_policy_pct] */
- limits.min_perf_pct = max(limits.min_policy_pct, limits.min_sysfs_pct);
+ limits.min_perf_pct = limits.min_policy_pct;
limits.min_perf_pct = min(limits.max_policy_pct, limits.min_perf_pct);
- limits.max_perf_pct = min(limits.max_policy_pct, limits.max_sysfs_pct);
+ limits.max_perf_pct = limits.max_sysfs_pct;
limits.max_perf_pct = max(limits.min_policy_pct, limits.max_perf_pct);
/* Make sure min_perf_pct <= max_perf_pct */
--
1.7.9.3
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-06 21:49 [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch Prarit Bhargava @ 2015-10-06 22:43 ` Rafael J. Wysocki 2015-10-06 23:06 ` Rafael J. Wysocki 0 siblings, 1 reply; 21+ messages in thread From: Rafael J. Wysocki @ 2015-10-06 22:43 UTC (permalink / raw) To: Prarit Bhargava, Kristen Carlson Accardi Cc: linux-kernel, Viresh Kumar, linux-pm On Tuesday, October 06, 2015 05:49:07 PM Prarit Bhargava wrote: > Intel CPUs will not enter higher p-states when after switching from the > performance governor to the powersave governor, until > /sys/devices/system/cpu/intel_pstate/min_perf_pct is set to a low value. > This differs from previous behaviour in which a switch to the powersave > governor would result in a low default value for min_perf_pct. > > The behavior of the powersave governor changed after commit a04759924e25 > ("[cpufreq] intel_pstate: honor user space min_perf_pct override on > resume"). The commit introduced tracking of performance percentage > changes via sysfs in order to restore userspace changes during > suspend/resume. The problem occurs because the global values of the newly > introduced max_sysfs_pct and min_sysfs_pct are not reset on a governor > change and this causes the new governor to inherit the previous governor's > settings. > > This patch sets max_sysfs_pct to 100 and min_sysfs_pct to 0 on a governor > change which fixes the problem with governor switching. These changes > also make the initial calculations for max_perf_pct and min_perf_pct > slightly simpler. > > Before patch: > [root@intel-skylake-y-01 power]# cpupower frequency-set -g performance > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct > 100 > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct > 100 > [root@intel-skylake-y-01 power]# cpupower frequency-set -g powersave > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct > 100 > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct > 100 > > After patch: > [root@intel-skylake-y-01 power]# cpupower frequency-set -g performance > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct > 100 > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct > 100 > [root@intel-skylake-y-01 power]# cpupower frequency-set -g powersave > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct > 14 > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct > 100 > > Also note that I have tested suspend/resume (using CONFIG_PM_DEBUG): > [root@intel-skylake-y-01 power]# echo 50 > /sys/devices/system/cpu/intel_pstate/min_perf_pct > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct > 100 > 50 > [root@intel-skylake-y-01 power]# echo devices > /sys/power/pm_test > [root@intel-skylake-y-01 power]# echo platform > /sys/power/disk > [root@intel-skylake-y-01 power]# echo disk > /sys/power/state > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct > 100 > 50 > > Fixes: a04759924e25 ("[cpufreq] intel_pstate: honor user space min_perf_pct override on resume") > Cc: Kristen Carlson Accardi <kristen@linux.intel.com> > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> > Cc: Viresh Kumar <viresh.kumar@linaro.org> > Cc: linux-pm@vger.kernel.org > Signed-off-by: Prarit Bhargava <prarit@redhat.com> > --- > drivers/cpufreq/intel_pstate.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c > index 3af9dd7..bb24458 100644 > --- a/drivers/cpufreq/intel_pstate.c > +++ b/drivers/cpufreq/intel_pstate.c > @@ -986,6 +986,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) > if (!policy->cpuinfo.max_freq) > return -ENODEV; > > + limits.min_sysfs_pct = 0; > + limits.max_sysfs_pct = 100; > + > if (policy->policy == CPUFREQ_POLICY_PERFORMANCE && > policy->max >= policy->cpuinfo.max_freq) { > limits.min_policy_pct = 100; > @@ -1004,9 +1007,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) > limits.max_policy_pct = clamp_t(int, limits.max_policy_pct, 0 , 100); > > /* Normalize user input to [min_policy_pct, max_policy_pct] */ > - limits.min_perf_pct = max(limits.min_policy_pct, limits.min_sysfs_pct); > + limits.min_perf_pct = limits.min_policy_pct; > limits.min_perf_pct = min(limits.max_policy_pct, limits.min_perf_pct); > - limits.max_perf_pct = min(limits.max_policy_pct, limits.max_sysfs_pct); > + limits.max_perf_pct = limits.max_sysfs_pct; > limits.max_perf_pct = max(limits.min_policy_pct, limits.max_perf_pct); > > /* Make sure min_perf_pct <= max_perf_pct */ > Makes sense to me. Kristen? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-06 22:43 ` Rafael J. Wysocki @ 2015-10-06 23:06 ` Rafael J. Wysocki 2015-10-07 6:51 ` Doug Smythies ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Rafael J. Wysocki @ 2015-10-06 23:06 UTC (permalink / raw) To: Prarit Bhargava Cc: Kristen Carlson Accardi, linux-kernel, Viresh Kumar, linux-pm On Wednesday, October 07, 2015 12:43:55 AM Rafael J. Wysocki wrote: > On Tuesday, October 06, 2015 05:49:07 PM Prarit Bhargava wrote: > > Intel CPUs will not enter higher p-states when after switching from the > > performance governor to the powersave governor, until > > /sys/devices/system/cpu/intel_pstate/min_perf_pct is set to a low value. > > This differs from previous behaviour in which a switch to the powersave > > governor would result in a low default value for min_perf_pct. > > > > The behavior of the powersave governor changed after commit a04759924e25 > > ("[cpufreq] intel_pstate: honor user space min_perf_pct override on > > resume"). The commit introduced tracking of performance percentage > > changes via sysfs in order to restore userspace changes during > > suspend/resume. The problem occurs because the global values of the newly > > introduced max_sysfs_pct and min_sysfs_pct are not reset on a governor > > change and this causes the new governor to inherit the previous governor's > > settings. > > > > This patch sets max_sysfs_pct to 100 and min_sysfs_pct to 0 on a governor > > change which fixes the problem with governor switching. These changes > > also make the initial calculations for max_perf_pct and min_perf_pct > > slightly simpler. > > > > Before patch: > > [root@intel-skylake-y-01 power]# cpupower frequency-set -g performance > > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct > > 100 > > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct > > 100 > > [root@intel-skylake-y-01 power]# cpupower frequency-set -g powersave > > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct > > 100 > > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct > > 100 > > > > After patch: > > [root@intel-skylake-y-01 power]# cpupower frequency-set -g performance > > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct > > 100 > > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct > > 100 > > [root@intel-skylake-y-01 power]# cpupower frequency-set -g powersave > > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct > > 14 > > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct > > 100 > > > > Also note that I have tested suspend/resume (using CONFIG_PM_DEBUG): > > [root@intel-skylake-y-01 power]# echo 50 > /sys/devices/system/cpu/intel_pstate/min_perf_pct > > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct > > 100 > > 50 > > [root@intel-skylake-y-01 power]# echo devices > /sys/power/pm_test > > [root@intel-skylake-y-01 power]# echo platform > /sys/power/disk > > [root@intel-skylake-y-01 power]# echo disk > /sys/power/state > > [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct > > 100 > > 50 > > > > Fixes: a04759924e25 ("[cpufreq] intel_pstate: honor user space min_perf_pct override on resume") > > Cc: Kristen Carlson Accardi <kristen@linux.intel.com> > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> > > Cc: Viresh Kumar <viresh.kumar@linaro.org> > > Cc: linux-pm@vger.kernel.org > > Signed-off-by: Prarit Bhargava <prarit@redhat.com> > > --- > > drivers/cpufreq/intel_pstate.c | 7 +++++-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c > > index 3af9dd7..bb24458 100644 > > --- a/drivers/cpufreq/intel_pstate.c > > +++ b/drivers/cpufreq/intel_pstate.c > > @@ -986,6 +986,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) > > if (!policy->cpuinfo.max_freq) > > return -ENODEV; > > > > + limits.min_sysfs_pct = 0; > > + limits.max_sysfs_pct = 100; > > + > > if (policy->policy == CPUFREQ_POLICY_PERFORMANCE && > > policy->max >= policy->cpuinfo.max_freq) { > > limits.min_policy_pct = 100; > > @@ -1004,9 +1007,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) > > limits.max_policy_pct = clamp_t(int, limits.max_policy_pct, 0 , 100); > > > > /* Normalize user input to [min_policy_pct, max_policy_pct] */ > > - limits.min_perf_pct = max(limits.min_policy_pct, limits.min_sysfs_pct); > > + limits.min_perf_pct = limits.min_policy_pct; > > limits.min_perf_pct = min(limits.max_policy_pct, limits.min_perf_pct); > > - limits.max_perf_pct = min(limits.max_policy_pct, limits.max_sysfs_pct); > > + limits.max_perf_pct = limits.max_sysfs_pct; On a second thought, isn't that always 100? If so, doesn't it basically discard limits.max_policy_pct? > > limits.max_perf_pct = max(limits.min_policy_pct, limits.max_perf_pct); > > > > /* Make sure min_perf_pct <= max_perf_pct */ > > Thanks, Rafael ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-06 23:06 ` Rafael J. Wysocki @ 2015-10-07 6:51 ` Doug Smythies 2015-10-07 9:59 ` Prarit Bhargava 2015-10-07 11:38 ` Prarit Bhargava 2015-10-07 12:18 ` Prarit Bhargava 2 siblings, 1 reply; 21+ messages in thread From: Doug Smythies @ 2015-10-07 6:51 UTC (permalink / raw) To: 'Prarit Bhargava' Cc: 'Kristen Carlson Accardi', linux-kernel, 'Viresh Kumar', linux-pm, 'Rafael J. Wysocki', Doug Smythies On 2015.09.06 16:48 Rafael J. Wysocki wrote: > On Wednesday, October 07, 2015 12:43:55 AM Rafael J. Wysocki wrote: >> On Tuesday, October 06, 2015 05:49:07 PM Prarit Bhargava wrote: >>> Intel CPUs will not enter higher p-states when after switching from the >>> performance governor to the powersave governor, until >>> /sys/devices/system/cpu/intel_pstate/min_perf_pct is set to a low value. It works properly for me. Isn't the root issue here an incompatibility between tools/power/cpupower/utils/cpufreq-set.c and drivers/cpufreq/intel_pstate.c? (see experiment results below, where I do not use "cpupower") I am not familiar with tools/power/cpupower/utils/cpufreq-set.c, but will look at it more tomorrow. >>> This differs from previous behaviour in which a switch to the powersave >>> governor would result in a low default value for min_perf_pct. >>> >>> The behavior of the powersave governor changed after commit a04759924e25 >>> ("[cpufreq] intel_pstate: honor user space min_perf_pct override on >>> resume"). The commit introduced tracking of performance percentage >>> changes via sysfs in order to restore userspace changes during >>> suspend/resume. The problem occurs because the global values of the newly >>> introduced max_sysfs_pct and min_sysfs_pct are not reset on a governor >>> change and this causes the new governor to inherit the previous governor's >>> settings. >>> >>> This patch sets max_sysfs_pct to 100 and min_sysfs_pct to 0 on a governor >>> change which fixes the problem with governor switching. These changes >>> also make the initial calculations for max_perf_pct and min_perf_pct >>> slightly simpler. >>> >>> Before patch: >>> [root@intel-skylake-y-01 power]# cpupower frequency-set -g performance >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> 100 >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct >>> 100 >>> [root@intel-skylake-y-01 power]# cpupower frequency-set -g powersave >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> 100 >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct >>> 100 And before patch I get, using primitives and not cpupower: Executive Summary: Everything works fine (or at least as I thought it was supposed to). root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave ... /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:powersave root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 /sys/devices/system/cpu/intel_pstate/min_perf_pct:42 root@s15:/home/doug/temp# echo 50 > /sys/devices/system/cpu/intel_pstate/min_perf_pct root@s15:/home/doug/temp# echo 80 > /sys/devices/system/cpu/intel_pstate/max_perf_pct root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 root@s15:/home/doug/temp# for file in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo "performance" > $file; done root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 /sys/devices/system/cpu/intel_pstate/min_perf_pct:100 root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:performance ... /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:performance root@s15:/home/doug/temp# for file in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo "powersave" > $file; done root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave ... /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:powersave root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 >>> >>> After patch: >>> [root@intel-skylake-y-01 power]# cpupower frequency-set -g performance >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> 100 >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct >>> 100 >>> [root@intel-skylake-y-01 power]# cpupower frequency-set -g powersave >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> 14 >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct >>> 100 >>> And after the patch I get, using primitives and not cpupower: Executive Summary: Settings go back to default, and user settings are lost. This is not how I thought things were supposed to behave, but I'm not actually sure. root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave ... /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:powersave root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 /sys/devices/system/cpu/intel_pstate/min_perf_pct:42 root@s15:/home/doug/temp# echo 50 > /sys/devices/system/cpu/intel_pstate/min_perf_pct root@s15:/home/doug/temp# echo 80 > /sys/devices/system/cpu/intel_pstate/max_perf_pct root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 root@s15:/home/doug/temp# for file in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo "performance" > $file; done root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:performance ... /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:performance root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 /sys/devices/system/cpu/intel_pstate/min_perf_pct:100 root@s15:/home/doug/temp# for file in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo "powersave" > $file; done root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave ... /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:powersave root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 /sys/devices/system/cpu/intel_pstate/min_perf_pct:42 >>> Also note that I have tested suspend/resume (using CONFIG_PM_DEBUG): >>> [root@intel-skylake-y-01 power]# echo 50 > /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct >>> 100 >>> 50 >>> [root@intel-skylake-y-01 power]# echo devices > /sys/power/pm_test >>> [root@intel-skylake-y-01 power]# echo platform > /sys/power/disk >>> [root@intel-skylake-y-01 power]# echo disk > /sys/power/state >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct >>> 100 >>> 50 Before Patch, I get: root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 root@s15:/home/doug/temp# pm-suspend ... root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 After Patch, I get: root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 root@s15:/home/doug/temp# pm-suspend ... root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 /sys/devices/system/cpu/intel_pstate/min_perf_pct:42 >>> >>> Fixes: a04759924e25 ("[cpufreq] intel_pstate: honor user space min_perf_pct override on resume") >>> Cc: Kristen Carlson Accardi <kristen@linux.intel.com> >>> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> >>> Cc: Viresh Kumar <viresh.kumar@linaro.org> >>> Cc: linux-pm@vger.kernel.org >>> Signed-off-by: Prarit Bhargava <prarit@redhat.com> >>> --- >>> drivers/cpufreq/intel_pstate.c | 7 +++++-- >>> 1 file changed, 5 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c >>> index 3af9dd7..bb24458 100644 >>> --- a/drivers/cpufreq/intel_pstate.c >>> +++ b/drivers/cpufreq/intel_pstate.c >>> @@ -986,6 +986,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) >>> if (!policy->cpuinfo.max_freq) >>> return -ENODEV; >>> >>> + limits.min_sysfs_pct = 0; >>> + limits.max_sysfs_pct = 100; >>> + >>> if (policy->policy == CPUFREQ_POLICY_PERFORMANCE && >>> policy->max >= policy->cpuinfo.max_freq) { >>> limits.min_policy_pct = 100; >>> @@ -1004,9 +1007,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) >>> limits.max_policy_pct = clamp_t(int, limits.max_policy_pct, 0 , 100); >>> >>> /* Normalize user input to [min_policy_pct, max_policy_pct] */ >>> - limits.min_perf_pct = max(limits.min_policy_pct, limits.min_sysfs_pct); >>> + limits.min_perf_pct = limits.min_policy_pct; >>> limits.min_perf_pct = min(limits.max_policy_pct, limits.min_perf_pct); >>> - limits.max_perf_pct = min(limits.max_policy_pct, limits.max_sysfs_pct); >>> + limits.max_perf_pct = limits.max_sysfs_pct; > > On a second thought, isn't that always 100? If so, doesn't it basically discard > limits.max_policy_pct? > Yes, I think so, see above. >>> limits.max_perf_pct = max(limits.min_policy_pct, limits.max_perf_pct); >>> >>> /* Make sure min_perf_pct <= max_perf_pct */ >>> Kernels used: 4.3-rc4 and same plus this patch. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-07 6:51 ` Doug Smythies @ 2015-10-07 9:59 ` Prarit Bhargava 2015-10-07 14:04 ` Doug Smythies 0 siblings, 1 reply; 21+ messages in thread From: Prarit Bhargava @ 2015-10-07 9:59 UTC (permalink / raw) To: Doug Smythies Cc: 'Kristen Carlson Accardi', linux-kernel, 'Viresh Kumar', linux-pm, 'Rafael J. Wysocki' On 10/07/2015 02:51 AM, Doug Smythies wrote: > > And before patch I get, using primitives and not cpupower: > Executive Summary: Everything works fine (or at least as I thought it was supposed to). > > root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave > ... > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:powersave > root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* > /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct:42 > root@s15:/home/doug/temp# echo 50 > /sys/devices/system/cpu/intel_pstate/min_perf_pct > root@s15:/home/doug/temp# echo 80 > /sys/devices/system/cpu/intel_pstate/max_perf_pct > root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* > /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 > /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 > root@s15:/home/doug/temp# for file in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo "performance" > $file; done > root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* > /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct:100 > root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:performance The switch has to be from performance to powersave. Switch again and you'll see the problem. I can also reproduce this without using 'cpupower'. P. ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-07 9:59 ` Prarit Bhargava @ 2015-10-07 14:04 ` Doug Smythies 2015-10-07 14:10 ` Prarit Bhargava 2015-10-07 14:34 ` Prarit Bhargava 0 siblings, 2 replies; 21+ messages in thread From: Doug Smythies @ 2015-10-07 14:04 UTC (permalink / raw) To: 'Prarit Bhargava' Cc: 'Kristen Carlson Accardi', linux-kernel, 'Viresh Kumar', linux-pm, 'Rafael J. Wysocki', Doug Smythies On 2015.10.07 03:00 Prarit Bhargava wrote: > On 10/07/2015 02:51 AM, Doug Smythies wrote: >> >> And before patch I get, using primitives and not cpupower: >> Executive Summary: Everything works fine (or at least as I thought it was supposed to). >> >> root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor >> /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave >> ... >> /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:powersave >> root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* >> /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 >> /sys/devices/system/cpu/intel_pstate/min_perf_pct:42 >> root@s15:/home/doug/temp# echo 50 > /sys/devices/system/cpu/intel_pstate/min_perf_pct >> root@s15:/home/doug/temp# echo 80 > /sys/devices/system/cpu/intel_pstate/max_perf_pct >> root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* >> /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 >> /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 >> root@s15:/home/doug/temp# for file in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo "performance" > $file; done >> root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* >> /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 >> /sys/devices/system/cpu/intel_pstate/min_perf_pct:100 >> root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor >> /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:performance > > The switch has to be from performance to powersave. Switch again and you'll see > the problem. Yes, of course, and I did. The first "powersave" listing was just to show the setup. Continue reading my original reply, also added back below: The "..." stuff is just where I deleted 6 of the 8 CPUs saying the same thing. root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 /sys/devices/system/cpu/intel_pstate/min_perf_pct:100 root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:performance ... /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:performance root@s15:/home/doug/temp# for file in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo "powersave" > $file; done root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave ... /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:powersave root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 > I can also reproduce this without using 'cpupower'. Interesting. Then maybe it is a difference between legacy mode and Hardware P state (HWP) mode. I do not have an HWP capable processor to test. I compiled cpupower, but get the same results as I get using primitives, meaning things work properly before the patch and resort to defaults after the patch. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-07 14:04 ` Doug Smythies @ 2015-10-07 14:10 ` Prarit Bhargava 2015-10-07 15:40 ` Doug Smythies 2015-10-07 14:34 ` Prarit Bhargava 1 sibling, 1 reply; 21+ messages in thread From: Prarit Bhargava @ 2015-10-07 14:10 UTC (permalink / raw) To: Doug Smythies Cc: 'Kristen Carlson Accardi', linux-kernel, 'Viresh Kumar', linux-pm, 'Rafael J. Wysocki' On 10/07/2015 10:04 AM, Doug Smythies wrote: > > On 2015.10.07 03:00 Prarit Bhargava wrote: >> On 10/07/2015 02:51 AM, Doug Smythies wrote: >>> >>> And before patch I get, using primitives and not cpupower: >>> Executive Summary: Everything works fine (or at least as I thought it was supposed to). >>> >>> root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor >>> /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave >>> ... >>> /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:powersave >>> root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* >>> /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 >>> /sys/devices/system/cpu/intel_pstate/min_perf_pct:42 >>> root@s15:/home/doug/temp# echo 50 > /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> root@s15:/home/doug/temp# echo 80 > /sys/devices/system/cpu/intel_pstate/max_perf_pct >>> root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* >>> /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 >>> /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 >>> root@s15:/home/doug/temp# for file in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo "performance" > $file; done >>> root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* >>> /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 >>> /sys/devices/system/cpu/intel_pstate/min_perf_pct:100 >>> root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor >>> /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:performance >> >> The switch has to be from performance to powersave. Switch again and you'll see >> the problem. > > Yes, of course, and I did. > The first "powersave" listing was just to show the setup. > Continue reading my original reply, also added back below: > The "..." stuff is just where I deleted 6 of the 8 CPUs saying the same thing. > > root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* > /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct:100 > root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:performance > ... > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:performance > root@s15:/home/doug/temp# for file in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo "powersave" > $file; done > root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave > ... > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:powersave > root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* > /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 > /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 > >> I can also reproduce this without using 'cpupower'. > > Interesting. Then maybe it is a difference between legacy mode > and Hardware P state (HWP) mode. I do not have an HWP capable processor > to test. I have intel_pstate=no_hwp so that doesn't explain things. I will send you a printk debug patch shortly to figure out why your system doesn't see this problem. I have _6_ different variants of intel processors that see this same issue. What is your model #? P. > > I compiled cpupower, but get the same results as I get using primitives, > meaning things work properly before the patch and resort to defaults > after the patch. > > ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-07 14:10 ` Prarit Bhargava @ 2015-10-07 15:40 ` Doug Smythies 2015-10-07 15:46 ` Prarit Bhargava 0 siblings, 1 reply; 21+ messages in thread From: Doug Smythies @ 2015-10-07 15:40 UTC (permalink / raw) To: 'Prarit Bhargava' Cc: 'Kristen Carlson Accardi', linux-kernel, 'Viresh Kumar', linux-pm, 'Rafael J. Wysocki', Doug Smythies On 2015.09.07 07:11 Prarit Bhargava wrote: Hi Prarit, > I have intel_pstate=no_hwp so that doesn't explain things. I will send you a > printk debug patch shortly to figure out why your system doesn't see this > problem. I have _6_ different variants of intel processors that see this same > issue. Interesting. > > What is your model #? Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz While probably not relevant, it is the identical processor that the previous maintainer of the intel_pstate driver had on his main test computer. As mentioned in my original reply, I am using kernel 4.3-rc4 from kernel.org mainline. For kernel configuration I steal the Ubuntu kernel config. O.K. meanwhile your printk debug patch e-mail came: On 2015.10.07 07:34 Prarit Bhargava wrote: > Doug, please apply and test. I never get the "store_min_perf_pct" stuff during startup. I get: [ 2.662092] Intel P-state driver initializing. [ 2.662110] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 2.681792] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 2.681825] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 2.681834] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 2.681935] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 2.681959] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 2.682004] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 2.682024] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 80.044329] intel_pstate_set_policy[1020] min_perf_pct = 42 [ 80.044331] intel_pstate_set_policy[1023] min_perf_pct = 42 [ 80.044331] intel_pstate_set_policy[1028] min_perf_pct = 42 [ 80.044343] intel_pstate_set_policy[1020] min_perf_pct = 42 [ 80.044344] intel_pstate_set_policy[1023] min_perf_pct = 42 [ 80.044345] intel_pstate_set_policy[1028] min_perf_pct = 42 [ 80.044354] intel_pstate_set_policy[1020] min_perf_pct = 42 [ 80.044355] intel_pstate_set_policy[1023] min_perf_pct = 42 [ 80.044355] intel_pstate_set_policy[1028] min_perf_pct = 42 [ 80.044364] intel_pstate_set_policy[1020] min_perf_pct = 42 [ 80.044365] intel_pstate_set_policy[1023] min_perf_pct = 42 [ 80.044366] intel_pstate_set_policy[1028] min_perf_pct = 42 [ 80.044375] intel_pstate_set_policy[1020] min_perf_pct = 42 [ 80.044376] intel_pstate_set_policy[1023] min_perf_pct = 42 [ 80.044376] intel_pstate_set_policy[1028] min_perf_pct = 42 [ 80.044385] intel_pstate_set_policy[1020] min_perf_pct = 42 [ 80.044386] intel_pstate_set_policy[1023] min_perf_pct = 42 [ 80.044387] intel_pstate_set_policy[1028] min_perf_pct = 42 [ 80.044396] intel_pstate_set_policy[1020] min_perf_pct = 42 [ 80.044397] intel_pstate_set_policy[1023] min_perf_pct = 42 [ 80.044397] intel_pstate_set_policy[1028] min_perf_pct = 42 [ 80.044406] intel_pstate_set_policy[1020] min_perf_pct = 42 [ 80.044407] intel_pstate_set_policy[1023] min_perf_pct = 42 [ 80.044408] intel_pstate_set_policy[1028] min_perf_pct = 42 Note that the stuff at 80 seconds is from a startup script that changes the governor to powersave after a 60 second delay. It is an Ubuntu default script. (/etc/init.d/ondemand), but merely does that same thing as this: for file in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo "powersave" > $file; done I do not know why things are done 3 times. Now, continuing: echo 50 > /sys/devices/system/cpu/intel_pstate/min_perf_pct [ 1411.436249] store_min_perf_pct[453] min_sysfs_pct = 50 [ 1411.436253] store_min_perf_pct[456] min_perf_pct = 50 [ 1411.436255] store_min_perf_pct[459] min_perf_pct = 50 [ 1411.436256] store_min_perf_pct[462] min_perf_pct = 50 cpupower frequency-set -g performance [ 1425.879832] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 1425.879981] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 1425.880049] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 1425.880178] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 1425.880324] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 1425.880467] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 1425.880534] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 1425.880594] intel_pstate_set_policy[1001] min_perf_pct = 100 cpupower frequency-set -g powersave [ 1437.677179] intel_pstate_set_policy[1020] min_perf_pct = 50 [ 1437.677181] intel_pstate_set_policy[1023] min_perf_pct = 50 [ 1437.677182] intel_pstate_set_policy[1028] min_perf_pct = 50 [ 1437.677205] intel_pstate_set_policy[1020] min_perf_pct = 50 [ 1437.677206] intel_pstate_set_policy[1023] min_perf_pct = 50 [ 1437.677206] intel_pstate_set_policy[1028] min_perf_pct = 50 [ 1437.677224] intel_pstate_set_policy[1020] min_perf_pct = 50 [ 1437.677225] intel_pstate_set_policy[1023] min_perf_pct = 50 [ 1437.677226] intel_pstate_set_policy[1028] min_perf_pct = 50 [ 1437.677244] intel_pstate_set_policy[1020] min_perf_pct = 50 [ 1437.677245] intel_pstate_set_policy[1023] min_perf_pct = 50 [ 1437.677245] intel_pstate_set_policy[1028] min_perf_pct = 50 [ 1437.677270] intel_pstate_set_policy[1020] min_perf_pct = 50 [ 1437.677271] intel_pstate_set_policy[1023] min_perf_pct = 50 [ 1437.677279] intel_pstate_set_policy[1028] min_perf_pct = 50 [ 1437.677296] intel_pstate_set_policy[1020] min_perf_pct = 50 [ 1437.677297] intel_pstate_set_policy[1023] min_perf_pct = 50 [ 1437.677297] intel_pstate_set_policy[1028] min_perf_pct = 50 [ 1437.677316] intel_pstate_set_policy[1020] min_perf_pct = 50 [ 1437.677317] intel_pstate_set_policy[1023] min_perf_pct = 50 [ 1437.677318] intel_pstate_set_policy[1028] min_perf_pct = 50 [ 1437.677350] intel_pstate_set_policy[1020] min_perf_pct = 50 [ 1437.677351] intel_pstate_set_policy[1023] min_perf_pct = 50 [ 1437.677352] intel_pstate_set_policy[1028] min_perf_pct = 50 Do we agree or disagree that the root issue seems to be (from your test)?: \# echo 100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct [ 21.483436] store_min_perf_pct[453] min_sysfs_pct = 100 [ 21.489373] store_min_perf_pct[456] min_perf_pct = 100 [ 21.495203] store_min_perf_pct[459] min_perf_pct = 100 [ 21.501050] store_min_perf_pct[462] min_perf_pct = 100 ... Doug ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-07 15:40 ` Doug Smythies @ 2015-10-07 15:46 ` Prarit Bhargava 2015-10-07 18:52 ` Doug Smythies 0 siblings, 1 reply; 21+ messages in thread From: Prarit Bhargava @ 2015-10-07 15:46 UTC (permalink / raw) To: Doug Smythies Cc: 'Kristen Carlson Accardi', linux-kernel, 'Viresh Kumar', linux-pm, 'Rafael J. Wysocki' On 10/07/2015 11:40 AM, Doug Smythies wrote: > > Do we agree or disagree that the root issue seems to be (from your test)?: > > \# echo 100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct > > [ 21.483436] store_min_perf_pct[453] min_sysfs_pct = 100 > [ 21.489373] store_min_perf_pct[456] min_perf_pct = 100 > [ 21.495203] store_min_perf_pct[459] min_perf_pct = 100 > [ 21.501050] store_min_perf_pct[462] min_perf_pct = 100 Yep, and it appears to be done by default in Fedora & RHEL :/ ... the issue is still the same IMO that min_sysfs_pct & max_sysfs_pct are not cleared on a governor switch. P. ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-07 15:46 ` Prarit Bhargava @ 2015-10-07 18:52 ` Doug Smythies 2015-10-07 20:40 ` Prarit Bhargava 2015-10-07 21:31 ` Prarit Bhargava 0 siblings, 2 replies; 21+ messages in thread From: Doug Smythies @ 2015-10-07 18:52 UTC (permalink / raw) To: 'Prarit Bhargava' Cc: 'Kristen Carlson Accardi', linux-kernel, 'Viresh Kumar', linux-pm, 'Rafael J. Wysocki', Doug Smythies On 2015.10.07 08:46 Prarit Bhargava wrote: > On 10/07/2015 11:40 AM, Doug Smythies wrote: >> >> Do we agree or disagree that the root issue seems to be (from your test)?: >> >> \# echo 100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct >> >> [ 21.483436] store_min_perf_pct[453] min_sysfs_pct = 100 >> [ 21.489373] store_min_perf_pct[456] min_perf_pct = 100 >> [ 21.495203] store_min_perf_pct[459] min_perf_pct = 100 >> [ 21.501050] store_min_perf_pct[462] min_perf_pct = 100 > > Yep, and it appears to be done by default in Fedora & RHEL :/ ... the issue is > still the same IMO that min_sysfs_pct & max_sysfs_pct are not cleared on a > governor switch. Clearing them will break some other things. For example, and as shown in my original reply, resume from suspend. Why? Because, at least on my computer, the governor is changed to "performance" during suspend, and the "powersave" governor is restored sometime during resume. The users wants the settings they had before the suspend. Continuing with that printk debug kernel from earlier: pm-suspend: [12599.912028] intel_pstate_set_policy[1001] min_perf_pct = 100 [12599.913781] intel_pstate_set_policy[1001] min_perf_pct = 100 [12599.915343] intel_pstate_set_policy[1001] min_perf_pct = 100 [12599.916877] intel_pstate_set_policy[1001] min_perf_pct = 100 [12599.918444] intel_pstate_set_policy[1001] min_perf_pct = 100 [12599.919686] intel_pstate_set_policy[1001] min_perf_pct = 100 [12599.920932] intel_pstate_set_policy[1001] min_perf_pct = 100 [12599.922191] intel_pstate_set_policy[1001] min_perf_pct = 100 Then push the power button, i.e. resume: [12609.953358] intel_pstate_set_policy[1020] min_perf_pct = 50 [12609.953360] intel_pstate_set_policy[1023] min_perf_pct = 50 [12609.953361] intel_pstate_set_policy[1028] min_perf_pct = 50 [12609.953796] intel_pstate_set_policy[1020] min_perf_pct = 50 [12609.953797] intel_pstate_set_policy[1023] min_perf_pct = 50 [12609.953798] intel_pstate_set_policy[1028] min_perf_pct = 50 [12609.954209] intel_pstate_set_policy[1020] min_perf_pct = 50 [12609.954210] intel_pstate_set_policy[1023] min_perf_pct = 50 [12609.954211] intel_pstate_set_policy[1028] min_perf_pct = 50 [12609.954619] intel_pstate_set_policy[1020] min_perf_pct = 50 [12609.954620] intel_pstate_set_policy[1023] min_perf_pct = 50 [12609.954621] intel_pstate_set_policy[1028] min_perf_pct = 50 [12609.955028] intel_pstate_set_policy[1020] min_perf_pct = 50 [12609.955029] intel_pstate_set_policy[1023] min_perf_pct = 50 [12609.955030] intel_pstate_set_policy[1028] min_perf_pct = 50 [12609.955431] intel_pstate_set_policy[1020] min_perf_pct = 50 [12609.955432] intel_pstate_set_policy[1023] min_perf_pct = 50 [12609.955433] intel_pstate_set_policy[1028] min_perf_pct = 50 [12609.955833] intel_pstate_set_policy[1020] min_perf_pct = 50 [12609.955834] intel_pstate_set_policy[1023] min_perf_pct = 50 [12609.955835] intel_pstate_set_policy[1028] min_perf_pct = 50 [12609.956234] intel_pstate_set_policy[1020] min_perf_pct = 50 [12609.956235] intel_pstate_set_policy[1023] min_perf_pct = 50 [12609.956235] intel_pstate_set_policy[1028] min_perf_pct = 50 The below is copied from my original reply: Before Patch, I get: root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 root@s15:/home/doug/temp# pm-suspend ... root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 After Patch, I get: root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 root@s15:/home/doug/temp# pm-suspend ... root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 /sys/devices/system/cpu/intel_pstate/min_perf_pct:42 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-07 18:52 ` Doug Smythies @ 2015-10-07 20:40 ` Prarit Bhargava 2015-10-07 21:31 ` Prarit Bhargava 1 sibling, 0 replies; 21+ messages in thread From: Prarit Bhargava @ 2015-10-07 20:40 UTC (permalink / raw) To: Doug Smythies Cc: 'Kristen Carlson Accardi', linux-kernel, 'Viresh Kumar', linux-pm, 'Rafael J. Wysocki' On 10/07/2015 02:52 PM, Doug Smythies wrote: > On 2015.10.07 08:46 Prarit Bhargava wrote: >> On 10/07/2015 11:40 AM, Doug Smythies wrote: >>> >>> Do we agree or disagree that the root issue seems to be (from your test)?: >>> >>> \# echo 100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> >>> [ 21.483436] store_min_perf_pct[453] min_sysfs_pct = 100 >>> [ 21.489373] store_min_perf_pct[456] min_perf_pct = 100 >>> [ 21.495203] store_min_perf_pct[459] min_perf_pct = 100 >>> [ 21.501050] store_min_perf_pct[462] min_perf_pct = 100 >> >> Yep, and it appears to be done by default in Fedora & RHEL :/ ... the issue is >> still the same IMO that min_sysfs_pct & max_sysfs_pct are not cleared on a >> governor switch. > > Clearing them will break some other things. For example, and as > shown in my original reply, resume from suspend. > > Why? Because, at least on my computer, the governor is changed to > "performance" during suspend, and the "powersave" governor is > restored sometime during resume. The users wants the settings they had > before the suspend. Hmm ... okay. I'm going to try again with a new patch and I'll add an additional test of starting with the powersave governor, changing the min_perf_pct, and then doing a suspend/resume. P. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-07 18:52 ` Doug Smythies 2015-10-07 20:40 ` Prarit Bhargava @ 2015-10-07 21:31 ` Prarit Bhargava 2015-10-07 22:05 ` Rafael J. Wysocki 1 sibling, 1 reply; 21+ messages in thread From: Prarit Bhargava @ 2015-10-07 21:31 UTC (permalink / raw) To: Doug Smythies Cc: 'Kristen Carlson Accardi', linux-kernel, 'Viresh Kumar', linux-pm, 'Rafael J. Wysocki' On 10/07/2015 02:52 PM, Doug Smythies wrote: > On 2015.10.07 08:46 Prarit Bhargava wrote: >> On 10/07/2015 11:40 AM, Doug Smythies wrote: >>> >>> Do we agree or disagree that the root issue seems to be (from your test)?: >>> >>> \# echo 100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> >>> [ 21.483436] store_min_perf_pct[453] min_sysfs_pct = 100 >>> [ 21.489373] store_min_perf_pct[456] min_perf_pct = 100 >>> [ 21.495203] store_min_perf_pct[459] min_perf_pct = 100 >>> [ 21.501050] store_min_perf_pct[462] min_perf_pct = 100 >> >> Yep, and it appears to be done by default in Fedora & RHEL :/ ... the issue is >> still the same IMO that min_sysfs_pct & max_sysfs_pct are not cleared on a >> governor switch. > > Clearing them will break some other things. For example, and as > shown in my original reply, resume from suspend. > > Why? Because, at least on my computer, the governor is changed to > "performance" during suspend, and the "powersave" governor is > restored sometime during resume. The users wants the settings they had > before the suspend. > Looking at this in more detail after having tested on a Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz in Fedora and RHEL. I have a feeling that the switch you're seeing (poweersave->performance, suspend ... resume, performance->powersave) is occurring in userspace, and not as a result of the kernel. IMO if userspace changes the governor, all bets are off on maintaining max_sysfs_pct and min_sysfs_pct. Here's something I cannot figure out (because I do not have an Ubuntu install). *Why* is Ubuntu making the governor switch during suspend/resume? Is it because of archaic brokeness they were trying to paper over? > Continuing with that printk debug kernel from earlier: > > pm-suspend: > > [12599.912028] intel_pstate_set_policy[1001] min_perf_pct = 100 > [12599.913781] intel_pstate_set_policy[1001] min_perf_pct = 100 > [12599.915343] intel_pstate_set_policy[1001] min_perf_pct = 100 > [12599.916877] intel_pstate_set_policy[1001] min_perf_pct = 100 > [12599.918444] intel_pstate_set_policy[1001] min_perf_pct = 100 > [12599.919686] intel_pstate_set_policy[1001] min_perf_pct = 100 > [12599.920932] intel_pstate_set_policy[1001] min_perf_pct = 100 > [12599.922191] intel_pstate_set_policy[1001] min_perf_pct = 100 > > Then push the power button, i.e. resume: > > [12609.953358] intel_pstate_set_policy[1020] min_perf_pct = 50 > [12609.953360] intel_pstate_set_policy[1023] min_perf_pct = 50 > [12609.953361] intel_pstate_set_policy[1028] min_perf_pct = 50 > [12609.953796] intel_pstate_set_policy[1020] min_perf_pct = 50 > [12609.953797] intel_pstate_set_policy[1023] min_perf_pct = 50 > [12609.953798] intel_pstate_set_policy[1028] min_perf_pct = 50 > [12609.954209] intel_pstate_set_policy[1020] min_perf_pct = 50 > [12609.954210] intel_pstate_set_policy[1023] min_perf_pct = 50 > [12609.954211] intel_pstate_set_policy[1028] min_perf_pct = 50 > [12609.954619] intel_pstate_set_policy[1020] min_perf_pct = 50 > [12609.954620] intel_pstate_set_policy[1023] min_perf_pct = 50 > [12609.954621] intel_pstate_set_policy[1028] min_perf_pct = 50 > [12609.955028] intel_pstate_set_policy[1020] min_perf_pct = 50 > [12609.955029] intel_pstate_set_policy[1023] min_perf_pct = 50 > [12609.955030] intel_pstate_set_policy[1028] min_perf_pct = 50 > [12609.955431] intel_pstate_set_policy[1020] min_perf_pct = 50 > [12609.955432] intel_pstate_set_policy[1023] min_perf_pct = 50 > [12609.955433] intel_pstate_set_policy[1028] min_perf_pct = 50 > [12609.955833] intel_pstate_set_policy[1020] min_perf_pct = 50 > [12609.955834] intel_pstate_set_policy[1023] min_perf_pct = 50 > [12609.955835] intel_pstate_set_policy[1028] min_perf_pct = 50 > [12609.956234] intel_pstate_set_policy[1020] min_perf_pct = 50 > [12609.956235] intel_pstate_set_policy[1023] min_perf_pct = 50 > [12609.956235] intel_pstate_set_policy[1028] min_perf_pct = 50 > > The below is copied from my original reply: > > Before Patch, I get: > > root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* > /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 > /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 > root@s15:/home/doug/temp# pm-suspend > ... > root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* > /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 > /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 > > After Patch, I get: > > root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* > /sys/devices/system/cpu/intel_pstate/max_perf_pct:80 > /sys/devices/system/cpu/intel_pstate/min_perf_pct:50 > root@s15:/home/doug/temp# pm-suspend > ... > root@s15:/home/doug/temp# grep . /sys/devices/system/cpu/intel_pstate/*_perf_* > /sys/devices/system/cpu/intel_pstate/max_perf_pct:100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct:42 Here's what I get after the patch (again, on Fedora which appears to let the kernel do it's thing during suspend/resume) on the same processor you are using (a Sandybridge, Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz), and using the powersave governor, [root@intel-skylake-y-01 ~]# cpupower frequency-info analyzing CPU 0: driver: intel_pstate CPUs which run at the same hardware frequency: 0 CPUs which need to have their frequency coordinated by software: 0 maximum transition latency: 0.97 ms. hardware limits: 400 MHz - 2.70 GHz available cpufreq governors: performance, powersave current policy: frequency should be within 400 MHz and 2.70 GHz. The governor "powersave" may decide which speed to use within this range. current CPU frequency is 800 MHz (asserted by call to hardware). boost state support: Supported: yes Active: yes [root@intel-skylake-y-01 ~]# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct 100 14 [root@intel-skylake-y-01 ~]# echo devices > /sys/power/pm_test; echo platform > /sys/power/disk; sleep 1; echo disk > /sys/power/state [root@intel-skylake-y-01 ~]# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct 100 14 Even if I manually change the max & min, [root@intel-skylake-y-01 ~]# echo 80 > /sys/devices/system/cpu/intel_pstate/max_perf_pct [root@intel-skylake-y-01 ~]# echo 50 > /sys/devices/system/cpu/intel_pstate/min_perf_pct [root@intel-skylake-y-01 ~]# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct 80 50 [root@intel-skylake-y-01 ~]# echo devices > /sys/power/pm_test; echo platform > /sys/power/disk; sleep 1; echo disk > /sys/power/state [root@intel-skylake-y-01 ~]# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct 80 50 [root@intel-skylake-y-01 ~]# Everything works. It doesn't work on Ubuntu because userspace is doing something weird. Let's figure out why that is -- anyone know who works on s/r @ Ubuntu? P. > > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-07 21:31 ` Prarit Bhargava @ 2015-10-07 22:05 ` Rafael J. Wysocki 2015-10-07 22:26 ` Doug Smythies 2015-10-07 23:08 ` Prarit Bhargava 0 siblings, 2 replies; 21+ messages in thread From: Rafael J. Wysocki @ 2015-10-07 22:05 UTC (permalink / raw) To: Prarit Bhargava Cc: Doug Smythies, 'Kristen Carlson Accardi', linux-kernel, 'Viresh Kumar', linux-pm On Wednesday, October 07, 2015 05:31:25 PM Prarit Bhargava wrote: > > On 10/07/2015 02:52 PM, Doug Smythies wrote: > > On 2015.10.07 08:46 Prarit Bhargava wrote: > >> On 10/07/2015 11:40 AM, Doug Smythies wrote: > >>> > >>> Do we agree or disagree that the root issue seems to be (from your test)?: > >>> > >>> \# echo 100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct > >>> > >>> [ 21.483436] store_min_perf_pct[453] min_sysfs_pct = 100 > >>> [ 21.489373] store_min_perf_pct[456] min_perf_pct = 100 > >>> [ 21.495203] store_min_perf_pct[459] min_perf_pct = 100 > >>> [ 21.501050] store_min_perf_pct[462] min_perf_pct = 100 > >> > >> Yep, and it appears to be done by default in Fedora & RHEL :/ ... the issue is > >> still the same IMO that min_sysfs_pct & max_sysfs_pct are not cleared on a > >> governor switch. > > > > Clearing them will break some other things. For example, and as > > shown in my original reply, resume from suspend. > > > > Why? Because, at least on my computer, the governor is changed to > > "performance" during suspend, and the "powersave" governor is > > restored sometime during resume. The users wants the settings they had > > before the suspend. > > > > Looking at this in more detail after having tested on a Intel(R) Core(TM) > i7-2600 CPU @ 3.40GHz in Fedora and RHEL. > > I have a feeling that the switch you're seeing (poweersave->performance, suspend > ... resume, performance->powersave) is occurring in userspace, and not as a > result of the kernel. IMO if userspace changes the governor, all bets are off > on maintaining max_sysfs_pct and min_sysfs_pct. > > Here's something I cannot figure out (because I do not have an Ubuntu install). > *Why* is Ubuntu making the governor switch during suspend/resume? Is it > because of archaic brokeness they were trying to paper over? That's not limited to Ubuntu, pm-utils has been doing that forever. I have no idea why has it been doing that, though. I guess the reason was to "speed up" PM transitions (in case it started when you were in a low-frequency P-state and then there was no time to bump it up before things got too far). Thanks, Rafael ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-07 22:05 ` Rafael J. Wysocki @ 2015-10-07 22:26 ` Doug Smythies 2015-10-07 23:17 ` Prarit Bhargava 2015-10-08 0:13 ` Prarit Bhargava 2015-10-07 23:08 ` Prarit Bhargava 1 sibling, 2 replies; 21+ messages in thread From: Doug Smythies @ 2015-10-07 22:26 UTC (permalink / raw) To: 'Prarit Bhargava' Cc: 'Kristen Carlson Accardi', linux-kernel, 'Viresh Kumar', linux-pm, 'Rafael J. Wysocki', Doug Smythies On 2015.10.07 15:06 Rafael J. Wysocki wrote: > On Wednesday, October 07, 2015 05:31:25 PM Prarit Bhargava wrote: >> On 10/07/2015 02:52 PM, Doug Smythies wrote: >>> On 2015.10.07 08:46 Prarit Bhargava wrote: >>>> On 10/07/2015 11:40 AM, Doug Smythies wrote: >>>>> >>>>> Do we agree or disagree that the root issue seems to be (from your test)?: >>>>> >>>>> \# echo 100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct >>>>> >>>>> [ 21.483436] store_min_perf_pct[453] min_sysfs_pct = 100 >>>>> [ 21.489373] store_min_perf_pct[456] min_perf_pct = 100 >>>>> [ 21.495203] store_min_perf_pct[459] min_perf_pct = 100 >>>>> [ 21.501050] store_min_perf_pct[462] min_perf_pct = 100 >>>> >>>> Yep, and it appears to be done by default in Fedora & RHEL :/ ... the issue is >>>> still the same IMO that min_sysfs_pct & max_sysfs_pct are not cleared on a >>>> governor switch. >>> >>> Clearing them will break some other things. For example, and as >>> shown in my original reply, resume from suspend. >>> >>> Why? Because, at least on my computer, the governor is changed to >>> "performance" during suspend, and the "powersave" governor is >>> restored sometime during resume. The users wants the settings they had >>> before the suspend. >>> >> Looking at this in more detail after having tested on a Intel(R) Core(TM) >> i7-2600 CPU @ 3.40GHz in Fedora and RHEL. >> >> I have a feeling that the switch you're seeing (poweersave->performance, suspend >> ... resume, performance->powersave) is occurring in userspace, and not as a >> result of the kernel. Agreed. It is pm-suspend doing it. >> IMO if userspace changes the governor, all bets are off >> on maintaining max_sysfs_pct and min_sysfs_pct. >> >> Here's something I cannot figure out (because I do not have an Ubuntu install). >> *Why* is Ubuntu making the governor switch during suspend/resume? Is it >> because of archaic brokeness they were trying to paper over? >> That's not limited to Ubuntu, pm-utils has been doing that forever. Agreed. This in pm-utils, and not limited to Ubuntu. We can ignore this issue if everyone wants, but I can envision bug reports. > I have no idea why has it been doing that, though. I guess the reason > was to "speed up" PM transitions (in case it started when you were in a > low-frequency P-state and then there was no time to bump it up before > things got too far). I have no idea either, but the stated theory seems sound. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-07 22:26 ` Doug Smythies @ 2015-10-07 23:17 ` Prarit Bhargava 2015-10-08 0:13 ` Prarit Bhargava 1 sibling, 0 replies; 21+ messages in thread From: Prarit Bhargava @ 2015-10-07 23:17 UTC (permalink / raw) To: Doug Smythies Cc: 'Kristen Carlson Accardi', linux-kernel, 'Viresh Kumar', linux-pm, 'Rafael J. Wysocki' On 10/07/2015 06:26 PM, Doug Smythies wrote: > On 2015.10.07 15:06 Rafael J. Wysocki wrote: >> On Wednesday, October 07, 2015 05:31:25 PM Prarit Bhargava wrote: >>> On 10/07/2015 02:52 PM, Doug Smythies wrote: >>>> On 2015.10.07 08:46 Prarit Bhargava wrote: >>>>> On 10/07/2015 11:40 AM, Doug Smythies wrote: >>>>>> >>>>>> Do we agree or disagree that the root issue seems to be (from your test)?: >>>>>> >>>>>> \# echo 100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct >>>>>> >>>>>> [ 21.483436] store_min_perf_pct[453] min_sysfs_pct = 100 >>>>>> [ 21.489373] store_min_perf_pct[456] min_perf_pct = 100 >>>>>> [ 21.495203] store_min_perf_pct[459] min_perf_pct = 100 >>>>>> [ 21.501050] store_min_perf_pct[462] min_perf_pct = 100 >>>>> >>>>> Yep, and it appears to be done by default in Fedora & RHEL :/ ... the issue is >>>>> still the same IMO that min_sysfs_pct & max_sysfs_pct are not cleared on a >>>>> governor switch. >>>> >>>> Clearing them will break some other things. For example, and as >>>> shown in my original reply, resume from suspend. >>>> >>>> Why? Because, at least on my computer, the governor is changed to >>>> "performance" during suspend, and the "powersave" governor is >>>> restored sometime during resume. The users wants the settings they had >>>> before the suspend. >>>> >>> Looking at this in more detail after having tested on a Intel(R) Core(TM) >>> i7-2600 CPU @ 3.40GHz in Fedora and RHEL. >>> >>> I have a feeling that the switch you're seeing (poweersave->performance, suspend >>> ... resume, performance->powersave) is occurring in userspace, and not as a >>> result of the kernel. > Agreed. It is pm-suspend doing it. > >>> IMO if userspace changes the governor, all bets are off >>> on maintaining max_sysfs_pct and min_sysfs_pct. >>> >>> Here's something I cannot figure out (because I do not have an Ubuntu install). >>> *Why* is Ubuntu making the governor switch during suspend/resume? Is it >>> because of archaic brokeness they were trying to paper over? > >>> That's not limited to Ubuntu, pm-utils has been doing that forever. > > Agreed. This in pm-utils, and not limited to Ubuntu. > We can ignore this issue if everyone wants, but I can envision bug reports. > Yeah -- I'd rather not just ignore it. Sorry to repeat myself but I just sent out a reply that said I'm going to submit a patch to pm-utils that blacklists the intel-pstate driver. Suspend/Resume is known to work with intel-pstate and we shouldn't be doing anything fancy there. I will put it on my TODO list to also later on take a look at acpi-cpufreq. >> I have no idea why has it been doing that, though. I guess the reason >> was to "speed up" PM transitions (in case it started when you were in a >> low-frequency P-state and then there was no time to bump it up before >> things got too far). > > I have no idea either, but the stated theory seems sound. Looking at various comments here and there online, it sounds like they wanted to put the system into a "sane state" before attempting to suspend/resume. The code has been untouched since 2008. We've come a long way and I'm willing to argue that we don't need it anymore. P. > > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-07 22:26 ` Doug Smythies 2015-10-07 23:17 ` Prarit Bhargava @ 2015-10-08 0:13 ` Prarit Bhargava 1 sibling, 0 replies; 21+ messages in thread From: Prarit Bhargava @ 2015-10-08 0:13 UTC (permalink / raw) To: Doug Smythies Cc: 'Kristen Carlson Accardi', linux-kernel, 'Viresh Kumar', linux-pm, 'Rafael J. Wysocki' On 10/07/2015 06:26 PM, Doug Smythies wrote: > On 2015.10.07 15:06 Rafael J. Wysocki wrote: >> On Wednesday, October 07, 2015 05:31:25 PM Prarit Bhargava wrote: >>> On 10/07/2015 02:52 PM, Doug Smythies wrote: >>>> On 2015.10.07 08:46 Prarit Bhargava wrote: >>>>> On 10/07/2015 11:40 AM, Doug Smythies wrote: >>>>>> >>>>>> Do we agree or disagree that the root issue seems to be (from your test)?: >>>>>> >>>>>> \# echo 100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct >>>>>> >>>>>> [ 21.483436] store_min_perf_pct[453] min_sysfs_pct = 100 >>>>>> [ 21.489373] store_min_perf_pct[456] min_perf_pct = 100 >>>>>> [ 21.495203] store_min_perf_pct[459] min_perf_pct = 100 >>>>>> [ 21.501050] store_min_perf_pct[462] min_perf_pct = 100 >>>>> >>>>> Yep, and it appears to be done by default in Fedora & RHEL :/ ... the issue is >>>>> still the same IMO that min_sysfs_pct & max_sysfs_pct are not cleared on a >>>>> governor switch. >>>> >>>> Clearing them will break some other things. For example, and as >>>> shown in my original reply, resume from suspend. >>>> >>>> Why? Because, at least on my computer, the governor is changed to >>>> "performance" during suspend, and the "powersave" governor is >>>> restored sometime during resume. The users wants the settings they had >>>> before the suspend. >>>> >>> Looking at this in more detail after having tested on a Intel(R) Core(TM) >>> i7-2600 CPU @ 3.40GHz in Fedora and RHEL. >>> >>> I have a feeling that the switch you're seeing (poweersave->performance, suspend >>> ... resume, performance->powersave) is occurring in userspace, and not as a >>> result of the kernel. > Agreed. It is pm-suspend doing it. > >>> IMO if userspace changes the governor, all bets are off >>> on maintaining max_sysfs_pct and min_sysfs_pct. >>> >>> Here's something I cannot figure out (because I do not have an Ubuntu install). >>> *Why* is Ubuntu making the governor switch during suspend/resume? Is it >>> because of archaic brokeness they were trying to paper over? > >>> That's not limited to Ubuntu, pm-utils has been doing that forever. > > Agreed. This in pm-utils, and not limited to Ubuntu. > We can ignore this issue if everyone wants, but I can envision bug reports. > >> I have no idea why has it been doing that, though. I guess the reason >> was to "speed up" PM transitions (in case it started when you were in a >> low-frequency P-state and then there was no time to bump it up before >> things got too far). > > I have no idea either, but the stated theory seems sound. Doug, can you also apply (sorry for the cut-and-paste) diff --git a/pm/sleep.d/94cpufreq b/pm/sleep.d/94cpufreq index 6807681..2c83e8e 100755 --- a/pm/sleep.d/94cpufreq +++ b/pm/sleep.d/94cpufreq @@ -16,6 +16,8 @@ hibernate_cpufreq() gov="$x/cpufreq/scaling_governor" # if we do not have a scaling_governor file, skip. [ -f "$gov" ] || continue + # Is the governor known to work? + [ "$gov" == "intel_pstate" ] && continue # if our temporary governor is not available, skip. grep -q "$TEMPORARY_CPUFREQ_GOVERNOR" \ "$x/cpufreq/scaling_available_governors" || continue and re-test? Thanks, P. > > ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-07 22:05 ` Rafael J. Wysocki 2015-10-07 22:26 ` Doug Smythies @ 2015-10-07 23:08 ` Prarit Bhargava 1 sibling, 0 replies; 21+ messages in thread From: Prarit Bhargava @ 2015-10-07 23:08 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Doug Smythies, 'Kristen Carlson Accardi', linux-kernel, 'Viresh Kumar', linux-pm On 10/07/2015 06:05 PM, Rafael J. Wysocki wrote: > On Wednesday, October 07, 2015 05:31:25 PM Prarit Bhargava wrote: >> >> On 10/07/2015 02:52 PM, Doug Smythies wrote: >>> On 2015.10.07 08:46 Prarit Bhargava wrote: >>>> On 10/07/2015 11:40 AM, Doug Smythies wrote: >>>>> >>>>> Do we agree or disagree that the root issue seems to be (from your test)?: >>>>> >>>>> \# echo 100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct >>>>> >>>>> [ 21.483436] store_min_perf_pct[453] min_sysfs_pct = 100 >>>>> [ 21.489373] store_min_perf_pct[456] min_perf_pct = 100 >>>>> [ 21.495203] store_min_perf_pct[459] min_perf_pct = 100 >>>>> [ 21.501050] store_min_perf_pct[462] min_perf_pct = 100 >>>> >>>> Yep, and it appears to be done by default in Fedora & RHEL :/ ... the issue is >>>> still the same IMO that min_sysfs_pct & max_sysfs_pct are not cleared on a >>>> governor switch. >>> >>> Clearing them will break some other things. For example, and as >>> shown in my original reply, resume from suspend. >>> >>> Why? Because, at least on my computer, the governor is changed to >>> "performance" during suspend, and the "powersave" governor is >>> restored sometime during resume. The users wants the settings they had >>> before the suspend. >>> >> >> Looking at this in more detail after having tested on a Intel(R) Core(TM) >> i7-2600 CPU @ 3.40GHz in Fedora and RHEL. >> >> I have a feeling that the switch you're seeing (poweersave->performance, suspend >> ... resume, performance->powersave) is occurring in userspace, and not as a >> result of the kernel. IMO if userspace changes the governor, all bets are off >> on maintaining max_sysfs_pct and min_sysfs_pct. >> >> Here's something I cannot figure out (because I do not have an Ubuntu install). >> *Why* is Ubuntu making the governor switch during suspend/resume? Is it >> because of archaic brokeness they were trying to paper over? > > That's not limited to Ubuntu, pm-utils has been doing that forever. > > I have no idea why has it been doing that, though. I guess the reason > was to "speed up" PM transitions (in case it started when you were in a > low-frequency P-state and then there was no time to bump it up before > things got too far). Nuts :/. Okay, it looks like git://anongit.freedesktop.org/git/pm-utils pm/sleep.d/94cpufreq is the culprit here. suspend/resume is known to work with the intel_pstate driver and IMO should be blacklisted. I'll install Ubuntu and do some testing. P. > > Thanks, > Rafael > ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-07 14:04 ` Doug Smythies 2015-10-07 14:10 ` Prarit Bhargava @ 2015-10-07 14:34 ` Prarit Bhargava 1 sibling, 0 replies; 21+ messages in thread From: Prarit Bhargava @ 2015-10-07 14:34 UTC (permalink / raw) To: dsmythies Cc: kristen, linux-kernel, viresh.kumar, linux-pm, rjw, Prarit Bhargava Doug, please apply and test. My result is: \# initial load of driver, set to PERFORMANCE governor by default. [ 21.406525] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 21.421288] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 21.436038] intel_pstate_set_policy[1001] min_perf_pct = 100 [ 21.450505] intel_pstate_set_policy[1001] min_perf_pct = 100 \# echo 100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct [ 21.483436] store_min_perf_pct[453] min_sysfs_pct = 100 [ 21.489373] store_min_perf_pct[456] min_perf_pct = 100 [ 21.495203] store_min_perf_pct[459] min_perf_pct = 100 [ 21.501050] store_min_perf_pct[462] min_perf_pct = 100 \# cpupower -c all frequency-set -g powersave [ 214.187212] intel_pstate_set_policy[1023] min_perf_pct = 100 [ 214.193643] intel_pstate_set_policy[1028] min_perf_pct = 100 [ 214.200205] intel_pstate_set_policy[1020] min_perf_pct = 100 [ 214.206626] intel_pstate_set_policy[1023] min_perf_pct = 100 [ 214.213064] intel_pstate_set_policy[1028] min_perf_pct = 100 [ 214.219601] intel_pstate_set_policy[1020] min_perf_pct = 100 [ 214.226008] intel_pstate_set_policy[1023] min_perf_pct = 100 [ 214.232471] intel_pstate_set_policy[1028] min_perf_pct = 100 [ 214.238990] intel_pstate_set_policy[1020] min_perf_pct = 100 [ 214.245408] intel_pstate_set_policy[1023] min_perf_pct = 100 [ 214.251845] intel_pstate_set_policy[1028] min_perf_pct = 100 \# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct 100 100 Thanks! P. --- drivers/cpufreq/intel_pstate.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index 3af9dd7..95f6b9f 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -450,9 +450,17 @@ static ssize_t store_min_perf_pct(struct kobject *a, struct attribute *b, return -EINVAL; limits.min_sysfs_pct = clamp_t(int, input, 0 , 100); + printk("%s[%d] min_sysfs_pct = %d\n", __FUNCTION__, __LINE__, + limits.min_sysfs_pct); limits.min_perf_pct = max(limits.min_policy_pct, limits.min_sysfs_pct); + printk("%s[%d] min_perf_pct = %d\n", __FUNCTION__, __LINE__, + limits.min_perf_pct); limits.min_perf_pct = min(limits.max_policy_pct, limits.min_perf_pct); + printk("%s[%d] min_perf_pct = %d\n", __FUNCTION__, __LINE__, + limits.min_perf_pct); limits.min_perf_pct = min(limits.max_perf_pct, limits.min_perf_pct); + printk("%s[%d] min_perf_pct = %d\n", __FUNCTION__, __LINE__, + limits.min_perf_pct); limits.min_perf = div_fp(int_tofp(limits.min_perf_pct), int_tofp(100)); if (hwp_active) @@ -990,6 +998,8 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) policy->max >= policy->cpuinfo.max_freq) { limits.min_policy_pct = 100; limits.min_perf_pct = 100; + printk("%s[%d] min_perf_pct = %d\n", __FUNCTION__, __LINE__, + limits.min_perf_pct); limits.min_perf = int_tofp(1); limits.max_policy_pct = 100; limits.max_perf_pct = 100; @@ -1007,10 +1017,16 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) limits.min_perf_pct = max(limits.min_policy_pct, limits.min_sysfs_pct); limits.min_perf_pct = min(limits.max_policy_pct, limits.min_perf_pct); limits.max_perf_pct = min(limits.max_policy_pct, limits.max_sysfs_pct); + printk("%s[%d] min_perf_pct = %d\n", __FUNCTION__, __LINE__, + limits.min_perf_pct); limits.max_perf_pct = max(limits.min_policy_pct, limits.max_perf_pct); + printk("%s[%d] min_perf_pct = %d\n", __FUNCTION__, __LINE__, + limits.min_perf_pct); /* Make sure min_perf_pct <= max_perf_pct */ limits.min_perf_pct = min(limits.max_perf_pct, limits.min_perf_pct); + printk("%s[%d] min_perf_pct = %d\n", __FUNCTION__, __LINE__, + limits.min_perf_pct); limits.min_perf = div_fp(int_tofp(limits.min_perf_pct), int_tofp(100)); limits.max_perf = div_fp(int_tofp(limits.max_perf_pct), int_tofp(100)); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-06 23:06 ` Rafael J. Wysocki 2015-10-07 6:51 ` Doug Smythies @ 2015-10-07 11:38 ` Prarit Bhargava 2015-10-07 12:18 ` Prarit Bhargava 2 siblings, 0 replies; 21+ messages in thread From: Prarit Bhargava @ 2015-10-07 11:38 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Kristen Carlson Accardi, linux-kernel, Viresh Kumar, linux-pm On 10/06/2015 07:06 PM, Rafael J. Wysocki wrote: > On Wednesday, October 07, 2015 12:43:55 AM Rafael J. Wysocki wrote: >> On Tuesday, October 06, 2015 05:49:07 PM Prarit Bhargava wrote: >>> Intel CPUs will not enter higher p-states when after switching from the >>> performance governor to the powersave governor, until >>> /sys/devices/system/cpu/intel_pstate/min_perf_pct is set to a low value. >>> This differs from previous behaviour in which a switch to the powersave >>> governor would result in a low default value for min_perf_pct. >>> >>> The behavior of the powersave governor changed after commit a04759924e25 >>> ("[cpufreq] intel_pstate: honor user space min_perf_pct override on >>> resume"). The commit introduced tracking of performance percentage >>> changes via sysfs in order to restore userspace changes during >>> suspend/resume. The problem occurs because the global values of the newly >>> introduced max_sysfs_pct and min_sysfs_pct are not reset on a governor >>> change and this causes the new governor to inherit the previous governor's >>> settings. >>> >>> This patch sets max_sysfs_pct to 100 and min_sysfs_pct to 0 on a governor >>> change which fixes the problem with governor switching. These changes >>> also make the initial calculations for max_perf_pct and min_perf_pct >>> slightly simpler. >>> >>> Before patch: >>> [root@intel-skylake-y-01 power]# cpupower frequency-set -g performance >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> 100 >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct >>> 100 >>> [root@intel-skylake-y-01 power]# cpupower frequency-set -g powersave >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> 100 >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct >>> 100 >>> >>> After patch: >>> [root@intel-skylake-y-01 power]# cpupower frequency-set -g performance >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> 100 >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct >>> 100 >>> [root@intel-skylake-y-01 power]# cpupower frequency-set -g powersave >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> 14 >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct >>> 100 >>> >>> Also note that I have tested suspend/resume (using CONFIG_PM_DEBUG): >>> [root@intel-skylake-y-01 power]# echo 50 > /sys/devices/system/cpu/intel_pstate/min_perf_pct >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct >>> 100 >>> 50 >>> [root@intel-skylake-y-01 power]# echo devices > /sys/power/pm_test >>> [root@intel-skylake-y-01 power]# echo platform > /sys/power/disk >>> [root@intel-skylake-y-01 power]# echo disk > /sys/power/state >>> [root@intel-skylake-y-01 power]# cat /sys/devices/system/cpu/intel_pstate/*_perf_pct >>> 100 >>> 50 >>> >>> Fixes: a04759924e25 ("[cpufreq] intel_pstate: honor user space min_perf_pct override on resume") >>> Cc: Kristen Carlson Accardi <kristen@linux.intel.com> >>> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> >>> Cc: Viresh Kumar <viresh.kumar@linaro.org> >>> Cc: linux-pm@vger.kernel.org >>> Signed-off-by: Prarit Bhargava <prarit@redhat.com> >>> --- >>> drivers/cpufreq/intel_pstate.c | 7 +++++-- >>> 1 file changed, 5 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c >>> index 3af9dd7..bb24458 100644 >>> --- a/drivers/cpufreq/intel_pstate.c >>> +++ b/drivers/cpufreq/intel_pstate.c >>> @@ -986,6 +986,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) >>> if (!policy->cpuinfo.max_freq) >>> return -ENODEV; >>> >>> + limits.min_sysfs_pct = 0; >>> + limits.max_sysfs_pct = 100; >>> + >>> if (policy->policy == CPUFREQ_POLICY_PERFORMANCE && >>> policy->max >= policy->cpuinfo.max_freq) { >>> limits.min_policy_pct = 100; >>> @@ -1004,9 +1007,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) >>> limits.max_policy_pct = clamp_t(int, limits.max_policy_pct, 0 , 100); >>> >>> /* Normalize user input to [min_policy_pct, max_policy_pct] */ >>> - limits.min_perf_pct = max(limits.min_policy_pct, limits.min_sysfs_pct); >>> + limits.min_perf_pct = limits.min_policy_pct; >>> limits.min_perf_pct = min(limits.max_policy_pct, limits.min_perf_pct); >>> - limits.max_perf_pct = min(limits.max_policy_pct, limits.max_sysfs_pct); >>> + limits.max_perf_pct = limits.max_sysfs_pct; > > On a second thought, isn't that always 100? If so, doesn't it basically discard > limits.max_policy_pct? Looking at it, yes. And that's definitely an unintended consequence of this patch :). I'll take a closer look. I thought it should be permissible to set a range of (min_perf_pct, max_perf_pct) while changing p-states and I thought the purpose of max_perf_pct was to set the higher percentage limit. P. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-06 23:06 ` Rafael J. Wysocki 2015-10-07 6:51 ` Doug Smythies 2015-10-07 11:38 ` Prarit Bhargava @ 2015-10-07 12:18 ` Prarit Bhargava 2015-10-07 14:50 ` Prarit Bhargava 2 siblings, 1 reply; 21+ messages in thread From: Prarit Bhargava @ 2015-10-07 12:18 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Kristen Carlson Accardi, linux-kernel, Viresh Kumar, linux-pm On 10/06/2015 07:06 PM, Rafael J. Wysocki wrote: > On Wednesday, October 07, 2015 12:43:55 AM Rafael J. Wysocki wrote: >> On Tuesday, October 06, 2015 05:49:07 PM Prarit Bhargava wrote: <snip> >>> >>> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c >>> index 3af9dd7..bb24458 100644 >>> --- a/drivers/cpufreq/intel_pstate.c >>> +++ b/drivers/cpufreq/intel_pstate.c >>> @@ -986,6 +986,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) >>> if (!policy->cpuinfo.max_freq) >>> return -ENODEV; >>> >>> + limits.min_sysfs_pct = 0; >>> + limits.max_sysfs_pct = 100; >>> + >>> if (policy->policy == CPUFREQ_POLICY_PERFORMANCE && >>> policy->max >= policy->cpuinfo.max_freq) { >>> limits.min_policy_pct = 100; >>> @@ -1004,9 +1007,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) >>> limits.max_policy_pct = clamp_t(int, limits.max_policy_pct, 0 , 100); >>> >>> /* Normalize user input to [min_policy_pct, max_policy_pct] */ >>> - limits.min_perf_pct = max(limits.min_policy_pct, limits.min_sysfs_pct); >>> + limits.min_perf_pct = limits.min_policy_pct; >>> limits.min_perf_pct = min(limits.max_policy_pct, limits.min_perf_pct); >>> - limits.max_perf_pct = min(limits.max_policy_pct, limits.max_sysfs_pct); >>> + limits.max_perf_pct = limits.max_sysfs_pct; > > On a second thought, isn't that always 100? If so, doesn't it basically discard > limits.max_policy_pct? Wow :) I think you're right and that definitely was an unintended consequence of this patch. I also see that I can clean up the intel_pstate_set_policy() code a bit more. I'll submit a 2-part v2. P. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch 2015-10-07 12:18 ` Prarit Bhargava @ 2015-10-07 14:50 ` Prarit Bhargava 0 siblings, 0 replies; 21+ messages in thread From: Prarit Bhargava @ 2015-10-07 14:50 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Kristen Carlson Accardi, linux-kernel, Viresh Kumar, linux-pm On 10/07/2015 08:18 AM, Prarit Bhargava wrote: > > > On 10/06/2015 07:06 PM, Rafael J. Wysocki wrote: >> On Wednesday, October 07, 2015 12:43:55 AM Rafael J. Wysocki wrote: >>> On Tuesday, October 06, 2015 05:49:07 PM Prarit Bhargava wrote: > > <snip> > >>>> >>>> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c >>>> index 3af9dd7..bb24458 100644 >>>> --- a/drivers/cpufreq/intel_pstate.c >>>> +++ b/drivers/cpufreq/intel_pstate.c >>>> @@ -986,6 +986,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) >>>> if (!policy->cpuinfo.max_freq) >>>> return -ENODEV; >>>> >>>> + limits.min_sysfs_pct = 0; >>>> + limits.max_sysfs_pct = 100; >>>> + >>>> if (policy->policy == CPUFREQ_POLICY_PERFORMANCE && >>>> policy->max >= policy->cpuinfo.max_freq) { >>>> limits.min_policy_pct = 100; >>>> @@ -1004,9 +1007,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) >>>> limits.max_policy_pct = clamp_t(int, limits.max_policy_pct, 0 , 100); >>>> >>>> /* Normalize user input to [min_policy_pct, max_policy_pct] */ >>>> - limits.min_perf_pct = max(limits.min_policy_pct, limits.min_sysfs_pct); >>>> + limits.min_perf_pct = limits.min_policy_pct; >>>> limits.min_perf_pct = min(limits.max_policy_pct, limits.min_perf_pct); >>>> - limits.max_perf_pct = min(limits.max_policy_pct, limits.max_sysfs_pct); >>>> + limits.max_perf_pct = limits.max_sysfs_pct; >> >> On a second thought, isn't that always 100? If so, doesn't it basically discard >> limits.max_policy_pct? > > Wow :) I think you're right and that definitely was an unintended consequence > of this patch. I also see that I can clean up the intel_pstate_set_policy() > code a bit more. I'll submit a 2-part v2. Kristen, There is a question here and I'm going to need your input on the answer. The question is whether or not a user can exceed the max_policy_pct or drop below the min_policy_pct values via sysfs? If "No, a user should not be able to exceed or go below those values", then IMO limits.max_sysfs_pct should default to limits.max_policy_pct, and limits.min_sysfs_pct should default to limits.min_policy_pct. and I will adjust my patch accordingly. P. ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2015-10-08 0:13 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-10-06 21:49 [PATCH] cpufreq, intel_pstate, set max_sysfs_pct and min_sysfs_pct on governor switch Prarit Bhargava 2015-10-06 22:43 ` Rafael J. Wysocki 2015-10-06 23:06 ` Rafael J. Wysocki 2015-10-07 6:51 ` Doug Smythies 2015-10-07 9:59 ` Prarit Bhargava 2015-10-07 14:04 ` Doug Smythies 2015-10-07 14:10 ` Prarit Bhargava 2015-10-07 15:40 ` Doug Smythies 2015-10-07 15:46 ` Prarit Bhargava 2015-10-07 18:52 ` Doug Smythies 2015-10-07 20:40 ` Prarit Bhargava 2015-10-07 21:31 ` Prarit Bhargava 2015-10-07 22:05 ` Rafael J. Wysocki 2015-10-07 22:26 ` Doug Smythies 2015-10-07 23:17 ` Prarit Bhargava 2015-10-08 0:13 ` Prarit Bhargava 2015-10-07 23:08 ` Prarit Bhargava 2015-10-07 14:34 ` Prarit Bhargava 2015-10-07 11:38 ` Prarit Bhargava 2015-10-07 12:18 ` Prarit Bhargava 2015-10-07 14:50 ` Prarit Bhargava
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).