public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Renninger <trenn@suse.de>
To: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Len Brown <lenb@kernel.org>,
	Philippe Longepe <philippe.longepe@linux.intel.com>,
	linux-pm@vger.kernel.org, rafael.j.wysocki@intel.com,
	Prarit Bhargava <prarit@redhat.com>,
	viresh.kumar@linaro.org
Subject: Re: [PATCH V6 1/3] cpufreq: intel_pstate: configurable algorithm to get target pstate
Date: Thu, 10 Dec 2015 14:04:46 +0100	[thread overview]
Message-ID: <8633351.YrHIUtRzE5@skinner> (raw)
In-Reply-To: <1449692513.3240.231.camel@spandruv-desk3.jf.intel.com>

On Wednesday, December 09, 2015 12:21:53 PM Srinivas Pandruvada wrote:
> On Wed, 2015-12-09 at 15:34 +0100, Thomas Renninger wrote:
> > On Tuesday, December 08, 2015 10:02:23 AM Srinivas Pandruvada wrote:
> > > On Tue, 2015-12-08 at 16:27 +0100, Thomas Renninger wrote:
> > > > Hi,
> > > > 
> > > > There may be atoms clustered in server platforms and there may be
> > > > performance oriented CPU models built into laptops.
> > > 
> > > You mean BYT/CHT ATOM used as servers not Avaton, which has different
> > > id.
> > 
> > I have no idea.
> > And since Intel does not build mainboards anymore afaik or most
> > are built by OEMs, you also have no idea which CPUs end up on which kind
> > of system, right?
> > 
> > Default policy to go for performance, IO, workload based switching or
> > whatever tunables are modified should be, as you said, platform oriented.
> > 
> > So instead of calling the params:
> > silvermont, airmont, knl_params (knights landing, right?), whatever CPU...
> 
> > Better call them:
> These param structure is much more than just policy specification. They
> have callback to get limits. For example, silvermont and airmont has
> different scaling methods. So they can't be common to all.
Sure this is HW specific.

> The function
> to get next state can be common. There is no airmont/silvermont name
> attached to this (get_target_pstate_use_cpu_load or
> get_target_pstate_use_performance). So the default callback
> ".get_target_pstate" can be changed based on preference.
> 
> This is the order I am thinking of in the order of priority high to
> low :
> - User policy (either command line or via cpu-freq scaling_governor)
> - ACPI
> - Pickup defaults based on CPU ID.

Why by CPU ID? This doesn't make sense to me and unnecessarily complicates
things.

But, please do not submit anything half baken:
a new pstate ondemand governor or different
workload optizmized algorithms inside this governor, or later possibly as
separate governors, defaults based on CPU ID or platform type and
then later userspace tunable...

We should not only think, but implement a final concept and submit altogether.
Otherwise there are several kernels flying around which do implement half
of the implementation and the confusion is perfect.

I would also be interested in some figures.
Are there any measurements how many power is saved, how big the performance
overhead is, etc?

I formerly used the tools/power/cpupower/bench/ tool to find out worst case 
workload scenarios between ondemand and performance governor with different
tuning knobs.
It could even do some nice plots showing performance issues when increasing
polling time, etc., IIRC.
Are these new RAPL registers fine grained enough to be used to measure
power consumption on different workloads with different dynamic cpufreq
algorithms/tunables?

On a platform which exposes itself as "Enterprise Server" or "Performance
Server" (ACPI PM Profile, cmp with chapter 5.2.9 of ACPI spec) I expect 
default will be the current performance governor, which means dynamic CPU 
frequency off and run on high frequency statically, right?

> The current patch-set only addresses the cpu id based selection.
> BTW changing a pid set point can be done from user space without
> changing code.

You mean:
/sys/kernel/debug/pstate_snb/p_gain_pct
/sys/kernel/debug/pstate_snb/setpoint
/sys/kernel/debug/pstate_snb/deadband
/sys/kernel/debug/pstate_snb/i_gain_pct
/sys/kernel/debug/pstate_snb/d_gain_pct
/sys/kernel/debug/pstate_snb/sample_rate_ms

These should better not be picked as API by userspace tools...
 
> > I had a quick look, but providing whole governors seem to be a lot
> > overhead. But with some luck you do not have to fill up whole structures
> > you have to pass to cpufreq_register_governor(..).
> 
> Adding a external governor is easy (Your RFC patch already does that).
> But inside cpufreq core logic there is different processing based on
> whether there are internal governor or cpufreq governor. So that needs
> to be addressed first.

Yep, not sure that is perfect.
I also try to have a closer look.
Quite some open questions...

        Thomas

  reply	other threads:[~2015-12-10 13:04 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-04 16:40 [PATCH V6 0/3] cpufreq: intel_pstate: account non C0 time Philippe Longepe
2015-12-04 16:40 ` Philippe Longepe
2015-12-04 16:40 ` [PATCH V6 1/3] cpufreq: intel_pstate: configurable algorithm to get target pstate Philippe Longepe
2015-12-08 15:27   ` Thomas Renninger
2015-12-08 18:02     ` Srinivas Pandruvada
2015-12-09 14:34       ` Thomas Renninger
2015-12-09 20:21         ` Srinivas Pandruvada
2015-12-10 13:04           ` Thomas Renninger [this message]
2015-12-10 17:28             ` Srinivas Pandruvada
2015-12-14 15:13               ` Thomas Renninger
2015-12-14 18:20                 ` Srinivas Pandruvada
2015-12-15 14:24                   ` Thomas Renninger
2015-12-15 17:59                     ` Len Brown
2015-12-16 10:25                       ` Thomas Renninger
2015-12-15 18:10                     ` Srinivas Pandruvada
2015-12-10 22:01             ` Rafael J. Wysocki
2015-12-14 16:14               ` Stephane Gasparini
2015-12-14 16:36                 ` Stephane Gasparini
2015-12-14 22:13                 ` Doug Smythies
2015-12-15 10:30                   ` Philippe Longepe
2015-12-15 13:06                     ` Stephane Gasparini
2015-12-15 23:34                     ` Doug Smythies
2015-12-16  9:49                       ` Stephane Gasparini
2015-12-14 16:22               ` Thomas Renninger
2015-12-14 16:38                 ` Stephane Gasparini
2015-12-14 22:06                 ` Rafael J. Wysocki
2015-12-15 14:13                   ` Thomas Renninger
2015-12-04 16:40 ` [PATCH " Philippe Longepe
2015-12-04 17:35   ` Srinivas Pandruvada
2015-12-10  0:45     ` Rafael J. Wysocki
2015-12-10  0:19       ` Srinivas Pandruvada
2015-12-10  0:51         ` Rafael J. Wysocki
2015-12-04 16:40 ` [PATCH V6 2/3] cpufreq: intel_pstate: account for non C0 time Philippe Longepe
2015-12-04 16:40 ` [PATCH " Philippe Longepe
2015-12-04 16:40 ` [PATCH V6 3/3] cpufreq: intel_pstate: Account for IO wait time Philippe Longepe
2015-12-04 16:40 ` Philippe Longepe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8633351.YrHIUtRzE5@skinner \
    --to=trenn@suse.de \
    --cc=lenb@kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=philippe.longepe@linux.intel.com \
    --cc=prarit@redhat.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox