All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Doug Smythies" <dsmythies@telus.net>
To: "'Rafael J. Wysocki'" <rjw@rjwysocki.net>
Cc: 'Srinivas Pandruvada' <srinivas.pandruvada@linux.intel.com>,
	'LKML' <linux-kernel@vger.kernel.org>,
	'Jonathan Corbet' <corbet@lwn.net>,
	'Linux PM' <linux-pm@vger.kernel.org>,
	Doug Smythies <dsmythies@telus.net>
Subject: RE: [PATCH 5/5] cpufreq: intel_pstate: Document the current behavior and user interface
Date: Sun, 26 Mar 2017 23:32:37 -0700	[thread overview]
Message-ID: <001f01d2a6c3$ef7f0c00$ce7d2400$@net> (raw)
In-Reply-To: qpybcdyUgZGlLqpydcsQsL

On 2017.03.22 16:32 Rafael J. Wysocki wrote:

I realize that there is tradeoff between a succinct and brief
document and having to write a full book, but I have a couple of
comments anyhow.

> Add a document describing the current behavior and user space
> interface of the intel_pstate driver in the RST format and
> drop the existing outdated intel_pstate.txt document.

... [cut]...

> +The second variant of the ``powersave`` P-state selection algorithm, used in all
> +of the other cases (generally, on processors from the Core line, so it is
> +referred to as the "Core" algorithm), is based on the values read from the APERF
> +and MPERF feedback registers alone

And target pstate over the last sample interval.

> and it does not really take CPU utilization
> +into account explicitly.  Still, it causes the CPU P-state to ramp up very
> +quickly in response to increased utilization which is generally desirable in
> +server environments.

It will only ramp up quickly if another CPU has already ramped up such that the
effective pstate is much higher than the target, giving a very very high "load"
(actually scaled_busy) see comments further down.

... [cut]...

> +Turbo P-states Support
> +======================
...
> +Some processors allow multiple cores to be in turbo P-states at the same time,
> +but the maximum P-state that can be set for them generally depends on the number
> +of cores running concurrently.  The maximum turbo P-state that can be set for 3
> +cores at the same time usually is lower than the analogous maximum P-state for
> +2 cores, which in turn usually is lower than the maximum turbo P-state that can
> +be set for 1 core.  The one-core maximum turbo P-state is thus the maximum
> +supported one overall.

The above segment was retained because it is relevant to footnote 1 below.

...[cut]...

> +For example, the default values of the PID controller parameters for the Sandy
> +Bridge generation of processors are
> +
> +| ``deadband`` = 0
> +| ``d_gain_pct`` = 0
> +| ``i_gain_pct`` = 0
> +| ``p_gain_pct`` = 20
> +| ``sample_rate_ms`` = 10
> +| ``setpoint`` = 97
> +
> +If the derivative and integral coefficients in the PID algorithm are both equal
> +to 0 (which is the case above), the next P-State value will be equal to:
> +
> +  ``current_pstate`` - ((``setpoint`` - ``current_load``) * ``p_gain_pct``)
> +
> +where ``current_pstate`` is the P-state currently set for the given CPU and
> +``current_load`` is the current load estimate for it based on the current values
> +of feedback registers.

While mentioned earlier, it should be emphasized again here that this
"current_load" might be, and very often is, very very different than
the actual load on the CPU. It can be as high as the ratio of the maximum
P state / minimum P state. I.E. for my older i7 processor it can be
38/16 *100% = 237.5%. For more recent processors, that maximum can be much
higher. This is how this control algorithm can achieve a very rapid ramp
of pstate on a CPU that was previously idle, with these settings, and when
other CPUs were already active and ramped up.

> +
> +If ``current_pstate`` is 8 (in the internal representation used by
> +``intel_pstate``) and ``current_load`` is 100 (in percent), the next P-state
> +value will be:
> +
> +	8 - ((97 - 100) * 0.2) = 8.6
> +
> +which will be rounded up to 9, so the P-state value goes up by 1 in this case.
> +If the load does not change during the next interval between invocations of the
> +driver's utilization update callback for the CPU in question, the P-state value
> +will go up by 1 again and so on, as long as the load exceeds the ``setpoint``
> +value (or until the maximum P-state is reached).

No, only if the "load" exceeds the setpoint by at least 0.5/p_gain+setpoint,
Or for these settings, 99.5. The point being that p_gain and setpoint effect
each other in terms of system response.

Suggest it would be worth a fast ramp up example here. Something like:
Minimum pstate = 16; Maximum pstate = 38.

Current pstate = 16,
Effective pstate over the last interval, due to another CPU = 38
"load" = 237.5%

16 - ((97-237.5) * 0.2) = 44.1, which would be clamped to 38.

Footnote 1: Readers might argue that, due to multiple cores being active
at one time, we would never actually get a "load" of 237.5 in the above example.
That is true, but it can get very very close. For simplicity of the example, the
suggestion is to ignore it.
A real trace data sample fast ramp up example:

mperf: 9806829 cycles
apref: 10936506 cycles
tsc: 99803828 cycles
freq: 3.7916 GHz ; effective pstate 37.9
old target pstate: 16
duration: 29.26 milliseconds
load (actual): 9.83%
"load" (scaled)busy): 236
New target pstate: 38
  
... Doug

  parent reply	other threads:[~2017-03-27  6:33 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-22 22:50 [PATCH 0/5] cpufreq: intel_pstate: HW support changes, limits rework and documentation Rafael J. Wysocki
2017-03-22 22:52 ` [PATCH 1/5] cpufreq: intel_pstate: Support HWP processors in all operation modes Rafael J. Wysocki
2017-03-22 22:53 ` [PATCH 2/5] cpufreq: intel_pstate: Use load-based P-state selection more widely Rafael J. Wysocki
2017-03-22 22:58 ` [PATCH 3/5] cpufreq: intel_pstate: Active mode P-state limits rework Rafael J. Wysocki
2017-03-22 23:00 ` [PATCH 4/5] cpufreq: intel_pstate: Avoid transient updates of cpuinfo.max_freq Rafael J. Wysocki
2017-03-22 23:32 ` [PATCH 5/5] cpufreq: intel_pstate: Document the current behavior and user interface Rafael J. Wysocki
2017-03-30 21:01   ` [Update][PATCH v2 " Rafael J. Wysocki
2017-04-18 14:24     ` Rafael J. Wysocki
2017-05-05 21:38     ` [Resend][PATCH] " Rafael J. Wysocki
2017-05-12 20:47       ` Rafael J. Wysocki
2017-05-12 21:20         ` Jonathan Corbet
2017-05-12 21:42           ` Rafael J. Wysocki
2017-03-27  6:32 ` Doug Smythies [this message]
2017-03-30  0:19   ` [PATCH 5/5] " Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='001f01d2a6c3$ef7f0c00$ce7d2400$@net' \
    --to=dsmythies@telus.net \
    --cc=corbet@lwn.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=srinivas.pandruvada@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.