Re: [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2.99)

linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Valentin Schneider <valentin.schneider@arm.com>
To: Francisco Jerez <currojerez@riseup.net>
Cc: Peter Zijlstra <peterz@infradead.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>, "Pandruvada\,
	Srinivas" <srinivas.pandruvada@intel.com>,
	linux-pm@vger.kernel.org, intel-gfx@lists.freedesktop.org,
	chris.p.wilson@intel.com, "Vivi\,
	Rodrigo" <rodrigo.vivi@intel.com>,
	rui.zhang@intel.com, daniel.lezcano@linaro.org,
	amit.kucheria@verdurent.com, Lukasz Luba <Lukasz.Luba@arm.com>
Subject: Re: [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2.99)
Date: Fri, 15 May 2020 19:09:52 +0100	[thread overview]
Message-ID: <jhjv9kxqdcf.mognet@arm.com> (raw)
In-Reply-To: <87a72at44d.fsf@riseup.net>


On 15/05/20 01:48, Francisco Jerez wrote:
> Valentin Schneider <valentin.schneider@arm.com> writes:
>
>> (+Lukasz)
>>
>> On 11/05/20 22:01, Francisco Jerez wrote:
>>>> What I'm missing is an explanation for why this isn't using the
>>>> infrastructure that was build for these kinds of things? The thermal
>>>> framework, was AFAIU, supposed to help with these things, and the IPA
>>>> thing in particular is used by ARM to do exactly this GPU/CPU power
>>>> budget thing.
>>>>
>>>> If thermal/IPA is found wanting, why aren't we improving that?
>>>
>>> The GPU/CPU power budget "thing" is only a positive side effect of this
>>> series on some TDP-bound systems.  Its ultimate purpose is improving the
>>> energy efficiency of workloads which have a bottleneck on a device other
>>> than the CPU, by giving the bottlenecking device driver some influence
>>> over the response latency of CPUFREQ governors via a PM QoS interface.
>>> This seems to be completely outside the scope of the thermal framework
>>> and IPA AFAIU.
>>>
>>
>> It's been a while since I've stared at IPA, but it does sound vaguely
>> familiar.
>>
>> When thermally constrained, IPA figures out a budget and splits it between
>> actors (cpufreq and devfreq devices) depending on how much juice they are
>> asking for; see cpufreq_get_requested_power() and
>> devfreq_cooling_get_requested_power(). There's also some weighing involved.
>>
>
> I'm aware of those.  Main problem is that the current mechanism for IPA
> to figure out the requested power of each actor is based on a rough
> estimate of their past power consumption: If an actor was operating at a
> highly energy-inefficient regime it will end up requesting more power
> than another actor with the same load but more energy-efficient
> behavior.

Right, we do mix load (busy time for either cpufreq and devfreq devices
AFAIR) and current state (freq) into one single power value.

> The IPA power allocator is therefore ineffective at improving
> the energy efficiency of an agent beyond its past behavior --
> Furthermore it seems to *rely* on individual agents being somewhat
> energetically responsible in order for its power allocation result to be
> anywhere close to optimal.  But improving the energy efficiency of an
> agent seems useful in its own right, whether IPA is used to balance
> power across agents or not.  That's precisely the purpose of this
> series.
>
>> If you look at the cpufreq cooling side of things, you'll see it also uses
>> the PM QoS interface. For instance, should IPA decide to cap the CPUs
>> (perhaps because say the GPU is the one drawing most of the juice), it'll
>> lead to a maximum frequency capping request.
>>
>> So it does sound like that's what you want, only not just when thermally
>> constrained.
>
> Capping the CPU frequency from random device drivers is highly
> problematic, because the CPU is a shared resource which a number of
> different concurrent applications might be using beyond the GPU client.
> The GPU driver has no visibility over its impact on the performance of
> other applications.  And even in a single-task environment, in order to
> behave as effectively as the present series the GPU driver would need to
> monitor the utilization of *all* CPUs in the system and place a
> frequency constraint on each one of them (since there is the potential
> of the task scheduler migrating the process from one CPU to another
> without notice).  Furthermore these frequency constraints would need to
> be updated at high frequency in order to avoid performance degradation
> whenever the balance of load between CPU and IO device fluctuates.
>
> The present series attempts to remove the burden of computing frequency
> constraints out of individual device drivers into the CPUFREQ governor.
> Instead the device driver provides a response latency constraint when it
> encounters a bottleneck, which can be more easily derived from hardware
> and protocol characteristics than a CPU frequency.  PM QoS aggregates
> the response latency constraints provided by all applications and gives
> CPUFREQ a single response latency target compatible with all of them (so
> a device driver specifying a high latency target won't lead to
> performance degradation in a concurrent application with lower latency
> constraints).  The CPUFREQ governor then computes frequency constraints
> for each CPU core that minimize energy usage without limiting
> throughput, based on the results obtained from CPU performance counters,
> while guaranteeing that a discontinuous transition in CPU utilization
> leads to a proportional transition in the CPU frequency before the
> specified response latency has elapsed.

Right, I think I see your point there. I'm thinking the 'actual' IPA gurus
(Lukasz or even Javi) may want to have a look at this.

next prev parent reply	other threads:[~2020-05-15 18:10 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-28  3:22 [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2.99) Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 01/11] PM: QoS: Add CPU_SCALING_RESPONSE global PM QoS limit Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 02/11] drm/i915: Adjust PM QoS scaling response frequency based on GPU load Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 03/11] OPTIONAL: drm/i915: Expose PM QoS control parameters via debugfs Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 04/11] cpufreq: Define ADAPTIVE frequency governor policy Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 05/11] cpufreq: intel_pstate: Reorder intel_pstate_clear_update_util_hook() and intel_pstate_set_update_util_hook() Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 06/11] cpufreq: intel_pstate: Call intel_pstate_set_update_util_hook() once from the setpolicy hook Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 07/11] cpufreq: intel_pstate: Implement VLP controller statistics and target range calculation Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 08/11] cpufreq: intel_pstate: Implement VLP controller for HWP parts Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 09/11] cpufreq: intel_pstate: Enable VLP controller based on ACPI FADT profile and CPUID Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 10/11] OPTIONAL: cpufreq: intel_pstate: Add tracing of VLP controller status Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 11/11] OPTIONAL: cpufreq: intel_pstate: Expose VLP controller parameters via debugfs Francisco Jerez
2020-05-11 10:57 ` [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2.99) Peter Zijlstra
2020-05-11 21:01   ` Francisco Jerez
2020-05-14 10:26     ` Rafael J. Wysocki
2020-05-15  0:48       ` Francisco Jerez
2020-05-14 11:50     ` Valentin Schneider
2020-05-15  0:48       ` Francisco Jerez
2020-05-15 18:09         ` Valentin Schneider [this message]
2020-05-28  9:29           ` Lukasz Luba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jhjv9kxqdcf.mognet@arm.com \
    --to=valentin.schneider@arm.com \
    --cc=Lukasz.Luba@arm.com \
    --cc=amit.kucheria@verdurent.com \
    --cc=chris.p.wilson@intel.com \
    --cc=currojerez@riseup.net \
    --cc=daniel.lezcano@linaro.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=rodrigo.vivi@intel.com \
    --cc=rui.zhang@intel.com \
    --cc=srinivas.pandruvada@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).