All of lore.kernel.org
 help / color / mirror / Atom feed
From: Valentin Schneider <valentin.schneider@arm.com>
To: Francisco Jerez <currojerez@riseup.net>
Cc: Peter Zijlstra <peterz@infradead.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>, "Pandruvada\,
	Srinivas" <srinivas.pandruvada@intel.com>,
	linux-pm@vger.kernel.org, intel-gfx@lists.freedesktop.org,
	chris.p.wilson@intel.com, "Vivi\,
	Rodrigo" <rodrigo.vivi@intel.com>,
	rui.zhang@intel.com, daniel.lezcano@linaro.org,
	amit.kucheria@verdurent.com, Lukasz Luba <Lukasz.Luba@arm.com>
Subject: Re: [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2.99)
Date: Fri, 15 May 2020 19:09:52 +0100	[thread overview]
Message-ID: <jhjv9kxqdcf.mognet@arm.com> (raw)
In-Reply-To: <87a72at44d.fsf@riseup.net>


On 15/05/20 01:48, Francisco Jerez wrote:
> Valentin Schneider <valentin.schneider@arm.com> writes:
>
>> (+Lukasz)
>>
>> On 11/05/20 22:01, Francisco Jerez wrote:
>>>> What I'm missing is an explanation for why this isn't using the
>>>> infrastructure that was build for these kinds of things? The thermal
>>>> framework, was AFAIU, supposed to help with these things, and the IPA
>>>> thing in particular is used by ARM to do exactly this GPU/CPU power
>>>> budget thing.
>>>>
>>>> If thermal/IPA is found wanting, why aren't we improving that?
>>>
>>> The GPU/CPU power budget "thing" is only a positive side effect of this
>>> series on some TDP-bound systems.  Its ultimate purpose is improving the
>>> energy efficiency of workloads which have a bottleneck on a device other
>>> than the CPU, by giving the bottlenecking device driver some influence
>>> over the response latency of CPUFREQ governors via a PM QoS interface.
>>> This seems to be completely outside the scope of the thermal framework
>>> and IPA AFAIU.
>>>
>>
>> It's been a while since I've stared at IPA, but it does sound vaguely
>> familiar.
>>
>> When thermally constrained, IPA figures out a budget and splits it between
>> actors (cpufreq and devfreq devices) depending on how much juice they are
>> asking for; see cpufreq_get_requested_power() and
>> devfreq_cooling_get_requested_power(). There's also some weighing involved.
>>
>
> I'm aware of those.  Main problem is that the current mechanism for IPA
> to figure out the requested power of each actor is based on a rough
> estimate of their past power consumption: If an actor was operating at a
> highly energy-inefficient regime it will end up requesting more power
> than another actor with the same load but more energy-efficient
> behavior.

Right, we do mix load (busy time for either cpufreq and devfreq devices
AFAIR) and current state (freq) into one single power value.

> The IPA power allocator is therefore ineffective at improving
> the energy efficiency of an agent beyond its past behavior --
> Furthermore it seems to *rely* on individual agents being somewhat
> energetically responsible in order for its power allocation result to be
> anywhere close to optimal.  But improving the energy efficiency of an
> agent seems useful in its own right, whether IPA is used to balance
> power across agents or not.  That's precisely the purpose of this
> series.
>
>> If you look at the cpufreq cooling side of things, you'll see it also uses
>> the PM QoS interface. For instance, should IPA decide to cap the CPUs
>> (perhaps because say the GPU is the one drawing most of the juice), it'll
>> lead to a maximum frequency capping request.
>>
>> So it does sound like that's what you want, only not just when thermally
>> constrained.
>
> Capping the CPU frequency from random device drivers is highly
> problematic, because the CPU is a shared resource which a number of
> different concurrent applications might be using beyond the GPU client.
> The GPU driver has no visibility over its impact on the performance of
> other applications.  And even in a single-task environment, in order to
> behave as effectively as the present series the GPU driver would need to
> monitor the utilization of *all* CPUs in the system and place a
> frequency constraint on each one of them (since there is the potential
> of the task scheduler migrating the process from one CPU to another
> without notice).  Furthermore these frequency constraints would need to
> be updated at high frequency in order to avoid performance degradation
> whenever the balance of load between CPU and IO device fluctuates.
>
> The present series attempts to remove the burden of computing frequency
> constraints out of individual device drivers into the CPUFREQ governor.
> Instead the device driver provides a response latency constraint when it
> encounters a bottleneck, which can be more easily derived from hardware
> and protocol characteristics than a CPU frequency.  PM QoS aggregates
> the response latency constraints provided by all applications and gives
> CPUFREQ a single response latency target compatible with all of them (so
> a device driver specifying a high latency target won't lead to
> performance degradation in a concurrent application with lower latency
> constraints).  The CPUFREQ governor then computes frequency constraints
> for each CPU core that minimize energy usage without limiting
> throughput, based on the results obtained from CPU performance counters,
> while guaranteeing that a discontinuous transition in CPU utilization
> leads to a proportional transition in the CPU frequency before the
> specified response latency has elapsed.

Right, I think I see your point there. I'm thinking the 'actual' IPA gurus
(Lukasz or even Javi) may want to have a look at this.

WARNING: multiple messages have this Message-ID (diff)
From: Valentin Schneider <valentin.schneider@arm.com>
To: Francisco Jerez <currojerez@riseup.net>
Cc: amit.kucheria@verdurent.com, linux-pm@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	intel-gfx@lists.freedesktop.org, daniel.lezcano@linaro.org,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	chris.p.wilson@intel.com, "Pandruvada,
	Srinivas" <srinivas.pandruvada@intel.com>,
	rui.zhang@intel.com, Lukasz Luba <Lukasz.Luba@arm.com>
Subject: Re: [Intel-gfx] [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2.99)
Date: Fri, 15 May 2020 19:09:52 +0100	[thread overview]
Message-ID: <jhjv9kxqdcf.mognet@arm.com> (raw)
In-Reply-To: <87a72at44d.fsf@riseup.net>


On 15/05/20 01:48, Francisco Jerez wrote:
> Valentin Schneider <valentin.schneider@arm.com> writes:
>
>> (+Lukasz)
>>
>> On 11/05/20 22:01, Francisco Jerez wrote:
>>>> What I'm missing is an explanation for why this isn't using the
>>>> infrastructure that was build for these kinds of things? The thermal
>>>> framework, was AFAIU, supposed to help with these things, and the IPA
>>>> thing in particular is used by ARM to do exactly this GPU/CPU power
>>>> budget thing.
>>>>
>>>> If thermal/IPA is found wanting, why aren't we improving that?
>>>
>>> The GPU/CPU power budget "thing" is only a positive side effect of this
>>> series on some TDP-bound systems.  Its ultimate purpose is improving the
>>> energy efficiency of workloads which have a bottleneck on a device other
>>> than the CPU, by giving the bottlenecking device driver some influence
>>> over the response latency of CPUFREQ governors via a PM QoS interface.
>>> This seems to be completely outside the scope of the thermal framework
>>> and IPA AFAIU.
>>>
>>
>> It's been a while since I've stared at IPA, but it does sound vaguely
>> familiar.
>>
>> When thermally constrained, IPA figures out a budget and splits it between
>> actors (cpufreq and devfreq devices) depending on how much juice they are
>> asking for; see cpufreq_get_requested_power() and
>> devfreq_cooling_get_requested_power(). There's also some weighing involved.
>>
>
> I'm aware of those.  Main problem is that the current mechanism for IPA
> to figure out the requested power of each actor is based on a rough
> estimate of their past power consumption: If an actor was operating at a
> highly energy-inefficient regime it will end up requesting more power
> than another actor with the same load but more energy-efficient
> behavior.

Right, we do mix load (busy time for either cpufreq and devfreq devices
AFAIR) and current state (freq) into one single power value.

> The IPA power allocator is therefore ineffective at improving
> the energy efficiency of an agent beyond its past behavior --
> Furthermore it seems to *rely* on individual agents being somewhat
> energetically responsible in order for its power allocation result to be
> anywhere close to optimal.  But improving the energy efficiency of an
> agent seems useful in its own right, whether IPA is used to balance
> power across agents or not.  That's precisely the purpose of this
> series.
>
>> If you look at the cpufreq cooling side of things, you'll see it also uses
>> the PM QoS interface. For instance, should IPA decide to cap the CPUs
>> (perhaps because say the GPU is the one drawing most of the juice), it'll
>> lead to a maximum frequency capping request.
>>
>> So it does sound like that's what you want, only not just when thermally
>> constrained.
>
> Capping the CPU frequency from random device drivers is highly
> problematic, because the CPU is a shared resource which a number of
> different concurrent applications might be using beyond the GPU client.
> The GPU driver has no visibility over its impact on the performance of
> other applications.  And even in a single-task environment, in order to
> behave as effectively as the present series the GPU driver would need to
> monitor the utilization of *all* CPUs in the system and place a
> frequency constraint on each one of them (since there is the potential
> of the task scheduler migrating the process from one CPU to another
> without notice).  Furthermore these frequency constraints would need to
> be updated at high frequency in order to avoid performance degradation
> whenever the balance of load between CPU and IO device fluctuates.
>
> The present series attempts to remove the burden of computing frequency
> constraints out of individual device drivers into the CPUFREQ governor.
> Instead the device driver provides a response latency constraint when it
> encounters a bottleneck, which can be more easily derived from hardware
> and protocol characteristics than a CPU frequency.  PM QoS aggregates
> the response latency constraints provided by all applications and gives
> CPUFREQ a single response latency target compatible with all of them (so
> a device driver specifying a high latency target won't lead to
> performance degradation in a concurrent application with lower latency
> constraints).  The CPUFREQ governor then computes frequency constraints
> for each CPU core that minimize energy usage without limiting
> throughput, based on the results obtained from CPU performance counters,
> while guaranteeing that a discontinuous transition in CPU utilization
> leads to a proportional transition in the CPU frequency before the
> specified response latency has elapsed.

Right, I think I see your point there. I'm thinking the 'actual' IPA gurus
(Lukasz or even Javi) may want to have a look at this.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2020-05-15 18:10 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-28  3:22 [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2.99) Francisco Jerez
2020-04-28  3:22 ` [Intel-gfx] " Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 01/11] PM: QoS: Add CPU_SCALING_RESPONSE global PM QoS limit Francisco Jerez
2020-04-28  3:22   ` [Intel-gfx] " Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 02/11] drm/i915: Adjust PM QoS scaling response frequency based on GPU load Francisco Jerez
2020-04-28  3:22   ` [Intel-gfx] " Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 03/11] OPTIONAL: drm/i915: Expose PM QoS control parameters via debugfs Francisco Jerez
2020-04-28  3:22   ` [Intel-gfx] " Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 04/11] cpufreq: Define ADAPTIVE frequency governor policy Francisco Jerez
2020-04-28  3:22   ` [Intel-gfx] " Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 05/11] cpufreq: intel_pstate: Reorder intel_pstate_clear_update_util_hook() and intel_pstate_set_update_util_hook() Francisco Jerez
2020-04-28  3:22   ` [Intel-gfx] " Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 06/11] cpufreq: intel_pstate: Call intel_pstate_set_update_util_hook() once from the setpolicy hook Francisco Jerez
2020-04-28  3:22   ` [Intel-gfx] " Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 07/11] cpufreq: intel_pstate: Implement VLP controller statistics and target range calculation Francisco Jerez
2020-04-28  3:22   ` [Intel-gfx] " Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 08/11] cpufreq: intel_pstate: Implement VLP controller for HWP parts Francisco Jerez
2020-04-28  3:22   ` [Intel-gfx] " Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 09/11] cpufreq: intel_pstate: Enable VLP controller based on ACPI FADT profile and CPUID Francisco Jerez
2020-04-28  3:22   ` [Intel-gfx] " Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 10/11] OPTIONAL: cpufreq: intel_pstate: Add tracing of VLP controller status Francisco Jerez
2020-04-28  3:22   ` [Intel-gfx] " Francisco Jerez
2020-04-28  3:22 ` [PATCHv2.99 11/11] OPTIONAL: cpufreq: intel_pstate: Expose VLP controller parameters via debugfs Francisco Jerez
2020-04-28  3:22   ` [Intel-gfx] " Francisco Jerez
2020-04-28  3:32 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [PATCHv2.99,01/11] PM: QoS: Add CPU_SCALING_RESPONSE global PM QoS limit Patchwork
2020-05-11 10:57 ` [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2.99) Peter Zijlstra
2020-05-11 10:57   ` [Intel-gfx] " Peter Zijlstra
2020-05-11 21:01   ` Francisco Jerez
2020-05-11 21:01     ` [Intel-gfx] " Francisco Jerez
2020-05-14 10:26     ` Rafael J. Wysocki
2020-05-14 10:26       ` [Intel-gfx] " Rafael J. Wysocki
2020-05-15  0:48       ` Francisco Jerez
2020-05-15  0:48         ` [Intel-gfx] " Francisco Jerez
2020-05-14 11:50     ` Valentin Schneider
2020-05-14 11:50       ` [Intel-gfx] " Valentin Schneider
2020-05-15  0:48       ` Francisco Jerez
2020-05-15  0:48         ` [Intel-gfx] " Francisco Jerez
2020-05-15 18:09         ` Valentin Schneider [this message]
2020-05-15 18:09           ` Valentin Schneider
2020-05-28  9:29           ` Lukasz Luba
2020-05-28  9:29             ` [Intel-gfx] " Lukasz Luba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jhjv9kxqdcf.mognet@arm.com \
    --to=valentin.schneider@arm.com \
    --cc=Lukasz.Luba@arm.com \
    --cc=amit.kucheria@verdurent.com \
    --cc=chris.p.wilson@intel.com \
    --cc=currojerez@riseup.net \
    --cc=daniel.lezcano@linaro.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=rodrigo.vivi@intel.com \
    --cc=rui.zhang@intel.com \
    --cc=srinivas.pandruvada@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.