* Disable cpufreq on modern X86 processors
@ 2012-07-24 13:15 Thomas Renninger
2012-07-24 13:22 ` Andreas Herrmann
2012-07-24 13:23 ` Arjan van de Ven
0 siblings, 2 replies; 5+ messages in thread
From: Thomas Renninger @ 2012-07-24 13:15 UTC (permalink / raw)
To: cpufreq
Cc: arjan, Len Brown, Borislav Petkov, Andreas Herrmann,
Michael Galbraith
Hi,
I recently got pointed to performance losses measured
with and without cpufreq enabled when people worked on
scheduler tunables/improvements.
Depending on whether processes are bound to cores, tunables
inside the cpufreq subsystem, etc. there can be rather big
differences.
While there have been improvements (for example do not poll
that often if constantly running at highest frequency and
others), dynamic cpufreq adjusting as it currently is
implemented via ondemand/conservative governors always
will cost performance.
Arjan mentioned quite some time ago, that for modern X86
processors it does not make much sense to control the
frequency of the CPU via OS, because idle states are
much more efficient and should get entered asap.
Especially on bigger X86 systems with dozens or even hundreds
of cores, cpufreq polling sounds like a bad idea.
Especially if the CPUs do achieve the same or even
better performance/power results via entering C-states quickly.
I would like to come up with a init_default_governor()
or similar function which choses the performance governor
for such CPUs.
Hm, maybe it could get a driver callback, then this one could
be picked up by acpi-cpufreq (and powernow-k8 if applicable)
and those drivers could choose the right governor for the
platform/cpu.
Ideally identifying the CPUs where performance governor should get
used is a one liner checking for a cpu flag.
But this might not get that easy? CPU family/model would need
maintenance if there is no cpu flag/feature to test for.
Just some ideas..., if it's doable with some lines of code without
the need of maintaining/adding new cpu families, I'd like to have
a better default behavior.
One main problem I am facing is: Measuring power consumption
in different workloads.
I can measure the power consumption in idle (deeper sleep
states entered) when CPU frequency is set to lowest and highest
and compare. If both are the same, the CPU is a good candidate
to not do OS controlled CPU frequency scaling.
What do you think?
Thanks,
Thomas
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Disable cpufreq on modern X86 processors
2012-07-24 13:15 Disable cpufreq on modern X86 processors Thomas Renninger
@ 2012-07-24 13:22 ` Andreas Herrmann
2012-07-26 13:39 ` Andre Przywara
2012-07-24 13:23 ` Arjan van de Ven
1 sibling, 1 reply; 5+ messages in thread
From: Andreas Herrmann @ 2012-07-24 13:22 UTC (permalink / raw)
To: Thomas Renninger
Cc: cpufreq, arjan, Len Brown, Borislav Petkov, Michael Galbraith,
Andre Przywara
CC-ing Andre
On Tue, Jul 24, 2012 at 03:15:14PM +0200, Thomas Renninger wrote:
> Hi,
>
> I recently got pointed to performance losses measured
> with and without cpufreq enabled when people worked on
> scheduler tunables/improvements.
>
> Depending on whether processes are bound to cores, tunables
> inside the cpufreq subsystem, etc. there can be rather big
> differences.
>
> While there have been improvements (for example do not poll
> that often if constantly running at highest frequency and
> others), dynamic cpufreq adjusting as it currently is
> implemented via ondemand/conservative governors always
> will cost performance.
>
> Arjan mentioned quite some time ago, that for modern X86
> processors it does not make much sense to control the
> frequency of the CPU via OS, because idle states are
> much more efficient and should get entered asap.
>
> Especially on bigger X86 systems with dozens or even hundreds
> of cores, cpufreq polling sounds like a bad idea.
> Especially if the CPUs do achieve the same or even
> better performance/power results via entering C-states quickly.
>
> I would like to come up with a init_default_governor()
> or similar function which choses the performance governor
> for such CPUs.
> Hm, maybe it could get a driver callback, then this one could
> be picked up by acpi-cpufreq (and powernow-k8 if applicable)
> and those drivers could choose the right governor for the
> platform/cpu.
>
> Ideally identifying the CPUs where performance governor should get
> used is a one liner checking for a cpu flag.
> But this might not get that easy? CPU family/model would need
> maintenance if there is no cpu flag/feature to test for.
>
> Just some ideas..., if it's doable with some lines of code without
> the need of maintaining/adding new cpu families, I'd like to have
> a better default behavior.
>
> One main problem I am facing is: Measuring power consumption
> in different workloads.
>
> I can measure the power consumption in idle (deeper sleep
> states entered) when CPU frequency is set to lowest and highest
> and compare. If both are the same, the CPU is a good candidate
> to not do OS controlled CPU frequency scaling.
>
> What do you think?
>
> Thanks,
>
> Thomas
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Disable cpufreq on modern X86 processors
2012-07-24 13:22 ` Andreas Herrmann
@ 2012-07-26 13:39 ` Andre Przywara
2012-07-26 13:57 ` Arjan van de Ven
0 siblings, 1 reply; 5+ messages in thread
From: Andre Przywara @ 2012-07-26 13:39 UTC (permalink / raw)
To: Thomas Renninger
Cc: Andreas Herrmann, cpufreq, arjan, Len Brown, Borislav Petkov,
Michael Galbraith
On 07/24/2012 03:22 PM, Andreas Herrmann wrote:
> CC-ing Andre
>
> On Tue, Jul 24, 2012 at 03:15:14PM +0200, Thomas Renninger wrote:
>> Hi,
>>
>> I recently got pointed to performance losses measured
>> with and without cpufreq enabled when people worked on
>> scheduler tunables/improvements.
>>
>> Depending on whether processes are bound to cores, tunables
>> inside the cpufreq subsystem, etc. there can be rather big
>> differences.
>>
>> While there have been improvements (for example do not poll
>> that often if constantly running at highest frequency and
>> others), dynamic cpufreq adjusting as it currently is
>> implemented via ondemand/conservative governors always
>> will cost performance.
>>
>> Arjan mentioned quite some time ago, that for modern X86
>> processors it does not make much sense to control the
>> frequency of the CPU via OS, because idle states are
>> much more efficient and should get entered asap.
>>
>> Especially on bigger X86 systems with dozens or even hundreds
>> of cores, cpufreq polling sounds like a bad idea.
>> Especially if the CPUs do achieve the same or even
>> better performance/power results via entering C-states quickly.
>>
>> I would like to come up with a init_default_governor()
>> or similar function which choses the performance governor
>> for such CPUs.
>> Hm, maybe it could get a driver callback, then this one could
>> be picked up by acpi-cpufreq (and powernow-k8 if applicable)
>> and those drivers could choose the right governor for the
>> platform/cpu.
Actually we are currently also looking into this issue.
So I'd refrain at least for now from changing the default governor.
Let's see what Arjan comes up with, I'd also like to see a reworked
governor instead of changing the default one.
To answer your questions nevertheless:
>> Ideally identifying the CPUs where performance governor should get
>> used is a one liner checking for a cpu flag.
>> But this might not get that easy? CPU family/model would need
>> maintenance if there is no cpu flag/feature to test for.
From the AMD side it looks like family >= 0x11 should do the trick. The
actual difference is the availability of >C1 per core sleep states.
Family 10h and 11h have C1 and C1e, but no real deeper states.
Let me see if there is some register or bit we can reliably query to
check for this instead of relying on fragile f/m/s values.
On Phenoms I could measure much lower power usage with ondemand (or
conservative) governor on idle. This was not true for Bulldozers
anymore, so there is some truth in your idea.
Regards,
Andre.
>>
>> Just some ideas..., if it's doable with some lines of code without
>> the need of maintaining/adding new cpu families, I'd like to have
>> a better default behavior.
>>
>> One main problem I am facing is: Measuring power consumption
>> in different workloads.
>>
>> I can measure the power consumption in idle (deeper sleep
>> states entered) when CPU frequency is set to lowest and highest
>> and compare. If both are the same, the CPU is a good candidate
>> to not do OS controlled CPU frequency scaling.
>>
>> What do you think?
>>
>> Thanks,
>>
>> Thomas
>>
--
Andre Przywara
AMD-OSRC (Dresden)
Tel: x29712
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Disable cpufreq on modern X86 processors
2012-07-26 13:39 ` Andre Przywara
@ 2012-07-26 13:57 ` Arjan van de Ven
0 siblings, 0 replies; 5+ messages in thread
From: Arjan van de Ven @ 2012-07-26 13:57 UTC (permalink / raw)
To: Andre Przywara
Cc: Thomas Renninger, Andreas Herrmann, cpufreq, Len Brown,
Borislav Petkov, Michael Galbraith
On 7/26/2012 6:39 AM, Andre Przywara wrote:
> On Phenoms I could measure much lower power usage with ondemand (or
> conservative) governor on idle. This was not true for Bulldozers
> anymore, so there is some truth in your idea.
for Intel, the very hard split came with Nehalem for real, although
C1E earlier also changed the rules already.
ondemand is 10 years old, and the rules of the hardware have changed
fundamentally twice since.
(and frankly, and this may not be very popular on this list, CPUFREQ is
pretty much beyond fixing. The locking rules are completely hosed, but
more, the assumption that you can or should detangle policy from the
hardware is fundamentally flawed. The policy IS hardware specific.
Because it's tuned for specific hardware behavior, if you actually
optimize it)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Disable cpufreq on modern X86 processors
2012-07-24 13:15 Disable cpufreq on modern X86 processors Thomas Renninger
2012-07-24 13:22 ` Andreas Herrmann
@ 2012-07-24 13:23 ` Arjan van de Ven
1 sibling, 0 replies; 5+ messages in thread
From: Arjan van de Ven @ 2012-07-24 13:23 UTC (permalink / raw)
To: Thomas Renninger
Cc: cpufreq, Len Brown, Borislav Petkov, Andreas Herrmann,
Michael Galbraith
>
> Arjan mentioned quite some time ago, that for modern X86
> processors it does not make much sense to control the
> frequency of the CPU via OS, because idle states are
> much more efficient and should get entered asap.
well need to qualify that.
it does not make sense to control the frequency the way we are doing,
because during idle the frequency is zero (as is the voltage).
ondemand logic has different assumptions.
we have been reworking a new governer like thing inside Intel, and are
close to publish it to lkml (and this list).
the problem always is that there are many workloads and one shouldn't
regress bla bla bla so this is needs a lot of data collection (which
takes time)
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-07-26 13:57 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-24 13:15 Disable cpufreq on modern X86 processors Thomas Renninger
2012-07-24 13:22 ` Andreas Herrmann
2012-07-26 13:39 ` Andre Przywara
2012-07-26 13:57 ` Arjan van de Ven
2012-07-24 13:23 ` Arjan van de Ven
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox