* cpuidle governors
@ 2013-11-07 13:44 Jean Delvare
2013-11-07 13:54 ` Daniel Lezcano
0 siblings, 1 reply; 8+ messages in thread
From: Jean Delvare @ 2013-11-07 13:44 UTC (permalink / raw)
To: linux-pm; +Cc: Rafael J. Wysocki
Hi all,
I had to work on cpuidle recently and there are two things which caused
me trouble and I'd like to discuss.
1* Is there no documentation about how the available governors (menu and
ladder) work? I found good documentation of the general architecture and
API in Documentation/cpuidle, but I am missing a description of the
internal logic of each available governor (just like
Documentation/cpu-freq/governors.txt for cpufreq.) Also, the
documentation says that "the kernel picks the best governor based on
governor ratings" but that's pretty vague. An explanation of how the
governors are rated would be good to have. Could this be added?
2* Is there no way to specify/force the cpuidle governor at boot time? I
found the cpuidle_sysfs_switch parameter and then I can change the
governor in sysfs, but this seems overly complicated (and apparently
discouraged) in a production environment. Wouldn't it be easier to
implement a cpuidle.governor=xxx boot parameter? That way we don't have
to deal with run-time changes, but the user can still force a specific
governor.
Thanks,
--
Jean Delvare
Suse L3 Support
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cpuidle governors
2013-11-07 13:44 cpuidle governors Jean Delvare
@ 2013-11-07 13:54 ` Daniel Lezcano
2013-11-22 7:45 ` Jean Delvare
0 siblings, 1 reply; 8+ messages in thread
From: Daniel Lezcano @ 2013-11-07 13:54 UTC (permalink / raw)
To: Jean Delvare, linux-pm; +Cc: Rafael J. Wysocki
On 11/07/2013 02:44 PM, Jean Delvare wrote:
> Hi all,
>
> I had to work on cpuidle recently and there are two things which caused
> me trouble and I'd like to discuss.
>
> 1* Is there no documentation about how the available governors (menu and
> ladder) work? I found good documentation of the general architecture and
> API in Documentation/cpuidle, but I am missing a description of the
> internal logic of each available governor (just like
> Documentation/cpu-freq/governors.txt for cpufreq.)
IMO, the code review and the header description in the menu.c file is
the best way to understand how the governor works. For very specific
questions, try asking in the mailing list.
> Also, the
> documentation says that "the kernel picks the best governor based on
> governor ratings" but that's pretty vague. An explanation of how the
> governors are rated would be good to have. Could this be added?
Yeah, actually they are rated but depending on the system configuration
one fit better than the other one. Tickless system => menu governor,
Periodic system => ladder governor. Using a tickless system with the
ladder governor is less efficient from a power saving POV.
> 2* Is there no way to specify/force the cpuidle governor at boot time? I
> found the cpuidle_sysfs_switch parameter and then I can change the
> governor in sysfs, but this seems overly complicated (and apparently
> discouraged) in a production environment. Wouldn't it be easier to
> implement a cpuidle.governor=xxx boot parameter? That way we don't have
> to deal with run-time changes, but the user can still force a specific
> governor.
Yes, that may make sense.
Thanks
-- Daniel
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cpuidle governors
2013-11-07 13:54 ` Daniel Lezcano
@ 2013-11-22 7:45 ` Jean Delvare
2013-11-22 15:52 ` Rafael J. Wysocki
0 siblings, 1 reply; 8+ messages in thread
From: Jean Delvare @ 2013-11-22 7:45 UTC (permalink / raw)
To: Daniel Lezcano; +Cc: linux-pm, Rafael J. Wysocki
Hi Daniel,
Thanks for your fast reply and sorry for my slow one :(
Le Thursday 07 November 2013 à 14:54 +0100, Daniel Lezcano a écrit :
> On 11/07/2013 02:44 PM, Jean Delvare wrote:
> > Hi all,
> >
> > I had to work on cpuidle recently and there are two things which caused
> > me trouble and I'd like to discuss.
> >
> > 1* Is there no documentation about how the available governors (menu and
> > ladder) work? I found good documentation of the general architecture and
> > API in Documentation/cpuidle, but I am missing a description of the
> > internal logic of each available governor (just like
> > Documentation/cpu-freq/governors.txt for cpufreq.)
>
> IMO, the code review and the header description in the menu.c file is
> the best way to understand how the governor works.
OK, I'll look at the code then. But I still believe this should be
documented for clarity.
> For very specific
> questions, try asking in the mailing list.
I'm doing that right now ;)
> > Also, the
> > documentation says that "the kernel picks the best governor based on
> > governor ratings" but that's pretty vague. An explanation of how the
> > governors are rated would be good to have. Could this be added?
>
> Yeah, actually they are rated but depending on the system configuration
> one fit better than the other one. Tickless system => menu governor,
> Periodic system => ladder governor. Using a tickless system with the
> ladder governor is less efficient from a power saving POV.
My original issue is somewhat related to this. One customer reported to
us that booting with nohz=off breaks cpuidle. My own testing revealed:
* That a kernel built without NO_HZ still gets cpuidle governor "menu".
This contradicts your statement above.
* That a NO_HZ kernel booted with nohz=off behaves differently than a
kernel built without NO_HZ with regards to cpuidle. Both use the "menu"
governor by default (while I understand they should rather not), but in
the latter case deep C states are reached while in the former they never
are. This smells like a second bug.
I would appreciate if both bugs could get fixed. I can fill out bugzilla
entries if it helps.
Thanks,
--
Jean Delvare
Suse L3 Support
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cpuidle governors
2013-11-22 7:45 ` Jean Delvare
@ 2013-11-22 15:52 ` Rafael J. Wysocki
2013-11-22 16:14 ` Daniel Lezcano
2013-11-22 18:14 ` Jean Delvare
0 siblings, 2 replies; 8+ messages in thread
From: Rafael J. Wysocki @ 2013-11-22 15:52 UTC (permalink / raw)
To: Jean Delvare, Daniel Lezcano; +Cc: linux-pm
On 11/22/2013 8:45 AM, Jean Delvare wrote:
> Hi Daniel,
>
> Thanks for your fast reply and sorry for my slow one :(
>
> Le Thursday 07 November 2013 à 14:54 +0100, Daniel Lezcano a écrit :
>> On 11/07/2013 02:44 PM, Jean Delvare wrote:
>>> Hi all,
>>>
>>> I had to work on cpuidle recently and there are two things which caused
>>> me trouble and I'd like to discuss.
>>>
>>> 1* Is there no documentation about how the available governors (menu and
>>> ladder) work? I found good documentation of the general architecture and
>>> API in Documentation/cpuidle, but I am missing a description of the
>>> internal logic of each available governor (just like
>>> Documentation/cpu-freq/governors.txt for cpufreq.)
>> IMO, the code review and the header description in the menu.c file is
>> the best way to understand how the governor works.
> OK, I'll look at the code then. But I still believe this should be
> documented for clarity.
>
>> For very specific
>> questions, try asking in the mailing list.
> I'm doing that right now ;)
>
>>> Also, the
>>> documentation says that "the kernel picks the best governor based on
>>> governor ratings" but that's pretty vague. An explanation of how the
>>> governors are rated would be good to have. Could this be added?
>> Yeah, actually they are rated but depending on the system configuration
>> one fit better than the other one. Tickless system => menu governor,
>> Periodic system => ladder governor. Using a tickless system with the
>> ladder governor is less efficient from a power saving POV.
> My original issue is somewhat related to this. One customer reported to
> us that booting with nohz=off breaks cpuidle. My own testing revealed:
>
> * That a kernel built without NO_HZ still gets cpuidle governor "menu".
> This contradicts your statement above.
> * That a NO_HZ kernel booted with nohz=off behaves differently than a
> kernel built without NO_HZ with regards to cpuidle. Both use the "menu"
> governor by default (while I understand they should rather not), but in
> the latter case deep C states are reached while in the former they never
> are. This smells like a second bug.
>
> I would appreciate if both bugs could get fixed.
Yes, it looks like we have two separate bugs there.
> I can fill out bugzilla entries if it helps.
Please do, that helps a lot.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cpuidle governors
2013-11-22 15:52 ` Rafael J. Wysocki
@ 2013-11-22 16:14 ` Daniel Lezcano
2013-11-22 18:06 ` Jean Delvare
2013-11-22 18:14 ` Jean Delvare
1 sibling, 1 reply; 8+ messages in thread
From: Daniel Lezcano @ 2013-11-22 16:14 UTC (permalink / raw)
To: Rafael J. Wysocki, Jean Delvare; +Cc: linux-pm
On 11/22/2013 04:52 PM, Rafael J. Wysocki wrote:
> On 11/22/2013 8:45 AM, Jean Delvare wrote:
>> Hi Daniel,
>>
>> Thanks for your fast reply and sorry for my slow one :(
>>
>> Le Thursday 07 November 2013 à 14:54 +0100, Daniel Lezcano a écrit :
>>> On 11/07/2013 02:44 PM, Jean Delvare wrote:
>>>> Hi all,
>>>>
>>>> I had to work on cpuidle recently and there are two things which caused
>>>> me trouble and I'd like to discuss.
>>>>
>>>> 1* Is there no documentation about how the available governors (menu
>>>> and
>>>> ladder) work? I found good documentation of the general architecture
>>>> and
>>>> API in Documentation/cpuidle, but I am missing a description of the
>>>> internal logic of each available governor (just like
>>>> Documentation/cpu-freq/governors.txt for cpufreq.)
>>> IMO, the code review and the header description in the menu.c file is
>>> the best way to understand how the governor works.
>> OK, I'll look at the code then. But I still believe this should be
>> documented for clarity.
>>
>>> For very specific
>>> questions, try asking in the mailing list.
>> I'm doing that right now ;)
>>
>>>> Also, the
>>>> documentation says that "the kernel picks the best governor based on
>>>> governor ratings" but that's pretty vague. An explanation of how the
>>>> governors are rated would be good to have. Could this be added?
>>> Yeah, actually they are rated but depending on the system configuration
>>> one fit better than the other one. Tickless system => menu governor,
>>> Periodic system => ladder governor. Using a tickless system with the
>>> ladder governor is less efficient from a power saving POV.
>> My original issue is somewhat related to this. One customer reported to
>> us that booting with nohz=off breaks cpuidle. My own testing revealed:
>>
>> * That a kernel built without NO_HZ still gets cpuidle governor "menu".
>> This contradicts your statement above.
>> * That a NO_HZ kernel booted with nohz=off behaves differently than a
>> kernel built without NO_HZ with regards to cpuidle. Both use the "menu"
>> governor by default (while I understand they should rather not), but in
>> the latter case deep C states are reached while in the former they never
>> are. This smells like a second bug.
>>
>> I would appreciate if both bugs could get fixed.
>
> Yes, it looks like we have two separate bugs there.
Actually, the first one is a bug but not the second one.
I made some changes to select by default the menu governor with NO_HZ
and the ladder governor without NO_HZ and wanted to remove the unneeded
governor from the Kconfig. But we let it as it was to keep the old
behavior. Unfortunately, the governor rating decision will always goes
in favor of the menu governor as it is the best one even if we are *not*
with NO_HZ. So the only way to prevent is to set in the kernel command
line the option 'cpuidle_sysfs_switch' and from the userspace set the
ladder governor when the system has booted.
A fix could be to remove from the configuration the governor which does
not suit the NO_HZ option.
Another fix would be to play with the rating and change them depending
on the NO_HZ option.
Concerning the second bug, it is not a bug but totally normal. On a
periodic tick system, (aka NO_HZ=no), the periodic timer duration
prevents to enter a deep idle state. The target residency for the state,
which is never reached, should be on your system greater than the
periodic tick duration.
Hope that helps.
-- Daniel
>
>> I can fill out bugzilla entries if it helps.
>
> Please do, that helps a lot.
>
> Thanks,
> Rafael
>
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cpuidle governors
2013-11-22 16:14 ` Daniel Lezcano
@ 2013-11-22 18:06 ` Jean Delvare
2013-11-22 18:17 ` Daniel Lezcano
0 siblings, 1 reply; 8+ messages in thread
From: Jean Delvare @ 2013-11-22 18:06 UTC (permalink / raw)
To: Daniel Lezcano; +Cc: Rafael J. Wysocki, linux-pm
Hi Daniel,
Le Friday 22 November 2013 à 17:14 +0100, Daniel Lezcano a écrit :
> On 11/22/2013 04:52 PM, Rafael J. Wysocki wrote:
> > On 11/22/2013 8:45 AM, Jean Delvare wrote:
> >>My own testing revealed:
> >>
> >> * That a kernel built without NO_HZ still gets cpuidle governor "menu".
> >> This contradicts your statement above.
> >> * That a NO_HZ kernel booted with nohz=off behaves differently than a
> >> kernel built without NO_HZ with regards to cpuidle. Both use the "menu"
> >> governor by default (while I understand they should rather not), but in
> >> the latter case deep C states are reached while in the former they never
> >> are. This smells like a second bug.
> >>
> >> I would appreciate if both bugs could get fixed.
> >
> > Yes, it looks like we have two separate bugs there.
>
> Actually, the first one is a bug but not the second one.
>
> I made some changes to select by default the menu governor with NO_HZ
> and the ladder governor without NO_HZ and wanted to remove the unneeded
> governor from the Kconfig. But we let it as it was to keep the old
> behavior. Unfortunately, the governor rating decision will always goes
> in favor of the menu governor as it is the best one even if we are *not*
> with NO_HZ. So the only way to prevent is to set in the kernel command
> line the option 'cpuidle_sysfs_switch' and from the userspace set the
> ladder governor when the system has booted.
That's what I ended up doing, yes, but I don't like it, because 1* it's
not as straightforward as a cpuidle.governor=xxx option and 2* I don't
think I should have to change the governor manually in the first place.
> A fix could be to remove from the configuration the governor which does
> not suit the NO_HZ option.
That's not a good idea because nohz=off can be passed on the command
line to reenable the ticking at runtime. So the governor must be decided
at runtime too.
> Another fix would be to play with the rating and change them depending
> on the NO_HZ option.
Yes, I think this is better, because then you can honor nohz=off.
> Concerning the second bug, it is not a bug but totally normal. On a
> periodic tick system, (aka NO_HZ=no), the periodic timer duration
> prevents to enter a deep idle state. The target residency for the state,
> which is never reached, should be on your system greater than the
> periodic tick duration.
Not sure if you read what I wrote properly. The menu governor _does_
work at least to some degree without NO_HZ, even if it is less efficient
than with NO_HZ (no surprise here.) What bothers me is that NO_HZ=y +
nohz=off behaves differently than NO_HZ=n. I believe they should behave
the same.
I'll fill bugs for both.
Thanks,
--
Jean Delvare
Suse L3 Support
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cpuidle governors
2013-11-22 15:52 ` Rafael J. Wysocki
2013-11-22 16:14 ` Daniel Lezcano
@ 2013-11-22 18:14 ` Jean Delvare
1 sibling, 0 replies; 8+ messages in thread
From: Jean Delvare @ 2013-11-22 18:14 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: Daniel Lezcano, linux-pm
Le Friday 22 November 2013 à 16:52 +0100, Rafael J. Wysocki a écrit :
> On 11/22/2013 8:45 AM, Jean Delvare wrote:
> > * That a kernel built without NO_HZ still gets cpuidle governor "menu".
> > This contradicts your statement above.
> > * That a NO_HZ kernel booted with nohz=off behaves differently than a
> > kernel built without NO_HZ with regards to cpuidle. Both use the "menu"
> > governor by default (while I understand they should rather not), but in
> > the latter case deep C states are reached while in the former they never
> > are. This smells like a second bug.
> >
> > I would appreciate if both bugs could get fixed.
>
> Yes, it looks like we have two separate bugs there.
>
> > I can fill out bugzilla entries if it helps.
>
> Please do, that helps a lot.
Done:
Bug 65531 - Cpuidle governor menu selected despite NO_HZ not being enabled
https://bugzilla.kernel.org/show_bug.cgi?id=65531
Bug 65541 - Booting with nohz=off breaks the menu cpuidle governor
https://bugzilla.kernel.org/show_bug.cgi?id=65541
--
Jean Delvare
Suse L3 Support
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cpuidle governors
2013-11-22 18:06 ` Jean Delvare
@ 2013-11-22 18:17 ` Daniel Lezcano
0 siblings, 0 replies; 8+ messages in thread
From: Daniel Lezcano @ 2013-11-22 18:17 UTC (permalink / raw)
To: Jean Delvare; +Cc: Rafael J. Wysocki, linux-pm
On 11/22/2013 07:06 PM, Jean Delvare wrote:
> Hi Daniel,
>
> Le Friday 22 November 2013 à 17:14 +0100, Daniel Lezcano a écrit :
>> On 11/22/2013 04:52 PM, Rafael J. Wysocki wrote:
>>> On 11/22/2013 8:45 AM, Jean Delvare wrote:
>>>> My own testing revealed:
>>>>
>>>> * That a kernel built without NO_HZ still gets cpuidle governor "menu".
>>>> This contradicts your statement above.
>>>> * That a NO_HZ kernel booted with nohz=off behaves differently than a
>>>> kernel built without NO_HZ with regards to cpuidle. Both use the "menu"
>>>> governor by default (while I understand they should rather not), but in
>>>> the latter case deep C states are reached while in the former they never
>>>> are. This smells like a second bug.
>>>>
>>>> I would appreciate if both bugs could get fixed.
>>>
>>> Yes, it looks like we have two separate bugs there.
>>
>> Actually, the first one is a bug but not the second one.
>>
>> I made some changes to select by default the menu governor with NO_HZ
>> and the ladder governor without NO_HZ and wanted to remove the unneeded
>> governor from the Kconfig. But we let it as it was to keep the old
>> behavior. Unfortunately, the governor rating decision will always goes
>> in favor of the menu governor as it is the best one even if we are *not*
>> with NO_HZ. So the only way to prevent is to set in the kernel command
>> line the option 'cpuidle_sysfs_switch' and from the userspace set the
>> ladder governor when the system has booted.
>
> That's what I ended up doing, yes, but I don't like it, because 1* it's
> not as straightforward as a cpuidle.governor=xxx option and 2* I don't
> think I should have to change the governor manually in the first place.
>
>> A fix could be to remove from the configuration the governor which does
>> not suit the NO_HZ option.
>
> That's not a good idea because nohz=off can be passed on the command
> line to reenable the ticking at runtime. So the governor must be decided
> at runtime too.
>
>> Another fix would be to play with the rating and change them depending
>> on the NO_HZ option.
>
> Yes, I think this is better, because then you can honor nohz=off.
Yes, good point.
>> Concerning the second bug, it is not a bug but totally normal. On a
>> periodic tick system, (aka NO_HZ=no), the periodic timer duration
>> prevents to enter a deep idle state. The target residency for the state,
>> which is never reached, should be on your system greater than the
>> periodic tick duration.
>
> Not sure if you read what I wrote properly. The menu governor _does_
> work at least to some degree without NO_HZ, even if it is less efficient
> than with NO_HZ (no surprise here.) What bothers me is that NO_HZ=y +
> nohz=off behaves differently than NO_HZ=n. I believe they should behave
> the same.
Yes, you are right I misunderstood it.
Thanks
-- Daniel
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2013-11-22 18:17 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-07 13:44 cpuidle governors Jean Delvare
2013-11-07 13:54 ` Daniel Lezcano
2013-11-22 7:45 ` Jean Delvare
2013-11-22 15:52 ` Rafael J. Wysocki
2013-11-22 16:14 ` Daniel Lezcano
2013-11-22 18:06 ` Jean Delvare
2013-11-22 18:17 ` Daniel Lezcano
2013-11-22 18:14 ` Jean Delvare
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).