* cpufreq on demand governor sampling rate restricted to HZ even on NO_HZ kernels
@ 2009-01-30 14:59 Thomas Renninger
2009-01-30 17:28 ` Pallipadi, Venkatesh
0 siblings, 1 reply; 3+ messages in thread
From: Thomas Renninger @ 2009-01-30 14:59 UTC (permalink / raw)
To: cpufreq; +Cc: linux-kernel, venkatesh.pallipadi
Hi,
depending on HZ set to:
100
250
1000
the ondemand governor is currently limited to poll the CPU load
and adjust the frequency (sampling rate sysfs variable) every:
200ms
80ms
20ms
This limitation does not consider NO_HZ which looks wrong?
If this is correct, can someone give me a pointer, I'd like
to understand why.
If NO_HZ can/should go down to 20ms polling and more (current
CPUs are able to switch fast enough, so that the ondemand governor
would calculate the default polling interval below 80ms for them),
this would hurt in respect of C-states at some point.
For performance reasons, one wants to poll as much as possible, for
powersaving reasons (C-states), one wants to poll as seldom as
possible.
I wonder whether it makes sense to dynamically adjust the polling
interval (e.g. by a hint (and initial wakeup) from the scheduler or
taking C-states into account) to:
- increase the sampling rate, e.g. based on context switching
activity
- lower sampling rate when the system is idle (to gain
full C-state efficiency)
Or in what other way deep C-states could be taken into account
in respect of ondemand polling?
Thanks,
Thomas
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: cpufreq on demand governor sampling rate restricted to HZ even on NO_HZ kernels
2009-01-30 14:59 cpufreq on demand governor sampling rate restricted to HZ even on NO_HZ kernels Thomas Renninger
@ 2009-01-30 17:28 ` Pallipadi, Venkatesh
2009-02-03 17:04 ` Thomas Renninger
0 siblings, 1 reply; 3+ messages in thread
From: Pallipadi, Venkatesh @ 2009-01-30 17:28 UTC (permalink / raw)
To: Thomas Renninger; +Cc: cpufreq@vger.kernel.org, linux-kernel@vger.kernel.org
On Fri, 2009-01-30 at 06:59 -0800, Thomas Renninger wrote:
> Hi,
>
> depending on HZ set to:
>
> 100
> 250
> 1000
>
> the ondemand governor is currently limited to poll the CPU load
> and adjust the frequency (sampling rate sysfs variable) every:
>
> 200ms
> 80ms
> 20ms
>
> This limitation does not consider NO_HZ which looks wrong?
> If this is correct, can someone give me a pointer, I'd like
> to understand why.
>
That is wrong. ondemand sampling_rate should not limit the sampling rate
based on HZ when NO_HZ is configured. The idle statistics is not limited
by HZ rate with NO_HZ, as we will have idle microaccounting.
> If NO_HZ can/should go down to 20ms polling and more (current
> CPUs are able to switch fast enough, so that the ondemand governor
> would calculate the default polling interval below 80ms for them),
> this would hurt in respect of C-states at some point.
>
> For performance reasons, one wants to poll as much as possible, for
> powersaving reasons (C-states), one wants to poll as seldom as
> possible.
>
> I wonder whether it makes sense to dynamically adjust the polling
> interval (e.g. by a hint (and initial wakeup) from the scheduler or
> taking C-states into account) to:
> - increase the sampling rate, e.g. based on context switching
> activity
> - lower sampling rate when the system is idle (to gain
> full C-state efficiency)
> Or in what other way deep C-states could be taken into account
> in respect of ondemand polling?
>
ondemand polling uses deferrable timer and hence will not be called
frequently on a totally idle CPU. The main reason we did not do the
dynamic sampling_rate is because it increases the ondemand response time
with a sudden increase of load, which is not liked by most workloads.
Thanks,
Venki
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: cpufreq on demand governor sampling rate restricted to HZ even on NO_HZ kernels
2009-01-30 17:28 ` Pallipadi, Venkatesh
@ 2009-02-03 17:04 ` Thomas Renninger
0 siblings, 0 replies; 3+ messages in thread
From: Thomas Renninger @ 2009-02-03 17:04 UTC (permalink / raw)
To: Pallipadi, Venkatesh
Cc: cpufreq@vger.kernel.org, linux-kernel@vger.kernel.org
On Friday 30 January 2009 18:28:16 Pallipadi, Venkatesh wrote:
> On Fri, 2009-01-30 at 06:59 -0800, Thomas Renninger wrote:
> > Hi,
> >
> > depending on HZ set to:
> >
> > 100
> > 250
> > 1000
> >
> > the ondemand governor is currently limited to poll the CPU load
> > and adjust the frequency (sampling rate sysfs variable) every:
> >
> > 200ms
> > 80ms
> > 20ms
> >
> > This limitation does not consider NO_HZ which looks wrong?
> > If this is correct, can someone give me a pointer, I'd like
> > to understand why.
>
> That is wrong.
I think I got it now. I first thought my above assumptions are wrong.
Double checking tells me that above assumptions are right, but you
agree that the ondemand minimum sampling is wrong, is that correct?
Can a system fall back to periodic timers, once NO_HZ is active?
Or is NO_HZ always active, once no_hz=off boot param and timer
requirements are analyzed?
Then a rather low value could just be added to ondemand if no_hz is
active, checking what is allowed to be written to ondemand/sampling_rate
and that's it.
What could be sane minimum sampling rate value, the ondemand governor
would set the deferrable timer to?
> ondemand sampling_rate should not limit the sampling rate
> based on HZ when NO_HZ is configured. The idle statistics is not limited
> by HZ rate with NO_HZ, as we will have idle microaccounting.
>
> > If NO_HZ can/should go down to 20ms polling and more (current
> > CPUs are able to switch fast enough, so that the ondemand governor
> > would calculate the default polling interval below 80ms for them),
> > this would hurt in respect of C-states at some point.
> >
> > For performance reasons, one wants to poll as much as possible, for
> > powersaving reasons (C-states), one wants to poll as seldom as
> > possible.
> >
> > I wonder whether it makes sense to dynamically adjust the polling
> > interval (e.g. by a hint (and initial wakeup) from the scheduler or
> > taking C-states into account) to:
> > - increase the sampling rate, e.g. based on context switching
> > activity
> > - lower sampling rate when the system is idle (to gain
> > full C-state efficiency)
> > Or in what other way deep C-states could be taken into account
> > in respect of ondemand polling?
>
> ondemand polling uses deferrable timer and hence will not be called
> frequently on a totally idle CPU. The main reason we did not do the
> dynamic sampling_rate is because it increases the ondemand response time
> with a sudden increase of load, which is not liked by most workloads.
Neat. I didn't know about the deferrable timer, thanks.
Thomas
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-02-03 17:04 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-30 14:59 cpufreq on demand governor sampling rate restricted to HZ even on NO_HZ kernels Thomas Renninger
2009-01-30 17:28 ` Pallipadi, Venkatesh
2009-02-03 17:04 ` Thomas Renninger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox