* [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
@ 2006-06-18 15:10 Thomas Gleixner
2006-06-18 16:35 ` Michal Piotrowski
` (2 more replies)
0 siblings, 3 replies; 29+ messages in thread
From: Thomas Gleixner @ 2006-06-18 15:10 UTC (permalink / raw)
To: LKML; +Cc: Andrew Morton, john stultz, Ingo Molnar, Con Kolivas
We are pleased to announce the 2.6.17 based release of our high-res
timers kernel feature, upon which we based a tickless kernel (dyntick)
implementation and a 'dynamic HZ' feature as well:
http://www.tglx.de/projects/hrtimers/2.6.17/
The easiest way to try these features is to apply the combo patch to
vanilla 2.6.17. The patching order is:
http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.17.tar.bz2
http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick1.patch
A broken out patch series is available too:
http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick1.patches.tar.bz2
The high-res timers feature (CONFIG_HIGH_RES_TIMERS) enables POSIX
timers and nanosleep() to be as accurate as the hardware allows (around
1usec on typical hardware). This feature is transparent - if enabled it
just makes these timers much more accurate than the current HZ
resolution. It is based on the Generic Time Of Day patchset from John
Stultz and it in essence finishes what we started with the
kernel/hrtimers.c code in 2.6.16.
The tickless kernel feature (CONFIG_NO_HZ) enables 'on-demand' timer
interrupts: if there is no timer to be expired for say 1.5 seconds when
the system goes idle, then the system will stay totally idle for 1.5
seconds. This should bring cooler CPUs and power savings: on our (x86)
testboxes we have measured the effective IRQ rate to go from HZ to 1-2
timer interrupts per second.
This feature is implemented by driving 'low res timer wheel' processing
via special per-CPU high-res timers, which timers are reprogrammed to
the next-low-res-timer-expires interval. This tickless-kernel design is
SMP-safe in a natural way and has been developed on SMP systems from
the
beginning.
Note: while our code should be similar in behavior to the existing
dynticks kernel patch from Con, it is a fundamentally different design
(being based on the high-res timers support and APIs) and is thus a
different implementation. We reused one area of dynticks: we integrated
and improved the 'timer top' profiling tool (CONFIG_TIMER_INFO).
When running the kernel then there's a 'timeout granularity'
runtime tunable parameter as well, under:
/proc/sys/kernel/timeout_granularity
it defaults to 1, meaning that CONFIG_HZ is the granularity of timers.
For example, if CONFIG_HZ is 1000 and timeout_granularity is set to 10,
then low-res timers will be expired every 10 jiffies (every 10 msecs),
thus the effective granularity of low-res timers is 100 HZ. Thus this
feature implements nonintrusive dynamic HZ in essence, without touching
the HZ macro itself.
Supported platforms: high-res timers and tickless works on x86 (x86_64,
PPC and ARM port are in the works). Other platforms should still work
fine with the usual HZ frequency timer tick.
Naturally, we'd like these features to be integrated into the upstream
kernel as well.
Bugreports and suggestions are welcome,
Thomas, Ingo
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-18 15:10 [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ Thomas Gleixner
@ 2006-06-18 16:35 ` Michal Piotrowski
2006-06-18 18:28 ` Ingo Molnar
2006-06-18 19:50 ` Thomas Gleixner
2006-06-18 23:47 ` Roman Zippel
2006-06-19 5:21 ` Con Kolivas
2 siblings, 2 replies; 29+ messages in thread
From: Michal Piotrowski @ 2006-06-18 16:35 UTC (permalink / raw)
To: tglx; +Cc: LKML, Andrew Morton, john stultz, Ingo Molnar, Con Kolivas
Hi Thomas,
Thomas Gleixner napisał(a):
> We are pleased to announce the 2.6.17 based release of our high-res
> timers kernel feature, upon which we based a tickless kernel (dyntick)
> implementation and a 'dynamic HZ' feature as well:
>
> http://www.tglx.de/projects/hrtimers/2.6.17/
>
[snip]
I get a lot of
WARNING: /lib/modules/2.6.17-hrt-dyntick1/kernel/sound/pci/ac97/snd-ac97-codec.ko needs unknown symbol msecs_to_jiffies
WARNING: /lib/modules/2.6.17-hrt-dyntick1/kernel/drivers/net/skge.ko needs unknown symbol jiffies_to_msecs
WARNING: /lib/modules/2.6.17-hrt-dyntick1/kernel/drivers/cpufreq/cpufreq_ondemand.ko needs unknown symbol jiffies_to_usecs
etc...
warnings.
Here is fix small fix.
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/
diff -uprN -X linux-work/Documentation/dontdiff linux-work-clean/kernel/time.c linux-work/kernel/time.c
--- linux-work-clean/kernel/time.c 2006-06-18 18:16:29.000000000 +0200
+++ linux-work/kernel/time.c 2006-06-18 18:25:42.000000000 +0200
@@ -660,6 +660,8 @@ unsigned int jiffies_to_msecs(const unsi
#endif
}
+EXPORT_SYMBOL(jiffies_to_msecs);
+
unsigned int jiffies_to_usecs(const unsigned long j)
{
#if HZ <= USEC_PER_SEC && !(USEC_PER_SEC % HZ)
@@ -671,6 +673,8 @@ unsigned int jiffies_to_usecs(const unsi
#endif
}
+EXPORT_SYMBOL(jiffies_to_usecs);
+
/*
* When we convert to jiffies then we interpret incoming values
* the following way:
@@ -724,6 +728,9 @@ unsigned long msecs_to_jiffies(const uns
return (m * HZ + MSEC_PER_SEC - 1) / MSEC_PER_SEC;
#endif
}
+
+EXPORT_SYMBOL(msecs_to_jiffies);
+
unsigned long usecs_to_jiffies(const unsigned int u)
{
if (u > jiffies_to_usecs(MAX_JIFFY_OFFSET))
@@ -737,6 +744,8 @@ unsigned long usecs_to_jiffies(const uns
#endif
}
+EXPORT_SYMBOL(usecs_to_jiffies);
+
/*
* The TICK_NSEC - 1 rounds up the value to the next resolution. Note
* that a remainder subtract here would not do the right thing as the
@@ -830,6 +839,8 @@ clock_t jiffies_to_clock_t(long x)
#endif
}
+EXPORT_SYMBOL(jiffies_to_clock_t);
+
unsigned long clock_t_to_jiffies(unsigned long x)
{
#if (HZ % USER_HZ)==0
@@ -850,6 +861,8 @@ unsigned long clock_t_to_jiffies(unsigne
#endif
}
+EXPORT_SYMBOL(clock_t_to_jiffies);
+
u64 jiffies_64_to_clock_t(u64 x)
{
#if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-18 16:35 ` Michal Piotrowski
@ 2006-06-18 18:28 ` Ingo Molnar
2006-06-19 16:35 ` Michal Piotrowski
2006-06-18 19:50 ` Thomas Gleixner
1 sibling, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2006-06-18 18:28 UTC (permalink / raw)
To: Michal Piotrowski; +Cc: tglx, LKML, Andrew Morton, john stultz, Con Kolivas
* Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
> WARNING: /lib/modules/2.6.17-hrt-dyntick1/kernel/sound/pci/ac97/snd-ac97-codec.ko needs unknown symbol msecs_to_jiffies
> WARNING: /lib/modules/2.6.17-hrt-dyntick1/kernel/drivers/net/skge.ko needs unknown symbol jiffies_to_msecs
> WARNING: /lib/modules/2.6.17-hrt-dyntick1/kernel/drivers/cpufreq/cpufreq_ondemand.ko needs unknown symbol jiffies_to_usecs
> etc...
>
> warnings.
>
> Here is fix small fix.
thanks. I've uploaded the current combo patch to:
http://redhat.com/~mingo/high-res-timers-dyntick/hres-dyntick-combo-2.6.17-2.patch
(this also includes work-in-progress x86_64 bits - the .config options
are offered by dynticks are not yet functional there.)
Ingo
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-18 16:35 ` Michal Piotrowski
2006-06-18 18:28 ` Ingo Molnar
@ 2006-06-18 19:50 ` Thomas Gleixner
2006-06-19 12:09 ` Con Kolivas
1 sibling, 1 reply; 29+ messages in thread
From: Thomas Gleixner @ 2006-06-18 19:50 UTC (permalink / raw)
To: Michal Piotrowski
Cc: LKML, Andrew Morton, john stultz, Ingo Molnar, Con Kolivas
Michal,
On Sun, 2006-06-18 at 18:35 +0200, Michal Piotrowski wrote:
> HARNING: /lib/modules/2.6.17-hrt-dyntick1/kernel/sound/pci/ac97/snd-ac97-codec.ko needs unknown symbol msecs_to_jiffies
> WARNING: /lib/modules/2.6.17-hrt-dyntick1/kernel/drivers/net/skge.ko needs unknown symbol jiffies_to_msecs
> WARNING: /lib/modules/2.6.17-hrt-dyntick1/kernel/drivers/cpufreq/cpufreq_ondemand.ko needs unknown symbol jiffies_to_usecs
>
> Here is fix small fix.
Applied, thanks.
New patch available at:
http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick2.patch
http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick2.patches.tar.bz2
tglx
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-18 15:10 [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ Thomas Gleixner
2006-06-18 16:35 ` Michal Piotrowski
@ 2006-06-18 23:47 ` Roman Zippel
2006-06-19 12:50 ` Ingo Molnar
2006-06-19 5:21 ` Con Kolivas
2 siblings, 1 reply; 29+ messages in thread
From: Roman Zippel @ 2006-06-18 23:47 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Andrew Morton, john stultz, Ingo Molnar, Con Kolivas
Hi,
On Sun, 18 Jun 2006, Thomas Gleixner wrote:
> Bugreports and suggestions are welcome,
Could you please document the patches? I know it sucks compared to
hacking, but it would make a review a lot simpler.
bye, Roman
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-18 15:10 [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ Thomas Gleixner
2006-06-18 16:35 ` Michal Piotrowski
2006-06-18 23:47 ` Roman Zippel
@ 2006-06-19 5:21 ` Con Kolivas
2006-06-19 5:24 ` Con Kolivas
2006-06-19 12:26 ` Ingo Molnar
2 siblings, 2 replies; 29+ messages in thread
From: Con Kolivas @ 2006-06-19 5:21 UTC (permalink / raw)
To: tglx; +Cc: LKML, Andrew Morton, john stultz, Ingo Molnar
On Monday 19 June 2006 01:10, Thomas Gleixner wrote:
> We are pleased to announce the 2.6.17 based release of our high-res
> timers kernel feature, upon which we based a tickless kernel (dyntick)
> implementation and a 'dynamic HZ' feature as well:
>
> http://www.tglx.de/projects/hrtimers/2.6.17/
>
> The easiest way to try these features is to apply the combo patch to
> vanilla 2.6.17. The patching order is:
>
> http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.17.tar.bz2
> http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick1.patch
>
>
> A broken out patch series is available too:
>
> http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick1.patch
>es.tar.bz2
>
>
> The high-res timers feature (CONFIG_HIGH_RES_TIMERS) enables POSIX
> timers and nanosleep() to be as accurate as the hardware allows (around
> 1usec on typical hardware). This feature is transparent - if enabled it
> just makes these timers much more accurate than the current HZ
> resolution. It is based on the Generic Time Of Day patchset from John
> Stultz and it in essence finishes what we started with the
> kernel/hrtimers.c code in 2.6.16.
>
> The tickless kernel feature (CONFIG_NO_HZ) enables 'on-demand' timer
> interrupts: if there is no timer to be expired for say 1.5 seconds when
> the system goes idle, then the system will stay totally idle for 1.5
> seconds. This should bring cooler CPUs and power savings: on our (x86)
> testboxes we have measured the effective IRQ rate to go from HZ to 1-2
> timer interrupts per second.
>
> This feature is implemented by driving 'low res timer wheel' processing
> via special per-CPU high-res timers, which timers are reprogrammed to
> the next-low-res-timer-expires interval. This tickless-kernel design is
> SMP-safe in a natural way and has been developed on SMP systems from
> the
> beginning.
>
> Note: while our code should be similar in behavior to the existing
> dynticks kernel patch from Con, it is a fundamentally different design
> (being based on the high-res timers support and APIs) and is thus a
> different implementation. We reused one area of dynticks: we integrated
> and improved the 'timer top' profiling tool (CONFIG_TIMER_INFO).
>
> When running the kernel then there's a 'timeout granularity'
> runtime tunable parameter as well, under:
>
> /proc/sys/kernel/timeout_granularity
>
> it defaults to 1, meaning that CONFIG_HZ is the granularity of timers.
>
> For example, if CONFIG_HZ is 1000 and timeout_granularity is set to 10,
> then low-res timers will be expired every 10 jiffies (every 10 msecs),
> thus the effective granularity of low-res timers is 100 HZ. Thus this
> feature implements nonintrusive dynamic HZ in essence, without touching
> the HZ macro itself.
>
> Supported platforms: high-res timers and tickless works on x86 (x86_64,
> PPC and ARM port are in the works). Other platforms should still work
> fine with the usual HZ frequency timer tick.
>
> Naturally, we'd like these features to be integrated into the upstream
> kernel as well.
>
> Bugreports and suggestions are welcome,
>
> Thomas, Ingo
Nice work Thomas and Ingo.
The approach to previous dynticks that I was working on had some nasty issues
with scalability that were not addressable without a complete rewrite which
is why I abandoned the previous implementation. Your approach for using the
hires timer events is ultimately a better solution and the code base is
cleaner so I'm very pleased to see it.
A couple of comments.
One of the problems we enountered with dynticks was that using the higher
resolution timers such as TSC and HPET to adjust for timer ticks over longer
periods when skipping ticks made the overall clock drift when run for many
days and only the PM Timer was not prone to this happening. ie the timers
were very accurate for short periods but over days it would drift. It could
well have been a design flaw in the dynticks I was maintaining rather than
the timers themselves but have you checked that this isn't a problem?
The other thing I note is that there is a reasonable amount of indirection in
fairly hot paths. It looks like there is scope for more local variable
storage of these indirect calls. Also if set_next_event is separated from
struct clock_event, the whole struct looks like a suitable candidate for
__read_only.
--
-ck
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-19 5:21 ` Con Kolivas
@ 2006-06-19 5:24 ` Con Kolivas
2006-06-19 12:26 ` Ingo Molnar
1 sibling, 0 replies; 29+ messages in thread
From: Con Kolivas @ 2006-06-19 5:24 UTC (permalink / raw)
To: tglx; +Cc: LKML, Andrew Morton, john stultz, Ingo Molnar
On Monday 19 June 2006 15:21, Con Kolivas wrote:
> struct clock_event, the whole struct looks like a suitable candidate for
> __read_only.
duh
__read_mostly I mean
--
-ck
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-18 19:50 ` Thomas Gleixner
@ 2006-06-19 12:09 ` Con Kolivas
2006-06-19 12:31 ` Thomas Gleixner
0 siblings, 1 reply; 29+ messages in thread
From: Con Kolivas @ 2006-06-19 12:09 UTC (permalink / raw)
To: tglx; +Cc: Michal Piotrowski, LKML, Andrew Morton, john stultz, Ingo Molnar
On Monday 19 June 2006 05:50, Thomas Gleixner wrote:
> Michal,
>
> On Sun, 2006-06-18 at 18:35 +0200, Michal Piotrowski wrote:
> > HARNING:
> > /lib/modules/2.6.17-hrt-dyntick1/kernel/sound/pci/ac97/snd-ac97-codec.ko
> > needs unknown symbol msecs_to_jiffies WARNING:
> > /lib/modules/2.6.17-hrt-dyntick1/kernel/drivers/net/skge.ko needs unknown
> > symbol jiffies_to_msecs WARNING:
> > /lib/modules/2.6.17-hrt-dyntick1/kernel/drivers/cpufreq/cpufreq_ondemand.
> >ko needs unknown symbol jiffies_to_usecs
> >
> > Here is fix small fix.
>
> Applied, thanks.
>
> New patch available at:
>
> http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick2.patch
>
> http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick2.patch
>es.tar.bz2
Also suffers from:
WARNING: "timespec_to_jiffies" [fs/fuse/fuse.ko] undefined!
Here is a fix
---
kernel/time.c | 2 ++
1 files changed, 2 insertions(+)
Index: linux-2.6.17-hrt-dyntick2.patch/kernel/time.c
===================================================================
--- linux-2.6.17-hrt-dyntick2.patch.orig/kernel/time.c 2006-06-19 22:02:32.000000000 +1000
+++ linux-2.6.17-hrt-dyntick2.patch/kernel/time.c 2006-06-19 22:08:39.000000000 +1000
@@ -773,6 +773,8 @@ timespec_to_jiffies(const struct timespe
}
+EXPORT_SYMBOL(timespec_to_jiffies);
+
void
jiffies_to_timespec(const unsigned long jiffies, struct timespec *value)
{
--
-ck
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-19 5:21 ` Con Kolivas
2006-06-19 5:24 ` Con Kolivas
@ 2006-06-19 12:26 ` Ingo Molnar
2006-06-19 14:03 ` Con Kolivas
1 sibling, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2006-06-19 12:26 UTC (permalink / raw)
To: Con Kolivas; +Cc: tglx, LKML, Andrew Morton, john stultz, Thomas Gleixner
* Con Kolivas <kernel@kolivas.org> wrote:
> Nice work Thomas and Ingo.
>
> The approach to previous dynticks that I was working on had some nasty
> issues with scalability that were not addressable without a complete
> rewrite which is why I abandoned the previous implementation. Your
> approach for using the hires timer events is ultimately a better
> solution and the code base is cleaner so I'm very pleased to see it.
thanks!
> A couple of comments.
>
> One of the problems we enountered with dynticks was that using the
> higher resolution timers such as TSC and HPET to adjust for timer
> ticks over longer periods when skipping ticks made the overall clock
> drift when run for many days and only the PM Timer was not prone to
> this happening. ie the timers were very accurate for short periods but
> over days it would drift. It could well have been a design flaw in the
> dynticks I was maintaining rather than the timers themselves but have
> you checked that this isn't a problem?
not yet. If it's a real problem we could introduce a 'make clock events
more reliable' framework by doing something like always programming
clock event sources into periodic mode and reading their current time
offset [if possible] when the event is processesed (thus compensating
for most of the drift caused by irq processing latency). But if it's not
needed it would be nice to avoid that complexity. I'm also wondering why
the PM timer was the most accurate in that regard - it's almost as slow
to program as the PIT, so i'd have expected it to to show the biggest
drift.
(another technique to reduce drift: we could increase the APIC-priority
of the lapic timer, making it less suspect to drift when there are lots
of other IRQs going on.)
can you think of any other similar 'weird cases' that you saw happen
with dynticks? For example there's the 'APIC stops timer irqs when
entering C3 mode' bug - any similar weirdness we should be careful
about? [right now the patch doesnt handle the C3 mode bug, but it should
be relatively straightforward to blacklist lapic events in that case]
i'm looking at dynticks-060227.patch right now, and there seem to be a
fair amount of dyntick specific changes to ACPI's processor_idle.c code.
Do you remember what those changes were about and should we pick them up
in one way or another?
> The other thing I note is that there is a reasonable amount of
> indirection in fairly hot paths. It looks like there is scope for more
> local variable storage of these indirect calls. [...]
which function(s) were you looking at when coming to this conclusion?
clockevents_init_next_event() perhaps? [we could certainly put
'sources->nextevent' into a local variable there]
> [...] Also if set_next_event is separated from struct clock_event, the
> whole struct looks like a suitable candidate for __read_mostly.
You mean ->event_handler()? We can make all clockevent instantiations
__read_mostly right now - all of the fields of clock_event are static,
even ->event_handler() will change at most once per bootup [when we
switch from low-res into high-res mode].
Ingo
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-19 12:09 ` Con Kolivas
@ 2006-06-19 12:31 ` Thomas Gleixner
2006-06-19 13:05 ` Con Kolivas
2006-06-19 21:58 ` mark gross
0 siblings, 2 replies; 29+ messages in thread
From: Thomas Gleixner @ 2006-06-19 12:31 UTC (permalink / raw)
To: Con Kolivas
Cc: Michal Piotrowski, LKML, Andrew Morton, john stultz, Ingo Molnar
On Mon, 2006-06-19 at 22:09 +1000, Con Kolivas wrote:
> Also suffers from:
> WARNING: "timespec_to_jiffies" [fs/fuse/fuse.ko] undefined!
>
> Here is a fix
Doh, where is the brown paperbag shop ?
Thanks, applied.
New queue:
http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick3.patch
http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick3.patches.tar.bz2
tglx
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-18 23:47 ` Roman Zippel
@ 2006-06-19 12:50 ` Ingo Molnar
2006-06-19 13:47 ` Roman Zippel
0 siblings, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2006-06-19 12:50 UTC (permalink / raw)
To: Roman Zippel
Cc: Thomas Gleixner, LKML, Andrew Morton, john stultz, Con Kolivas
* Roman Zippel <zippel@linux-m68k.org> wrote:
> > Bugreports and suggestions are welcome,
>
> Could you please document the patches? I know it sucks compared to
> hacking, but it would make a review a lot simpler.
yeah, we'll add some description to the patches themselves, but
otherwise i'm afraid it will be like with almost all patch submissions
on lkml: 99% of the details are in the code and people have to ask
specifically if one area or another is unclear :-|
Meanwhile the patch names should provide you with some initial info
(also, we reuse GTOD which is documented in -mm) and the splitup is
pretty clean too - but in any case please feel free to ask pointed
questions! (we happily accept documentation patches as well.)
Ingo
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-19 12:31 ` Thomas Gleixner
@ 2006-06-19 13:05 ` Con Kolivas
2006-06-19 13:10 ` Thomas Gleixner
2006-06-19 21:58 ` mark gross
1 sibling, 1 reply; 29+ messages in thread
From: Con Kolivas @ 2006-06-19 13:05 UTC (permalink / raw)
To: tglx; +Cc: Michal Piotrowski, LKML, Andrew Morton, john stultz, Ingo Molnar
On Monday 19 June 2006 22:31, Thomas Gleixner wrote:
> On Mon, 2006-06-19 at 22:09 +1000, Con Kolivas wrote:
> > Also suffers from:
> > WARNING: "timespec_to_jiffies" [fs/fuse/fuse.ko] undefined!
> >
> > Here is a fix
>
> Doh, where is the brown paperbag shop ?
>
> Thanks, applied.
>
> New queue:
>
> http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick3.patch
>
> http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick3.patch
>es.tar.bz2
Question:
In clockevents.c
setup_global_clockevent and recalc_events call ret=setup_event()
and they act on ret but setup_event always returns 0
Was more planned for setup_event() ?
--
-ck
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-19 13:05 ` Con Kolivas
@ 2006-06-19 13:10 ` Thomas Gleixner
0 siblings, 0 replies; 29+ messages in thread
From: Thomas Gleixner @ 2006-06-19 13:10 UTC (permalink / raw)
To: Con Kolivas
Cc: Michal Piotrowski, LKML, Andrew Morton, john stultz, Ingo Molnar
On Mon, 2006-06-19 at 23:05 +1000, Con Kolivas wrote:
> In clockevents.c
> setup_global_clockevent and recalc_events call ret=setup_event()
>
> and they act on ret but setup_event always returns 0
>
> Was more planned for setup_event() ?
In a previous version we did interrupt setup in setup_event(), but it
turned out to be too much hassle to pull in the arch specific quirks via
extra function pointers.
So the return value is just a remainder.
tglx
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-19 12:50 ` Ingo Molnar
@ 2006-06-19 13:47 ` Roman Zippel
0 siblings, 0 replies; 29+ messages in thread
From: Roman Zippel @ 2006-06-19 13:47 UTC (permalink / raw)
To: Ingo Molnar
Cc: Thomas Gleixner, LKML, Andrew Morton, john stultz, Con Kolivas
Hi,
On Mon, 19 Jun 2006, Ingo Molnar wrote:
> > > Bugreports and suggestions are welcome,
> >
> > Could you please document the patches? I know it sucks compared to
> > hacking, but it would make a review a lot simpler.
>
> yeah, we'll add some description to the patches themselves, but
The problem is this is not the first time I mentioned this and some
patches still have no descriptions at all! :-(
> otherwise i'm afraid it will be like with almost all patch submissions
> on lkml: 99% of the details are in the code and people have to ask
> specifically if one area or another is unclear :-|
For a lot of things this acceptable, but if patches (e.g. clockevents) add
new generic infrastructure which effect all archs, they need
documentation (unless you also provide all the arch specific changes).
> Meanwhile the patch names should provide you with some initial info
> (also, we reuse GTOD which is documented in -mm) and the splitup is
> pretty clean too - but in any case please feel free to ask pointed
> questions! (we happily accept documentation patches as well.)
I can't do this without documentation. Without any information I'm only
wondering why it has to be this complex.
For example clockevents, I think all the special event handlers are
overkill, a simple list would do just fine. This way it may also possible
to treat a clock as virtual interrupt source and we could share code with
interrupt code and a callback can simply be requested via request_irq().
More information about what this code actually intends to do and what it
is required to do, would help a great deal to judge alternative solutions,
but only the author of this code can really provide this information and
IMO it's really sad that this information is still lacking after being
requested multiple times.
bye, Roman
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-19 12:26 ` Ingo Molnar
@ 2006-06-19 14:03 ` Con Kolivas
2006-06-19 20:06 ` Thomas Gleixner
0 siblings, 1 reply; 29+ messages in thread
From: Con Kolivas @ 2006-06-19 14:03 UTC (permalink / raw)
To: Ingo Molnar; +Cc: tglx, LKML, Andrew Morton, john stultz, Thomas Gleixner
On Monday 19 June 2006 22:26, Ingo Molnar wrote:
> * Con Kolivas <kernel@kolivas.org> wrote:
> > One of the problems we enountered with dynticks was that using the
> > higher resolution timers such as TSC and HPET to adjust for timer
> > ticks over longer periods when skipping ticks made the overall clock
> > drift when run for many days and only the PM Timer was not prone to
> > this happening. ie the timers were very accurate for short periods but
> > over days it would drift. It could well have been a design flaw in the
> > dynticks I was maintaining rather than the timers themselves but have
> > you checked that this isn't a problem?
>
> not yet. If it's a real problem we could introduce a 'make clock events
> more reliable' framework by doing something like always programming
> clock event sources into periodic mode and reading their current time
> offset [if possible] when the event is processesed (thus compensating
> for most of the drift caused by irq processing latency). But if it's not
> needed it would be nice to avoid that complexity. I'm also wondering why
> the PM timer was the most accurate in that regard - it's almost as slow
> to program as the PIT, so i'd have expected it to to show the biggest
> drift.
>
> (another technique to reduce drift: we could increase the APIC-priority
> of the lapic timer, making it less suspect to drift when there are lots
> of other IRQs going on.)
Better to wait and see if it was an artefact of my dodgy code for recover
walltime and if this code doesn't have that issue.
> can you think of any other similar 'weird cases' that you saw happen
> with dynticks? For example there's the 'APIC stops timer irqs when
> entering C3 mode' bug - any similar weirdness we should be careful
> about? [right now the patch doesnt handle the C3 mode bug, but it should
> be relatively straightforward to blacklist lapic events in that case]
The hardware that also did C4 was more troublesome but for the same reasons
since it's a subset of C3. See Dominik's patches mentioned below which
address these high state transitions. There isn't anything else offhand I can
think of that I actually managed to track down :|
> i'm looking at dynticks-060227.patch right now, and there seem to be a
> fair amount of dyntick specific changes to ACPI's processor_idle.c code.
> Do you remember what those changes were about and should we pick them up
> in one way or another?
Dominik donated a lot of code to use the dynticks infrastructure to actually
implement the power savings. Just skipping ticks seemed to make very little
power difference unless we also used the knowledge from next timer interrupt
to know how long we are going to be idle and choose C state transitions
accordingly. Each patch is documented at length in the split out
C-States-1_bm_activity_improvements.patch
C-States-2_bm_activity_handling_improvement.patch
C-States-3_accounting_of_sleep_times.patch
C-States-4_dyn-ticks_tweaks.patch
http://ck.kolivas.org/patches/dyn-ticks/split-out/
> > The other thing I note is that there is a reasonable amount of
> > indirection in fairly hot paths. It looks like there is scope for more
> > local variable storage of these indirect calls. [...]
>
> which function(s) were you looking at when coming to this conclusion?
> clockevents_init_next_event() perhaps? [we could certainly put
> 'sources->nextevent' into a local variable there]
>From what I could see
hrtimer_restart_sched_tick() could use
struct hrtimer *sched_timer = &cpu_base->sched_timer;
clockevents_init_next_event() and clockevents_set_next_event() could use
struct clock_event *nextevt = sources->nextevt;
> > [...] Also if set_next_event is separated from struct clock_event, the
> > whole struct looks like a suitable candidate for __read_mostly.
>
> You mean ->event_handler()? We can make all clockevent instantiations
> __read_mostly right now - all of the fields of clock_event are static,
> even ->event_handler() will change at most once per bootup [when we
> switch from low-res into high-res mode].
Great, thanks!
--
-ck
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-18 18:28 ` Ingo Molnar
@ 2006-06-19 16:35 ` Michal Piotrowski
2006-06-19 19:51 ` Thomas Gleixner
0 siblings, 1 reply; 29+ messages in thread
From: Michal Piotrowski @ 2006-06-19 16:35 UTC (permalink / raw)
To: Ingo Molnar; +Cc: tglx, LKML, Andrew Morton, john stultz, Con Kolivas
Hi,
Ingo Molnar napisał(a):
> * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
>
>> WARNING: /lib/modules/2.6.17-hrt-dyntick1/kernel/sound/pci/ac97/snd-ac97-codec.ko needs unknown symbol msecs_to_jiffies
>> WARNING: /lib/modules/2.6.17-hrt-dyntick1/kernel/drivers/net/skge.ko needs unknown symbol jiffies_to_msecs
>> WARNING: /lib/modules/2.6.17-hrt-dyntick1/kernel/drivers/cpufreq/cpufreq_ondemand.ko needs unknown symbol jiffies_to_usecs
>> etc...
>>
>> warnings.
>>
>> Here is fix small fix.
>
> thanks. I've uploaded the current combo patch to:
>
> http://redhat.com/~mingo/high-res-timers-dyntick/hres-dyntick-combo-2.6.17-2.patch
>
> (this also includes work-in-progress x86_64 bits - the .config options
> are offered by dynticks are not yet functional there.)
>
> Ingo
>
Here is the last EXPORT_SYMBOL fix.
WARNING: /lib/modules/2.6.17-hrt-dyntick1/kernel/drivers/cpufreq/cpufreq_stats.ko needs unknown symbol jiffies_64_to_clock_t
BTW. APM doesn't compile.
/usr/src/linux-work1/arch/i386/kernel/apm.c: In function ‘apm_do_idle’:
/usr/src/linux-work1/arch/i386/kernel/apm.c:767: error: ‘TIF_POLLING_NRFLAG’ undeclared (first use in this function)
/usr/src/linux-work1/arch/i386/kernel/apm.c:767: error: (Each undeclared identifier is reported only once
/usr/src/linux-work1/arch/i386/kernel/apm.c:767: error: for each function it appears in.)
/usr/src/linux-work1/arch/i386/kernel/apm.c: In function ‘suspend’:
/usr/src/linux-work1/arch/i386/kernel/apm.c:1193: warning: ‘pm_send_all’ is deprecated (declared at /usr/src/linux-work1/inc
lude/linux/pm_legacy.h:26)
/usr/src/linux-work1/arch/i386/kernel/apm.c:1247: warning: ‘pm_send_all’ is deprecated (declared at /usr/src/linux-work1/inc
lude/linux/pm_legacy.h:26)
/usr/src/linux-work1/arch/i386/kernel/apm.c: In function ‘check_events’:
/usr/src/linux-work1/arch/i386/kernel/apm.c:1368: warning: ‘pm_send_all’ is deprecated (declared at /usr/src/linux-work1/inc
lude/linux/pm_legacy.h:26)
make[2]: *** [arch/i386/kernel/apm.o] Błąd 1
make[1]: *** [arch/i386/kernel] Błąd 2
make: *** [_all] Błąd 2
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
diff -uprN -X linux-work/Documentation/dontdiff linux-work-clean/kernel/time.c linux-work/kernel/time.c
--- linux-work-clean/kernel/time.c 2006-06-19 18:21:37.000000000 +0200
+++ linux-work/kernel/time.c 2006-06-19 18:18:37.000000000 +0200
@@ -879,6 +879,8 @@ u64 jiffies_64_to_clock_t(u64 x)
return x;
}
+EXPORT_SYMBOL(jiffies_64_to_clock_t);
+
u64 nsec_to_clock_t(u64 x)
{
#if (NSEC_PER_SEC % USER_HZ) == 0
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-19 16:35 ` Michal Piotrowski
@ 2006-06-19 19:51 ` Thomas Gleixner
2006-06-25 13:06 ` Steven Rostedt
0 siblings, 1 reply; 29+ messages in thread
From: Thomas Gleixner @ 2006-06-19 19:51 UTC (permalink / raw)
To: Michal Piotrowski
Cc: Ingo Molnar, LKML, Andrew Morton, john stultz, Con Kolivas
On Mon, 2006-06-19 at 18:35 +0200, Michal Piotrowski wrote:
> Here is the last EXPORT_SYMBOL fix.
> WARNING: /lib/modules/2.6.17-hrt-dyntick1/kernel/drivers/cpufreq/cpufreq_stats.ko needs unknown symbol jiffies_64_to_clock_t
Thanks, fixed.
> BTW. APM doesn't compile.
>
> /usr/src/linux-work1/arch/i386/kernel/apm.c: In function ‘apm_do_idle’:
> /usr/src/linux-work1/arch/i386/kernel/apm.c:767: error: ‘TIF_POLLING_NRFLAG’ undeclared (first use in this function)
Fixed, new patch at:
http://www.tglx.de/projects/hrtimers/2.6.17/linux-2.6.17-hrt-dyntick4.patch
http://www.tglx.de/projects/hrtimers/2.6.17/linux-2.6.17-hrt-dyntick4.patches.tar.bz2
tglx
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-19 14:03 ` Con Kolivas
@ 2006-06-19 20:06 ` Thomas Gleixner
2006-06-19 20:57 ` ACPI C-States algorithm updates for dyn-tick Dominik Brodowski
0 siblings, 1 reply; 29+ messages in thread
From: Thomas Gleixner @ 2006-06-19 20:06 UTC (permalink / raw)
To: Con Kolivas; +Cc: Ingo Molnar, LKML, Andrew Morton, john stultz
Con,
On Tue, 2006-06-20 at 00:03 +1000, Con Kolivas wrote:
> Dominik donated a lot of code to use the dynticks infrastructure to actually
> implement the power savings. Just skipping ticks seemed to make very little
> power difference unless we also used the knowledge from next timer interrupt
> to know how long we are going to be idle and choose C state transitions
> accordingly. Each patch is documented at length in the split out
>
> C-States-1_bm_activity_improvements.patch
> C-States-2_bm_activity_handling_improvement.patch
> C-States-3_accounting_of_sleep_times.patch
> C-States-4_dyn-ticks_tweaks.patch
>
> http://ck.kolivas.org/patches/dyn-ticks/split-out/
Thanks for pointing that out. We'll look into those tomorrow.
> hrtimer_restart_sched_tick() could use
> struct hrtimer *sched_timer = &cpu_base->sched_timer;
>
> clockevents_init_next_event() and clockevents_set_next_event() could use
> struct clock_event *nextevt = sources->nextevt;
>
> > > [...] Also if set_next_event is separated from struct clock_event, the
> > > whole struct looks like a suitable candidate for __read_mostly.
> >
> > You mean ->event_handler()? We can make all clockevent instantiations
> > __read_mostly right now - all of the fields of clock_event are static,
> > even ->event_handler() will change at most once per bootup [when we
> > switch from low-res into high-res mode].
Thanks for the review.
tglx
^ permalink raw reply [flat|nested] 29+ messages in thread
* ACPI C-States algorithm updates for dyn-tick
2006-06-19 20:06 ` Thomas Gleixner
@ 2006-06-19 20:57 ` Dominik Brodowski
2006-06-19 21:28 ` [1/4] ACPI C-States: accounting of sleep states Dominik Brodowski
0 siblings, 1 reply; 29+ messages in thread
From: Dominik Brodowski @ 2006-06-19 20:57 UTC (permalink / raw)
To: Thomas Gleixner, len.brown
Cc: Con Kolivas, Ingo Molnar, LKML, Andrew Morton, john stultz
Hi,
On Mon, Jun 19, 2006 at 10:06:51PM +0200, Thomas Gleixner wrote:
> On Tue, 2006-06-20 at 00:03 +1000, Con Kolivas wrote:
> > Dominik donated a lot of code to use the dynticks infrastructure to actually
> > implement the power savings. Just skipping ticks seemed to make very little
> > power difference unless we also used the knowledge from next timer interrupt
> > to know how long we are going to be idle and choose C state transitions
> > accordingly. Each patch is documented at length in the split out
> >
> > C-States-1_bm_activity_improvements.patch
> > C-States-2_bm_activity_handling_improvement.patch
> > C-States-3_accounting_of_sleep_times.patch
> > C-States-4_dyn-ticks_tweaks.patch
> >
> > http://ck.kolivas.org/patches/dyn-ticks/split-out/
>
> Thanks for pointing that out. We'll look into those tomorrow.
1 to 3 were already submitted to Len, as they're useful already right now.
(Len: do you want me to re-submit, as I can't find them in a git tree right
now?) The fourth one is the only dyn-tick-specific one, and probably needs
some more tweaking, testing and benchmarking.
Dominik
^ permalink raw reply [flat|nested] 29+ messages in thread
* [1/4] ACPI C-States: accounting of sleep states
2006-06-19 20:57 ` ACPI C-States algorithm updates for dyn-tick Dominik Brodowski
@ 2006-06-19 21:28 ` Dominik Brodowski
2006-06-19 21:29 ` [2/4] ACPI C-States: bm_activity improvements Dominik Brodowski
0 siblings, 1 reply; 29+ messages in thread
From: Dominik Brodowski @ 2006-06-19 21:28 UTC (permalink / raw)
To: Thomas Gleixner, len.brown
Cc: Con Kolivas, Ingo Molnar, LKML, Andrew Morton, john stultz
Track the actual time spent in C-States (C2 upwards, we can't
determine this for C1), not only the number of invocations. This is
especially useful for dynamic ticks / "tickless systems", but is also
of interest on normal systems, as any interrupt activity leads to
C-States being exited, not only the timer interrupt.
The time is being measured in PM timer ticks, so an increase by one equals
279 nanoseconds.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
drivers/acpi/processor_idle.c | 10 ++++++----
include/acpi/processor.h | 1 +
2 files changed, 7 insertions(+), 4 deletions(-)
3997a08ff5aa0553dfff81801c3690a5c91ac7bc
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 80fa434..4f166fa 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -322,8 +322,6 @@ static void acpi_processor_idle(void)
cx = &pr->power.states[ACPI_STATE_C1];
#endif
- cx->usage++;
-
/*
* Sleep:
* ------
@@ -421,6 +419,9 @@ static void acpi_processor_idle(void)
local_irq_enable();
return;
}
+ cx->usage++;
+ if ((cx->type != ACPI_STATE_C1) && (sleep_ticks > 0))
+ cx->time += sleep_ticks;
next_state = pr->power.state;
@@ -1055,9 +1056,10 @@ static int acpi_processor_power_seq_show
else
seq_puts(seq, "demotion[--] ");
- seq_printf(seq, "latency[%03d] usage[%08d]\n",
+ seq_printf(seq, "latency[%03d] usage[%08d] duration[%020llu]\n",
pr->power.states[i].latency,
- pr->power.states[i].usage);
+ pr->power.states[i].usage,
+ pr->power.states[i].time);
}
end:
diff --git a/include/acpi/processor.h b/include/acpi/processor.h
index badf027..ca0e031 100644
--- a/include/acpi/processor.h
+++ b/include/acpi/processor.h
@@ -51,6 +51,7 @@ struct acpi_processor_cx {
u32 latency_ticks;
u32 power;
u32 usage;
+ u64 time;
struct acpi_processor_cx_policy promotion;
struct acpi_processor_cx_policy demotion;
};
--
1.2.4
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [2/4] ACPI C-States: bm_activity improvements
2006-06-19 21:28 ` [1/4] ACPI C-States: accounting of sleep states Dominik Brodowski
@ 2006-06-19 21:29 ` Dominik Brodowski
2006-06-19 21:31 ` [3/4] ACPI C-States: only demote on current bus mastering activity Dominik Brodowski
0 siblings, 1 reply; 29+ messages in thread
From: Dominik Brodowski @ 2006-06-19 21:29 UTC (permalink / raw)
To: Thomas Gleixner, len.brown
Cc: Con Kolivas, Ingo Molnar, LKML, Andrew Morton, john stultz
Do not assume there was bus mastering activity if the idle handler didn't
get called, as there's only reason to not enter C3-type sleep if there is
bus master activity going on. Only for the "promotion" into C3-type sleep
bus mastering activity is taken into account, and there only current bus
mastering activity, and not pure guessing should lead to the decision on
whether to enter C3-type sleep or not.
Also, as bm_activity is a jiffy-based bitmask (bit 0: bus mastering activity
during this juffy, bit 31: bus mastering activity 31 jiffies ago), fix the
setting of bit 0, as it might be called multiple times within one jiffy.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
drivers/acpi/processor_idle.c | 18 ++++++------------
1 files changed, 6 insertions(+), 12 deletions(-)
2e1b29fabc1085e1ab5b05dcac5d59e82c633668
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 4f166fa..29470e1 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -3,7 +3,7 @@
*
* Copyright (C) 2001, 2002 Andy Grover <andrew.grover@intel.com>
* Copyright (C) 2001, 2002 Paul Diefenbaugh <paul.s.diefenbaugh@intel.com>
- * Copyright (C) 2004 Dominik Brodowski <linux@brodo.de>
+ * Copyright (C) 2004, 2005 Dominik Brodowski <linux@brodo.de>
* Copyright (C) 2004 Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
* - Added processor hotplug support
* Copyright (C) 2005 Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
@@ -261,21 +261,15 @@ static void acpi_processor_idle(void)
u32 bm_status = 0;
unsigned long diff = jiffies - pr->power.bm_check_timestamp;
- if (diff > 32)
- diff = 32;
+ if (diff > 31)
+ diff = 31;
- while (diff) {
- /* if we didn't get called, assume there was busmaster activity */
- diff--;
- if (diff)
- pr->power.bm_activity |= 0x1;
- pr->power.bm_activity <<= 1;
- }
+ pr->power.bm_activity <<= diff;
acpi_get_register(ACPI_BITREG_BUS_MASTER_STATUS,
&bm_status, ACPI_MTX_DO_NOT_LOCK);
if (bm_status) {
- pr->power.bm_activity++;
+ pr->power.bm_activity |= 0x1;
acpi_set_register(ACPI_BITREG_BUS_MASTER_STATUS,
1, ACPI_MTX_DO_NOT_LOCK);
}
@@ -287,7 +281,7 @@ static void acpi_processor_idle(void)
else if (errata.piix4.bmisx) {
if ((inb_p(errata.piix4.bmisx + 0x02) & 0x01)
|| (inb_p(errata.piix4.bmisx + 0x0A) & 0x01))
- pr->power.bm_activity++;
+ pr->power.bm_activity |= 0x1;
}
pr->power.bm_check_timestamp = jiffies;
--
1.2.4
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [3/4] ACPI C-States: only demote on current bus mastering activity
2006-06-19 21:29 ` [2/4] ACPI C-States: bm_activity improvements Dominik Brodowski
@ 2006-06-19 21:31 ` Dominik Brodowski
2006-06-19 21:33 ` [4/4 -- only for discussion] ACPI C-States: dyn-ticks-improvements (for -ck implementation) Dominik Brodowski
0 siblings, 1 reply; 29+ messages in thread
From: Dominik Brodowski @ 2006-06-19 21:31 UTC (permalink / raw)
To: Thomas Gleixner, len.brown
Cc: Con Kolivas, Ingo Molnar, LKML, Andrew Morton, john stultz
Only if bus master activity is going on at the present, we should
avoid entering C3-type sleep, as it might be a faulty transition. As long
as the bm_activity bitmask was based on the number of calls to the ACPI
idle function, looking at previous moments made sense. Now, with it being
based on what happened this jiffy, looking at this jiffy should be
sufficient.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
drivers/acpi/processor_idle.c | 7 ++++---
1 files changed, 4 insertions(+), 3 deletions(-)
Index: linux-2.6.16-rc5-dt/drivers/acpi/processor_idle.c
===================================================================
--- linux-2.6.16-rc5-dt.orig/drivers/acpi/processor_idle.c 2006-02-27 20:32:58.000000000 +1100
+++ linux-2.6.16-rc5-dt/drivers/acpi/processor_idle.c 2006-02-27 20:32:59.000000000 +1100
@@ -287,10 +287,10 @@ static void acpi_processor_idle(void)
pr->power.bm_check_timestamp = jiffies;
/*
- * Apply bus mastering demotion policy. Automatically demote
+ * If bus mastering is or was active this jiffy, demote
* to avoid a faulty transition. Note that the processor
* won't enter a low-power state during this call (to this
- * funciton) but should upon the next.
+ * function) but should upon the next.
*
* TBD: A better policy might be to fallback to the demotion
* state (use it for this quantum only) istead of
@@ -298,7 +298,8 @@ static void acpi_processor_idle(void)
* qualification. This may, however, introduce DMA
* issues (e.g. floppy DMA transfer overrun/underrun).
*/
- if (pr->power.bm_activity & cx->demotion.threshold.bm) {
+ if ((pr->power.bm_activity & 0x1) &&
+ cx->demotion.threshold.bm) {
local_irq_enable();
next_state = cx->demotion.state;
goto end;
^ permalink raw reply [flat|nested] 29+ messages in thread
* [4/4 -- only for discussion] ACPI C-States: dyn-ticks-improvements (for -ck implementation)
2006-06-19 21:31 ` [3/4] ACPI C-States: only demote on current bus mastering activity Dominik Brodowski
@ 2006-06-19 21:33 ` Dominik Brodowski
0 siblings, 0 replies; 29+ messages in thread
From: Dominik Brodowski @ 2006-06-19 21:33 UTC (permalink / raw)
To: Thomas Gleixner, len.brown
Cc: Con Kolivas, Ingo Molnar, LKML, Andrew Morton, john stultz
Note: this was for Con's implementation of dynamic ticks, and probably
doesn't even compile with this new patchset. However, some ideas in this
patchset may be useful for improving the C-States algorithm:
If dyn-ticks is enabled, we can and should try to be smart when
deciding which C-State to enter.
If we're likely not to wake up soon, we can "kick back" to the high C-State
the system was in before bus mastering activity was present.
If we slept for a long period of time last time, and we're scheduled to do
so again, we can enter a higher (or even the next higher) C-State.
(fastpath, super-fastpath promotion).
If bus mastering activity was detected this jiffy, schedule an extra
early wakeup: most likely there's something to handle then anyways, and
we hope this bus mastering activity will end soon, allowing us to utilize
high C-States afterwards.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Index: working-tree/drivers/acpi/processor_idle.c
===================================================================
--- working-tree.orig/drivers/acpi/processor_idle.c
+++ working-tree/drivers/acpi/processor_idle.c
@@ -38,6 +38,7 @@
#include <linux/dmi.h>
#include <linux/moduleparam.h>
#include <linux/sched.h> /* need_resched() */
+#include <linux/dyn-tick.h>
#include <asm/io.h>
#include <asm/uaccess.h>
@@ -60,6 +61,8 @@ module_param(max_cstate, uint, 0644);
static unsigned int nocst = 0;
module_param(nocst, uint, 0000);
+#define BM_JIFFIES (HZ >= 800 ? 2 : 1)
+
/*
* bm_history -- bit-mask with a bit per jiffy of bus-master activity
* 1000 HZ: 0xFFFFFFFF: 32 jiffies = 32ms
@@ -264,6 +267,8 @@ static void acpi_processor_idle(void)
if ((pr->power.bm_activity & 0x1) &&
cx->demotion.threshold.bm) {
local_irq_enable();
+ if (!pr->power.pre_bm_state)
+ pr->power.pre_bm_state = cx;
next_state = cx->demotion.state;
goto end;
}
@@ -281,6 +286,69 @@ static void acpi_processor_idle(void)
#endif
/*
+ * Some special policy tweaks for dynamic ticks
+ */
+ if (dyn_tick_enabled()) {
+ /*
+ * Kick-back promotion: promote to C-State used before bm
+ * activity was detected if
+ * - we have a pre-bm-state
+ * - we do not have bus mastering at the moment
+ * - we're scheduled to sleep at least BM_JIFFIES now
+ */
+ if (pr->power.pre_bm_state &&
+ !(pr->power.bm_activity & 0x1) &&
+ (dyn_tick_current_skip() >= BM_JIFFIES)) {
+ local_irq_enable();
+ next_state = pr->power.pre_bm_state;
+ pr->power.pre_bm_state = NULL;
+ goto end;
+ }
+
+ /*
+ * Fast-path promotion: promote to higher state if
+ * - we can promote
+ * - there is no bm_activity this tick
+ * - we slept more than BM_JIFFIES ticks last time
+ * - we're scheduled to sleep at least BM_JIFFIES ticks
+ */
+ if (cx->promotion.state &&
+ !(pr->power.bm_activity & 0x1) &&
+ (pr->power.last_sleep > BM_JIFFIES) &&
+ (dyn_tick_current_skip() >= BM_JIFFIES)) {
+ local_irq_enable();
+ next_state = cx->promotion.state;
+ /*
+ * Super-fast-path: promote to next higher state if
+ * - we can promote
+ * - we did sleep longer than 2 * BM_JIFFIES
+ * times last time
+ * - we're scheduled to sleep at least 2 *
+ * BM_JIFFIES ticks
+ */
+ if ((next_state->promotion.state) &&
+ (pr->power.last_sleep > 2 * BM_JIFFIES) &&
+ (dyn_tick_current_skip() >= 2 * BM_JIFFIES))
+ next_state = next_state->promotion.state;
+ pr->power.pre_bm_state = NULL;
+ goto end;
+ }
+
+ /*
+ * Re-program if bm activity is present this jiffy -- we hope
+ * that it ends soon, so that we can go into a deeper sleep
+ * type
+ */
+ if (cx->demotion.state &&
+ (pr->power.bm_activity & 0x1) &&
+ (pr->power.bm_check_timestamp == jiffies)) {
+ dyn_early_reprogram(BM_JIFFIES);
+ }
+ }
+
+ pr->power.last_sleep = 0;
+
+ /*
* Sleep:
* ------
* Invoke the current Cx state to put the processor to sleep.
@@ -377,9 +445,13 @@ static void acpi_processor_idle(void)
local_irq_enable();
return;
}
+
+ /* Accounting */
cx->usage++;
if ((cx->type != ACPI_STATE_C1) && (sleep_ticks > 0))
cx->time += sleep_ticks;
+ pr->power.last_sleep = sleep_ticks / (PM_TIMER_FREQUENCY / HZ);
+
next_state = pr->power.state;
@@ -413,10 +485,12 @@ static void acpi_processor_idle(void)
promotion.threshold.bm)) {
next_state =
cx->promotion.state;
+ pr->power.pre_bm_state = NULL;
goto end;
}
} else {
next_state = cx->promotion.state;
+ pr->power.pre_bm_state = NULL;
goto end;
}
}
@@ -434,6 +508,7 @@ static void acpi_processor_idle(void)
cx->demotion.count++;
cx->promotion.count = 0;
if (cx->demotion.count >= cx->demotion.threshold.count) {
+ pr->power.pre_bm_state = NULL;
next_state = cx->demotion.state;
goto end;
}
Index: working-tree/include/acpi/processor.h
===================================================================
--- working-tree.orig/include/acpi/processor.h
+++ working-tree/include/acpi/processor.h
@@ -61,8 +61,10 @@ struct acpi_processor_power {
unsigned long bm_check_timestamp;
u32 default_state;
u32 bm_activity;
+ u16 last_sleep;
int count;
struct acpi_processor_cx states[ACPI_PROCESSOR_MAX_POWER];
+ struct acpi_processor_cx *pre_bm_state;
/* the _PDC objects passed by the driver, if any */
struct acpi_object_list *pdc;
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-19 12:31 ` Thomas Gleixner
2006-06-19 13:05 ` Con Kolivas
@ 2006-06-19 21:58 ` mark gross
2006-06-19 22:19 ` Thomas Gleixner
1 sibling, 1 reply; 29+ messages in thread
From: mark gross @ 2006-06-19 21:58 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Con Kolivas, Michal Piotrowski, LKML, Andrew Morton, john stultz,
Ingo Molnar
On Mon, Jun 19, 2006 at 02:31:44PM +0200, Thomas Gleixner wrote:
> On Mon, 2006-06-19 at 22:09 +1000, Con Kolivas wrote:
> > Also suffers from:
> > WARNING: "timespec_to_jiffies" [fs/fuse/fuse.ko] undefined!
> >
> > Here is a fix
>
> Doh, where is the brown paperbag shop ?
>
> Thanks, applied.
>
> New queue:
>
> http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick3.patch
>
> http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick3.patches.tar.bz2
>
I'm just giving this a test spin now on my desktop boot. looking at uptime and cat /proc/interrupts
~/work> uptime;cat /proc/interrupts
2:51pm up 0:28, 5 users, load average: 0.00, 0.02, 0.08
CPU0
0: 80007 XT-PIC timer
1: 1776 XT-PIC i8042
2: 0 XT-PIC cascade
8: 2 XT-PIC rtc
9: 0 XT-PIC acpi
11: 2156 XT-PIC eth0
12: 2879 XT-PIC i8042
14: 20402 XT-PIC ide0
15: 11 XT-PIC ide1
NMI: 0
LOC: 0
ERR: 0
MIS: 0
or about 47.6 timer's a second.
This system is mostly idle, is this about right or should I expect even fewer timer ticks?
Is there a way to see timer stats?
FWIW Its nice to see this stuff start getting real.
thanks
--mgross
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-19 21:58 ` mark gross
@ 2006-06-19 22:19 ` Thomas Gleixner
2006-06-21 12:54 ` Felix Oxley
0 siblings, 1 reply; 29+ messages in thread
From: Thomas Gleixner @ 2006-06-19 22:19 UTC (permalink / raw)
To: mgross
Cc: Con Kolivas, Michal Piotrowski, LKML, Andrew Morton, john stultz,
Ingo Molnar
Mark,
On Mon, 2006-06-19 at 14:58 -0700, mark gross wrote:
> I'm just giving this a test spin now on my desktop boot. looking at uptime and cat /proc/interrupts
> ~/work> uptime;cat /proc/interrupts
> 2:51pm up 0:28, 5 users, load average: 0.00, 0.02, 0.08
> CPU0
> 0: 80007 XT-PIC timer
> 1: 1776 XT-PIC i8042
> 2: 0 XT-PIC cascade
> 8: 2 XT-PIC rtc
> 9: 0 XT-PIC acpi
> 11: 2156 XT-PIC eth0
> 12: 2879 XT-PIC i8042
> 14: 20402 XT-PIC ide0
> 15: 11 XT-PIC ide1
> NMI: 0
> LOC: 0
> ERR: 0
> MIS: 0
>
> or about 47.6 timer's a second.
>
> This system is mostly idle, is this about right or should I expect even fewer timer ticks?
We did not track down all culprits of useless timer schedules, but there
are definitely a couple of user space programs which we identified, e.g.
redhat network, debian and ubuntu updates applets and similar
candidates.
> Is there a way to see timer stats?
Enable timer stats in the kernel config
$ echo start >/proc/timer_input
$ do nothing for a while
$ cat /proc/timer_info
You get something like:
Function counter - Timer Top v0.9.9
collection period: 19.7 seconds
1 0 swapper verify_tsc_freq (verify_tsc_freq)
1 6 events/0 do_cache_clean (delayed_work_timer_fn)
1 0 swapper i8042_interrupt (i8042_timer_func)
2 0 swapper neigh_periodic_timer (neigh_periodic_timer)
2 148 pdflush wb_kupdate (wb_timer_fn)
1 1 swapper init_tsc_clocksource (verify_tsc_freq)
4 6 events/0 cache_reap (delayed_work_timer_fn)
1 2508 sh get_transaction (commit_timeout)
1 0 swapper page_writeback_init (wb_timer_fn)
4 1 init schedule_timeout (process_timeout)
9 0 swapper e100_watchdog (e100_watchdog)
9 0 swapper dev_watchdog (dev_watchdog)
6 1 swapper schedule_delayed_work_on (delayed_work_timer_fn)
1 1 swapper neigh_table_init_no_netlink (neigh_periodic_timer)
1 1 swapper i8042_probe (i8042_timer_func)
3 3317 bash schedule_timeout (process_timeout)
1 2973 ifconfig e100_up (e100_watchdog)
1 0 swapper __netdev_watchdog_up (dev_watchdog)
4 893 kirqd schedule_timeout (process_timeout)
5 0 swapper tty_flip_buffer_push (delayed_work_timer_fn)
repeat over time to find the tick wasters.
Also:
$ cat /proc/stat | grep nohz
gives you some stats about the idle state
nohz cpu0 I:37390 S:30025 T:2108551 A:70 E: 26891
nohz cpu1 I:15480 S:12614 T:2109231 A:49 E: 14508
nohz total I:52870 S:42639 T:4217782 A:98 E:41399
where:
I: number of idle calls
S: number of idle calls, which can sleep for at least one tick
T: total time (in tick units) slept in idle calls
A: average sleep time per "can sleep" idle call (in tick units)
E: number of timer interrupt events
You might also try to slow down the timer wheel activity by batching the
timeouts into multiples of the scheduler tick (HZ) by doing
$ echo $FACTOR >/proc/sys/kernel/timeout_granularity
e.g.
FACTOR=10 batches the timer wheel timers to 10ms on a HZ=1000 kernel
FACTOR=20 batches the timer wheel timers to 40ms on a HZ=250 kernel
.....
> FWIW Its nice to see this stuff start getting real.
Thanks,
tglx
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-19 22:19 ` Thomas Gleixner
@ 2006-06-21 12:54 ` Felix Oxley
2006-06-21 13:07 ` Thomas Gleixner
0 siblings, 1 reply; 29+ messages in thread
From: Felix Oxley @ 2006-06-21 12:54 UTC (permalink / raw)
To: tglx
Cc: mgross, Con Kolivas, Michal Piotrowski, LKML, Andrew Morton,
john stultz, Ingo Molnar
On 19 Jun 2006, at 23:19, Thomas Gleixner wrote:
> FACTOR=20 batches the timer wheel timers to 40ms on a HZ=250 kernel
Should that read 80ms?
//felix
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-21 12:54 ` Felix Oxley
@ 2006-06-21 13:07 ` Thomas Gleixner
0 siblings, 0 replies; 29+ messages in thread
From: Thomas Gleixner @ 2006-06-21 13:07 UTC (permalink / raw)
To: Felix Oxley
Cc: mgross, Con Kolivas, Michal Piotrowski, LKML, Andrew Morton,
john stultz, Ingo Molnar
On Wed, 2006-06-21 at 13:54 +0100, Felix Oxley wrote:
> On 19 Jun 2006, at 23:19, Thomas Gleixner wrote:
>
> > FACTOR=20 batches the timer wheel timers to 40ms on a HZ=250 kernel
>
> Should that read 80ms?
Doh, I lost my abacus and it's hard to find a good replacement :)
tglx
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-19 19:51 ` Thomas Gleixner
@ 2006-06-25 13:06 ` Steven Rostedt
2006-06-25 14:26 ` Thomas Gleixner
0 siblings, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2006-06-25 13:06 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Michal Piotrowski, Ingo Molnar, LKML, Andrew Morton, john stultz,
Con Kolivas
Hi Thomas,
I finally was able to try -V5. And hit the following:
WARNING: "hrtimer_stop_sched_tick" [drivers/acpi/processor.ko] undefined!
WARNING: "hrtimer_restart_sched_tick" [drivers/acpi/processor.ko] undefined!
Here's the patch.
-- Steve
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Index: linux-2.6.17/kernel/hrtimer.c
===================================================================
--- linux-2.6.17.orig/kernel/hrtimer.c 2006-06-24 18:47:22.000000000 -0400
+++ linux-2.6.17/kernel/hrtimer.c 2006-06-24 18:48:03.000000000 -0400
@@ -550,6 +550,7 @@ int hrtimer_stop_sched_tick(void)
return need_resched();
}
+EXPORT_SYMBOL_GPL(hrtimer_stop_sched_tick);
void hrtimer_restart_sched_tick(void)
{
@@ -584,6 +585,7 @@ void hrtimer_restart_sched_tick(void)
HRTIMER_ABS);
local_irq_restore(flags);
}
+EXPORT_SYMBOL_GPL(hrtimer_restart_sched_tick);
void show_no_hz_stats(struct seq_file *p)
{
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
2006-06-25 13:06 ` Steven Rostedt
@ 2006-06-25 14:26 ` Thomas Gleixner
0 siblings, 0 replies; 29+ messages in thread
From: Thomas Gleixner @ 2006-06-25 14:26 UTC (permalink / raw)
To: Steven Rostedt
Cc: Michal Piotrowski, Ingo Molnar, LKML, Andrew Morton, john stultz,
Con Kolivas
On Sun, 2006-06-25 at 09:06 -0400, Steven Rostedt wrote:
> Hi Thomas,
>
> I finally was able to try -V5. And hit the following:
>
> WARNING: "hrtimer_stop_sched_tick" [drivers/acpi/processor.ko] undefined!
> WARNING: "hrtimer_restart_sched_tick" [drivers/acpi/processor.ko] undefined!
Thanks, applied.
tglx
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2006-06-25 14:24 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-18 15:10 [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ Thomas Gleixner
2006-06-18 16:35 ` Michal Piotrowski
2006-06-18 18:28 ` Ingo Molnar
2006-06-19 16:35 ` Michal Piotrowski
2006-06-19 19:51 ` Thomas Gleixner
2006-06-25 13:06 ` Steven Rostedt
2006-06-25 14:26 ` Thomas Gleixner
2006-06-18 19:50 ` Thomas Gleixner
2006-06-19 12:09 ` Con Kolivas
2006-06-19 12:31 ` Thomas Gleixner
2006-06-19 13:05 ` Con Kolivas
2006-06-19 13:10 ` Thomas Gleixner
2006-06-19 21:58 ` mark gross
2006-06-19 22:19 ` Thomas Gleixner
2006-06-21 12:54 ` Felix Oxley
2006-06-21 13:07 ` Thomas Gleixner
2006-06-18 23:47 ` Roman Zippel
2006-06-19 12:50 ` Ingo Molnar
2006-06-19 13:47 ` Roman Zippel
2006-06-19 5:21 ` Con Kolivas
2006-06-19 5:24 ` Con Kolivas
2006-06-19 12:26 ` Ingo Molnar
2006-06-19 14:03 ` Con Kolivas
2006-06-19 20:06 ` Thomas Gleixner
2006-06-19 20:57 ` ACPI C-States algorithm updates for dyn-tick Dominik Brodowski
2006-06-19 21:28 ` [1/4] ACPI C-States: accounting of sleep states Dominik Brodowski
2006-06-19 21:29 ` [2/4] ACPI C-States: bm_activity improvements Dominik Brodowski
2006-06-19 21:31 ` [3/4] ACPI C-States: only demote on current bus mastering activity Dominik Brodowski
2006-06-19 21:33 ` [4/4 -- only for discussion] ACPI C-States: dyn-ticks-improvements (for -ck implementation) Dominik Brodowski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox