* [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging
@ 2008-12-22 1:24 Frederic Weisbecker
2008-12-22 2:07 ` Frans Pop
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2008-12-22 1:24 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Thomas Gleixner, Linux Kernel
Impact: avoid hanging on slow systems
While using the function graph tracer on a virtualized system, the hrtimer_interrupt
can hang the system on an infinite loop.
This can be caused on several situation where something intrusive is slowing the
system (ie: tracing) and the next clock events to program are always before the current
time.
This patch implements a reasonable compromise. If such a situation is detected, we share
the CPUs time in 1/4 to process the hrtimer interrupts. This is enough to let the system
running without serious starvation.
It has been successfully tested under VirtualBox with 1000 HZ and 100 HZ with function graph
tracer launched. On both cases, the clock events were increased until about 25 ms periodic ticks,
which means 40 HZ.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
kernel/hrtimer.c | 30 +++++++++++++++++++++++++++++-
1 files changed, 29 insertions(+), 1 deletions(-)
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index bda9cb9..02f2477 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1171,6 +1171,29 @@ static void __run_hrtimer(struct hrtimer *timer)
#ifdef CONFIG_HIGH_RES_TIMERS
+static int force_clock_reprogram;
+
+/*
+ * After 5 iteration's attempts, we consider that hrtimer_interrupt()
+ * is hanging, which could happen with something that slows the interrupt
+ * such as the tracing. Then we force the clock reprogramming for each future
+ * hrtimer interrupts to avoid infinite loops and use the min_delta_ns
+ * threshold that we will overwrite.
+ * The next tick event will be scheduled to 3 times we currently spend on
+ * hrtimer_interrupt(). This gives a good compromise, the cpus will spend
+ * 1/4 of their time to process the hrtimer interrupts. This is enough to
+ * let it running without serious starvation.
+ */
+
+static inline void
+hrtimer_interrupt_hanging(struct clock_event_device *dev,
+ ktime_t try_time)
+{
+ force_clock_reprogram = 1;
+ dev->min_delta_ns = (unsigned long)try_time.tv64 * 3;
+ printk(KERN_WARNING "hrtimer: interrupt too slow, "
+ "forcing clock min delta to %lu ns\n", dev->min_delta_ns);
+}
/*
* High resolution timer interrupt
* Called with interrupts disabled
@@ -1180,6 +1203,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
struct hrtimer_clock_base *base;
ktime_t expires_next, now;
+ int nr_retries = 0;
int i;
BUG_ON(!cpu_base->hres_active);
@@ -1187,6 +1211,10 @@ void hrtimer_interrupt(struct clock_event_device *dev)
dev->next_event.tv64 = KTIME_MAX;
retry:
+ /* 5 retries is enough to notice a hang */
+ if (!(++nr_retries % 5))
+ hrtimer_interrupt_hanging(dev, ktime_sub(ktime_get(), now));
+
now = ktime_get();
expires_next.tv64 = KTIME_MAX;
@@ -1239,7 +1267,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
/* Reprogramming necessary ? */
if (expires_next.tv64 != KTIME_MAX) {
- if (tick_program_event(expires_next, 0))
+ if (tick_program_event(expires_next, force_clock_reprogram))
goto retry;
}
}
--
1.6.0.4
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging
2008-12-22 1:24 [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging Frederic Weisbecker
@ 2008-12-22 2:07 ` Frans Pop
2008-12-22 7:00 ` Ingo Molnar
2008-12-22 9:04 ` Frédéric Weisbecker
2008-12-22 15:17 ` Cyrill Gorcunov
2008-12-27 10:53 ` Ingo Molnar
2 siblings, 2 replies; 12+ messages in thread
From: Frans Pop @ 2008-12-22 2:07 UTC (permalink / raw)
To: Frederic Weisbecker; +Cc: mingo, tglx, linux-kernel
> Impact: avoid hanging on slow systems
>
> While using the function graph tracer on a virtualized system, the
> hrtimer_interrupt can hang the system on an infinite loop.
> This can be caused on several situation where something intrusive is
> slowing the system (ie: tracing) and the next clock events to program
> are always before the current time.
> This patch implements a reasonable compromise. If such a situation is
> detected, we share the CPUs time in 1/4 to process the hrtimer
> interrupts. This is enough to let the system running without serious
> starvation.
Should there maybe also be a mechanism to allow the system to automatically
"recover" to higher (the original?) clockfrequencies, for example if the
danger of loops has passed after tracing has been disabled?
> It has been successfully tested under VirtualBox with 1000 HZ and 100
> HZ with function graph tracer launched. On both cases, the clock events
> were increased until about 25 ms periodic ticks, which means 40 HZ.
>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> ---
> kernel/hrtimer.c | 30 +++++++++++++++++++++++++++++-
> 1 files changed, 29 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index bda9cb9..02f2477 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -1171,6 +1171,29 @@ static void __run_hrtimer(struct hrtimer *timer)
>
> #ifdef CONFIG_HIGH_RES_TIMERS
>
> +static int force_clock_reprogram;
Shouldn't this be initialized to 0?
> +
> +/*
> + * After 5 iteration's attempts, we consider that hrtimer_interrupt()
> + * is hanging, which could happen with something that slows the interrupt
> + * such as the tracing. Then we force the clock reprogramming for each future
> + * hrtimer interrupts to avoid infinite loops and use the min_delta_ns
> + * threshold that we will overwrite.
> + * The next tick event will be scheduled to 3 times we currently spend on
> + * hrtimer_interrupt(). This gives a good compromise, the cpus will spend
> + * 1/4 of their time to process the hrtimer interrupts. This is enough to
> + * let it running without serious starvation.
> + */
> +
> +static inline void
> +hrtimer_interrupt_hanging(struct clock_event_device *dev,
> + ktime_t try_time)
> +{
> + force_clock_reprogram = 1;
> + dev->min_delta_ns = (unsigned long)try_time.tv64 * 3;
> + printk(KERN_WARNING "hrtimer: interrupt too slow, "
> + "forcing clock min delta to %lu ns\n", dev->min_delta_ns);
> +}
> /*
> * High resolution timer interrupt
> * Called with interrupts disabled
> @@ -1180,6 +1203,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
> struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
> struct hrtimer_clock_base *base;
> ktime_t expires_next, now;
> + int nr_retries = 0;
> int i;
>
> BUG_ON(!cpu_base->hres_active);
> @@ -1187,6 +1211,10 @@ void hrtimer_interrupt(struct clock_event_device *dev)
> dev->next_event.tv64 = KTIME_MAX;
>
> retry:
> + /* 5 retries is enough to notice a hang */
> + if (!(++nr_retries % 5))
> + hrtimer_interrupt_hanging(dev, ktime_sub(ktime_get(), now)); +
> now = ktime_get();
>
> expires_next.tv64 = KTIME_MAX;
> @@ -1239,7 +1267,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
> /* Reprogramming necessary ? */
> if (expires_next.tv64 != KTIME_MAX) {
> - if (tick_program_event(expires_next, 0))
> + if (tick_program_event(expires_next, force_clock_reprogram))
> goto retry;
> }
> }
Shouldn't force_clock_reprogram be reset to 0 after it has fired and been
handled?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging
2008-12-22 2:07 ` Frans Pop
@ 2008-12-22 7:00 ` Ingo Molnar
2008-12-22 9:00 ` Frédéric Weisbecker
2008-12-22 9:04 ` Frédéric Weisbecker
1 sibling, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2008-12-22 7:00 UTC (permalink / raw)
To: Frans Pop; +Cc: Frederic Weisbecker, tglx, linux-kernel
* Frans Pop <elendil@planet.nl> wrote:
> > Impact: avoid hanging on slow systems
> >
> > While using the function graph tracer on a virtualized system, the
> > hrtimer_interrupt can hang the system on an infinite loop. This can be
> > caused on several situation where something intrusive is slowing the
> > system (ie: tracing) and the next clock events to program are always
> > before the current time. This patch implements a reasonable
> > compromise. If such a situation is detected, we share the CPUs time in
> > 1/4 to process the hrtimer interrupts. This is enough to let the
> > system running without serious starvation.
>
> Should there maybe also be a mechanism to allow the system to
> automatically "recover" to higher (the original?) clockfrequencies, for
> example if the danger of loops has passed after tracing has been
> disabled?
I dont think that's necessary - tick_dev_program_event() already includes
a gradual approach that increases the 'min delta' interval step by step -
so we should find the 'system is limping along' compromise pretty
accurately.
A system can get "magically faster" later on (if we turn off tracing or
other kernel features that impact the cost of the timer tick), and we
might not need those safety measures anymore - but here the real solution
will be to make the virtualizer faster. Taking milliseconds to process a
timer tick (be that under tracing or not) is rather slow. So the kernel
has applied the brakes and has printed a warning about it - we should do
no more.
> > +static int force_clock_reprogram;
>
> Shouldn't this be initialized to 0?
no - global or static scope variables are initialized to zero in C.
> > @@ -1239,7 +1267,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
> > /* Reprogramming necessary ? */
> > if (expires_next.tv64 != KTIME_MAX) {
> > - if (tick_program_event(expires_next, 0))
> > + if (tick_program_event(expires_next, force_clock_reprogram))
> > goto retry;
> > }
> > }
>
> Shouldn't force_clock_reprogram be reset to 0 after it has fired and
> been handled?
hm, that would be interesting to see - in theory the system should become
stable after a few iterations of increasing min_delta. Frederic, is the
system still workable if you try what Frans has suggested?
Also, there's min_delta doubling in tick_dev_program_event() itself too -
that interacts with the irq-overload logic:
+ dev->min_delta_ns = (unsigned long)try_time.tv64 * 3;
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging
2008-12-22 7:00 ` Ingo Molnar
@ 2008-12-22 9:00 ` Frédéric Weisbecker
2008-12-22 9:05 ` Ingo Molnar
0 siblings, 1 reply; 12+ messages in thread
From: Frédéric Weisbecker @ 2008-12-22 9:00 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Frans Pop, tglx, linux-kernel
Hi Ingo and Frans,
2008/12/22 Ingo Molnar <mingo@elte.hu>:
>
> * Frans Pop <elendil@planet.nl> wrote:
>
>> > Impact: avoid hanging on slow systems
>> >
>> > While using the function graph tracer on a virtualized system, the
>> > hrtimer_interrupt can hang the system on an infinite loop. This can be
>> > caused on several situation where something intrusive is slowing the
>> > system (ie: tracing) and the next clock events to program are always
>> > before the current time. This patch implements a reasonable
>> > compromise. If such a situation is detected, we share the CPUs time in
>> > 1/4 to process the hrtimer interrupts. This is enough to let the
>> > system running without serious starvation.
>>
>> Should there maybe also be a mechanism to allow the system to
>> automatically "recover" to higher (the original?) clockfrequencies, for
>> example if the danger of loops has passed after tracing has been
>> disabled?
>
> I dont think that's necessary - tick_dev_program_event() already includes
> a gradual approach that increases the 'min delta' interval step by step -
> so we should find the 'system is limping along' compromise pretty
> accurately.
When you force the clock reprogramming to tick_dev_program_event(),
clockevents_program_event()
will fail a first time (like before because of tracing or whatever)
and then it will rebuild the next event:
now = ktime_get();
expires = ktime_add_ns(now, dev->min_delta_ns);
This one never failed on my tests.
So the above tricks which increases the min delta:
dev->min_delta_ns += dev->min_delta_ns >> 1
...is never reached.
But this is not enough. If I leave the things as is, the system
will success to reprogram the clock but in a delay which can be really too low.
It's almost worst: the system will not lock up anymore but will run in
an infinite and huge starvation
The effect is the same: a hang-up but without a soft lock-up warning
this time (it's worst).
That's why I'm dynamically adapting the min_delta along the time spent
to process
the hrtimer interrupt to reserve 1/4 to the latter.
This way, it is able (in theory) to solve the problem whatever the
frequency or the amount of slowness.
Note that I could have changed the things directly inside
tick_dev_program_event(), but there are multiple call
sites to this function and I couldn't be sure I'm not breaking something else.
> A system can get "magically faster" later on (if we turn off tracing or
> other kernel features that impact the cost of the timer tick), and we
> might not need those safety measures anymore - but here the real solution
> will be to make the virtualizer faster. Taking milliseconds to process a
> timer tick (be that under tracing or not) is rather slow. So the kernel
> has applied the brakes and has printed a warning about it - we should do
> no more.
>
>> > +static int force_clock_reprogram;
>>
>> Shouldn't this be initialized to 0?
>
> no - global or static scope variables are initialized to zero in C.
>
>> > @@ -1239,7 +1267,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
>> > /* Reprogramming necessary ? */
>> > if (expires_next.tv64 != KTIME_MAX) {
>> > - if (tick_program_event(expires_next, 0))
>> > + if (tick_program_event(expires_next, force_clock_reprogram))
>> > goto retry;
>> > }
>> > }
>>
>> Shouldn't force_clock_reprogram be reset to 0 after it has fired and
>> been handled?
>
> hm, that would be interesting to see - in theory the system should become
> stable after a few iterations of increasing min_delta. Frederic, is the
> system still workable if you try what Frans has suggested?
No, if you don't force the clock reprogramming, and the usual case
continue to build
a clock event before the current time, the min delta will not be used
and you will get
a -ETIME error.
> Also, there's min_delta doubling in tick_dev_program_event() itself too -
> that interacts with the irq-overload logic:
>
> + dev->min_delta_ns = (unsigned long)try_time.tv64 * 3;
>
> Ingo
>
That's what I explained above, the doubling of min delta in
tick_dev_program_event() is never reached in
my case. And moreover I'm not sure it is ever reached whatever the
call sites of tick_dev_program_event()
unless min delta has a very low value...
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging
2008-12-22 2:07 ` Frans Pop
2008-12-22 7:00 ` Ingo Molnar
@ 2008-12-22 9:04 ` Frédéric Weisbecker
2008-12-22 20:41 ` Frans Pop
1 sibling, 1 reply; 12+ messages in thread
From: Frédéric Weisbecker @ 2008-12-22 9:04 UTC (permalink / raw)
To: Frans Pop; +Cc: mingo, tglx, linux-kernel
2008/12/22 Frans Pop <elendil@planet.nl>:
>> Impact: avoid hanging on slow systems
>>
>> While using the function graph tracer on a virtualized system, the
>> hrtimer_interrupt can hang the system on an infinite loop.
>> This can be caused on several situation where something intrusive is
>> slowing the system (ie: tracing) and the next clock events to program
>> are always before the current time.
>> This patch implements a reasonable compromise. If such a situation is
>> detected, we share the CPUs time in 1/4 to process the hrtimer
>> interrupts. This is enough to let the system running without serious
>> starvation.
>
> Should there maybe also be a mechanism to allow the system to automatically
> "recover" to higher (the original?) clockfrequencies, for example if the
> danger of loops has passed after tracing has been disabled?
As Ingo said, printing a warning should be enough.
The real problem lays in the fact that we don't know the source of the
slowness. It can
be tracing, it can be something else (why not an nmi bombing)....
>> It has been successfully tested under VirtualBox with 1000 HZ and 100
>> HZ with function graph tracer launched. On both cases, the clock events
>> were increased until about 25 ms periodic ticks, which means 40 HZ.
>>
>> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> ---
>> kernel/hrtimer.c | 30 +++++++++++++++++++++++++++++-
>> 1 files changed, 29 insertions(+), 1 deletions(-)
>>
>> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
>> index bda9cb9..02f2477 100644
>> --- a/kernel/hrtimer.c
>> +++ b/kernel/hrtimer.c
>> @@ -1171,6 +1171,29 @@ static void __run_hrtimer(struct hrtimer *timer)
>>
>> #ifdef CONFIG_HIGH_RES_TIMERS
>>
>> +static int force_clock_reprogram;
>
> Shouldn't this be initialized to 0?
>
>> +
>> +/*
>> + * After 5 iteration's attempts, we consider that hrtimer_interrupt()
>> + * is hanging, which could happen with something that slows the interrupt
>> + * such as the tracing. Then we force the clock reprogramming for each future
>> + * hrtimer interrupts to avoid infinite loops and use the min_delta_ns
>> + * threshold that we will overwrite.
>> + * The next tick event will be scheduled to 3 times we currently spend on
>> + * hrtimer_interrupt(). This gives a good compromise, the cpus will spend
>> + * 1/4 of their time to process the hrtimer interrupts. This is enough to
>> + * let it running without serious starvation.
>> + */
>> +
>> +static inline void
>> +hrtimer_interrupt_hanging(struct clock_event_device *dev,
>> + ktime_t try_time)
>> +{
>> + force_clock_reprogram = 1;
>> + dev->min_delta_ns = (unsigned long)try_time.tv64 * 3;
>> + printk(KERN_WARNING "hrtimer: interrupt too slow, "
>> + "forcing clock min delta to %lu ns\n", dev->min_delta_ns);
>> +}
>> /*
>> * High resolution timer interrupt
>> * Called with interrupts disabled
>> @@ -1180,6 +1203,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
>> struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
>> struct hrtimer_clock_base *base;
>> ktime_t expires_next, now;
>> + int nr_retries = 0;
>> int i;
>>
>> BUG_ON(!cpu_base->hres_active);
>> @@ -1187,6 +1211,10 @@ void hrtimer_interrupt(struct clock_event_device *dev)
>> dev->next_event.tv64 = KTIME_MAX;
>>
>> retry:
>> + /* 5 retries is enough to notice a hang */
>> + if (!(++nr_retries % 5))
>> + hrtimer_interrupt_hanging(dev, ktime_sub(ktime_get(), now)); +
>> now = ktime_get();
>>
>> expires_next.tv64 = KTIME_MAX;
>> @@ -1239,7 +1267,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
>> /* Reprogramming necessary ? */
>> if (expires_next.tv64 != KTIME_MAX) {
>> - if (tick_program_event(expires_next, 0))
>> + if (tick_program_event(expires_next, force_clock_reprogram))
>> goto retry;
>> }
>> }
>
> Shouldn't force_clock_reprogram be reset to 0 after it has fired and been
> handled?
>
No, we need to keep the clock reprogramming forcing for each next
hrtimer interrupt if
we want the min delta to be used inside tick_dev_program_event().
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging
2008-12-22 9:00 ` Frédéric Weisbecker
@ 2008-12-22 9:05 ` Ingo Molnar
2008-12-22 9:24 ` Frédéric Weisbecker
0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2008-12-22 9:05 UTC (permalink / raw)
To: Frédéric Weisbecker; +Cc: Frans Pop, tglx, linux-kernel
* Frédéric Weisbecker <fweisbec@gmail.com> wrote:
> > Also, there's min_delta doubling in tick_dev_program_event() itself
> > too - that interacts with the irq-overload logic:
> >
> > + dev->min_delta_ns = (unsigned long)try_time.tv64 * 3;
> >
> > Ingo
>
> That's what I explained above, the doubling of min delta in
> tick_dev_program_event() is never reached in my case. And moreover I'm
> not sure it is ever reached whatever the call sites of
> tick_dev_program_event() unless min delta has a very low value...
yeah, that looks all rather messy and probably ineffective. Thomas, mind
to take a look?
Frederic's patch solves a real timer-irq-overload extreme situation so i
find it rather valuable. Maybe we should do Frederic's patch as-is, have
the 'softer' min_delta behavior for the usual hrtimer codepaths - and the
more agressive one for the timer tick only?
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging
2008-12-22 9:05 ` Ingo Molnar
@ 2008-12-22 9:24 ` Frédéric Weisbecker
0 siblings, 0 replies; 12+ messages in thread
From: Frédéric Weisbecker @ 2008-12-22 9:24 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Frans Pop, tglx, linux-kernel
2008/12/22 Ingo Molnar <mingo@elte.hu>:
>
> * Frédéric Weisbecker <fweisbec@gmail.com> wrote:
>
>> > Also, there's min_delta doubling in tick_dev_program_event() itself
>> > too - that interacts with the irq-overload logic:
>> >
>> > + dev->min_delta_ns = (unsigned long)try_time.tv64 * 3;
>> >
>> > Ingo
>>
>> That's what I explained above, the doubling of min delta in
>> tick_dev_program_event() is never reached in my case. And moreover I'm
>> not sure it is ever reached whatever the call sites of
>> tick_dev_program_event() unless min delta has a very low value...
>
> yeah, that looks all rather messy and probably ineffective. Thomas, mind
> to take a look?
>
> Frederic's patch solves a real timer-irq-overload extreme situation so i
> find it rather valuable. Maybe we should do Frederic's patch as-is, have
> the 'softer' min_delta behavior for the usual hrtimer codepaths - and the
> more agressive one for the timer tick only?
>
> Ingo
>
Oh a little thing.
I wonder if 5 iterations to notice a hang is not too low for embedded
systems on certain
situations.... Since embedded systems could need high resolution
timers for real time....
Actually perhaps 10 iterations would be more reasonable....
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging
2008-12-22 1:24 [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging Frederic Weisbecker
2008-12-22 2:07 ` Frans Pop
@ 2008-12-22 15:17 ` Cyrill Gorcunov
2008-12-22 15:28 ` Frédéric Weisbecker
2008-12-27 10:53 ` Ingo Molnar
2 siblings, 1 reply; 12+ messages in thread
From: Cyrill Gorcunov @ 2008-12-22 15:17 UTC (permalink / raw)
To: Frederic Weisbecker; +Cc: Ingo Molnar, Thomas Gleixner, Linux Kernel
[Frederic Weisbecker - Mon, Dec 22, 2008 at 02:24:48AM +0100]
| Impact: avoid hanging on slow systems
|
| While using the function graph tracer on a virtualized system, the hrtimer_interrupt
| can hang the system on an infinite loop.
| This can be caused on several situation where something intrusive is slowing the
| system (ie: tracing) and the next clock events to program are always before the current
| time.
| This patch implements a reasonable compromise. If such a situation is detected, we share
| the CPUs time in 1/4 to process the hrtimer interrupts. This is enough to let the system
| running without serious starvation.
|
| It has been successfully tested under VirtualBox with 1000 HZ and 100 HZ with function graph
| tracer launched. On both cases, the clock events were increased until about 25 ms periodic ticks,
| which means 40 HZ.
|
| Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
| Cc: Thomas Gleixner <tglx@linutronix.de>
| ---
| kernel/hrtimer.c | 30 +++++++++++++++++++++++++++++-
| 1 files changed, 29 insertions(+), 1 deletions(-)
|
| diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
| index bda9cb9..02f2477 100644
| --- a/kernel/hrtimer.c
| +++ b/kernel/hrtimer.c
| @@ -1171,6 +1171,29 @@ static void __run_hrtimer(struct hrtimer *timer)
|
| #ifdef CONFIG_HIGH_RES_TIMERS
|
| +static int force_clock_reprogram;
| +
| +/*
| + * After 5 iteration's attempts, we consider that hrtimer_interrupt()
| + * is hanging, which could happen with something that slows the interrupt
| + * such as the tracing. Then we force the clock reprogramming for each future
| + * hrtimer interrupts to avoid infinite loops and use the min_delta_ns
| + * threshold that we will overwrite.
| + * The next tick event will be scheduled to 3 times we currently spend on
| + * hrtimer_interrupt(). This gives a good compromise, the cpus will spend
| + * 1/4 of their time to process the hrtimer interrupts. This is enough to
| + * let it running without serious starvation.
| + */
| +
| +static inline void
| +hrtimer_interrupt_hanging(struct clock_event_device *dev,
| + ktime_t try_time)
| +{
| + force_clock_reprogram = 1;
| + dev->min_delta_ns = (unsigned long)try_time.tv64 * 3;
| + printk(KERN_WARNING "hrtimer: interrupt too slow, "
| + "forcing clock min delta to %lu ns\n", dev->min_delta_ns);
| +}
| /*
| * High resolution timer interrupt
| * Called with interrupts disabled
| @@ -1180,6 +1203,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
| struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
| struct hrtimer_clock_base *base;
| ktime_t expires_next, now;
| + int nr_retries = 0;
| int i;
|
| BUG_ON(!cpu_base->hres_active);
| @@ -1187,6 +1211,10 @@ void hrtimer_interrupt(struct clock_event_device *dev)
| dev->next_event.tv64 = KTIME_MAX;
|
| retry:
| + /* 5 retries is enough to notice a hang */
| + if (!(++nr_retries % 5))
| + hrtimer_interrupt_hanging(dev, ktime_sub(ktime_get(), now));
| +
| now = ktime_get();
Hi Frederic,
is it really needed to use mod operation here? Why
cant we test for plain 5 and flush it to zero then?
I mean something like
if (++nr_retries > 5) {
nr_retries = 0;
...
}
Did I miss anything?
- Cyrill -
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging
2008-12-22 15:17 ` Cyrill Gorcunov
@ 2008-12-22 15:28 ` Frédéric Weisbecker
0 siblings, 0 replies; 12+ messages in thread
From: Frédéric Weisbecker @ 2008-12-22 15:28 UTC (permalink / raw)
To: Cyrill Gorcunov; +Cc: Ingo Molnar, Thomas Gleixner, Linux Kernel
2008/12/22 Cyrill Gorcunov <gorcunov@gmail.com>:
> [Frederic Weisbecker - Mon, Dec 22, 2008 at 02:24:48AM +0100]
> | Impact: avoid hanging on slow systems
> |
> | While using the function graph tracer on a virtualized system, the hrtimer_interrupt
> | can hang the system on an infinite loop.
> | This can be caused on several situation where something intrusive is slowing the
> | system (ie: tracing) and the next clock events to program are always before the current
> | time.
> | This patch implements a reasonable compromise. If such a situation is detected, we share
> | the CPUs time in 1/4 to process the hrtimer interrupts. This is enough to let the system
> | running without serious starvation.
> |
> | It has been successfully tested under VirtualBox with 1000 HZ and 100 HZ with function graph
> | tracer launched. On both cases, the clock events were increased until about 25 ms periodic ticks,
> | which means 40 HZ.
> |
> | Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> | Cc: Thomas Gleixner <tglx@linutronix.de>
> | ---
> | kernel/hrtimer.c | 30 +++++++++++++++++++++++++++++-
> | 1 files changed, 29 insertions(+), 1 deletions(-)
> |
> | diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> | index bda9cb9..02f2477 100644
> | --- a/kernel/hrtimer.c
> | +++ b/kernel/hrtimer.c
> | @@ -1171,6 +1171,29 @@ static void __run_hrtimer(struct hrtimer *timer)
> |
> | #ifdef CONFIG_HIGH_RES_TIMERS
> |
> | +static int force_clock_reprogram;
> | +
> | +/*
> | + * After 5 iteration's attempts, we consider that hrtimer_interrupt()
> | + * is hanging, which could happen with something that slows the interrupt
> | + * such as the tracing. Then we force the clock reprogramming for each future
> | + * hrtimer interrupts to avoid infinite loops and use the min_delta_ns
> | + * threshold that we will overwrite.
> | + * The next tick event will be scheduled to 3 times we currently spend on
> | + * hrtimer_interrupt(). This gives a good compromise, the cpus will spend
> | + * 1/4 of their time to process the hrtimer interrupts. This is enough to
> | + * let it running without serious starvation.
> | + */
> | +
> | +static inline void
> | +hrtimer_interrupt_hanging(struct clock_event_device *dev,
> | + ktime_t try_time)
> | +{
> | + force_clock_reprogram = 1;
> | + dev->min_delta_ns = (unsigned long)try_time.tv64 * 3;
> | + printk(KERN_WARNING "hrtimer: interrupt too slow, "
> | + "forcing clock min delta to %lu ns\n", dev->min_delta_ns);
> | +}
> | /*
> | * High resolution timer interrupt
> | * Called with interrupts disabled
> | @@ -1180,6 +1203,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
> | struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
> | struct hrtimer_clock_base *base;
> | ktime_t expires_next, now;
> | + int nr_retries = 0;
> | int i;
> |
> | BUG_ON(!cpu_base->hres_active);
> | @@ -1187,6 +1211,10 @@ void hrtimer_interrupt(struct clock_event_device *dev)
> | dev->next_event.tv64 = KTIME_MAX;
> |
> | retry:
> | + /* 5 retries is enough to notice a hang */
> | + if (!(++nr_retries % 5))
> | + hrtimer_interrupt_hanging(dev, ktime_sub(ktime_get(), now));
> | +
> | now = ktime_get();
>
> Hi Frederic,
>
> is it really needed to use mod operation here?
This is a kind of paranoid check.
But actually you are right, it is not necessary.
If we force the clock reprogramming, we will not retry again...
> Why cant we test for plain 5 and flush it to zero then?
> I mean something like
>
> if (++nr_retries > 5) {
> nr_retries = 0;
> ...
> }
>
> Did I miss anything?
>
> - Cyrill -
>
Since the clock reprogramming can't fail anymore after that, we can
just check ++nr_retries == 5 or why not
++nr_retries >= 5 if we want to stay paranoid......
I will fix it if the patched is accepted...
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging
2008-12-22 9:04 ` Frédéric Weisbecker
@ 2008-12-22 20:41 ` Frans Pop
0 siblings, 0 replies; 12+ messages in thread
From: Frans Pop @ 2008-12-22 20:41 UTC (permalink / raw)
To: Frédéric Weisbecker; +Cc: mingo, tglx, linux-kernel
On Monday 22 December 2008, Frédéric Weisbecker wrote:
> > Shouldn't force_clock_reprogram be reset to 0 after it has fired and
> > been handled?
>
> No, we need to keep the clock reprogramming forcing for each next
> hrtimer interrupt if we want the min delta to be used inside
> tick_dev_program_event().
Wouldn't that be worth a comment in the code?
Thanks a lot for the answers.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging
2008-12-22 1:24 [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging Frederic Weisbecker
2008-12-22 2:07 ` Frans Pop
2008-12-22 15:17 ` Cyrill Gorcunov
@ 2008-12-27 10:53 ` Ingo Molnar
2008-12-27 14:00 ` Frederic Weisbecker
2 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2008-12-27 10:53 UTC (permalink / raw)
To: Frederic Weisbecker; +Cc: Thomas Gleixner, Linux Kernel
* Frederic Weisbecker <fweisbec@gmail.com> wrote:
> Impact: avoid hanging on slow systems
>
> While using the function graph tracer on a virtualized system, the
> hrtimer_interrupt can hang the system on an infinite loop. This can be
> caused on several situation where something intrusive is slowing the
> system (ie: tracing) and the next clock events to program are always
> before the current time. This patch implements a reasonable compromise.
> If such a situation is detected, we share the CPUs time in 1/4 to
> process the hrtimer interrupts. This is enough to let the system running
> without serious starvation.
>
> It has been successfully tested under VirtualBox with 1000 HZ and 100 HZ
> with function graph tracer launched. On both cases, the clock events
> were increased until about 25 ms periodic ticks, which means 40 HZ.
>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> ---
> kernel/hrtimer.c | 30 +++++++++++++++++++++++++++++-
> 1 files changed, 29 insertions(+), 1 deletions(-)
applied to tip/timers/hrtimers, thanks Frederic!
Thomas, any objections/observations?
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging
2008-12-27 10:53 ` Ingo Molnar
@ 2008-12-27 14:00 ` Frederic Weisbecker
0 siblings, 0 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2008-12-27 14:00 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Thomas Gleixner, Linux Kernel
On Sat, Dec 27, 2008 at 11:53:34AM +0100, Ingo Molnar wrote:
>
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
>
> > Impact: avoid hanging on slow systems
> >
> > While using the function graph tracer on a virtualized system, the
> > hrtimer_interrupt can hang the system on an infinite loop. This can be
> > caused on several situation where something intrusive is slowing the
> > system (ie: tracing) and the next clock events to program are always
> > before the current time. This patch implements a reasonable compromise.
> > If such a situation is detected, we share the CPUs time in 1/4 to
> > process the hrtimer interrupts. This is enough to let the system running
> > without serious starvation.
> >
> > It has been successfully tested under VirtualBox with 1000 HZ and 100 HZ
> > with function graph tracer launched. On both cases, the clock events
> > were increased until about 25 ms periodic ticks, which means 40 HZ.
> >
> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > ---
> > kernel/hrtimer.c | 30 +++++++++++++++++++++++++++++-
> > 1 files changed, 29 insertions(+), 1 deletions(-)
>
> applied to tip/timers/hrtimers, thanks Frederic!
I will apply the comments that came along these reviews in a delta patch.
Note that I still wonder about false positive with a 5 loop check. I think
I will increase it to 10. This a more reasonable.
(I remember I had false positive but I'm not sure if that was because of early_printk
used for debugging. I didn't worry about it first, because I hadn't these false positive
on my last tests, just before I send this patch.)
>
> Thomas, any objections/observations?
>
> Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2008-12-27 14:00 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-22 1:24 [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging Frederic Weisbecker
2008-12-22 2:07 ` Frans Pop
2008-12-22 7:00 ` Ingo Molnar
2008-12-22 9:00 ` Frédéric Weisbecker
2008-12-22 9:05 ` Ingo Molnar
2008-12-22 9:24 ` Frédéric Weisbecker
2008-12-22 9:04 ` Frédéric Weisbecker
2008-12-22 20:41 ` Frans Pop
2008-12-22 15:17 ` Cyrill Gorcunov
2008-12-22 15:28 ` Frédéric Weisbecker
2008-12-27 10:53 ` Ingo Molnar
2008-12-27 14:00 ` Frederic Weisbecker
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.