[RFC] tick/nohz: schedule TIMER_SOFTIRQ immediately for expired timers

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [RFC] tick/nohz: schedule TIMER_SOFTIRQ immediately for expired timers
@ 2017-05-24 12:39 Octavian Purdila
  2017-05-26 12:55 ` Frederic Weisbecker
  0 siblings, 1 reply; 6+ messages in thread
From: Octavian Purdila @ 2017-05-24 12:39 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Frederic Weisbecker, Peter Zijlstra, linux-kernel,
	Leonard Crestez, Octavian Purdila

Currently, when detecting expired timers in tick_nohz_stop_sched_tick
we just schedule a new hrtimer and let its handler to schedule
TIMER_SOFITRQ.

This can lead to indefinite timer stalls if the system is busy with a
stream of interrupts that have the period shorter or about the same
with the value of the minimum delay of the current clocksource driver:

-> idle
-> IRQ (time = x)
   -> irq_exit -> tick_nohz_irq_exit -> tick_nohz_stop_sched_tick
      -> tick_nohz_restart
         -> cancel hrtimer (set clocksource event to x + max_delay)
         -> set clocksource to x + min_delay
...
-> IRQ (time = y, where y < x + min_delay)
   -> irq_exit -> tick_nohz_irq_exit -> tick_nohz_stop_sched_tick
      -> tick_nohz_restart
         -> cancel hrtimer (set clocksource event to x + max_delay)
         -> set clocksource to y + min_delay

So, instead of prodding the hrtimer interrupt, schedule TIMER_SOFTIRQ
since we know that timers are ready. The timers will run either at the
next interrupt or from ksoftirq so no hrtimer interrupt is
needed. This also avoids spurious programming of the clocksource in
this scenario.

Signed-off-by: Octavian Purdila <octavian.purdila@nxp.com>
---
 kernel/time/tick-sched.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 3bcb61b..0a30278 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -727,9 +727,16 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 		 *
 		 * Only once we exit the idle loop will we re-enable the tick,
 		 * see tick_nohz_idle_exit().
+		 *
+		 * Also, make sure we schedule TIMER_SOFTIRQ now instead of
+		 * relying on the hrtimer interrupt to do it to avoid
+		 * postponing processing of expired timers. If we have a
+		 * constant stream of interrupts with a period shorter than
+		 * the minimum delay of the current clocksource we can end up
+		 * postponing the timers indefinitely.
 		 */
 		if (delta == 0) {
-			tick_nohz_restart(ts, now);
+			raise_softirq(TIMER_SOFTIRQ);
 			goto out;
 		}
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC] tick/nohz: schedule TIMER_SOFTIRQ immediately for expired timers
  2017-05-24 12:39 [RFC] tick/nohz: schedule TIMER_SOFTIRQ immediately for expired timers Octavian Purdila
@ 2017-05-26 12:55 ` Frederic Weisbecker
  2017-05-26 18:04   ` Octavian Purdila
  2017-05-31  3:08   ` Wanpeng Li
  0 siblings, 2 replies; 6+ messages in thread
From: Frederic Weisbecker @ 2017-05-26 12:55 UTC (permalink / raw)
  To: Octavian Purdila
  Cc: Thomas Gleixner, Peter Zijlstra, linux-kernel, Leonard Crestez

On Wed, May 24, 2017 at 03:39:24PM +0300, Octavian Purdila wrote:
> Currently, when detecting expired timers in tick_nohz_stop_sched_tick
> we just schedule a new hrtimer and let its handler to schedule
> TIMER_SOFITRQ.
> 
> This can lead to indefinite timer stalls if the system is busy with a
> stream of interrupts that have the period shorter or about the same
> with the value of the minimum delay of the current clocksource driver:
> 
> -> idle
> -> IRQ (time = x)
>    -> irq_exit -> tick_nohz_irq_exit -> tick_nohz_stop_sched_tick
>       -> tick_nohz_restart
>          -> cancel hrtimer (set clocksource event to x + max_delay)
>          -> set clocksource to x + min_delay
> ...
> -> IRQ (time = y, where y < x + min_delay)
>    -> irq_exit -> tick_nohz_irq_exit -> tick_nohz_stop_sched_tick
>       -> tick_nohz_restart
>          -> cancel hrtimer (set clocksource event to x + max_delay)
>          -> set clocksource to y + min_delay
> 
> So, instead of prodding the hrtimer interrupt, schedule TIMER_SOFTIRQ
> since we know that timers are ready. The timers will run either at the
> next interrupt or from ksoftirq so no hrtimer interrupt is
> needed. This also avoids spurious programming of the clocksource in
> this scenario.
> 
> Signed-off-by: Octavian Purdila <octavian.purdila@nxp.com>
> ---
>  kernel/time/tick-sched.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 3bcb61b..0a30278 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -727,9 +727,16 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>  		 *
>  		 * Only once we exit the idle loop will we re-enable the tick,
>  		 * see tick_nohz_idle_exit().
> +		 *
> +		 * Also, make sure we schedule TIMER_SOFTIRQ now instead of
> +		 * relying on the hrtimer interrupt to do it to avoid
> +		 * postponing processing of expired timers. If we have a
> +		 * constant stream of interrupts with a period shorter than
> +		 * the minimum delay of the current clocksource we can end up
> +		 * postponing the timers indefinitely.
>  		 */
>  		if (delta == 0) {
> -			tick_nohz_restart(ts, now);
> +			raise_softirq(TIMER_SOFTIRQ);
>  			goto out;
>  		}
>  	}

Nice catch! And the patch should work but that actually restores a behaviour we've
removed some time ago. We wanted to get rid of that softirq raise.

So discussing this with Thomas, here is an alternate solution. That tick restart looks
like an unnecessary special case. In fact the normal path ending with hrtimer_start()
in tick_nohz_stop_sched_tick() should work if the delta deadline is 0. And if
we do so, we benefit from the optimization path:

        if (ts->tick_stopped && (expires == dev->next_event))

...which avoids the cancel/reprog game and therefore should fix that issue.

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index ed18ca5..58c257c 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -713,8 +713,6 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 	 */
 	delta = next_tick - basemono;
 	if (delta <= (u64)TICK_NSEC) {
-		tick = 0;
-
 		/*
 		 * Tell the timer code that the base is not idle, i.e. undo
 		 * the effect of get_next_timer_interrupt():
@@ -724,23 +722,8 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 		 * We've not stopped the tick yet, and there's a timer in the
 		 * next period, so no point in stopping it either, bail.
 		 */
-		if (!ts->tick_stopped)
-			goto out;
-
-		/*
-		 * If, OTOH, we did stop it, but there's a pending (expired)
-		 * timer reprogram the timer hardware to fire now.
-		 *
-		 * We will not restart the tick proper, just prod the timer
-		 * hardware into firing an interrupt to process the pending
-		 * timers. Just like tick_irq_exit() will not restart the tick
-		 * for 'normal' interrupts.
-		 *
-		 * Only once we exit the idle loop will we re-enable the tick,
-		 * see tick_nohz_idle_exit().
-		 */
-		if (delta == 0) {
-			tick_nohz_restart(ts, now);
+		if (!ts->tick_stopped) {
+			tick = 0;
 			goto out;
 		}
 	}

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC] tick/nohz: schedule TIMER_SOFTIRQ immediately for expired timers
  2017-05-26 12:55 ` Frederic Weisbecker
@ 2017-05-26 18:04   ` Octavian Purdila
  2017-05-30 21:47     ` Thomas Gleixner
  2017-05-31  3:08   ` Wanpeng Li
  1 sibling, 1 reply; 6+ messages in thread
From: Octavian Purdila @ 2017-05-26 18:04 UTC (permalink / raw)
  To: fweisbec@gmail.com
  Cc: tglx@linutronix.de, linux-kernel@vger.kernel.org,
	peterz@infradead.org, Leonard Crestez

On Vi, 2017-05-26 at 14:55 +0200, Frederic Weisbecker wrote:
> On Wed, May 24, 2017 at 03:39:24PM +0300, Octavian Purdila wrote:
> > 
> > Currently, when detecting expired timers in
> > tick_nohz_stop_sched_tick
> > we just schedule a new hrtimer and let its handler to schedule
> > TIMER_SOFITRQ.
> > 
> > This can lead to indefinite timer stalls if the system is busy with
> > a
> > stream of interrupts that have the period shorter or about the same
> > with the value of the minimum delay of the current clocksource
> > driver:
> > 
> > -> idle
> > -> IRQ (time = x)
> >    -> irq_exit -> tick_nohz_irq_exit -> tick_nohz_stop_sched_tick
> >       -> tick_nohz_restart
> >          -> cancel hrtimer (set clocksource event to x + max_delay)
> >          -> set clocksource to x + min_delay
> > ...
> > -> IRQ (time = y, where y < x + min_delay)
> >    -> irq_exit -> tick_nohz_irq_exit -> tick_nohz_stop_sched_tick
> >       -> tick_nohz_restart
> >          -> cancel hrtimer (set clocksource event to x + max_delay)
> >          -> set clocksource to y + min_delay
> > 
> > So, instead of prodding the hrtimer interrupt, schedule
> > TIMER_SOFTIRQ
> > since we know that timers are ready. The timers will run either at
> > the
> > next interrupt or from ksoftirq so no hrtimer interrupt is
> > needed. This also avoids spurious programming of the clocksource in
> > this scenario.
> > 
> > Signed-off-by: Octavian Purdila <octavian.purdila@nxp.com>
> > ---
> >  kernel/time/tick-sched.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index 3bcb61b..0a30278 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -727,9 +727,16 @@ static ktime_t
> > tick_nohz_stop_sched_tick(struct tick_sched *ts,
> >  		 *
> >  		 * Only once we exit the idle loop will we re-
> > enable the tick,
> >  		 * see tick_nohz_idle_exit().
> > +		 *
> > +		 * Also, make sure we schedule TIMER_SOFTIRQ now
> > instead of
> > +		 * relying on the hrtimer interrupt to do it to
> > avoid
> > +		 * postponing processing of expired timers. If we
> > have a
> > +		 * constant stream of interrupts with a period
> > shorter than
> > +		 * the minimum delay of the current clocksource we
> > can end up
> > +		 * postponing the timers indefinitely.
> >  		 */
> >  		if (delta == 0) {
> > -			tick_nohz_restart(ts, now);
> > +			raise_softirq(TIMER_SOFTIRQ);
> >  			goto out;
> >  		}
> >  	}
> Nice catch! And the patch should work but that actually restores a
> behaviour we've
> removed some time ago. We wanted to get rid of that softirq raise.
> 
> So discussing this with Thomas, here is an alternate solution. That
> tick restart looks
> like an unnecessary special case. In fact the normal path ending with
> hrtimer_start()
> in tick_nohz_stop_sched_tick() should work if the delta deadline is
> 0. And if
> we do so, we benefit from the optimization path:
> 
>         if (ts->tick_stopped && (expires == dev->next_event))
> 
> ...which avoids the cancel/reprog game and therefore should fix that
> issue.
> 
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index ed18ca5..58c257c 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -713,8 +713,6 @@ static ktime_t tick_nohz_stop_sched_tick(struct
> tick_sched *ts,
>  	 */
>  	delta = next_tick - basemono;
>  	if (delta <= (u64)TICK_NSEC) {
> -		tick = 0;
> -
>  		/*
>  		 * Tell the timer code that the base is not idle,
> i.e. undo
>  		 * the effect of get_next_timer_interrupt():
> @@ -724,23 +722,8 @@ static ktime_t tick_nohz_stop_sched_tick(struct
> tick_sched *ts,
>  		 * We've not stopped the tick yet, and there's a
> timer in the
>  		 * next period, so no point in stopping it either,
> bail.
>  		 */
> -		if (!ts->tick_stopped)
> -			goto out;
> -
> -		/*
> -		 * If, OTOH, we did stop it, but there's a pending
> (expired)
> -		 * timer reprogram the timer hardware to fire now.
> -		 *
> -		 * We will not restart the tick proper, just prod
> the timer
> -		 * hardware into firing an interrupt to process the
> pending
> -		 * timers. Just like tick_irq_exit() will not
> restart the tick
> -		 * for 'normal' interrupts.
> -		 *
> -		 * Only once we exit the idle loop will we re-enable 
> the tick,
> -		 * see tick_nohz_idle_exit().
> -		 */
> -		if (delta == 0) {
> -			tick_nohz_restart(ts, now);
> +		if (!ts->tick_stopped) {
> +			tick = 0;
>  			goto out;
>  		}
>  	}
> 

Nice, it is less expensive and deletes some code :-) Thanks for the
quick fix Frederic, I confirm it solves my issue.

Tested-by: Octavian Purdila <octavian.purdila@nxp.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] tick/nohz: schedule TIMER_SOFTIRQ immediately for expired timers
  2017-05-26 18:04   ` Octavian Purdila
@ 2017-05-30 21:47     ` Thomas Gleixner
  2017-05-31  3:37       ` Frederic Weisbecker
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2017-05-30 21:47 UTC (permalink / raw)
  To: Octavian Purdila
  Cc: fweisbec@gmail.com, linux-kernel@vger.kernel.org,
	peterz@infradead.org, Leonard Crestez

[-- Attachment #1: Type: text/plain, Size: 347 bytes --]

On Fri, 26 May 2017, Octavian Purdila wrote:
> On Vi, 2017-05-26 at 14:55 +0200, Frederic Weisbecker wrote:
> 
> Nice, it is less expensive and deletes some code :-) Thanks for the
> quick fix Frederic, I confirm it solves my issue.
> 
> Tested-by: Octavian Purdila <octavian.purdila@nxp.com>

Fredric, care to send a proper patch?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] tick/nohz: schedule TIMER_SOFTIRQ immediately for expired timers
  2017-05-30 21:47     ` Thomas Gleixner
@ 2017-05-31  3:37       ` Frederic Weisbecker
  0 siblings, 0 replies; 6+ messages in thread
From: Frederic Weisbecker @ 2017-05-31  3:37 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Octavian Purdila, linux-kernel@vger.kernel.org,
	peterz@infradead.org, Leonard Crestez

On Tue, May 30, 2017 at 11:47:59PM +0200, Thomas Gleixner wrote:
> On Fri, 26 May 2017, Octavian Purdila wrote:
> > On Vi, 2017-05-26 at 14:55 +0200, Frederic Weisbecker wrote:
> > 
> > Nice, it is less expensive and deletes some code :-) Thanks for the
> > quick fix Frederic, I confirm it solves my issue.
> > 
> > Tested-by: Octavian Purdila <octavian.purdila@nxp.com>
> 
> Fredric, care to send a proper patch?

Sure, cooking that.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] tick/nohz: schedule TIMER_SOFTIRQ immediately for expired timers
  2017-05-26 12:55 ` Frederic Weisbecker
  2017-05-26 18:04   ` Octavian Purdila
@ 2017-05-31  3:08   ` Wanpeng Li
  1 sibling, 0 replies; 6+ messages in thread
From: Wanpeng Li @ 2017-05-31  3:08 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Octavian Purdila, Thomas Gleixner, Peter Zijlstra,
	linux-kernel@vger.kernel.org, Leonard Crestez

2017-05-26 20:55 GMT+08:00 Frederic Weisbecker <fweisbec@gmail.com>:
> On Wed, May 24, 2017 at 03:39:24PM +0300, Octavian Purdila wrote:
>> Currently, when detecting expired timers in tick_nohz_stop_sched_tick
>> we just schedule a new hrtimer and let its handler to schedule
>> TIMER_SOFITRQ.
>>
>> This can lead to indefinite timer stalls if the system is busy with a
>> stream of interrupts that have the period shorter or about the same
>> with the value of the minimum delay of the current clocksource driver:
>>
>> -> idle
>> -> IRQ (time = x)
>>    -> irq_exit -> tick_nohz_irq_exit -> tick_nohz_stop_sched_tick
>>       -> tick_nohz_restart
>>          -> cancel hrtimer (set clocksource event to x + max_delay)
>>          -> set clocksource to x + min_delay
>> ...
>> -> IRQ (time = y, where y < x + min_delay)
>>    -> irq_exit -> tick_nohz_irq_exit -> tick_nohz_stop_sched_tick
>>       -> tick_nohz_restart
>>          -> cancel hrtimer (set clocksource event to x + max_delay)
>>          -> set clocksource to y + min_delay
>>
>> So, instead of prodding the hrtimer interrupt, schedule TIMER_SOFTIRQ
>> since we know that timers are ready. The timers will run either at the
>> next interrupt or from ksoftirq so no hrtimer interrupt is
>> needed. This also avoids spurious programming of the clocksource in
>> this scenario.
>>
>> Signed-off-by: Octavian Purdila <octavian.purdila@nxp.com>
>> ---
>>  kernel/time/tick-sched.c | 9 ++++++++-
>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
>> index 3bcb61b..0a30278 100644
>> --- a/kernel/time/tick-sched.c
>> +++ b/kernel/time/tick-sched.c
>> @@ -727,9 +727,16 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>>                *
>>                * Only once we exit the idle loop will we re-enable the tick,
>>                * see tick_nohz_idle_exit().
>> +              *
>> +              * Also, make sure we schedule TIMER_SOFTIRQ now instead of
>> +              * relying on the hrtimer interrupt to do it to avoid
>> +              * postponing processing of expired timers. If we have a
>> +              * constant stream of interrupts with a period shorter than
>> +              * the minimum delay of the current clocksource we can end up
>> +              * postponing the timers indefinitely.
>>                */
>>               if (delta == 0) {
>> -                     tick_nohz_restart(ts, now);
>> +                     raise_softirq(TIMER_SOFTIRQ);
>>                       goto out;
>>               }
>>       }
>
> Nice catch! And the patch should work but that actually restores a behaviour we've
> removed some time ago. We wanted to get rid of that softirq raise.
>
> So discussing this with Thomas, here is an alternate solution. That tick restart looks
> like an unnecessary special case. In fact the normal path ending with hrtimer_start()
> in tick_nohz_stop_sched_tick() should work if the delta deadline is 0. And if
> we do so, we benefit from the optimization path:
>
>         if (ts->tick_stopped && (expires == dev->next_event))
>
> ...which avoids the cancel/reprog game and therefore should fix that issue.
>
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index ed18ca5..58c257c 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -713,8 +713,6 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>          */
>         delta = next_tick - basemono;
>         if (delta <= (u64)TICK_NSEC) {
> -               tick = 0;
> -
>                 /*
>                  * Tell the timer code that the base is not idle, i.e. undo
>                  * the effect of get_next_timer_interrupt():
> @@ -724,23 +722,8 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
>                  * We've not stopped the tick yet, and there's a timer in the
>                  * next period, so no point in stopping it either, bail.
>                  */
> -               if (!ts->tick_stopped)
> -                       goto out;
> -
> -               /*
> -                * If, OTOH, we did stop it, but there's a pending (expired)
> -                * timer reprogram the timer hardware to fire now.
> -                *
> -                * We will not restart the tick proper, just prod the timer
> -                * hardware into firing an interrupt to process the pending
> -                * timers. Just like tick_irq_exit() will not restart the tick
> -                * for 'normal' interrupts.
> -                *
> -                * Only once we exit the idle loop will we re-enable the tick,
> -                * see tick_nohz_idle_exit().
> -                */
> -               if (delta == 0) {
> -                       tick_nohz_restart(ts, now);
> +               if (!ts->tick_stopped) {
> +                       tick = 0;
>                         goto out;
>                 }
>         }
>

Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-05-31  3:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-24 12:39 [RFC] tick/nohz: schedule TIMER_SOFTIRQ immediately for expired timers Octavian Purdila
2017-05-26 12:55 ` Frederic Weisbecker
2017-05-26 18:04   ` Octavian Purdila
2017-05-30 21:47     ` Thomas Gleixner
2017-05-31  3:37       ` Frederic Weisbecker
2017-05-31  3:08   ` Wanpeng Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox