[PATCH -rt] kernel/time: unbreak nohz in -rt

linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH -rt] kernel/time: unbreak nohz in -rt
@ 2016-03-21 19:12 Luiz Capitulino
  2016-03-29 16:25 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 3+ messages in thread
From: Luiz Capitulino @ 2016-03-21 19:12 UTC (permalink / raw)
  To: linux-rt-users; +Cc: riel, bigeasy, tglx, fweisbec

nohz support (nohz-full and nohz-idle) is currently
broken in the RT kernel. Meaning that, the tick is
never de-activated even when a core is idle or when
nohz_full= is passed.

The reason for this is that get_next_timer_interrupt()
in the RT kernel *always* returns "basem + TICK_NSEC"
which translates to "there's a timer firing in the
next tick". This causes tick_nohz_stop_sched_tick()
to never deactivate the tick.

This patch is like tylenol, it doesn't fix the problem, it
just reliefs the symptons by making tick_nohz_stop_sched_tick()
succeed if: 1. a core doesn't have any legacy timers
pending and 2. there's no hrtimer firing in the next tick.

Also, note that this issue has another side effect: it
causes the ktimersoftd thread to always take 1%-2% of CPU
time on all cores, even if they are idle. As it turns out,
the tick handling code path unconditionally raises the
TIMER_SOFTIRQ line. This is an upstream kernel behavior.
I believe people are not noticing the CPU usage because
nohz-idle papers over this problem.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
---
 kernel/time/timer.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index fee8682..2bf49af 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1451,8 +1451,14 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
 	/*
 	 * On PREEMPT_RT we cannot sleep here. As a result we can't take
 	 * the base lock to check when the next timer is pending and so
-	 * we assume the next jiffy.
+	 * we assume the next jiffy if there are active timers.
 	 */
+	local_irq_disable();
+	if (!base->active_timers) {
+		local_irq_enable();
+		return cmp_next_hrtimer_event(basem, expires);
+	}
+	local_irq_enable();
 	return basem + TICK_NSEC;
 #endif
 	spin_lock(&base->lock);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH -rt] kernel/time: unbreak nohz in -rt
  2016-03-21 19:12 [PATCH -rt] kernel/time: unbreak nohz in -rt Luiz Capitulino
@ 2016-03-29 16:25 ` Sebastian Andrzej Siewior
  2016-03-29 20:10   ` Luiz Capitulino
  0 siblings, 1 reply; 3+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-03-29 16:25 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: linux-rt-users, riel, tglx, fweisbec

* Luiz Capitulino | 2016-03-21 15:12:38 [-0400]:

>nohz support (nohz-full and nohz-idle) is currently
>broken in the RT kernel. Meaning that, the tick is
>never de-activated even when a core is idle or when
>nohz_full= is passed.
>
>The reason for this is that get_next_timer_interrupt()
>in the RT kernel *always* returns "basem + TICK_NSEC"
>which translates to "there's a timer firing in the
>next tick". This causes tick_nohz_stop_sched_tick()
>to never deactivate the tick.
>
>This patch is like tylenol, it doesn't fix the problem, it
>just reliefs the symptons by making tick_nohz_stop_sched_tick()
>succeed if: 1. a core doesn't have any legacy timers
>pending and 2. there's no hrtimer firing in the next tick.
>
>Also, note that this issue has another side effect: it
>causes the ktimersoftd thread to always take 1%-2% of CPU
>time on all cores, even if they are idle. As it turns out,
>the tick handling code path unconditionally raises the
>TIMER_SOFTIRQ line. This is an upstream kernel behavior.
>I believe people are not noticing the CPU usage because
>nohz-idle papers over this problem.

Unless this gets an ack-by tglx I will not consider it. Last time it was
decided that we first rework the timer wheel before getting full-nohz
fixed for -RT.
This patch disables interrupts to read an integer which should be safe
without interrupts disabled. What are the implications if the value
changes after read (say after the interrupt enable)?

>Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>

Sebastian

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH -rt] kernel/time: unbreak nohz in -rt
  2016-03-29 16:25 ` Sebastian Andrzej Siewior
@ 2016-03-29 20:10   ` Luiz Capitulino
  0 siblings, 0 replies; 3+ messages in thread
From: Luiz Capitulino @ 2016-03-29 20:10 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: linux-rt-users, riel, tglx, fweisbec

On Tue, 29 Mar 2016 18:25:56 +0200
Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:

> * Luiz Capitulino | 2016-03-21 15:12:38 [-0400]:
> 
> >nohz support (nohz-full and nohz-idle) is currently
> >broken in the RT kernel. Meaning that, the tick is
> >never de-activated even when a core is idle or when
> >nohz_full= is passed.
> >
> >The reason for this is that get_next_timer_interrupt()
> >in the RT kernel *always* returns "basem + TICK_NSEC"
> >which translates to "there's a timer firing in the
> >next tick". This causes tick_nohz_stop_sched_tick()
> >to never deactivate the tick.
> >
> >This patch is like tylenol, it doesn't fix the problem, it
> >just reliefs the symptons by making tick_nohz_stop_sched_tick()
> >succeed if: 1. a core doesn't have any legacy timers
> >pending and 2. there's no hrtimer firing in the next tick.
> >
> >Also, note that this issue has another side effect: it
> >causes the ktimersoftd thread to always take 1%-2% of CPU
> >time on all cores, even if they are idle. As it turns out,
> >the tick handling code path unconditionally raises the
> >TIMER_SOFTIRQ line. This is an upstream kernel behavior.
> >I believe people are not noticing the CPU usage because
> >nohz-idle papers over this problem.
> 
> Unless this gets an ack-by tglx I will not consider it.

Thomas?

> Last time it was
> decided that we first rework the timer wheel before getting full-nohz
> fixed for -RT.

Is there anyone working on it? What needs to be done?

> This patch disables interrupts to read an integer which should be safe
> without interrupts disabled.

I think you're right, it shouldn't be necessary to disable
interrupts.

> What are the implications if the value
> changes after read (say after the interrupt enable)?
> 
> >Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> 
> Sebastian
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-03-29 20:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-21 19:12 [PATCH -rt] kernel/time: unbreak nohz in -rt Luiz Capitulino
2016-03-29 16:25 ` Sebastian Andrzej Siewior
2016-03-29 20:10   ` Luiz Capitulino

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).