* [PATCH -rt] kernel/time: unbreak nohz in -rt
@ 2016-03-21 19:12 Luiz Capitulino
2016-03-29 16:25 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 3+ messages in thread
From: Luiz Capitulino @ 2016-03-21 19:12 UTC (permalink / raw)
To: linux-rt-users; +Cc: riel, bigeasy, tglx, fweisbec
nohz support (nohz-full and nohz-idle) is currently
broken in the RT kernel. Meaning that, the tick is
never de-activated even when a core is idle or when
nohz_full= is passed.
The reason for this is that get_next_timer_interrupt()
in the RT kernel *always* returns "basem + TICK_NSEC"
which translates to "there's a timer firing in the
next tick". This causes tick_nohz_stop_sched_tick()
to never deactivate the tick.
This patch is like tylenol, it doesn't fix the problem, it
just reliefs the symptons by making tick_nohz_stop_sched_tick()
succeed if: 1. a core doesn't have any legacy timers
pending and 2. there's no hrtimer firing in the next tick.
Also, note that this issue has another side effect: it
causes the ktimersoftd thread to always take 1%-2% of CPU
time on all cores, even if they are idle. As it turns out,
the tick handling code path unconditionally raises the
TIMER_SOFTIRQ line. This is an upstream kernel behavior.
I believe people are not noticing the CPU usage because
nohz-idle papers over this problem.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
---
kernel/time/timer.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index fee8682..2bf49af 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1451,8 +1451,14 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
/*
* On PREEMPT_RT we cannot sleep here. As a result we can't take
* the base lock to check when the next timer is pending and so
- * we assume the next jiffy.
+ * we assume the next jiffy if there are active timers.
*/
+ local_irq_disable();
+ if (!base->active_timers) {
+ local_irq_enable();
+ return cmp_next_hrtimer_event(basem, expires);
+ }
+ local_irq_enable();
return basem + TICK_NSEC;
#endif
spin_lock(&base->lock);
--
2.1.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH -rt] kernel/time: unbreak nohz in -rt
2016-03-21 19:12 [PATCH -rt] kernel/time: unbreak nohz in -rt Luiz Capitulino
@ 2016-03-29 16:25 ` Sebastian Andrzej Siewior
2016-03-29 20:10 ` Luiz Capitulino
0 siblings, 1 reply; 3+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-03-29 16:25 UTC (permalink / raw)
To: Luiz Capitulino; +Cc: linux-rt-users, riel, tglx, fweisbec
* Luiz Capitulino | 2016-03-21 15:12:38 [-0400]:
>nohz support (nohz-full and nohz-idle) is currently
>broken in the RT kernel. Meaning that, the tick is
>never de-activated even when a core is idle or when
>nohz_full= is passed.
>
>The reason for this is that get_next_timer_interrupt()
>in the RT kernel *always* returns "basem + TICK_NSEC"
>which translates to "there's a timer firing in the
>next tick". This causes tick_nohz_stop_sched_tick()
>to never deactivate the tick.
>
>This patch is like tylenol, it doesn't fix the problem, it
>just reliefs the symptons by making tick_nohz_stop_sched_tick()
>succeed if: 1. a core doesn't have any legacy timers
>pending and 2. there's no hrtimer firing in the next tick.
>
>Also, note that this issue has another side effect: it
>causes the ktimersoftd thread to always take 1%-2% of CPU
>time on all cores, even if they are idle. As it turns out,
>the tick handling code path unconditionally raises the
>TIMER_SOFTIRQ line. This is an upstream kernel behavior.
>I believe people are not noticing the CPU usage because
>nohz-idle papers over this problem.
Unless this gets an ack-by tglx I will not consider it. Last time it was
decided that we first rework the timer wheel before getting full-nohz
fixed for -RT.
This patch disables interrupts to read an integer which should be safe
without interrupts disabled. What are the implications if the value
changes after read (say after the interrupt enable)?
>Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Sebastian
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH -rt] kernel/time: unbreak nohz in -rt
2016-03-29 16:25 ` Sebastian Andrzej Siewior
@ 2016-03-29 20:10 ` Luiz Capitulino
0 siblings, 0 replies; 3+ messages in thread
From: Luiz Capitulino @ 2016-03-29 20:10 UTC (permalink / raw)
To: Sebastian Andrzej Siewior; +Cc: linux-rt-users, riel, tglx, fweisbec
On Tue, 29 Mar 2016 18:25:56 +0200
Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> * Luiz Capitulino | 2016-03-21 15:12:38 [-0400]:
>
> >nohz support (nohz-full and nohz-idle) is currently
> >broken in the RT kernel. Meaning that, the tick is
> >never de-activated even when a core is idle or when
> >nohz_full= is passed.
> >
> >The reason for this is that get_next_timer_interrupt()
> >in the RT kernel *always* returns "basem + TICK_NSEC"
> >which translates to "there's a timer firing in the
> >next tick". This causes tick_nohz_stop_sched_tick()
> >to never deactivate the tick.
> >
> >This patch is like tylenol, it doesn't fix the problem, it
> >just reliefs the symptons by making tick_nohz_stop_sched_tick()
> >succeed if: 1. a core doesn't have any legacy timers
> >pending and 2. there's no hrtimer firing in the next tick.
> >
> >Also, note that this issue has another side effect: it
> >causes the ktimersoftd thread to always take 1%-2% of CPU
> >time on all cores, even if they are idle. As it turns out,
> >the tick handling code path unconditionally raises the
> >TIMER_SOFTIRQ line. This is an upstream kernel behavior.
> >I believe people are not noticing the CPU usage because
> >nohz-idle papers over this problem.
>
> Unless this gets an ack-by tglx I will not consider it.
Thomas?
> Last time it was
> decided that we first rework the timer wheel before getting full-nohz
> fixed for -RT.
Is there anyone working on it? What needs to be done?
> This patch disables interrupts to read an integer which should be safe
> without interrupts disabled.
I think you're right, it shouldn't be necessary to disable
interrupts.
> What are the implications if the value
> changes after read (say after the interrupt enable)?
>
> >Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
>
> Sebastian
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-03-29 20:10 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-21 19:12 [PATCH -rt] kernel/time: unbreak nohz in -rt Luiz Capitulino
2016-03-29 16:25 ` Sebastian Andrzej Siewior
2016-03-29 20:10 ` Luiz Capitulino
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).