* Re: [tip:timers/urgent] timers: Unconditionally check deferrable base [not found] <tip-ed4bbf7910b28ce3c691aef28d245585eaabda06@git.kernel.org> @ 2018-01-14 22:44 ` Paul E. McKenney 2018-01-15 23:57 ` Paul E. McKenney 0 siblings, 1 reply; 3+ messages in thread From: Paul E. McKenney @ 2018-01-14 22:44 UTC (permalink / raw) To: bigeasy, fweisbec, linux-kernel, hpa, peterz, mingo, tglx, anna-maria Cc: linux-tip-commits On Sun, Jan 14, 2018 at 02:31:07PM -0800, tip-bot for Thomas Gleixner wrote: > Commit-ID: ed4bbf7910b28ce3c691aef28d245585eaabda06 > Gitweb: https://git.kernel.org/tip/ed4bbf7910b28ce3c691aef28d245585eaabda06 > Author: Thomas Gleixner <tglx@linutronix.de> > AuthorDate: Sun, 14 Jan 2018 23:19:49 +0100 > Committer: Thomas Gleixner <tglx@linutronix.de> > CommitDate: Sun, 14 Jan 2018 23:25:33 +0100 > > timers: Unconditionally check deferrable base > > When the timer base is checked for expired timers then the deferrable base > must be checked as well. This was missed when making the deferrable base > independent of base::nohz_active. Very cool, thank you! I have fired up some of the dreaded TREE01 tests. Thanx, Paul > Fixes: ced6d5c11d3e ("timers: Use deferrable base independent of base::nohz_active") > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> > Cc: Frederic Weisbecker <fweisbec@gmail.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Sebastian Siewior <bigeasy@linutronix.de> > Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> > Cc: stable@vger.kernel.org > Cc: rt@linutronix.de > --- > kernel/time/timer.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/time/timer.c b/kernel/time/timer.c > index 89a9e1b..0bcf00e 100644 > --- a/kernel/time/timer.c > +++ b/kernel/time/timer.c > @@ -1696,7 +1696,7 @@ void run_local_timers(void) > hrtimer_run_queues(); > /* Raise the softirq only if required. */ > if (time_before(jiffies, base->clk)) { > - if (!IS_ENABLED(CONFIG_NO_HZ_COMMON) || !base->nohz_active) > + if (!IS_ENABLED(CONFIG_NO_HZ_COMMON)) > return; > /* CPU is awake, so check the deferrable base. */ > base++; > ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [tip:timers/urgent] timers: Unconditionally check deferrable base 2018-01-14 22:44 ` [tip:timers/urgent] timers: Unconditionally check deferrable base Paul E. McKenney @ 2018-01-15 23:57 ` Paul E. McKenney 2018-01-16 10:27 ` Thomas Gleixner 0 siblings, 1 reply; 3+ messages in thread From: Paul E. McKenney @ 2018-01-15 23:57 UTC (permalink / raw) To: bigeasy, fweisbec, linux-kernel, hpa, peterz, mingo, tglx, anna-maria Cc: linux-tip-commits On Sun, Jan 14, 2018 at 02:44:15PM -0800, Paul E. McKenney wrote: > On Sun, Jan 14, 2018 at 02:31:07PM -0800, tip-bot for Thomas Gleixner wrote: > > Commit-ID: ed4bbf7910b28ce3c691aef28d245585eaabda06 > > Gitweb: https://git.kernel.org/tip/ed4bbf7910b28ce3c691aef28d245585eaabda06 > > Author: Thomas Gleixner <tglx@linutronix.de> > > AuthorDate: Sun, 14 Jan 2018 23:19:49 +0100 > > Committer: Thomas Gleixner <tglx@linutronix.de> > > CommitDate: Sun, 14 Jan 2018 23:25:33 +0100 > > > > timers: Unconditionally check deferrable base > > > > When the timer base is checked for expired timers then the deferrable base > > must be checked as well. This was missed when making the deferrable base > > independent of base::nohz_active. > > Very cool, thank you! I have fired up some of the dreaded TREE01 tests. And (tentatively) a change in behavior! In one case, I did not see stall warnings (at least not until the ftrace dump at test end), but just the occasional "Waylayed timer" message: [ 1316.628105] schedule_timeout: Waylayed timer base->clk: 0x1000eff23 jiffies: 0x1000f8335 base->next_expiry: 0x1000eff23 timer->flags: 0x3c800006 timer->expires 0x1000f62ca idx: f2 idx_now: 139 base->pending_map 000000080000000000000000000000000000000000000000100400002000000000000000000000000000010000000000000000000000000000000000000000000000000000000000 [ 1316.638372] Torture onoff task state: [ 1316.639774] rcu_preempt I14984 8 2 0x80000000 [ 1316.641766] Call Trace: [ 1316.642757] ? __schedule+0x33c/0x6f0 [ 1316.644109] ? trace_event_raw_event_timer_start+0x6d/0xb0 [ 1316.646042] schedule+0x37/0x90 [ 1316.647219] schedule_timeout+0x159/0x470 [ 1316.648825] ? __next_timer_interrupt+0xc0/0xc0 [ 1316.650459] rcu_gp_kthread+0x9ee/0x1090 [ 1316.652147] kthread+0xf0/0x130 [ 1316.653415] ? rcu_oom_notify+0xd0/0xd0 [ 1316.654790] ? kthread_destroy_worker+0x40/0x40 [ 1316.656470] ret_from_fork+0x1f/0x30 If I am reading the waylay message correctly, the timer was delayed by about 8.3 seconds instead of the 21 seconds required to cause an RCU CPU stall warning. I am going to rearrange the ftrace-dump code a bit and run more tests. The next one got the usual stall warning and: [ 633.772036] schedule_timeout: Waylayed timer base->clk: 0x10004c5cb jiffies: 0x10005181e base->next_expiry: 0x10004c5cb timer->flags: 0x2c00006 timer->expires 0x10004c5ca idx: b idx_now: cd base->pending_map 000000000000080000000000000000000000020000000000000020000000000000000000000000000000000000080000000000000000000000000000000000000000000000000000 [ 633.855300] Torture onoff task state: [ 633.855300] rcu_preempt I14984 8 2 0x80000000 [ 633.855300] Call Trace: [ 633.855300] ? __schedule+0x33c/0x6f0 [ 633.855300] ? _raw_spin_lock_irqsave+0x12/0x40 [ 633.855300] schedule+0x37/0x90 [ 633.855300] rcu_gp_kthread+0xad/0x1090 [ 633.855300] kthread+0xf0/0x130 [ 633.855300] ? rcu_oom_notify+0xd0/0xd0 [ 633.855300] ? kthread_destroy_worker+0x40/0x40 [ 633.855300] ret_from_fork+0x1f/0x30 And the third one also got the usual stall warning and: [ 382.482240] schedule_timeout: Waylayed timer base->clk: 0x10000322a jiffies: 0x100014232 base->next_expiry: 0x10000322a timer->flags: 0x44000001 timer->expires 0x10000f00f idx: 110 idx_now: 115 base->pending_map 000004000000000000000000000000000000000000000000000000000100000000000000000140000000000000000400000000000000000000000000000000000000000000000000 [ 382.482245] Torture onoff task state: [ 382.482246] rcu_preempt I14984 8 2 0x80000000 [ 382.482273] Call Trace: [ 382.482280] ? __schedule+0x33c/0x6f0 [ 382.482283] ? trace_event_raw_event_timer_start+0x6d/0xb0 [ 382.482285] schedule+0x37/0x90 [ 382.482287] schedule_timeout+0x159/0x470 [ 382.482289] ? __next_timer_interrupt+0xc0/0xc0 [ 382.482292] rcu_gp_kthread+0x9ee/0x1090 [ 382.482297] kthread+0xf0/0x130 [ 382.482299] ? rcu_oom_notify+0xd0/0xd0 [ 382.482301] ? kthread_destroy_worker+0x40/0x40 [ 382.482303] ret_from_fork+0x1f/0x30 I don't have enough runs to be sure, but my current (sad) guess is that the error rate is unchanged. Thanx, Paul ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [tip:timers/urgent] timers: Unconditionally check deferrable base 2018-01-15 23:57 ` Paul E. McKenney @ 2018-01-16 10:27 ` Thomas Gleixner 0 siblings, 0 replies; 3+ messages in thread From: Thomas Gleixner @ 2018-01-16 10:27 UTC (permalink / raw) To: Paul E. McKenney Cc: bigeasy, fweisbec, linux-kernel, hpa, peterz, mingo, anna-maria, linux-tip-commits On Mon, 15 Jan 2018, Paul E. McKenney wrote: > On Sun, Jan 14, 2018 at 02:44:15PM -0800, Paul E. McKenney wrote: > > On Sun, Jan 14, 2018 at 02:31:07PM -0800, tip-bot for Thomas Gleixner wrote: > > > Commit-ID: ed4bbf7910b28ce3c691aef28d245585eaabda06 > > > Gitweb: https://git.kernel.org/tip/ed4bbf7910b28ce3c691aef28d245585eaabda06 > > > Author: Thomas Gleixner <tglx@linutronix.de> > > > AuthorDate: Sun, 14 Jan 2018 23:19:49 +0100 > > > Committer: Thomas Gleixner <tglx@linutronix.de> > > > CommitDate: Sun, 14 Jan 2018 23:25:33 +0100 > > > > > > timers: Unconditionally check deferrable base > > > > > > When the timer base is checked for expired timers then the deferrable base > > > must be checked as well. This was missed when making the deferrable base > > > independent of base::nohz_active. > > > > Very cool, thank you! I have fired up some of the dreaded TREE01 tests. > > And (tentatively) a change in behavior! That's the least thing I expected. None of this stuff should use deferrable timers. But what do I know. > In one case, I did not see stall warnings (at least not until the ftrace > dump at test end), but just the occasional "Waylayed timer" message: > > [ 1316.628105] schedule_timeout: Waylayed timer base->clk: 0x1000eff23 jiffies: 0x1000f8335 base->next_expiry: 0x1000eff23 timer->flags: 0x3c800006 timer->expires 0x1000f62ca idx: f2 idx_now: 139 base->pending_map 000000080000000000000000000000000000000000000000100400002000000000000000000000000000010000000000000000000000000000000000000000000000000000000000 > [ 1316.638372] Torture onoff task state: > [ 1316.639774] rcu_preempt I14984 8 2 0x80000000 > [ 1316.641766] Call Trace: > [ 1316.642757] ? __schedule+0x33c/0x6f0 > [ 1316.644109] ? trace_event_raw_event_timer_start+0x6d/0xb0 > [ 1316.646042] schedule+0x37/0x90 > [ 1316.647219] schedule_timeout+0x159/0x470 > [ 1316.648825] ? __next_timer_interrupt+0xc0/0xc0 > [ 1316.650459] rcu_gp_kthread+0x9ee/0x1090 > [ 1316.652147] kthread+0xf0/0x130 > [ 1316.653415] ? rcu_oom_notify+0xd0/0xd0 > [ 1316.654790] ? kthread_destroy_worker+0x40/0x40 > [ 1316.656470] ret_from_fork+0x1f/0x30 > > If I am reading the waylay message correctly, the timer was delayed by > about 8.3 seconds instead of the 21 seconds required to cause an RCU > CPU stall warning. I am going to rearrange the ftrace-dump code a bit > and run more tests. Would be interesting to stop the trace right when you detect the waylay. > [ 382.482303] ret_from_fork+0x1f/0x30 > > I don't have enough runs to be sure, but my current (sad) guess is that > the error rate is unchanged. That patch really should not make any difference. We are wading through traces and are still puzzled.... Thanks, tglx ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2018-01-16 10:28 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <tip-ed4bbf7910b28ce3c691aef28d245585eaabda06@git.kernel.org>
2018-01-14 22:44 ` [tip:timers/urgent] timers: Unconditionally check deferrable base Paul E. McKenney
2018-01-15 23:57 ` Paul E. McKenney
2018-01-16 10:27 ` Thomas Gleixner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox