From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
Anna-Maria Gleixner <anna-maria@linutronix.de>,
Sebastian Siewior <bigeasy@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Ingo Molnar <mingo@kernel.org>
Subject: Re: [patch 0/4] timer/nohz: Fix timer/nohz woes
Date: Sat, 23 Dec 2017 17:21:20 -0800 [thread overview]
Message-ID: <20171224012120.GA4113@linux.vnet.ibm.com> (raw)
In-Reply-To: <20171222170907.GJ7829@linux.vnet.ibm.com>
On Fri, Dec 22, 2017 at 09:09:07AM -0800, Paul E. McKenney wrote:
> On Fri, Dec 22, 2017 at 03:51:11PM +0100, Thomas Gleixner wrote:
> > Paul was observing weird stalls which are hard to reproduce and decode. We
> > were finally able to reproduce and decode the wreckage on RT.
> >
> > The following series addresses the issues and hopefully nails the root
> > cause completely.
> >
> > Please review carefully and expose it to the dreaded rcu torture tests
> > which seem to be the only way to trigger it.
>
> Best Christmas present ever, thank you!!!
>
> Just started up three concurrent 10-hour runs of the infamous rcutorture
> TREE01 scenario, and will let you know how it goes!
Well, I messed up the first test and then reran it. Which had the benefit
of giving me a baseline. The rerun (with all four patches) produced
failures, so I ran it again with an additional patch of mine. I score
these tests by recording the time at first failure, or, if there is no
failure, the duration of the test. Summing the values gives the score.
And here are the scores, where 30 is a perfect score:
1. Baseline: 3.0+2.5+10=15.5
2. Four patches from Anna-Marie and Thomas: 10+2.7+1.7=14.4
3. Ditto plus the patch below: 10+4.3+10=24.3
Please note that these are nowhere near anything even resembling
statistical significance. However, they are encouraging. I will do
more runs, but also do shorter five-hour runs to increase the amount
of data per unit time. Please note also that my patch by itself never
did provide that great of an improvement, so there might be some sort
of combination effect going on here. Or maybe it is just luck, who knows?
Please note that I have not yet ported my diagnostic patches on top of
these, however, the stacks have the usual schedule_timeout() entries.
This is not too surprising from a software-engineering viewpoint:
Locating several bugs at a given point of time usually indicates that
there are more to be found. So in a sense we are lucky that the
same test triggers at least one of those additional bugs.
Thanx, Paul
------------------------------------------------------------------------
commit accb0edb85526a05b934eac49658d05ea0216fc4
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date: Thu Dec 7 13:18:44 2017 -0800
timers: Ensure that timer_base ->clk accounts for time offline
The timer_base ->must_forward_clk is set to indicate that the next timer
operation on that timer_base must check for passage of time. One instance
of time passage is when the timer wheel goes idle, and another is when
the corresponding CPU is offline. Note that it is not appropriate to set
->is_idle because that could result in IPIing an offline CPU. Therefore,
this commit instead sets ->must_forward_clk at CPU-offline time.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index ffebcf878fba..94cce780c574 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1875,6 +1875,7 @@ int timers_dead_cpu(unsigned int cpu)
BUG_ON(old_base->running_timer);
+ old_base->must_forward_clk = true;
for (i = 0; i < WHEEL_SIZE; i++)
migrate_timer_list(new_base, old_base->vectors + i);
next prev parent reply other threads:[~2017-12-24 1:21 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-22 14:51 [patch 0/4] timer/nohz: Fix timer/nohz woes Thomas Gleixner
2017-12-22 14:51 ` [patch 1/4] timer: Use deferrable base independent of base::nohz_active Thomas Gleixner
2017-12-25 16:24 ` Frederic Weisbecker
2017-12-29 22:45 ` [tip:timers/urgent] timers: " tip-bot for Anna-Maria Gleixner
2017-12-22 14:51 ` [patch 2/4] nohz: Prevent erroneous tick stop invocations Thomas Gleixner
2017-12-26 15:17 ` Frederic Weisbecker
2017-12-27 18:22 ` Thomas Gleixner
2017-12-27 18:24 ` Thomas Gleixner
2017-12-27 20:58 ` Thomas Gleixner
2017-12-29 16:12 ` Frederic Weisbecker
2017-12-29 22:46 ` [tip:timers/urgent] nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick() tip-bot for Thomas Gleixner
2017-12-22 14:51 ` [patch 3/4] timer: Invoke timer_start_debug() where it makes sense Thomas Gleixner
2017-12-29 22:47 ` [tip:timers/urgent] timers: " tip-bot for Thomas Gleixner
2017-12-22 14:51 ` [patch 4/4] timerqueue: Document return values of timerqueue_add/del() Thomas Gleixner
2017-12-29 22:47 ` [tip:timers/urgent] " tip-bot for Thomas Gleixner
2017-12-22 17:09 ` [patch 0/4] timer/nohz: Fix timer/nohz woes Paul E. McKenney
2017-12-24 1:21 ` Paul E. McKenney [this message]
2017-12-24 1:29 ` Paul E. McKenney
2018-01-05 19:41 ` Paul E. McKenney
2018-01-06 21:18 ` Thomas Gleixner
2018-01-06 23:21 ` Paul E. McKenney
2017-12-27 20:55 ` Thomas Gleixner
2017-12-29 22:46 ` [tip:timers/urgent] timers: Reinitialize per cpu bases on hotplug tip-bot for Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171224012120.GA4113@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=anna-maria@linutronix.de \
--cc=bigeasy@linutronix.de \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).