From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1424407AbcBSEW7 (ORCPT ); Thu, 18 Feb 2016 23:22:59 -0500 Received: from e19.ny.us.ibm.com ([129.33.205.209]:39035 "EHLO e19.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1424151AbcBSEW4 (ORCPT ); Thu, 18 Feb 2016 23:22:56 -0500 X-IBM-Helo: d01dlp02.pok.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org Date: Thu, 18 Feb 2016 20:22:53 -0800 From: "Paul E. McKenney" To: Mathieu Desnoyers Cc: Ross Green , Thomas Gleixner , John Stultz , Peter Zijlstra , linux-kernel@vger.kernel.org, Ingo Molnar , Lai Jiangshan , dipankar@in.ibm.com, Andrew Morton , josh@joshtriplett.org, rostedt , David Howells , Eric Dumazet , dvhart@linux.intel.com, =?iso-8859-1?Q?Fr=E9d=E9ric?= Weisbecker , Oleg Nesterov , pranith kumar Subject: Re: rcu_preempt self-detected stall on CPU from 4.5-rc3, since 3.17 Message-ID: <20160219042253.GJ6719@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20160217054549.GB6719@linux.vnet.ibm.com> <20160217192817.GA21818@linux.vnet.ibm.com> <20160217194554.GO6357@twins.programming.kicks-ass.net> <20160217202829.GO6719@linux.vnet.ibm.com> <20160217231945.GA21140@linux.vnet.ibm.com> <1568248905.2264.1455837260992.JavaMail.zimbra@efficios.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1568248905.2264.1455837260992.JavaMail.zimbra@efficios.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16021904-0057-0000-0000-00000378D7AD Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 18, 2016 at 11:14:21PM +0000, Mathieu Desnoyers wrote: > ----- On Feb 18, 2016, at 6:51 AM, Ross Green rgkernel@gmail.com wrote: > > > On Thu, Feb 18, 2016 at 10:19 AM, Paul E. McKenney > > wrote: > >> On Wed, Feb 17, 2016 at 12:28:29PM -0800, Paul E. McKenney wrote: > >>> On Wed, Feb 17, 2016 at 08:45:54PM +0100, Peter Zijlstra wrote: > >>> > On Wed, Feb 17, 2016 at 11:28:17AM -0800, Paul E. McKenney wrote: > >>> > > On Tue, Feb 16, 2016 at 09:45:49PM -0800, Paul E. McKenney wrote: > >>> > > > On Tue, Feb 09, 2016 at 09:11:55PM +1100, Ross Green wrote: > >>> > > > > Continued testing with the latest linux-4.5-rc3 release. > >>> > > > > > >>> > > > > Please find attached a copy of traces from dmesg: > >>> > > > > > >>> > > > > There is a lot more debug and trace data so hopefully this will shed > >>> > > > > some light on what might be happening here. > >>> > > > > > >>> > > > > My testing remains run a series of simple benchmarks, let that run to > >>> > > > > completion and then leave the system idle away with just a few daemons > >>> > > > > running. > >>> > > > > > >>> > > > > the self detected stalls in this instance turned up after a days run time. > >>> > > > > There were NO heavy artificial computational loads on the machine. > >>> > > > > >>> > > > It does indeed look quiet on that dmesg for a good long time. > >>> > > > > >>> > > > The following insanely crude not-for-mainline hack -might- be producing > >>> > > > good results in my testing. It will take some time before I can claim > >>> > > > statistically different results. But please feel free to give it a go > >>> > > > in the meantime. (Thanks to Al Viro for pointing me in this direction.) > >>> > > >>> > Your case was special in that is was hotplug triggering it, right? > >>> > >>> Yes, it has thus far only shown up with CPU hotplug enabled. > >>> > >>> > I was auditing the hotplug paths involved when I fell ill two weeks ago, > >>> > and have not really made any progress on that because of that :/ > >>> > >>> I have always said that being sick is bad for one's health, but I didn't > >>> realize that it could be bad for the kernel's health as well. ;-) > >>> > >>> > I'll go have another look, I had a vague feeling for a race back then, > >>> > lets see if I can still remember how.. > >>> > >>> I believe that I can -finally- get an ftrace_dump() to happen within > >>> 10-20 milliseconds of the problem, which just might be soon enough > >>> after the problem to gather some useful information. I am currently > >>> testing this theory with "ftrace trace_event=sched_waking,sched_wakeup" > >>> boot arguments on a two-hour run. > >> > >> And apparently another way to greatly reduce the probability of this > >> bug occurring is to enable ftrace. :-/ > >> > >> Will try longer runs. > >> > >> Thanx, Paul > >> > >>> If this works out, what would be a useful set of trace events for me > >>> to capture? > >>> > >>> Thanx, Paul > >> > > > > Well managed to catch this one on linux-4.5-rc4. > > > > Took over 3 days and 7 hours to hit. > > > > Same test as before, boot, run a series of simple benchmarks and then > > let the machine just idle away. > > > > As before, the reported stall, AND everything keeps on running as if > > nothing had happened. > > > > I notice in the task dump for both the cpus, the swapper is running on > > both cpus. > > > > does that make any sense? > > There is around 3% of memory actually used. > > > > Anyway, please find attached a copy of the dmesg output. > > > > Hope this helps a few people fill in the missing pieces here. > > What seems weird here is that all code paths in the loop > perform a WRITE_ONCE(rsp->gp_activity, jiffies), which > implies progress in each case: > > - rcu_gp_init() does it, > - both branches in the QS forcing loop do it, either > through rcu_gp_fqs(), or directly, > > This means the thread is really stalled, and the backtrace > shows those threads are stalled on the > > ret = wait_event_interruptible_timeout(rsp->gp_wq, > rcu_gp_fqs_check_wake(rsp, &gf), j); > > Since this is a *_timeout wait, for which the timeout > is bounded by "j" jiffies which is bounded by "HZ" value, > we should really not stay there too long, even if we are > not awakened by whatever is supposed to awaken us. Completely agreed on this seeming weird. ;-) > So unless I'm missing something, it would look like > schedule_timeout() is missing its timeout there. > > Perhaps we only experience this missed timeout here > because typically there is always a wakeup coming sooner > or later on relatively busy systems. This one is idle > for quite a while. > > Thoughts ? I can also make this happen (infrequently) on a busy system with rcutorture, but only with frequent CPU hotplugging. Ross is making it happen with pure idle. I did manage to make this fail with ftrace running, but thus far have not been able to get a trace that actually includes any activity for the grace-period kthread. Working on tightening up the tests... Thanx, Paul > Thanks, > > Mathieu > > > > > > Regards, > > > > Ross Green > > -- > Mathieu Desnoyers > EfficiOS Inc. > http://www.efficios.com