From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1424407AbcBSEW7 (ORCPT <rfc822;w@1wt.eu>);
	Thu, 18 Feb 2016 23:22:59 -0500
Received: from e19.ny.us.ibm.com ([129.33.205.209]:39035 "EHLO
	e19.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1424151AbcBSEW4 (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Thu, 18 Feb 2016 23:22:56 -0500
X-IBM-Helo: d01dlp02.pok.ibm.com
X-IBM-MailFrom: paulmck@linux.vnet.ibm.com
X-IBM-RcptTo: linux-kernel@vger.kernel.org
Date: Thu, 18 Feb 2016 20:22:53 -0800
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Ross Green <rgkernel@gmail.com>, Thomas Gleixner <tglx@linutronix.de>,
        John Stultz <john.stultz@linaro.org>,
        Peter Zijlstra <peterz@infradead.org>, linux-kernel@vger.kernel.org,
        Ingo Molnar <mingo@kernel.org>, Lai Jiangshan <jiangshanlai@gmail.com>,
        dipankar@in.ibm.com, Andrew Morton <akpm@linux-foundation.org>,
        josh@joshtriplett.org, rostedt <rostedt@goodmis.org>,
        David Howells <dhowells@redhat.com>,
        Eric Dumazet <edumazet@google.com>, dvhart@linux.intel.com,
        =?iso-8859-1?Q?Fr=E9d=E9ric?= Weisbecker <fweisbec@gmail.com>,
        Oleg Nesterov <oleg@redhat.com>, pranith kumar <bobby.prani@gmail.com>
Subject: Re: rcu_preempt self-detected stall on CPU from 4.5-rc3, since 3.17
Message-ID: <20160219042253.GJ6719@linux.vnet.ibm.com>
Reply-To: paulmck@linux.vnet.ibm.com
References: <CANfgCY16F5V20Je=50ynZDL+duGVvuRY35tmZ2oi9jBz7Pjp3g@mail.gmail.com>
 <20160217054549.GB6719@linux.vnet.ibm.com>
 <20160217192817.GA21818@linux.vnet.ibm.com>
 <20160217194554.GO6357@twins.programming.kicks-ass.net>
 <20160217202829.GO6719@linux.vnet.ibm.com>
 <20160217231945.GA21140@linux.vnet.ibm.com>
 <CANfgCY0sy7Fib6yihxFrj+iZx7Y8JkBKMcw2Fjc7JX_58MuFiQ@mail.gmail.com>
 <1568248905.2264.1455837260992.JavaMail.zimbra@efficios.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1568248905.2264.1455837260992.JavaMail.zimbra@efficios.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-TM-AS-MML: disable
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 16021904-0057-0000-0000-00000378D7AD
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu, Feb 18, 2016 at 11:14:21PM +0000, Mathieu Desnoyers wrote:
> ----- On Feb 18, 2016, at 6:51 AM, Ross Green rgkernel@gmail.com wrote:
> 
> > On Thu, Feb 18, 2016 at 10:19 AM, Paul E. McKenney
> > <paulmck@linux.vnet.ibm.com> wrote:
> >> On Wed, Feb 17, 2016 at 12:28:29PM -0800, Paul E. McKenney wrote:
> >>> On Wed, Feb 17, 2016 at 08:45:54PM +0100, Peter Zijlstra wrote:
> >>> > On Wed, Feb 17, 2016 at 11:28:17AM -0800, Paul E. McKenney wrote:
> >>> > > On Tue, Feb 16, 2016 at 09:45:49PM -0800, Paul E. McKenney wrote:
> >>> > > > On Tue, Feb 09, 2016 at 09:11:55PM +1100, Ross Green wrote:
> >>> > > > > Continued testing with the latest linux-4.5-rc3 release.
> >>> > > > >
> >>> > > > > Please find attached a copy of traces from dmesg:
> >>> > > > >
> >>> > > > > There is a lot more debug and trace data so hopefully this will shed
> >>> > > > > some light on what might be happening here.
> >>> > > > >
> >>> > > > > My testing remains run a series of simple benchmarks, let that run to
> >>> > > > > completion and then leave the system idle away with just a few daemons
> >>> > > > > running.
> >>> > > > >
> >>> > > > > the self detected stalls in this instance turned up after a days run time.
> >>> > > > > There were  NO heavy artificial computational loads on the machine.
> >>> > > >
> >>> > > > It does indeed look quiet on that dmesg for a good long time.
> >>> > > >
> >>> > > > The following insanely crude not-for-mainline hack -might- be producing
> >>> > > > good results in my testing.  It will take some time before I can claim
> >>> > > > statistically different results.  But please feel free to give it a go
> >>> > > > in the meantime.  (Thanks to Al Viro for pointing me in this direction.)
> >>> >
> >>> > Your case was special in that is was hotplug triggering it, right?
> >>>
> >>> Yes, it has thus far only shown up with CPU hotplug enabled.
> >>>
> >>> > I was auditing the hotplug paths involved when I fell ill two weeks ago,
> >>> > and have not really made any progress on that because of that :/
> >>>
> >>> I have always said that being sick is bad for one's health, but I didn't
> >>> realize that it could be bad for the kernel's health as well.  ;-)
> >>>
> >>> > I'll go have another look, I had a vague feeling for a race back then,
> >>> > lets see if I can still remember how..
> >>>
> >>> I believe that I can -finally- get an ftrace_dump() to happen within
> >>> 10-20 milliseconds of the problem, which just might be soon enough
> >>> after the problem to gather some useful information.  I am currently
> >>> testing this theory with "ftrace trace_event=sched_waking,sched_wakeup"
> >>> boot arguments on a two-hour run.
> >>
> >> And apparently another way to greatly reduce the probability of this
> >> bug occurring is to enable ftrace.  :-/
> >>
> >> Will try longer runs.
> >>
> >>                                                         Thanx, Paul
> >>
> >>> If this works out, what would be a useful set of trace events for me
> >>> to capture?
> >>>
> >>>                                                       Thanx, Paul
> >>
> > 
> > Well managed to catch this one on linux-4.5-rc4.
> > 
> > Took over 3 days and 7 hours to hit.
> > 
> > Same test as before, boot, run a series of simple benchmarks and then
> > let the machine just idle away.
> > 
> > As before, the reported stall, AND everything keeps on running as if
> > nothing had happened.
> > 
> > I notice in the task dump for both the cpus, the swapper is running on
> > both cpus.
> > 
> > does that make any sense?
> > There is around 3% of memory actually used.
> > 
> > Anyway, please find attached a copy of the dmesg output.
> > 
> > Hope this helps a few people fill in the missing pieces here.
> 
> What seems weird here is that all code paths in the loop
> perform a WRITE_ONCE(rsp->gp_activity, jiffies), which
> implies progress in each case:
> 
> - rcu_gp_init() does it,
> - both branches in the QS forcing loop do it, either
>   through rcu_gp_fqs(), or directly,
> 
> This means the thread is really stalled, and the backtrace
> shows those threads are stalled on the
> 
>                         ret = wait_event_interruptible_timeout(rsp->gp_wq,
>                                         rcu_gp_fqs_check_wake(rsp, &gf), j);
> 
> Since this is a *_timeout wait, for which the timeout
> is bounded by "j" jiffies which is bounded by "HZ" value,
> we should really not stay there too long, even if we are
> not awakened by whatever is supposed to awaken us.

Completely agreed on this seeming weird.  ;-)

> So unless I'm missing something, it would look like
> schedule_timeout() is missing its timeout there.
> 
> Perhaps we only experience this missed timeout here
> because typically there is always a wakeup coming sooner
> or later on relatively busy systems. This one is idle
> for quite a while.
> 
> Thoughts ?

I can also make this happen (infrequently) on a busy system with
rcutorture, but only with frequent CPU hotplugging.  Ross is making
it happen with pure idle.

I did manage to make this fail with ftrace running, but thus far
have not been able to get a trace that actually includes any
activity for the grace-period kthread.  Working on tightening
up the tests...

						Thanx, Paul

> Thanks,
> 
> Mathieu
> 
> 
> > 
> > Regards,
> > 
> > Ross Green
> 
> -- 
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com