From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Paul E. McKenney" Subject: Re: rcu self-detected stall messages on OMAP3, 4 boards Date: Sat, 22 Sep 2012 09:00:31 -0700 Message-ID: <20120922160031.GC2934@linux.vnet.ibm.com> References: <20120920220130.GN2449@linux.vnet.ibm.com> <20120920232114.GO2449@linux.vnet.ibm.com> <20120921185827.GC2454@linux.vnet.ibm.com> <20120921195717.GD2454@linux.vnet.ibm.com> <20120921203149.GI28835@atomide.com> <20120921220302.GF2454@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org To: Frederic Weisbecker Cc: Tony Lindgren , Paul Walmsley , "Bruce, Becky" , "Paul E. McKenney" , "" , "" , "" , "Hilman, Kevin" , "Shilimkar, Santosh" , "Hunter, Jon" , "" List-Id: linux-omap@vger.kernel.org On Sat, Sep 22, 2012 at 05:45:12PM +0200, Frederic Weisbecker wrote: > 2012/9/22 Paul E. McKenney : > > On Fri, Sep 21, 2012 at 01:31:49PM -0700, Tony Lindgren wrote: > >> * Paul E. McKenney [120921 12:58]: > >> > > >> > Just to make sure I understand the combinations: > >> > > >> > o All stalls have happened when running a minimal userspace. > >> > o CONFIG_NO_HZ=n suppresses the stalls. > >> > o CONFIG_RCU_FAST_NO_HZ (which depends on CONFIG_NO_HZ=y) has > >> > no observable effect on the stalls. > >> > >> The reason why you may need minimal userspace is to cut down > >> the number of timers waking up the system with NO_HZ. > >> Booting with init=/bin/sh might also do the trick for that. > > > > Good point! This does make for a very quiet system, but does not > > reproduce the problem under kvm, even after waiting for four minutes. > > I will leave it for more time, but it looks like I really might need to > > ask Linaro for remote access to a Panda. > > I have one. I'm currently installing Ubuntu on it and I'll try to > manage to build > a kernel and reproduce the issue. > > I'll give more news soon. Thank you! My bet is that you have to have a userspace that is so small that it registers only a few (but at least one!) RCU callback at boot time, then never registers any callbacks ever again. I have coded up a crude test case, using Tony Lindgren's suggestion of "init=/bin/sh", but I appear to have inadvertently fixed this bug in current -rcu (git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git, branch rcu/next). But I have been wrong a few times already on this particular bug... Thanx, Paul