From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759284AbXHEPFD (ORCPT ); Sun, 5 Aug 2007 11:05:03 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752621AbXHEPEy (ORCPT ); Sun, 5 Aug 2007 11:04:54 -0400 Received: from e1.ny.us.ibm.com ([32.97.182.141]:38428 "EHLO e1.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751250AbXHEPEx (ORCPT ); Sun, 5 Aug 2007 11:04:53 -0400 Date: Sun, 5 Aug 2007 08:04:49 -0700 From: "Paul E. McKenney" To: Steven Rostedt Cc: Ingo Molnar , Thomas Gleixner , RT , LKML Subject: Re: [BUG RT] - rcupreempt.c:133 on 2.6.23-rc1-rt7 Message-ID: <20070805150449.GA19418@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <1186289484.636.5.camel@localhost.localdomain> <1186290332.636.8.camel@localhost.localdomain> <20070805065948.GB515@elte.hu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.13 (2006-08-11) Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Aug 05, 2007 at 10:24:15AM -0400, Steven Rostedt wrote: > > -- > > On Sun, 5 Aug 2007, Ingo Molnar wrote: > > > > > * Steven Rostedt wrote: > > > > > > I don't have time to look further now, and it's something that isn't > > > > easily reproducible (Well, it happened once out of two boots). If > > > > you need me to look further, or need a config or dmesg (I have > > > > both), then just give me a holler. > > > > > > Silly me. FYI, I was running with !PREEMPT_RT, but with Hard and > > > Softirqs as threads. Must have copied the wrong config over :-/ > > > > it's still not supposed to happen ... rcu read lock nesting that deep? > > > > The code on line 133 is: > > WARN_ON_ONCE(current->rcu_read_lock_nesting > NR_CPUS); > > I have NR_CPUS set to 2 since the box I'm running this on only has > 2 cpus and I see no reason to waste more data structures. > > Is rcu read lock nesting deeper than 2? In networking, I would not be at all surprised, given things like fib_trie and netfilter usage. In addition, if rcu_read_lock() is called from hardirq or NMI/SMI, it is necessary to add the nesting levels in these environments as well. In any case, rcu_read_lock() is freely nestable, so there is no penalty for nesting pretty deeply. I must have missed this WARN_ON_ONCE() being added to rcu_read_lock() -- I did ack Daniel Walker's check for negative values of rcu_read_lock_nesting in rcu_read_unlock(), but saw no upper-limit checks. So, are you running into a situation where rcu_read_lock_nesting is growing unboundedly? I would not expect the per-task nesting level to normally be a function of the number of CPUs -- unless one was doing some sort of nested scan of RCU-protected per-CPU data structures or some such. So if you are adding this to your local build as a debug check, I would suggest a fixed limit -- but would -not- suggest putting such a check into a production build, at least not for a small limit. Thanx, Paul