public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Josh Triplett <josh@joshtriplett.org>
Cc: linux-kernel@vger.kernel.org, mingo@elte.hu,
	laijs@cn.fujitsu.com, dipankar@in.ibm.com,
	akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca,
	niv@us.ibm.com, tglx@linutronix.de, peterz@infradead.org,
	rostedt@goodmis.org, Valdis.Kletnieks@vt.edu,
	dhowells@redhat.com, eric.dumazet@gmail.com, darren@dvhart.com,
	patches@linaro.org
Subject: Re: [PATCH RFC tip/core/rcu 05/28] lockdep: Update documentation for lock-class leak detection
Date: Thu, 3 Nov 2011 12:42:26 -0700	[thread overview]
Message-ID: <20111103194226.GG2287@linux.vnet.ibm.com> (raw)
In-Reply-To: <20111103025716.GA2042@leaf>

On Wed, Nov 02, 2011 at 07:57:16PM -0700, Josh Triplett wrote:
> On Wed, Nov 02, 2011 at 01:30:26PM -0700, Paul E. McKenney wrote:
> > There are a number of bugs that can leak or overuse lock classes,
> > which can cause the maximum number of lock classes (currently 8191)
> > to be exceeded.  However, the documentation does not tell you how to
> > track down these problems.  This commit addresses this shortcoming.
> > 
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > ---
> >  Documentation/lockdep-design.txt |   61 ++++++++++++++++++++++++++++++++++++++
> >  1 files changed, 61 insertions(+), 0 deletions(-)
> > 
> > diff --git a/Documentation/lockdep-design.txt b/Documentation/lockdep-design.txt
> > index abf768c..383bb23 100644
> > --- a/Documentation/lockdep-design.txt
> > +++ b/Documentation/lockdep-design.txt
> > @@ -221,3 +221,64 @@ when the chain is validated for the first time, is then put into a hash
> >  table, which hash-table can be checked in a lockfree manner. If the
> >  locking chain occurs again later on, the hash table tells us that we
> >  dont have to validate the chain again.
> > +
> > +Troubleshooting:
> > +----------------
> > +
> > +The validator tracks a maximum of MAX_LOCKDEP_KEYS number of lock classes.
> > +Exceeding this number will trigger the following lockdep warning:
> > +
> > +	(DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS))
> > +
> > +By default, MAX_LOCKDEP_KEYS is currently set to 8191, and typical
> > +desktop systems have less than 1,000 lock classes, so this warning
> > +normally results from lock-class leakage or failure to properly
> > +initialize locks.  These two problems are illustrated below:
> > +
> > +1.	Repeated module loading and unloading while running the validator
> > +	will result in lock-class leakage.  The issue here is that each
> > +	load of the module will create a new set of lock classes for that
> > +	module's locks, but module unloading does not remove old classes.
> 
> I'd explicitly add a parenthetical here: (see below about reusing lock
> classes for why).  I stared at this for a minute trying to think about
> why the old classes couldn't go away, before realizing this fell into
> the case you described below: removing them would require cleaning up
> any dependency chains involving them.

Done!

> > +	Therefore, if that module is loaded and unloaded repeatedly,
> > +	the number of lock classes will eventually reach the maximum.
> > +
> > +2.	Using structures such as arrays that have large numbers of
> > +	locks that are not explicitly initialized.  For example,
> > +	a hash table with 8192 buckets where each bucket has its
> > +	own spinlock_t will consume 8192 lock classes -unless- each
> > +	spinlock is initialized, for example, using spin_lock_init().
> > +	Failure to properly initialize the per-bucket spinlocks would
> > +	guarantee lock-class overflow.	In contrast, a loop that called
> > +	spin_lock_init() on each lock would place all 8192 locks into a
> > +	single lock class.
> > +
> > +	The moral of this story is that you should always explicitly
> > +	initialize your locks.
> 
> Spin locks *require* initialization, right?  Doesn't this constitute a
> bug regardless of lockdep?
> 
> If so, could we simply arrange to have lockdep scream when it encounters
> an uninitialized spinlock?

I reworded to distinguish between compile-time initialization (which will
cause lockdep to have a separate class per instance) and run-time
initialization (which will cause lockdep to have one class total).

Making lockdep scream in this case might be useful, but if I understand
correctly, that would give false positives for compile-time initialized
global locks.

> > +One might argue that the validator should be modified to allow lock
> > +classes to be reused.  However, if you are tempted to make this argument,
> > +first review the code and think through the changes that would be
> > +required, keeping in mind that the lock classes to be removed are likely
> > +to be linked into the lock-dependency graph.  This turns out to be a
> > +harder to do than to say.
> 
> Typo fix: s/to be a harder/to be harder/.

Fixed.

> > +Of course, if you do run out of lock classes, the next thing to do is
> > +to find the offending lock classes.  First, the following command gives
> > +you the number of lock classes currently in use along with the maximum:
> > +
> > +	grep "lock-classes" /proc/lockdep_stats
> > +
> > +This command produces the following output on a modest Power system:
> > +
> > +	 lock-classes:                          748 [max: 8191]
> 
> Does Power matter here?  Could this just say "a modest system"?

Good point -- true but irrelevant.  Removed "Power".

> > +If the number allocated (748 above) increases continually over time,
> > +then there is likely a leak.  The following command can be used to
> > +identify the leaking lock classes:
> > +
> > +	grep "BD" /proc/lockdep
> > +
> > +Run the command and save the output, then compare against the output
> > +from a later run of this command to identify the leakers.  This same
> > +output can also help you find situations where lock initialization
> > +has been omitted.
> 
> You might consider giving an example of what a lack of lock
> initialization would look like here.

Hopefully the compile-time vs. run-time clears this up.

								Thanx, Paul


  reply	other threads:[~2011-11-03 19:43 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-02 20:30 [PATCH RFC tip/core/rcu 0/28] Preview of RCU changes for 3.3 Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 01/28] powerpc: Strengthen value-returning-atomics memory barriers Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 02/28] rcu: ->signaled better named ->fqs_state Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 03/28] rcu: Avoid RCU-preempt expedited grace-period botch Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 04/28] rcu: Make synchronize_sched_expedited() better at work sharing Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 05/28] lockdep: Update documentation for lock-class leak detection Paul E. McKenney
2011-11-03  2:57   ` Josh Triplett
2011-11-03 19:42     ` Paul E. McKenney [this message]
2011-11-09 14:02       ` Peter Zijlstra
2011-11-10 17:22         ` Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 06/28] rcu: Track idleness independent of idle tasks Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 07/28] trace: Allow ftrace_dump() to be called from modules Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 08/28] rcu: Add failure tracing to rcutorture Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 09/28] rcu: Document failing tick as cause of RCU CPU stall warning Paul E. McKenney
2011-11-03  3:07   ` Josh Triplett
2011-11-03 13:25     ` Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 10/28] rcu: Disable preemption in rcu_is_cpu_idle() Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 11/28] rcu: Omit self-awaken when setting up expedited grace period Paul E. McKenney
2011-11-03  3:16   ` Josh Triplett
2011-11-03 19:43     ` Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 12/28] rcu: Detect illegal rcu dereference in extended quiescent state Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 13/28] rcu: Inform the user about extended quiescent state on PROVE_RCU warning Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 14/28] rcu: Warn when rcu_read_lock() is used in extended quiescent state Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 15/28] rcu: Remove one layer of abstraction from PROVE_RCU checking Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 16/28] rcu: Warn when srcu_read_lock() is used in an extended quiescent state Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 17/28] rcu: Make srcu_read_lock_held() call common lockdep-enabled function Paul E. McKenney
2011-11-03  3:48   ` Josh Triplett
2011-11-03 11:14     ` Frederic Weisbecker
2011-11-03 13:19       ` Steven Rostedt
2011-11-03 13:30         ` Paul E. McKenney
2011-11-03 13:29       ` Paul E. McKenney
2011-11-03 13:59         ` Steven Rostedt
2011-11-03 20:14           ` Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 18/28] nohz: Separate out irq exit and idle loop dyntick logic Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 19/28] nohz: Allow rcu extended quiescent state handling seperately from tick stop Paul E. McKenney
2011-11-03  4:00   ` Josh Triplett
2011-11-03 11:54     ` Frederic Weisbecker
2011-11-03 13:32       ` Paul E. McKenney
2011-11-03 15:31         ` Josh Triplett
2011-11-03 16:06           ` Paul E. McKenney
2011-11-09 14:28             ` Peter Zijlstra
2011-11-09 16:48             ` Frederic Weisbecker
2011-11-10 10:52               ` Peter Zijlstra
2011-11-10 17:22               ` Paul E. McKenney
2011-11-15 18:30                 ` Frederic Weisbecker
2011-11-16 19:41                   ` Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 20/28] x86: Enter rcu extended qs after idle notifier call Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 21/28] x86: Call idle notifier after irq_enter() Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 22/28] rcu: Fix early call to rcu_idle_enter() Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 23/28] powerpc: Tell RCU about idle after hcall tracing Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 24/28] rcu: Introduce bulk reference count Paul E. McKenney
2011-11-03  4:34   ` Josh Triplett
2011-11-03 13:34     ` Paul E. McKenney
2011-11-03 20:19       ` Paul E. McKenney
2011-11-28 12:41   ` Peter Zijlstra
2011-11-28 17:15     ` Paul E. McKenney
2011-11-28 18:17       ` Peter Zijlstra
2011-11-28 18:31         ` Paul E. McKenney
2011-11-28 18:35           ` Peter Zijlstra
2011-11-29 13:33             ` Peter Zijlstra
2011-11-29 17:41               ` Paul E. McKenney
2011-11-28 18:36           ` Peter Zijlstra
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 25/28] rcu: Deconfuse dynticks entry-exit tracing Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 26/28] rcu: Add more information to the wrong-idle-task complaint Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 27/28] rcu: Allow dyntick-idle mode for CPUs with callbacks Paul E. McKenney
2011-11-03  4:47   ` Josh Triplett
2011-11-03 19:53     ` Paul E. McKenney
2011-11-02 20:30 ` [PATCH RFC tip/core/rcu 28/28] rcu: Fix idle-task checks Paul E. McKenney
2011-11-03  4:55   ` Josh Triplett
2011-11-03 21:00     ` Paul E. McKenney
2011-11-03 23:05       ` Josh Triplett
2011-11-09 14:52     ` Peter Zijlstra
2011-11-03  4:55 ` [PATCH RFC tip/core/rcu 0/28] Preview of RCU changes for 3.3 Josh Triplett
2011-11-03 21:45   ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111103194226.GG2287@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=Valdis.Kletnieks@vt.edu \
    --cc=akpm@linux-foundation.org \
    --cc=darren@dvhart.com \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=eric.dumazet@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=niv@us.ibm.com \
    --cc=patches@linaro.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox