public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH] lockdep: Print a nice description of an irq locking issue
@ 2011-03-29 17:14 Steven Rostedt
  2011-03-29 17:20 ` Steven Rostedt
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Steven Rostedt @ 2011-03-29 17:14 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Linus Torvalds,
	Peter Zijlstra

After having to explain lockdep interrupt locking inversions a few
times, I decided to have lockdep spit out the scenario that it is
complaining about.

-- Steve


The following patch is in:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git

    branch: tip/lockdep/devel


Steven Rostedt (1):
      lockdep: Print a nice description of an irq locking issue

----
 kernel/lockdep.c |   66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 66 insertions(+), 0 deletions(-)
---------------------------
commit 3429984fca737d0028c57d8d5c6a6b94ac3e90de
Author: Steven Rostedt <srostedt@redhat.com>
Date:   Tue Mar 29 12:55:14 2011 -0400

    lockdep: Print a nice description of an irq locking issue
    
    Locking order inversion due to interrupts is a subtle problem.
    When a locking inversion due to interrupts is discovered by lockdep,
    it currently reports something like this:
    
    [ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ]
    
    And then writes the locks that are involved as well as back traces.
    But several developers are confused by what a HARDIRQ->safe to unsafe
    issue is all about, and sometimes even blow it off as a bug in lockdep.
    As it is not obvious when lockdep describes this about a lock that
    is never taken in interrupt context.
    
    After explaining the problems that lockdep is reporting, I decided
    to add a description of the problem in visual form. Now the following
    is shown:
    
     ---
    other info that might help us debug this:
    
     Possible interrupt unsafe locking scenario:
    
           CPU0                    CPU1
           ----                    ----
      lock(lockA);
                                   local_irq_disable();
                                   lock(&rq->lock);
                                   lock(lockA);
      <Interrupt>
        lock(&rq->lock)
    
     *** DEADLOCK ***
    
     ---
    
    The above is the case when the unsafe lock is taken while holding
    a lock taken in irq context. But when a lock is taken that also
    grabs a unsafe lock, the call chain is shown:
    
     ---
    other info that might help us debug this:
    
    Chain exists of:
      &rq->lock --> lockA --> lockC
    
     Possible interrupt unsafe locking scenario:
    
           CPU0                    CPU1
           ----                    ----
      lock(lockC);
                                   local_irq_disable();
                                   lock(&rq->lock);
                                   lock(lockA);
      <Interrupt>
        lock(&rq->lock)
    
     *** DEADLOCK ***
    
     ---
    
    Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 0d2058d..cc5fb5b 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -490,6 +490,18 @@ void get_usage_chars(struct lock_class *class, char usage[LOCK_USAGE_CHARS])
 	usage[i] = '\0';
 }
 
+static int __print_lock_name(struct lock_class *class)
+{
+	char str[KSYM_NAME_LEN];
+	const char *name;
+
+	name = class->name;
+	if (!name) {
+		name = __get_key_name(class->key, str);
+	}
+	return printk("%s", name);
+}
+
 static void print_lock_name(struct lock_class *class)
 {
 	char str[KSYM_NAME_LEN], usage[LOCK_USAGE_CHARS];
@@ -1325,6 +1337,58 @@ print_shortest_lock_dependencies(struct lock_list *leaf,
 	return;
 }
 
+static void
+print_irq_lock_scenario(struct lock_list *backwards_entry,
+			struct lock_list *forwards_entry,
+			struct held_lock *next)
+{
+	struct lock_class *safe_class = backwards_entry->class;
+	struct lock_class *unsafe_class = forwards_entry->class;
+	struct lock_class *middle_class = hlock_class(next);
+
+	/*
+	 * A direct locking problem where unsafe_class lock is taken
+	 * directly by safe_class lock, then all we need to show
+	 * is the deadlock scenario, as it is obvious that the
+	 * unsafe lock is taken under the safe lock.
+	 *
+	 * But if there is a chain instead, where the safe lock takes
+	 * an intermediate lock (middle_class) where this lock is
+	 * not the same as the safe lock, then the lock chain is
+	 * used to describe the problem. Otherwise we would need
+	 * to show a different CPU case for each link in the chain
+	 * from the safe_class lock to the unsafe_class lock.
+	 */
+	if (middle_class != unsafe_class) {
+		printk("Chain exists of:\n  ");
+		__print_lock_name(safe_class);
+		printk(" --> ");
+		__print_lock_name(middle_class);
+		printk(" --> ");
+		__print_lock_name(unsafe_class);
+		printk("\n\n");
+	}
+
+	printk(" Possible interrupt unsafe locking scenario:\n\n");
+	printk("       CPU0                    CPU1\n");
+	printk("       ----                    ----\n");
+	printk("  lock(");
+	__print_lock_name(unsafe_class);
+	printk(");\n");
+	printk("                               local_irq_disable();\n");
+	printk("                               lock(");
+	__print_lock_name(safe_class);
+	printk(");\n");
+	printk("                               lock(");
+	__print_lock_name(middle_class);
+	printk(");\n");
+	printk("  <Interrupt>\n");
+	printk("    lock(");
+	__print_lock_name(safe_class);
+	printk(")\n");
+	printk("\n *** DEADLOCK ***\n\n");
+}
+
 static int
 print_bad_irq_dependency(struct task_struct *curr,
 			 struct lock_list *prev_root,
@@ -1376,6 +1440,8 @@ print_bad_irq_dependency(struct task_struct *curr,
 	print_stack_trace(forwards_entry->class->usage_traces + bit2, 1);
 
 	printk("\nother info that might help us debug this:\n\n");
+	print_irq_lock_scenario(backwards_entry, forwards_entry, next);
+
 	lockdep_print_held_locks(curr);
 
 	printk("\nthe dependencies between %s-irq-safe lock", irqclass);



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH] lockdep: Print a nice description of an irq locking issue
  2011-03-29 17:14 [RFC][PATCH] lockdep: Print a nice description of an irq locking issue Steven Rostedt
@ 2011-03-29 17:20 ` Steven Rostedt
  2011-03-29 18:21 ` Ingo Molnar
  2011-03-29 19:53 ` Thomas Gleixner
  2 siblings, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2011-03-29 17:20 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Linus Torvalds,
	Peter Zijlstra

On Tue, 2011-03-29 at 13:14 -0400, Steven Rostedt wrote:
>    
>      ---
>     other info that might help us debug this:
>     
>     Chain exists of:
>       &rq->lock --> lockA --> lockC
>     
>      Possible interrupt unsafe locking scenario:
>     
>            CPU0                    CPU1
>            ----                    ----
>       lock(lockC);
>                                    local_irq_disable();
>                                    lock(&rq->lock);
>                                    lock(lockA);
>       <Interrupt>
>         lock(&rq->lock)
>     
>      *** DEADLOCK ***
>     

Note, the above output is real. To produce this output, I wrote a module
that created a "lockA", "lockB" and "lockC" and had the following:

        spin_lock_irq(&lockA);
        spin_lock(&lockB);
        spin_unlock(&lockB);
        spin_unlock_irq(&lockA);

        spin_lock_irq(&lockB);
        spin_lock(&lockC);
        spin_unlock(&lockC);
        spin_unlock_irq(&lockB);

        spin_lock(&lockC);
        spin_unlock(&lockC);
        ret = register_trace_sched_switch(probe_switch, NULL);

static void
probe_switch(void *ignore, struct task_struct *p, struct task_struct *n)
{
        unsigned long flags;

        spin_lock_irqsave(&lockA, flags);
        spin_unlock_irqrestore(&lockA, flags);
}


probe_switch is called via the trace_sched_switch() trace point that is
called with the rq lock held, producing the call chain that will trigger
lockdep to produce a dump.

-- Steve





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH] lockdep: Print a nice description of an irq locking issue
  2011-03-29 17:14 [RFC][PATCH] lockdep: Print a nice description of an irq locking issue Steven Rostedt
  2011-03-29 17:20 ` Steven Rostedt
@ 2011-03-29 18:21 ` Ingo Molnar
  2011-03-29 18:21   ` Ingo Molnar
  2011-03-29 19:53 ` Thomas Gleixner
  2 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2011-03-29 18:21 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Thomas Gleixner, Andrew Morton, Linus Torvalds,
	Peter Zijlstra


* Steven Rostedt <rostedt@goodmis.org> wrote:

> After having to explain lockdep interrupt locking inversions a few
> times, I decided to have lockdep spit out the scenario that it is
> complaining about.

Looks very useful!

Would you be interested in extending this to other lockdep splats as well? 

"What does this mean" is a very common reaction to lockdep splats on lkml, and 
it's kind of silly that lockdep does not explain things better (like your patch 
does) - it has already done the much harder job of *proving* that a locking 
sequence is problematic :-)

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH] lockdep: Print a nice description of an irq locking issue
  2011-03-29 18:21 ` Ingo Molnar
@ 2011-03-29 18:21   ` Ingo Molnar
  2011-03-29 19:09     ` Steven Rostedt
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2011-03-29 18:21 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Thomas Gleixner, Andrew Morton, Linus Torvalds,
	Peter Zijlstra


* Ingo Molnar <mingo@elte.hu> wrote:

> * Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > After having to explain lockdep interrupt locking inversions a few
> > times, I decided to have lockdep spit out the scenario that it is
> > complaining about.
> 
> Looks very useful!
> 
> Would you be interested in extending this to other lockdep splats as well? 

In separate patches, of course.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH] lockdep: Print a nice description of an irq locking issue
  2011-03-29 18:21   ` Ingo Molnar
@ 2011-03-29 19:09     ` Steven Rostedt
  0 siblings, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2011-03-29 19:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Thomas Gleixner, Andrew Morton, Linus Torvalds,
	Peter Zijlstra

On Tue, 2011-03-29 at 20:21 +0200, Ingo Molnar wrote:
> * Ingo Molnar <mingo@elte.hu> wrote:
> 
> > * Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > After having to explain lockdep interrupt locking inversions a few
> > > times, I decided to have lockdep spit out the scenario that it is
> > > complaining about.
> > 
> > Looks very useful!
> > 
> > Would you be interested in extending this to other lockdep splats as well? 

Sure, that should not be a problem.

> 
> In separate patches, of course.

Of course ;)

-- Steve



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH] lockdep: Print a nice description of an irq locking issue
  2011-03-29 17:14 [RFC][PATCH] lockdep: Print a nice description of an irq locking issue Steven Rostedt
  2011-03-29 17:20 ` Steven Rostedt
  2011-03-29 18:21 ` Ingo Molnar
@ 2011-03-29 19:53 ` Thomas Gleixner
  2 siblings, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2011-03-29 19:53 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Ingo Molnar, Andrew Morton, Linus Torvalds, Peter Zijlstra

On Tue, 29 Mar 2011, Steven Rostedt wrote:
>     other info that might help us debug this:
>     
>     Chain exists of:
>       &rq->lock --> lockA --> lockC
>     
>      Possible interrupt unsafe locking scenario:
>     
>            CPU0                    CPU1
>            ----                    ----
>       lock(lockC);
>                                    local_irq_disable();
>                                    lock(&rq->lock);
>                                    lock(lockA);
>       <Interrupt>
>         lock(&rq->lock)
>     
>      *** DEADLOCK ***

Cool stuff. I always have to twist my brain around those splats as
well :)

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-03-29 19:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-29 17:14 [RFC][PATCH] lockdep: Print a nice description of an irq locking issue Steven Rostedt
2011-03-29 17:20 ` Steven Rostedt
2011-03-29 18:21 ` Ingo Molnar
2011-03-29 18:21   ` Ingo Molnar
2011-03-29 19:09     ` Steven Rostedt
2011-03-29 19:53 ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox