public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] lockdep: make print_lock_name() robust against non-existing lock_class
@ 2015-04-15 13:24 Sebastian Andrzej Siewior
  2015-04-15 14:14 ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-04-15 13:24 UTC (permalink / raw)
  To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, Sebastian Andrzej Siewior

During sysrq's show-held-locks command it is possible that hlock_class()
returns NULL for a given lock. The result is then (after the warning):

|BUG: unable to handle kernel NULL pointer dereference at 0000001c
|IP: [<c1088145>] get_usage_chars+0x5/0x100
|Call Trace:
| [<c1088263>] print_lock_name+0x23/0x60
| [<c1576b57>] print_lock+0x5d/0x7e
| [<c1088314>] lockdep_print_held_locks+0x74/0xe0
| [<c1088652>] debug_show_all_locks+0x132/0x1b0
| [<c1315c48>] sysrq_handle_showlocks+0x8/0x10

This *might* happen because the thread on the other CPU drops the lock
after we are looking ->lockdep_depth and ->held_locks points no longer
to a lock that is held.
The fix here is to simply ignore it and continue.

Reported-by: Andreas Messerschmid <andreas@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 kernel/locking/lockdep.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index ba77ab5f64dd..260155a2cb89 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -530,6 +530,10 @@ static void print_lock_name(struct lock_class *class)
 {
 	char usage[LOCK_USAGE_CHARS];
 
+	if (!class) {
+		printk(" (<NONE>)");
+		return;
+	}
 	get_usage_chars(class, usage);
 
 	printk(" (");
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] lockdep: make print_lock_name() robust against non-existing lock_class
  2015-04-15 13:24 [PATCH] lockdep: make print_lock_name() robust against non-existing lock_class Sebastian Andrzej Siewior
@ 2015-04-15 14:14 ` Peter Zijlstra
  2015-04-16 14:50   ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2015-04-15 14:14 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: linux-kernel, Ingo Molnar

On Wed, Apr 15, 2015 at 03:24:36PM +0200, Sebastian Andrzej Siewior wrote:
> During sysrq's show-held-locks command it is possible that hlock_class()
> returns NULL for a given lock. The result is then (after the warning):
> 
> |BUG: unable to handle kernel NULL pointer dereference at 0000001c
> |IP: [<c1088145>] get_usage_chars+0x5/0x100
> |Call Trace:
> | [<c1088263>] print_lock_name+0x23/0x60
> | [<c1576b57>] print_lock+0x5d/0x7e
> | [<c1088314>] lockdep_print_held_locks+0x74/0xe0
> | [<c1088652>] debug_show_all_locks+0x132/0x1b0
> | [<c1315c48>] sysrq_handle_showlocks+0x8/0x10
> 
> This *might* happen because the thread on the other CPU drops the lock
> after we are looking ->lockdep_depth and ->held_locks points no longer
> to a lock that is held.
> The fix here is to simply ignore it and continue.

Hmm, but in that case we might equally run into the hlock_class() debug
check which would kill all of lockdep.

Note that lock_release_nested() with CONFIG_DEBUG_LOCKDEP will actually
clear those fields.

Would something like the below work for you?

---
 kernel/locking/lockdep.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index ba77ab5f64dd..0ef89f830ff4 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -551,7 +551,18 @@ static void print_lockdep_cache(struct lockdep_map *lock)
 
 static void print_lock(struct held_lock *hlock)
 {
-	print_lock_name(hlock_class(hlock));
+	/*
+	 * We can be called locklessly through debug_show_all_locks() so be
+	 * extra careful, the hlock might have been released and cleared.
+	 */
+	unsigned int class_idx = READ_ONCE(hlock->class_idx);
+
+	if (!class_idx || (class_idx - 1) >= MAX_LOCKDEP_KEYS) {
+		printk("<RELEASED>\n");
+		return;
+	}
+
+	print_lock_name(lock_classes + class_idx - 1);
 	printk(", at: ");
 	print_ip_sym(hlock->acquire_ip);
 }

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] lockdep: make print_lock_name() robust against non-existing lock_class
  2015-04-15 14:14 ` Peter Zijlstra
@ 2015-04-16 14:50   ` Sebastian Andrzej Siewior
  2015-04-16 15:35     ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-04-16 14:50 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, Ingo Molnar, Andreas Messerschmid

On 04/15/2015 04:14 PM, Peter Zijlstra wrote:
> On Wed, Apr 15, 2015 at 03:24:36PM +0200, Sebastian Andrzej Siewior wrote:
>> During sysrq's show-held-locks command it is possible that hlock_class()
>> returns NULL for a given lock. The result is then (after the warning):
>>
>> |BUG: unable to handle kernel NULL pointer dereference at 0000001c
>> |IP: [<c1088145>] get_usage_chars+0x5/0x100
>> |Call Trace:
>> | [<c1088263>] print_lock_name+0x23/0x60
>> | [<c1576b57>] print_lock+0x5d/0x7e
>> | [<c1088314>] lockdep_print_held_locks+0x74/0xe0
>> | [<c1088652>] debug_show_all_locks+0x132/0x1b0
>> | [<c1315c48>] sysrq_handle_showlocks+0x8/0x10
>>
>> This *might* happen because the thread on the other CPU drops the lock
>> after we are looking ->lockdep_depth and ->held_locks points no longer
>> to a lock that is held.
>> The fix here is to simply ignore it and continue.
> 
> Hmm, but in that case we might equally run into the hlock_class() debug
> check which would kill all of lockdep.
> 
> Note that lock_release_nested() with CONFIG_DEBUG_LOCKDEP will actually
> clear those fields.
> 
> Would something like the below work for you?

Andreas confirmed that it works for him on v3.18 with minor adjustment.

<---
+       struct held_lock lock = READ_ONCE(*hlock);
+       unsigned int class_idx = lock.class_idx;
--->

So, yes, thanks.

> ---
>  kernel/locking/lockdep.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index ba77ab5f64dd..0ef89f830ff4 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -551,7 +551,18 @@ static void print_lockdep_cache(struct lockdep_map *lock)
>  
>  static void print_lock(struct held_lock *hlock)
>  {
> -	print_lock_name(hlock_class(hlock));
> +	/*
> +	 * We can be called locklessly through debug_show_all_locks() so be
> +	 * extra careful, the hlock might have been released and cleared.
> +	 */
> +	unsigned int class_idx = READ_ONCE(hlock->class_idx);
> +
> +	if (!class_idx || (class_idx - 1) >= MAX_LOCKDEP_KEYS) {
> +		printk("<RELEASED>\n");
> +		return;
> +	}
> +
> +	print_lock_name(lock_classes + class_idx - 1);
>  	printk(", at: ");
>  	print_ip_sym(hlock->acquire_ip);
>  }
> 
Sebastian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] lockdep: make print_lock_name() robust against non-existing lock_class
  2015-04-16 14:50   ` Sebastian Andrzej Siewior
@ 2015-04-16 15:35     ` Peter Zijlstra
  2015-04-16 15:39       ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2015-04-16 15:35 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: linux-kernel, Ingo Molnar, Andreas Messerschmid

On Thu, Apr 16, 2015 at 04:50:21PM +0200, Sebastian Andrzej Siewior wrote:

> Andreas confirmed that it works for him on v3.18 with minor adjustment.
> 
> <---
> +       struct held_lock lock = READ_ONCE(*hlock);
> +       unsigned int class_idx = lock.class_idx;
> --->
> 

I'm confused by the need for that. What was the failure with the
proposed patch?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] lockdep: make print_lock_name() robust against non-existing lock_class
  2015-04-16 15:35     ` Peter Zijlstra
@ 2015-04-16 15:39       ` Sebastian Andrzej Siewior
  2015-04-16 15:50         ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-04-16 15:39 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, Ingo Molnar, Andreas Messerschmid

* Peter Zijlstra | 2015-04-16 17:35:03 [+0200]:

>On Thu, Apr 16, 2015 at 04:50:21PM +0200, Sebastian Andrzej Siewior wrote:
>
>> Andreas confirmed that it works for him on v3.18 with minor adjustment.
>> 
>> <---
>> +       struct held_lock lock = READ_ONCE(*hlock);
>> +       unsigned int class_idx = lock.class_idx;
>> --->
>> 
>
>I'm confused by the need for that. What was the failure with the
>proposed patch?

It was tested on v3.18, there might have been a change between v3.18 &
4.0. The patch as-is did no compile:

 in file included from arch/x86/include/asm/current.h:4:0,
                  from include/linux/mutex.h:13,
                  from kernel/locking/lockdep.c:29:
 kernel/locking/lockdep.c: In function ‘print_lock’:
 kernel/locking/lockdep.c:558:37: error: ‘typeof’ applied to a bit-field
   unsigned int class_idx = READ_ONCE(hlock->class_idx);
                                      ^
 include/linux/compiler.h:262:20: note: in definition of macro ‘READ_ONCE’
   ({ union { typeof(x) __val; char __c[1]; } __u; __read_once_size(&(x), __u.__c, sizeof(x)); __u.__val; })
                     ^
 include/linux/compiler.h:262:11: error: cannot take address of bit-field ‘class_idx’
   ({ union { typeof(x) __val; char __c[1]; } __u; __read_once_size(&(x), __u.__c, sizeof(x)); __u.__val; })
            ^
 kernel/locking/lockdep.c:558:27: note: in expansion of macro ‘READ_ONCE’ 
   unsigned int class_idx = READ_ONCE(hlock->class_idx);
                            ^
 include/linux/compiler.h:262:88: error: ‘sizeof’ applied to a bit-field
   ({ union { typeof(x) __val; char __c[1]; } __u; __read_once_size(&(x), __u.__c, sizeof(x)); __u.__val; })
 
                  ^
 kernel/locking/lockdep.c:558:27: note: in expansion of macro ‘READ_ONCE’ 
   unsigned int class_idx = READ_ONCE(hlock->class_idx);
                            ^
 scripts/Makefile.build:258: recipe for target 'kernel/locking/lockdep.o' failed 

Sebastian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] lockdep: make print_lock_name() robust against non-existing lock_class
  2015-04-16 15:39       ` Sebastian Andrzej Siewior
@ 2015-04-16 15:50         ` Peter Zijlstra
  0 siblings, 0 replies; 6+ messages in thread
From: Peter Zijlstra @ 2015-04-16 15:50 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: linux-kernel, Ingo Molnar, Andreas Messerschmid

On Thu, Apr 16, 2015 at 05:39:36PM +0200, Sebastian Andrzej Siewior wrote:
> * Peter Zijlstra | 2015-04-16 17:35:03 [+0200]:
> 
> >On Thu, Apr 16, 2015 at 04:50:21PM +0200, Sebastian Andrzej Siewior wrote:
> >
> >> Andreas confirmed that it works for him on v3.18 with minor adjustment.
> >> 
> >> <---
> >> +       struct held_lock lock = READ_ONCE(*hlock);
> >> +       unsigned int class_idx = lock.class_idx;
> >> --->
> >> 
> >
> >I'm confused by the need for that. What was the failure with the
> >proposed patch?
> 
> It was tested on v3.18, there might have been a change between v3.18 &
> 4.0. The patch as-is did no compile:

Yeah, I might not have compiled it..

>  in file included from arch/x86/include/asm/current.h:4:0,
>                   from include/linux/mutex.h:13,
>                   from kernel/locking/lockdep.c:29:
>  kernel/locking/lockdep.c: In function ‘print_lock’:
>  kernel/locking/lockdep.c:558:37: error: ‘typeof’ applied to a bit-field
>    unsigned int class_idx = READ_ONCE(hlock->class_idx);

Ah! Indeed so, copying all of the hlock is overdoing it a bit but would
work I suppose.

Thanks!

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-04-16 15:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-15 13:24 [PATCH] lockdep: make print_lock_name() robust against non-existing lock_class Sebastian Andrzej Siewior
2015-04-15 14:14 ` Peter Zijlstra
2015-04-16 14:50   ` Sebastian Andrzej Siewior
2015-04-16 15:35     ` Peter Zijlstra
2015-04-16 15:39       ` Sebastian Andrzej Siewior
2015-04-16 15:50         ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox