lockdep report in hibernate code

public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed

* lockdep report in hibernate code
@ 2007-10-22 14:11 Johannes Berg
  2007-10-22 22:39 ` Rafael J. Wysocki
  0 siblings, 1 reply; 13+ messages in thread
From: Johannes Berg @ 2007-10-22 14:11 UTC (permalink / raw)
  To: linux-pm


[-- Attachment #1.1: Type: text/plain, Size: 5231 bytes --]

I just got this on my quad powermac which then failed to hibernate
(separate mail coming soon):

Oct 22 15:37:59 quad kernel: [  362.782679] [ INFO: possible circular locking dependency detected ]
Oct 22 15:37:59 quad kernel: [  362.782687] 2.6.23-g55b70a03-dirty #268
Oct 22 15:37:59 quad kernel: [  362.782696] -------------------------------------------------------
Oct 22 15:37:59 quad kernel: [  362.782704] pm-hibernate/4231 is trying to acquire lock:
Oct 22 15:37:59 quad kernel: [  362.782710]  (pm_mutex){--..}, at: [<c00000000008ae28>] .disk_store+0x70/0x17c
Oct 22 15:37:59 quad kernel: [  362.782742] 
Oct 22 15:37:59 quad kernel: [  362.782744] but task is already holding lock:
Oct 22 15:37:59 quad kernel: [  362.782751]  (&buffer->mutex){--..}, at: [<c00000000013a620>] .sysfs_write_file+0x58/0x19c
Oct 22 15:37:59 quad kernel: [  362.782771] 
Oct 22 15:37:59 quad kernel: [  362.782773] which lock already depends on the new lock.
Oct 22 15:37:59 quad kernel: [  362.782776] 
Oct 22 15:37:59 quad kernel: [  362.782782] 
Oct 22 15:37:59 quad kernel: [  362.782783] the existing dependency chain (in reverse order) is:
Oct 22 15:37:59 quad kernel: [  362.782795] 
Oct 22 15:37:59 quad kernel: [  362.782797] -> #1 (&buffer->mutex){--..}:
Oct 22 15:37:59 quad kernel: [  362.782831]        [<c00000000007d190>] .__lock_acquire+0xcf0/0xf60
Oct 22 15:37:59 quad kernel: [  362.782910]        [<c00000000007d4d0>] .lock_acquire+0xd0/0x11c
Oct 22 15:37:59 quad kernel: [  362.782982]        [<c0000000003e0ed8>] .mutex_lock_nested+0x150/0x3e8
Oct 22 15:37:59 quad kernel: [  362.783061]        [<c00000000013a7bc>] .sysfs_read_file+0x58/0x1a0
Oct 22 15:37:59 quad kernel: [  362.783137]        [<c0000000000dd130>] .vfs_read+0xd8/0x1b0
Oct 22 15:37:59 quad kernel: [  362.783211]        [<c0000000000dd960>] .sys_read+0x5c/0xac
Oct 22 15:37:59 quad kernel: [  362.783285]        [<c000000000008014>] .try_name+0x88/0x260
Oct 22 15:37:59 quad kernel: [  362.783362]        [<c000000000008430>] .name_to_dev_t+0x244/0x2e8
Oct 22 15:37:59 quad kernel: [  362.783439]        [<c00000000008b33c>] .software_resume+0x7c/0x200
Oct 22 15:37:59 quad kernel: [  362.783516]        [<c00000000054e8f8>] .kernel_init+0x214/0x3e8
Oct 22 15:37:59 quad kernel: [  362.783588]        [<c000000000024cdc>] .kernel_thread+0x4c/0x68
Oct 22 15:37:59 quad kernel: [  362.783623] 
Oct 22 15:37:59 quad kernel: [  362.783625] -> #0 (pm_mutex){--..}:
Oct 22 15:37:59 quad kernel: [  362.783659]        [<c00000000007d088>] .__lock_acquire+0xbe8/0xf60
Oct 22 15:37:59 quad kernel: [  362.783733]        [<c00000000007d4d0>] .lock_acquire+0xd0/0x11c
Oct 22 15:37:59 quad kernel: [  362.783810]        [<c0000000003e0ed8>] .mutex_lock_nested+0x150/0x3e8
Oct 22 15:37:59 quad kernel: [  362.783884]        [<c00000000008ae28>] .disk_store+0x70/0x17c
Oct 22 15:37:59 quad kernel: [  362.783960]        [<c00000000013a160>] .subsys_attr_store+0x58/0x70
Oct 22 15:37:59 quad kernel: [  362.784033]        [<c00000000013a6f4>] .sysfs_write_file+0x12c/0x19c
Oct 22 15:37:59 quad kernel: [  362.784104]        [<c0000000000dcf80>] .vfs_write+0xd8/0x1b0
Oct 22 15:37:59 quad kernel: [  362.784179]        [<c0000000000dda0c>] .sys_write+0x5c/0xac
Oct 22 15:37:59 quad kernel: [  362.784255]        [<c000000000007550>] syscall_exit+0x0/0x40
Oct 22 15:37:59 quad kernel: [  362.784331] 
Oct 22 15:37:59 quad kernel: [  362.784333] other info that might help us debug this:
Oct 22 15:37:59 quad kernel: [  362.784336] 
Oct 22 15:37:59 quad kernel: [  362.784349] 1 lock held by pm-hibernate/4231:
Oct 22 15:37:59 quad kernel: [  362.784360]  #0:  (&buffer->mutex){--..}, at: [<c00000000013a620>] .sysfs_write_file+0x58/0x19c
Oct 22 15:37:59 quad kernel: [  362.784411] 
Oct 22 15:37:59 quad kernel: [  362.784413] stack backtrace:
Oct 22 15:37:59 quad kernel: [  362.784423] Call Trace:
Oct 22 15:37:59 quad kernel: [  362.784434] [c000000010783650] [c00000000000e758] .show_stack+0x78/0x1a4 (unreliable)
Oct 22 15:37:59 quad kernel: [  362.784473] [c000000010783700] [c00000000000e8a4] .dump_stack+0x20/0x34
Oct 22 15:37:59 quad kernel: [  362.784503] [c000000010783780] [c00000000007a74c] .print_circular_bug_tail+0x88/0xac
Oct 22 15:37:59 quad kernel: [  362.784535] [c000000010783850] [c00000000007d088] .__lock_acquire+0xbe8/0xf60
Oct 22 15:37:59 quad kernel: [  362.784567] [c000000010783940] [c00000000007d4d0] .lock_acquire+0xd0/0x11c
Oct 22 15:37:59 quad kernel: [  362.784599] [c000000010783a00] [c0000000003e0ed8] .mutex_lock_nested+0x150/0x3e8
Oct 22 15:37:59 quad kernel: [  362.784619] [c000000010783af0] [c00000000008ae28] .disk_store+0x70/0x17c
Oct 22 15:37:59 quad kernel: [  362.784650] [c000000010783ba0] [c00000000013a160] .subsys_attr_store+0x58/0x70
Oct 22 15:37:59 quad kernel: [  362.784680] [c000000010783c20] [c00000000013a6f4] .sysfs_write_file+0x12c/0x19c
Oct 22 15:37:59 quad kernel: [  362.784712] [c000000010783ce0] [c0000000000dcf80] .vfs_write+0xd8/0x1b0
Oct 22 15:37:59 quad kernel: [  362.784742] [c000000010783d80] [c0000000000dda0c] .sys_write+0x5c/0xac
Oct 22 15:37:59 quad kernel: [  362.784772] [c000000010783e30] [c000000000007550] syscall_exit+0x0/0x40


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: lockdep report in hibernate code
  2007-10-22 14:11 lockdep report in hibernate code Johannes Berg
@ 2007-10-22 22:39 ` Rafael J. Wysocki
  2007-10-23 10:34   ` Johannes Berg
  0 siblings, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2007-10-22 22:39 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-pm

On Monday, 22 October 2007 16:11, Johannes Berg wrote:
> I just got this on my quad powermac which then failed to hibernate
> (separate mail coming soon):
> 
> Oct 22 15:37:59 quad kernel: [  362.782679] [ INFO: possible circular locking dependency detected ]
> Oct 22 15:37:59 quad kernel: [  362.782687] 2.6.23-g55b70a03-dirty #268
> Oct 22 15:37:59 quad kernel: [  362.782696] -------------------------------------------------------
> Oct 22 15:37:59 quad kernel: [  362.782704] pm-hibernate/4231 is trying to acquire lock:
> Oct 22 15:37:59 quad kernel: [  362.782710]  (pm_mutex){--..}, at: [<c00000000008ae28>] .disk_store+0x70/0x17c
> Oct 22 15:37:59 quad kernel: [  362.782742] 
> Oct 22 15:37:59 quad kernel: [  362.782744] but task is already holding lock:
> Oct 22 15:37:59 quad kernel: [  362.782751]  (&buffer->mutex){--..}, at: [<c00000000013a620>] .sysfs_write_file+0x58/0x19c
> Oct 22 15:37:59 quad kernel: [  362.782771] 
> Oct 22 15:37:59 quad kernel: [  362.782773] which lock already depends on the new lock.

That's strange and almost certainly not true.

> Oct 22 15:37:59 quad kernel: [  362.782776] 
> Oct 22 15:37:59 quad kernel: [  362.782782] 
> Oct 22 15:37:59 quad kernel: [  362.782783] the existing dependency chain (in reverse order) is:
> Oct 22 15:37:59 quad kernel: [  362.782795] 
> Oct 22 15:37:59 quad kernel: [  362.782797] -> #1 (&buffer->mutex){--..}:
> Oct 22 15:37:59 quad kernel: [  362.782831]        [<c00000000007d190>] .__lock_acquire+0xcf0/0xf60
> Oct 22 15:37:59 quad kernel: [  362.782910]        [<c00000000007d4d0>] .lock_acquire+0xd0/0x11c
> Oct 22 15:37:59 quad kernel: [  362.782982]        [<c0000000003e0ed8>] .mutex_lock_nested+0x150/0x3e8
> Oct 22 15:37:59 quad kernel: [  362.783061]        [<c00000000013a7bc>] .sysfs_read_file+0x58/0x1a0
> Oct 22 15:37:59 quad kernel: [  362.783137]        [<c0000000000dd130>] .vfs_read+0xd8/0x1b0
> Oct 22 15:37:59 quad kernel: [  362.783211]        [<c0000000000dd960>] .sys_read+0x5c/0xac
> Oct 22 15:37:59 quad kernel: [  362.783285]        [<c000000000008014>] .try_name+0x88/0x260
> Oct 22 15:37:59 quad kernel: [  362.783362]        [<c000000000008430>] .name_to_dev_t+0x244/0x2e8
> Oct 22 15:37:59 quad kernel: [  362.783439]        [<c00000000008b33c>] .software_resume+0x7c/0x200
> Oct 22 15:37:59 quad kernel: [  362.783516]        [<c00000000054e8f8>] .kernel_init+0x214/0x3e8
> Oct 22 15:37:59 quad kernel: [  362.783588]        [<c000000000024cdc>] .kernel_thread+0x4c/0x68
> Oct 22 15:37:59 quad kernel: [  362.783623] 
> Oct 22 15:37:59 quad kernel: [  362.783625] -> #0 (pm_mutex){--..}:
> Oct 22 15:37:59 quad kernel: [  362.783659]        [<c00000000007d088>] .__lock_acquire+0xbe8/0xf60
> Oct 22 15:37:59 quad kernel: [  362.783733]        [<c00000000007d4d0>] .lock_acquire+0xd0/0x11c
> Oct 22 15:37:59 quad kernel: [  362.783810]        [<c0000000003e0ed8>] .mutex_lock_nested+0x150/0x3e8
> Oct 22 15:37:59 quad kernel: [  362.783884]        [<c00000000008ae28>] .disk_store+0x70/0x17c
> Oct 22 15:37:59 quad kernel: [  362.783960]        [<c00000000013a160>] .subsys_attr_store+0x58/0x70
> Oct 22 15:37:59 quad kernel: [  362.784033]        [<c00000000013a6f4>] .sysfs_write_file+0x12c/0x19c
> Oct 22 15:37:59 quad kernel: [  362.784104]        [<c0000000000dcf80>] .vfs_write+0xd8/0x1b0
> Oct 22 15:37:59 quad kernel: [  362.784179]        [<c0000000000dda0c>] .sys_write+0x5c/0xac
> Oct 22 15:37:59 quad kernel: [  362.784255]        [<c000000000007550>] syscall_exit+0x0/0x40
> Oct 22 15:37:59 quad kernel: [  362.784331] 
> Oct 22 15:37:59 quad kernel: [  362.784333] other info that might help us debug this:
> Oct 22 15:37:59 quad kernel: [  362.784336] 
> Oct 22 15:37:59 quad kernel: [  362.784349] 1 lock held by pm-hibernate/4231:
> Oct 22 15:37:59 quad kernel: [  362.784360]  #0:  (&buffer->mutex){--..}, at: [<c00000000013a620>] .sysfs_write_file+0x58/0x19c
> Oct 22 15:37:59 quad kernel: [  362.784411] 
> Oct 22 15:37:59 quad kernel: [  362.784413] stack backtrace:
> Oct 22 15:37:59 quad kernel: [  362.784423] Call Trace:
> Oct 22 15:37:59 quad kernel: [  362.784434] [c000000010783650] [c00000000000e758] .show_stack+0x78/0x1a4 (unreliable)
> Oct 22 15:37:59 quad kernel: [  362.784473] [c000000010783700] [c00000000000e8a4] .dump_stack+0x20/0x34
> Oct 22 15:37:59 quad kernel: [  362.784503] [c000000010783780] [c00000000007a74c] .print_circular_bug_tail+0x88/0xac
> Oct 22 15:37:59 quad kernel: [  362.784535] [c000000010783850] [c00000000007d088] .__lock_acquire+0xbe8/0xf60
> Oct 22 15:37:59 quad kernel: [  362.784567] [c000000010783940] [c00000000007d4d0] .lock_acquire+0xd0/0x11c
> Oct 22 15:37:59 quad kernel: [  362.784599] [c000000010783a00] [c0000000003e0ed8] .mutex_lock_nested+0x150/0x3e8
> Oct 22 15:37:59 quad kernel: [  362.784619] [c000000010783af0] [c00000000008ae28] .disk_store+0x70/0x17c
> Oct 22 15:37:59 quad kernel: [  362.784650] [c000000010783ba0] [c00000000013a160] .subsys_attr_store+0x58/0x70
> Oct 22 15:37:59 quad kernel: [  362.784680] [c000000010783c20] [c00000000013a6f4] .sysfs_write_file+0x12c/0x19c
> Oct 22 15:37:59 quad kernel: [  362.784712] [c000000010783ce0] [c0000000000dcf80] .vfs_write+0xd8/0x1b0
> Oct 22 15:37:59 quad kernel: [  362.784742] [c000000010783d80] [c0000000000dda0c] .sys_write+0x5c/0xac
> Oct 22 15:37:59 quad kernel: [  362.784772] [c000000010783e30] [c000000000007550] syscall_exit+0x0/0x40
> 
> 

-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: lockdep report in hibernate code
  2007-10-22 22:39 ` Rafael J. Wysocki
@ 2007-10-23 10:34   ` Johannes Berg
  2007-10-23 21:39     ` Rafael J. Wysocki
  0 siblings, 1 reply; 13+ messages in thread
From: Johannes Berg @ 2007-10-23 10:34 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm


[-- Attachment #1.1: Type: text/plain, Size: 2851 bytes --]

On Tue, 2007-10-23 at 00:39 +0200, Rafael J. Wysocki wrote:

> > Oct 22 15:37:59 quad kernel: [  362.782679] [ INFO: possible circular locking dependency detected ]
> > Oct 22 15:37:59 quad kernel: [  362.782687] 2.6.23-g55b70a03-dirty #268
> > Oct 22 15:37:59 quad kernel: [  362.782696] -------------------------------------------------------
> > Oct 22 15:37:59 quad kernel: [  362.782704] pm-hibernate/4231 is trying to acquire lock:
> > Oct 22 15:37:59 quad kernel: [  362.782710]  (pm_mutex){--..}, at: [<c00000000008ae28>] .disk_store+0x70/0x17c
> > Oct 22 15:37:59 quad kernel: [  362.782742] 
> > Oct 22 15:37:59 quad kernel: [  362.782744] but task is already holding lock:
> > Oct 22 15:37:59 quad kernel: [  362.782751]  (&buffer->mutex){--..}, at: [<c00000000013a620>] .sysfs_write_file+0x58/0x19c
> > Oct 22 15:37:59 quad kernel: [  362.782771] 
> > Oct 22 15:37:59 quad kernel: [  362.782773] which lock already depends on the new lock.
> 
> That's strange and almost certainly not true.

Uh, are you saying lockdep got it wrong? Hard to imagine. See, it tells
you:

> Oct 22 15:37:59 quad kernel: [  362.782783] the existing dependency chain (in reverse order) is:
> Oct 22 15:37:59 quad kernel: [  362.782795] 
> Oct 22 15:37:59 quad kernel: [  362.782797] -> #1 (&buffer->mutex){--..}:
> Oct 22 15:37:59 quad kernel: [  362.782831]        [<c00000000007d190>] .__lock_acquire+0xcf0/0xf60
> Oct 22 15:37:59 quad kernel: [  362.782910]        [<c00000000007d4d0>] .lock_acquire+0xd0/0x11c
> Oct 22 15:37:59 quad kernel: [  362.782982]        [<c0000000003e0ed8>] .mutex_lock_nested+0x150/0x3e8
> Oct 22 15:37:59 quad kernel: [  362.783061]        [<c00000000013a7bc>] .sysfs_read_file+0x58/0x1a0
> Oct 22 15:37:59 quad kernel: [  362.783137]        [<c0000000000dd130>] .vfs_read+0xd8/0x1b0
> Oct 22 15:37:59 quad kernel: [  362.783211]        [<c0000000000dd960>] .sys_read+0x5c/0xac
> Oct 22 15:37:59 quad kernel: [  362.783285]        [<c000000000008014>] .try_name+0x88/0x260
> Oct 22 15:37:59 quad kernel: [  362.783362]        [<c000000000008430>] .name_to_dev_t+0x244/0x2e8
> Oct 22 15:37:59 quad kernel: [  362.783439]        [<c00000000008b33c>] .software_resume+0x7c/0x200
> Oct 22 15:37:59 quad kernel: [  362.783516]        [<c00000000054e8f8>] .kernel_init+0x214/0x3e8
> Oct 22 15:37:59 quad kernel: [  362.783588]        [<c000000000024cdc>] .kernel_thread+0x4c/0x68
> Oct 22 15:37:59 quad kernel: [  362.783623] 

So let's look through the code.

software_resume() does:

459         mutex_lock(&pm_mutex);
[...]
465                 swsusp_resume_device = name_to_dev_t(resume_file);

which, according to the trace above takes a buffer mutex. This can be
verified easily, I haven't bothered.

The problem here is that the buffer mutexes are not distinguishable.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: lockdep report in hibernate code
  2007-10-23 10:34   ` Johannes Berg
@ 2007-10-23 21:39     ` Rafael J. Wysocki
  2007-10-23 21:56       ` Alan Stern
  0 siblings, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2007-10-23 21:39 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-pm

On Tuesday, 23 October 2007 12:34, Johannes Berg wrote:
> On Tue, 2007-10-23 at 00:39 +0200, Rafael J. Wysocki wrote:
> 
> > > Oct 22 15:37:59 quad kernel: [  362.782679] [ INFO: possible circular locking dependency detected ]
> > > Oct 22 15:37:59 quad kernel: [  362.782687] 2.6.23-g55b70a03-dirty #268
> > > Oct 22 15:37:59 quad kernel: [  362.782696] -------------------------------------------------------
> > > Oct 22 15:37:59 quad kernel: [  362.782704] pm-hibernate/4231 is trying to acquire lock:
> > > Oct 22 15:37:59 quad kernel: [  362.782710]  (pm_mutex){--..}, at: [<c00000000008ae28>] .disk_store+0x70/0x17c
> > > Oct 22 15:37:59 quad kernel: [  362.782742] 
> > > Oct 22 15:37:59 quad kernel: [  362.782744] but task is already holding lock:
> > > Oct 22 15:37:59 quad kernel: [  362.782751]  (&buffer->mutex){--..}, at: [<c00000000013a620>] .sysfs_write_file+0x58/0x19c
> > > Oct 22 15:37:59 quad kernel: [  362.782771] 
> > > Oct 22 15:37:59 quad kernel: [  362.782773] which lock already depends on the new lock.
> > 
> > That's strange and almost certainly not true.
> 
> Uh, are you saying lockdep got it wrong? Hard to imagine. See, it tells
> you:
> 
> > Oct 22 15:37:59 quad kernel: [  362.782783] the existing dependency chain (in reverse order) is:
> > Oct 22 15:37:59 quad kernel: [  362.782795] 
> > Oct 22 15:37:59 quad kernel: [  362.782797] -> #1 (&buffer->mutex){--..}:
> > Oct 22 15:37:59 quad kernel: [  362.782831]        [<c00000000007d190>] .__lock_acquire+0xcf0/0xf60
> > Oct 22 15:37:59 quad kernel: [  362.782910]        [<c00000000007d4d0>] .lock_acquire+0xd0/0x11c
> > Oct 22 15:37:59 quad kernel: [  362.782982]        [<c0000000003e0ed8>] .mutex_lock_nested+0x150/0x3e8
> > Oct 22 15:37:59 quad kernel: [  362.783061]        [<c00000000013a7bc>] .sysfs_read_file+0x58/0x1a0
> > Oct 22 15:37:59 quad kernel: [  362.783137]        [<c0000000000dd130>] .vfs_read+0xd8/0x1b0
> > Oct 22 15:37:59 quad kernel: [  362.783211]        [<c0000000000dd960>] .sys_read+0x5c/0xac
> > Oct 22 15:37:59 quad kernel: [  362.783285]        [<c000000000008014>] .try_name+0x88/0x260
> > Oct 22 15:37:59 quad kernel: [  362.783362]        [<c000000000008430>] .name_to_dev_t+0x244/0x2e8
> > Oct 22 15:37:59 quad kernel: [  362.783439]        [<c00000000008b33c>] .software_resume+0x7c/0x200
> > Oct 22 15:37:59 quad kernel: [  362.783516]        [<c00000000054e8f8>] .kernel_init+0x214/0x3e8
> > Oct 22 15:37:59 quad kernel: [  362.783588]        [<c000000000024cdc>] .kernel_thread+0x4c/0x68
> > Oct 22 15:37:59 quad kernel: [  362.783623] 
> 
> So let's look through the code.
> 
> software_resume() does:
> 
> 459         mutex_lock(&pm_mutex);
> [...]
> 465                 swsusp_resume_device = name_to_dev_t(resume_file);
> 
> which, according to the trace above takes a buffer mutex. This can be
> verified easily, I haven't bothered.
> 
> The problem here is that the buffer mutexes are not distinguishable.

I don't quite get the "which lock already depends on the new lock" part.

Well, I have always had problems with understanding what lockdep actually
traces ...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: lockdep report in hibernate code
  2007-10-23 21:39     ` Rafael J. Wysocki
@ 2007-10-23 21:56       ` Alan Stern
  2007-10-23 22:18         ` Rafael J. Wysocki
  0 siblings, 1 reply; 13+ messages in thread
From: Alan Stern @ 2007-10-23 21:56 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Johannes Berg, linux-pm

On Tue, 23 Oct 2007, Rafael J. Wysocki wrote:

> > The problem here is that the buffer mutexes are not distinguishable.
> 
> I don't quite get the "which lock already depends on the new lock" part.
> 
> Well, I have always had problems with understanding what lockdep actually
> traces ...

The basic idea is simple enough.  Lockdep looks for events which seem 
to be problematic, such as lock A being acquired while lock B is held 
if earlier on somebody acquired B while holding A.

The difficulty lies in the "_seem_ to be" part -- lockdep can't keep 
track of each and every individual lock in the system.  Instead it 
groups them into categories based on the structures they lie in.  So if 
A and A' are both pm_mutex members but belonging to two different 
structures, lockdep won't be able to tell them apart without help.  If 
someone acquires A then B, and someone else acquires B then A', lockdep 
will report a violation.

Alan Stern

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: lockdep report in hibernate code
  2007-10-23 21:56       ` Alan Stern
@ 2007-10-23 22:18         ` Rafael J. Wysocki
  2007-10-24  8:40           ` Johannes Berg
  0 siblings, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2007-10-23 22:18 UTC (permalink / raw)
  To: Alan Stern; +Cc: Johannes Berg, linux-pm

On Tuesday, 23 October 2007 23:56, Alan Stern wrote:
> On Tue, 23 Oct 2007, Rafael J. Wysocki wrote:
> 
> > > The problem here is that the buffer mutexes are not distinguishable.
> > 
> > I don't quite get the "which lock already depends on the new lock" part.
> > 
> > Well, I have always had problems with understanding what lockdep actually
> > traces ...
> 
> The basic idea is simple enough.  Lockdep looks for events which seem 
> to be problematic, such as lock A being acquired while lock B is held 
> if earlier on somebody acquired B while holding A.
> 
> The difficulty lies in the "_seem_ to be" part -- lockdep can't keep 
> track of each and every individual lock in the system.  Instead it 
> groups them into categories based on the structures they lie in.  So if 
> A and A' are both pm_mutex members but belonging to two different 
> structures, lockdep won't be able to tell them apart without help.  If 
> someone acquires A then B, and someone else acquires B then A', lockdep 
> will report a violation.

Yes, which is what I think is happening in this particular case.  More
precisely, we get pm_mutex while holding a buffer mutex, so lockdep is warning
when we get another buffer mutex afterwards.

Greetings (not sure what to do about that),
Rafael

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: lockdep report in hibernate code
  2007-10-23 22:18         ` Rafael J. Wysocki
@ 2007-10-24  8:40           ` Johannes Berg
  2007-10-24 21:57             ` Greg KH
  0 siblings, 1 reply; 13+ messages in thread
From: Johannes Berg @ 2007-10-24  8:40 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm


[-- Attachment #1.1: Type: text/plain, Size: 488 bytes --]

On Wed, 2007-10-24 at 00:18 +0200, Rafael J. Wysocki wrote:

> Yes, which is what I think is happening in this particular case.  More
> precisely, we get pm_mutex while holding a buffer mutex, so lockdep is warning
> when we get another buffer mutex afterwards.

Precisely. That's why I copied Greg on the second mail :) It seems that
sysfs already uses nested locks, but that only protects against lockdep
reporting a false positive for nested locks, not this case.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: lockdep report in hibernate code
  2007-10-24  8:40           ` Johannes Berg
@ 2007-10-24 21:57             ` Greg KH
  2007-10-25 13:31               ` Johannes Berg
  0 siblings, 1 reply; 13+ messages in thread
From: Greg KH @ 2007-10-24 21:57 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-pm

On Wed, Oct 24, 2007 at 10:40:20AM +0200, Johannes Berg wrote:
> On Wed, 2007-10-24 at 00:18 +0200, Rafael J. Wysocki wrote:
> 
> > Yes, which is what I think is happening in this particular case.  More
> > precisely, we get pm_mutex while holding a buffer mutex, so lockdep is warning
> > when we get another buffer mutex afterwards.
> 
> Precisely. That's why I copied Greg on the second mail :) It seems that
> sysfs already uses nested locks, but that only protects against lockdep
> reporting a false positive for nested locks, not this case.

Ok, I'm confused, where is the sysfs issue here?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: lockdep report in hibernate code
  2007-10-24 21:57             ` Greg KH
@ 2007-10-25 13:31               ` Johannes Berg
  2007-10-25 17:13                 ` Alan Stern
  0 siblings, 1 reply; 13+ messages in thread
From: Johannes Berg @ 2007-10-25 13:31 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-pm


[-- Attachment #1.1: Type: text/plain, Size: 1137 bytes --]

On Wed, 2007-10-24 at 14:57 -0700, Greg KH wrote:
> On Wed, Oct 24, 2007 at 10:40:20AM +0200, Johannes Berg wrote:
> > On Wed, 2007-10-24 at 00:18 +0200, Rafael J. Wysocki wrote:
> > 
> > > Yes, which is what I think is happening in this particular case.  More
> > > precisely, we get pm_mutex while holding a buffer mutex, so lockdep is warning
> > > when we get another buffer mutex afterwards.
> > 
> > Precisely. That's why I copied Greg on the second mail :) It seems that
> > sysfs already uses nested locks, but that only protects against lockdep
> > reporting a false positive for nested locks, not this case.
> 
> Ok, I'm confused, where is the sysfs issue here?

We have two paths here:

 (a) sysfs write -> lock buffer -> call power management code 
     -> lock pm_mutex
 (b) boot code -> power management boot -> lock pm_mutex
     -> use name_to_dev_t() -> call sysfs -> lock buffer

As you can see, lockdep rightfully complains about a possible deadlock
scenario although of course (b) only happens once at boot at a time
where (a) cannot happen. And now we're wondering how to fix it.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: lockdep report in hibernate code
  2007-10-25 13:31               ` Johannes Berg
@ 2007-10-25 17:13                 ` Alan Stern
  2007-10-26 10:36                   ` Johannes Berg
  0 siblings, 1 reply; 13+ messages in thread
From: Alan Stern @ 2007-10-25 17:13 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-pm

On Thu, 25 Oct 2007, Johannes Berg wrote:

> We have two paths here:
> 
>  (a) sysfs write -> lock buffer -> call power management code 
>      -> lock pm_mutex
>  (b) boot code -> power management boot -> lock pm_mutex
>      -> use name_to_dev_t() -> call sysfs -> lock buffer
> 
> As you can see, lockdep rightfully complains about a possible deadlock
> scenario although of course (b) only happens once at boot at a time
> where (a) cannot happen. And now we're wondering how to fix it.

Why not use mutex_lock_nested() for the boot-time pm_mutex lock call?

Alan Stern

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: lockdep report in hibernate code
  2007-10-25 17:13                 ` Alan Stern
@ 2007-10-26 10:36                   ` Johannes Berg
  2007-10-27 22:29                     ` Alan Stern
  0 siblings, 1 reply; 13+ messages in thread
From: Johannes Berg @ 2007-10-26 10:36 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-pm


[-- Attachment #1.1: Type: text/plain, Size: 662 bytes --]


> > We have two paths here:
> > 
> >  (a) sysfs write -> lock buffer -> call power management code 
> >      -> lock pm_mutex
> >  (b) boot code -> power management boot -> lock pm_mutex
> >      -> use name_to_dev_t() -> call sysfs -> lock buffer
> > 
> > As you can see, lockdep rightfully complains about a possible deadlock
> > scenario although of course (b) only happens once at boot at a time
> > where (a) cannot happen. And now we're wondering how to fix it.
> 
> Why not use mutex_lock_nested() for the boot-time pm_mutex lock call?

Is that going to help though? I thought lock_nested() is only for nested
within a class.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: lockdep report in hibernate code
  2007-10-26 10:36                   ` Johannes Berg
@ 2007-10-27 22:29                     ` Alan Stern
  2007-10-28 10:38                       ` Johannes Berg
  0 siblings, 1 reply; 13+ messages in thread
From: Alan Stern @ 2007-10-27 22:29 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-pm

On Fri, 26 Oct 2007, Johannes Berg wrote:

> > > We have two paths here:
> > > 
> > >  (a) sysfs write -> lock buffer -> call power management code 
> > >      -> lock pm_mutex
> > >  (b) boot code -> power management boot -> lock pm_mutex
> > >      -> use name_to_dev_t() -> call sysfs -> lock buffer
> > > 
> > > As you can see, lockdep rightfully complains about a possible deadlock
> > > scenario although of course (b) only happens once at boot at a time
> > > where (a) cannot happen. And now we're wondering how to fix it.
> > 
> > Why not use mutex_lock_nested() for the boot-time pm_mutex lock call?
> 
> Is that going to help though? I thought lock_nested() is only for nested
> within a class.

The "nested" part of the name is a little misleading.  The function
helps the lockdep core distinguish between locks belonging to the same
class.  Normally this is be done so that you can nest the locks in the
proper order, but that shouldn't stop you from using it here.

Alan Stern

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: lockdep report in hibernate code
  2007-10-27 22:29                     ` Alan Stern
@ 2007-10-28 10:38                       ` Johannes Berg
  0 siblings, 0 replies; 13+ messages in thread
From: Johannes Berg @ 2007-10-28 10:38 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-pm


[-- Attachment #1.1: Type: text/plain, Size: 1429 bytes --]


> The "nested" part of the name is a little misleading.  The function
> helps the lockdep core distinguish between locks belonging to the same
> class.  Normally this is be done so that you can nest the locks in the
> proper order, but that shouldn't stop you from using it here.

Oh good point. I'll test the patch below during the week (my powerbook
doesn't have lockdep and I don't have another machine here right now)
but I'm fairly confident it'll fix the issue. I'll submit it after
testing.

johannes

--- linux-2.6.orig/kernel/power/disk.c	2007-10-28 11:34:32.669337294 +0100
+++ linux-2.6/kernel/power/disk.c	2007-10-28 11:37:29.849294324 +0100
@@ -456,7 +456,17 @@ static int software_resume(void)
 	int error;
 	unsigned int flags;
 
-	mutex_lock(&pm_mutex);
+	/*
+	 * name_to_dev_t() below takes a sysfs buffer mutex when sysfs
+	 * is configured into the kernel. Since the regular hibernate
+	 * trigger path is via sysfs which takes a buffer mutex before
+	 * calling hibernate functions (which take pm_mutex) this can
+	 * cause lockdep to complain about a possible ABBA deadlock
+	 * which cannot happen since we're in the boot code here and
+	 * sysfs can't be invoked yet. Therefore, we use a subclass
+	 * here to avoid lockdep complaining.
+	 */
+	mutex_lock_nested(&pm_mutex, 1);
 	if (!swsusp_resume_device) {
 		if (!strlen(resume_file)) {
 			mutex_unlock(&pm_mutex);


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2007-10-28 10:38 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-22 14:11 lockdep report in hibernate code Johannes Berg
2007-10-22 22:39 ` Rafael J. Wysocki
2007-10-23 10:34   ` Johannes Berg
2007-10-23 21:39     ` Rafael J. Wysocki
2007-10-23 21:56       ` Alan Stern
2007-10-23 22:18         ` Rafael J. Wysocki
2007-10-24  8:40           ` Johannes Berg
2007-10-24 21:57             ` Greg KH
2007-10-25 13:31               ` Johannes Berg
2007-10-25 17:13                 ` Alan Stern
2007-10-26 10:36                   ` Johannes Berg
2007-10-27 22:29                     ` Alan Stern
2007-10-28 10:38                       ` Johannes Berg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox