All of lore.kernel.org
 help / color / mirror / Atom feed
* lockdep complaint from ckpt-v15
@ 2009-05-08 16:59 Nathan Lynch
       [not found] ` <m3ljp7pm2e.fsf-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Nathan Lynch @ 2009-05-08 16:59 UTC (permalink / raw)
  To: containers-qjLDD68F18O7TbgM5vRIOg

I tried checkpointing a container (created with lxc tools) and while I
didn't expect it to succeed, I did get this lockdep report.  Haven't
looked into it yet, anyone else seeing this?

# uname -a Linux localhost.localdomain 2.6.30-rc3 #1 SMP Fri May 8 10:58:07 CDT 2009 i686 i686 i386 GNU/Linux
# ./ckpt -c 3265 > /tmp/foo

=============================================
[ INFO: possible recursive locking detected ]
2.6.30-rc3 #1
---------------------------------------------
ckpt/3643 is trying to acquire lock:
 (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<c02866ba>] generic_file_aio_write+0x59/0xc2

but task is already holding lock:
 (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<c02b1b38>] pipe_file_checkpoint+0xd7/0x20d

other info that might help us debug this:
1 lock held by ckpt/3643:
 #0:  (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<c02b1b38>] pipe_file_checkpoint+0xd7/0x20d

stack backtrace:
Pid: 3643, comm: ckpt Not tainted 2.6.30-rc3 #1
Call Trace:
 [<c05f3c5a>] ? printk+0x14/0x16
 [<c024db78>] __lock_acquire+0xc3e/0x132a
 [<c024ab2b>] ? lock_release_holdtime+0x1a/0x172
 [<c036143c>] ? avc_has_perm_noaudit+0x376/0x39d
 [<c024e2ed>] lock_acquire+0x89/0xa6
 [<c02866ba>] ? generic_file_aio_write+0x59/0xc2
 [<c05f53be>] mutex_lock_nested+0x4e/0x274
 [<c02866ba>] ? generic_file_aio_write+0x59/0xc2
 [<c02866ba>] ? generic_file_aio_write+0x59/0xc2
 [<c02866ba>] generic_file_aio_write+0x59/0xc2
 [<c02f29e1>] ext3_file_write+0x1f/0x92
 [<c02ab801>] do_sync_write+0xb0/0xee
 [<c023d504>] ? autoremove_wake_function+0x0/0x38
 [<c0362d29>] ? selinux_file_permission+0xa1/0xa5
 [<c035d1a6>] ? security_file_permission+0x14/0x16
 [<c02ab751>] ? do_sync_write+0x0/0xee
 [<c02ac217>] vfs_write+0x8f/0x133
 [<c024c48a>] ? trace_hardirqs_on_caller+0x119/0x13d
 [<c0391d4c>] ckpt_kwrite+0x50/0x94
 [<c03926a8>] ckpt_write_obj+0x44/0x77
 [<c02b1b70>] pipe_file_checkpoint+0x10f/0x20d
 [<c0391d4c>] ? ckpt_kwrite+0x50/0x94
 [<c03963f4>] checkpoint_file+0x1e/0x21
 [<c0392643>] checkpoint_obj+0xd2/0xf3
 [<c0396b5f>] checkpoint_fd_table+0x16a/0x24d
 [<c0394b12>] checkpoint_task+0x1a6/0x364
 [<c0392f9a>] do_checkpoint+0x46d/0x5a7
 [<c0391c44>] sys_checkpoint+0x6c/0x82
 [<c0202af5>] syscall_call+0x7/0xb
c/r: FILE users 4 != count 6

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: lockdep complaint from ckpt-v15
       [not found] ` <m3ljp7pm2e.fsf-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
@ 2009-05-08 18:50   ` Oren Laadan
       [not found]     ` <4A047F03.2050803-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Oren Laadan @ 2009-05-08 18:50 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: containers-qjLDD68F18O7TbgM5vRIOg

Hi Nathan,

This was observed before:
https://lists.linux-foundation.org/pipermail/containers/2009-February/015595.html
(and you replied :)

I believe it's a false alarm: the code takes the pipe's inode mutexthen takes
the same mutex (of a different inode) when writing data out
to the output file descriptor.

This could create a deadlock if the user provides the pipe's fd as the output
file; Howver, I protects against that by explicitly checking that the file of
the pipe isn't the file in ctx->file.

I'd be happy to learn how to tell lockdep to accept this behavior.

Oren.



Nathan Lynch wrote:
> I tried checkpointing a container (created with lxc tools) and while I
> didn't expect it to succeed, I did get this lockdep report.  Haven't
> looked into it yet, anyone else seeing this?
> 
> # uname -a Linux localhost.localdomain 2.6.30-rc3 #1 SMP Fri May 8 10:58:07 CDT 2009 i686 i686 i386 GNU/Linux
> # ./ckpt -c 3265 > /tmp/foo
> 
> =============================================
> [ INFO: possible recursive locking detected ]
> 2.6.30-rc3 #1
> ---------------------------------------------
> ckpt/3643 is trying to acquire lock:
>  (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<c02866ba>] generic_file_aio_write+0x59/0xc2
> 
> but task is already holding lock:
>  (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<c02b1b38>] pipe_file_checkpoint+0xd7/0x20d
> 
> other info that might help us debug this:
> 1 lock held by ckpt/3643:
>  #0:  (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<c02b1b38>] pipe_file_checkpoint+0xd7/0x20d
> 
> stack backtrace:
> Pid: 3643, comm: ckpt Not tainted 2.6.30-rc3 #1
> Call Trace:
>  [<c05f3c5a>] ? printk+0x14/0x16
>  [<c024db78>] __lock_acquire+0xc3e/0x132a
>  [<c024ab2b>] ? lock_release_holdtime+0x1a/0x172
>  [<c036143c>] ? avc_has_perm_noaudit+0x376/0x39d
>  [<c024e2ed>] lock_acquire+0x89/0xa6
>  [<c02866ba>] ? generic_file_aio_write+0x59/0xc2
>  [<c05f53be>] mutex_lock_nested+0x4e/0x274
>  [<c02866ba>] ? generic_file_aio_write+0x59/0xc2
>  [<c02866ba>] ? generic_file_aio_write+0x59/0xc2
>  [<c02866ba>] generic_file_aio_write+0x59/0xc2
>  [<c02f29e1>] ext3_file_write+0x1f/0x92
>  [<c02ab801>] do_sync_write+0xb0/0xee
>  [<c023d504>] ? autoremove_wake_function+0x0/0x38
>  [<c0362d29>] ? selinux_file_permission+0xa1/0xa5
>  [<c035d1a6>] ? security_file_permission+0x14/0x16
>  [<c02ab751>] ? do_sync_write+0x0/0xee
>  [<c02ac217>] vfs_write+0x8f/0x133
>  [<c024c48a>] ? trace_hardirqs_on_caller+0x119/0x13d
>  [<c0391d4c>] ckpt_kwrite+0x50/0x94
>  [<c03926a8>] ckpt_write_obj+0x44/0x77
>  [<c02b1b70>] pipe_file_checkpoint+0x10f/0x20d
>  [<c0391d4c>] ? ckpt_kwrite+0x50/0x94
>  [<c03963f4>] checkpoint_file+0x1e/0x21
>  [<c0392643>] checkpoint_obj+0xd2/0xf3
>  [<c0396b5f>] checkpoint_fd_table+0x16a/0x24d
>  [<c0394b12>] checkpoint_task+0x1a6/0x364
>  [<c0392f9a>] do_checkpoint+0x46d/0x5a7
>  [<c0391c44>] sys_checkpoint+0x6c/0x82
>  [<c0202af5>] syscall_call+0x7/0xb
> c/r: FILE users 4 != count 6
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linux-foundation.org/mailman/listinfo/containers
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: lockdep complaint from ckpt-v15
       [not found]     ` <4A047F03.2050803-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
@ 2009-05-08 20:20       ` Nathan Lynch
  0 siblings, 0 replies; 3+ messages in thread
From: Nathan Lynch @ 2009-05-08 20:20 UTC (permalink / raw)
  To: Oren Laadan; +Cc: containers-qjLDD68F18O7TbgM5vRIOg

Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org> writes:
> Hi Nathan,
>
> This was observed before:
> https://lists.linux-foundation.org/pipermail/containers/2009-February/015595.html
> (and you replied :)

Heh, thought it seemed familiar.


> I believe it's a false alarm: the code takes the pipe's inode mutexthen takes
> the same mutex (of a different inode) when writing data out
> to the output file descriptor.
>
> This could create a deadlock if the user provides the pipe's fd as the output
> file; Howver, I protects against that by explicitly checking that the file of
> the pipe isn't the file in ctx->file.

That will prevent deadlock on that one inode, yes, but I believe lockdep
is pointing to a different problem.  Locks of the same class have to be
acquired in some pre-determined order, and lockdep has to be taught that
order.  In other words, if it's okay to acquire these inode mutexes in
this order, it must be forbidden to acquire them in the reverse order.
Otherwise you have the potential for AB-BA deadlock.  At least, that's
my impression after spending some time with the lockdep and vfs locking
docs... hope I'm right :)

There are vfs operations (rename etc) which need to acquire multiple
inodes' i_mutex in the same class -- the ordering rules are expressed in
include/linux/fs.h:inode_i_mutex_lock_class and with use of
mutex_lock_nested() in e.g. fs/namei.c:lock_rename().  But I don't see
immediately how to apply these mechanisms to this situation.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-05-08 20:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-08 16:59 lockdep complaint from ckpt-v15 Nathan Lynch
     [not found] ` <m3ljp7pm2e.fsf-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
2009-05-08 18:50   ` Oren Laadan
     [not found]     ` <4A047F03.2050803-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-08 20:20       ` Nathan Lynch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.