* lockdep complaint from ckpt-v15
@ 2009-05-08 16:59 Nathan Lynch
[not found] ` <m3ljp7pm2e.fsf-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 3+ messages in thread
From: Nathan Lynch @ 2009-05-08 16:59 UTC (permalink / raw)
To: containers-qjLDD68F18O7TbgM5vRIOg
I tried checkpointing a container (created with lxc tools) and while I
didn't expect it to succeed, I did get this lockdep report. Haven't
looked into it yet, anyone else seeing this?
# uname -a Linux localhost.localdomain 2.6.30-rc3 #1 SMP Fri May 8 10:58:07 CDT 2009 i686 i686 i386 GNU/Linux
# ./ckpt -c 3265 > /tmp/foo
=============================================
[ INFO: possible recursive locking detected ]
2.6.30-rc3 #1
---------------------------------------------
ckpt/3643 is trying to acquire lock:
(&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<c02866ba>] generic_file_aio_write+0x59/0xc2
but task is already holding lock:
(&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<c02b1b38>] pipe_file_checkpoint+0xd7/0x20d
other info that might help us debug this:
1 lock held by ckpt/3643:
#0: (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<c02b1b38>] pipe_file_checkpoint+0xd7/0x20d
stack backtrace:
Pid: 3643, comm: ckpt Not tainted 2.6.30-rc3 #1
Call Trace:
[<c05f3c5a>] ? printk+0x14/0x16
[<c024db78>] __lock_acquire+0xc3e/0x132a
[<c024ab2b>] ? lock_release_holdtime+0x1a/0x172
[<c036143c>] ? avc_has_perm_noaudit+0x376/0x39d
[<c024e2ed>] lock_acquire+0x89/0xa6
[<c02866ba>] ? generic_file_aio_write+0x59/0xc2
[<c05f53be>] mutex_lock_nested+0x4e/0x274
[<c02866ba>] ? generic_file_aio_write+0x59/0xc2
[<c02866ba>] ? generic_file_aio_write+0x59/0xc2
[<c02866ba>] generic_file_aio_write+0x59/0xc2
[<c02f29e1>] ext3_file_write+0x1f/0x92
[<c02ab801>] do_sync_write+0xb0/0xee
[<c023d504>] ? autoremove_wake_function+0x0/0x38
[<c0362d29>] ? selinux_file_permission+0xa1/0xa5
[<c035d1a6>] ? security_file_permission+0x14/0x16
[<c02ab751>] ? do_sync_write+0x0/0xee
[<c02ac217>] vfs_write+0x8f/0x133
[<c024c48a>] ? trace_hardirqs_on_caller+0x119/0x13d
[<c0391d4c>] ckpt_kwrite+0x50/0x94
[<c03926a8>] ckpt_write_obj+0x44/0x77
[<c02b1b70>] pipe_file_checkpoint+0x10f/0x20d
[<c0391d4c>] ? ckpt_kwrite+0x50/0x94
[<c03963f4>] checkpoint_file+0x1e/0x21
[<c0392643>] checkpoint_obj+0xd2/0xf3
[<c0396b5f>] checkpoint_fd_table+0x16a/0x24d
[<c0394b12>] checkpoint_task+0x1a6/0x364
[<c0392f9a>] do_checkpoint+0x46d/0x5a7
[<c0391c44>] sys_checkpoint+0x6c/0x82
[<c0202af5>] syscall_call+0x7/0xb
c/r: FILE users 4 != count 6
^ permalink raw reply [flat|nested] 3+ messages in thread[parent not found: <m3ljp7pm2e.fsf-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>]
* Re: lockdep complaint from ckpt-v15 [not found] ` <m3ljp7pm2e.fsf-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org> @ 2009-05-08 18:50 ` Oren Laadan [not found] ` <4A047F03.2050803-eQaUEPhvms7ENvBUuze7eA@public.gmane.org> 0 siblings, 1 reply; 3+ messages in thread From: Oren Laadan @ 2009-05-08 18:50 UTC (permalink / raw) To: Nathan Lynch; +Cc: containers-qjLDD68F18O7TbgM5vRIOg Hi Nathan, This was observed before: https://lists.linux-foundation.org/pipermail/containers/2009-February/015595.html (and you replied :) I believe it's a false alarm: the code takes the pipe's inode mutexthen takes the same mutex (of a different inode) when writing data out to the output file descriptor. This could create a deadlock if the user provides the pipe's fd as the output file; Howver, I protects against that by explicitly checking that the file of the pipe isn't the file in ctx->file. I'd be happy to learn how to tell lockdep to accept this behavior. Oren. Nathan Lynch wrote: > I tried checkpointing a container (created with lxc tools) and while I > didn't expect it to succeed, I did get this lockdep report. Haven't > looked into it yet, anyone else seeing this? > > # uname -a Linux localhost.localdomain 2.6.30-rc3 #1 SMP Fri May 8 10:58:07 CDT 2009 i686 i686 i386 GNU/Linux > # ./ckpt -c 3265 > /tmp/foo > > ============================================= > [ INFO: possible recursive locking detected ] > 2.6.30-rc3 #1 > --------------------------------------------- > ckpt/3643 is trying to acquire lock: > (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<c02866ba>] generic_file_aio_write+0x59/0xc2 > > but task is already holding lock: > (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<c02b1b38>] pipe_file_checkpoint+0xd7/0x20d > > other info that might help us debug this: > 1 lock held by ckpt/3643: > #0: (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<c02b1b38>] pipe_file_checkpoint+0xd7/0x20d > > stack backtrace: > Pid: 3643, comm: ckpt Not tainted 2.6.30-rc3 #1 > Call Trace: > [<c05f3c5a>] ? printk+0x14/0x16 > [<c024db78>] __lock_acquire+0xc3e/0x132a > [<c024ab2b>] ? lock_release_holdtime+0x1a/0x172 > [<c036143c>] ? avc_has_perm_noaudit+0x376/0x39d > [<c024e2ed>] lock_acquire+0x89/0xa6 > [<c02866ba>] ? generic_file_aio_write+0x59/0xc2 > [<c05f53be>] mutex_lock_nested+0x4e/0x274 > [<c02866ba>] ? generic_file_aio_write+0x59/0xc2 > [<c02866ba>] ? generic_file_aio_write+0x59/0xc2 > [<c02866ba>] generic_file_aio_write+0x59/0xc2 > [<c02f29e1>] ext3_file_write+0x1f/0x92 > [<c02ab801>] do_sync_write+0xb0/0xee > [<c023d504>] ? autoremove_wake_function+0x0/0x38 > [<c0362d29>] ? selinux_file_permission+0xa1/0xa5 > [<c035d1a6>] ? security_file_permission+0x14/0x16 > [<c02ab751>] ? do_sync_write+0x0/0xee > [<c02ac217>] vfs_write+0x8f/0x133 > [<c024c48a>] ? trace_hardirqs_on_caller+0x119/0x13d > [<c0391d4c>] ckpt_kwrite+0x50/0x94 > [<c03926a8>] ckpt_write_obj+0x44/0x77 > [<c02b1b70>] pipe_file_checkpoint+0x10f/0x20d > [<c0391d4c>] ? ckpt_kwrite+0x50/0x94 > [<c03963f4>] checkpoint_file+0x1e/0x21 > [<c0392643>] checkpoint_obj+0xd2/0xf3 > [<c0396b5f>] checkpoint_fd_table+0x16a/0x24d > [<c0394b12>] checkpoint_task+0x1a6/0x364 > [<c0392f9a>] do_checkpoint+0x46d/0x5a7 > [<c0391c44>] sys_checkpoint+0x6c/0x82 > [<c0202af5>] syscall_call+0x7/0xb > c/r: FILE users 4 != count 6 > _______________________________________________ > Containers mailing list > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > https://lists.linux-foundation.org/mailman/listinfo/containers > ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <4A047F03.2050803-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>]
* Re: lockdep complaint from ckpt-v15 [not found] ` <4A047F03.2050803-eQaUEPhvms7ENvBUuze7eA@public.gmane.org> @ 2009-05-08 20:20 ` Nathan Lynch 0 siblings, 0 replies; 3+ messages in thread From: Nathan Lynch @ 2009-05-08 20:20 UTC (permalink / raw) To: Oren Laadan; +Cc: containers-qjLDD68F18O7TbgM5vRIOg Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org> writes: > Hi Nathan, > > This was observed before: > https://lists.linux-foundation.org/pipermail/containers/2009-February/015595.html > (and you replied :) Heh, thought it seemed familiar. > I believe it's a false alarm: the code takes the pipe's inode mutexthen takes > the same mutex (of a different inode) when writing data out > to the output file descriptor. > > This could create a deadlock if the user provides the pipe's fd as the output > file; Howver, I protects against that by explicitly checking that the file of > the pipe isn't the file in ctx->file. That will prevent deadlock on that one inode, yes, but I believe lockdep is pointing to a different problem. Locks of the same class have to be acquired in some pre-determined order, and lockdep has to be taught that order. In other words, if it's okay to acquire these inode mutexes in this order, it must be forbidden to acquire them in the reverse order. Otherwise you have the potential for AB-BA deadlock. At least, that's my impression after spending some time with the lockdep and vfs locking docs... hope I'm right :) There are vfs operations (rename etc) which need to acquire multiple inodes' i_mutex in the same class -- the ordering rules are expressed in include/linux/fs.h:inode_i_mutex_lock_class and with use of mutex_lock_nested() in e.g. fs/namei.c:lock_rename(). But I don't see immediately how to apply these mechanisms to this situation. ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-05-08 20:20 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-08 16:59 lockdep complaint from ckpt-v15 Nathan Lynch
[not found] ` <m3ljp7pm2e.fsf-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
2009-05-08 18:50 ` Oren Laadan
[not found] ` <4A047F03.2050803-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-08 20:20 ` Nathan Lynch
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.