* INFO: possible circular locking dependency detected 3.1.0-rc2-00190-g3210d19
@ 2011-08-22 3:41 Justin P. Mattock
2011-08-22 13:09 ` Josh Boyer
0 siblings, 1 reply; 8+ messages in thread
From: Justin P. Mattock @ 2011-08-22 3:41 UTC (permalink / raw)
To: linux-kernel@vger.kernel.org
yikes.. seems the latest Mainline doesnt like rhythmbox or vice versa.
[ 68.476921] =======================================================
[ 68.476926] [ INFO: possible circular locking dependency detected ]
[ 68.476929] 3.1.0-rc2-00190-g3210d19 #7
[ 68.476931] -------------------------------------------------------
[ 68.476934] rhythmbox/1597 is trying to acquire lock:
[ 68.476937] (&sb->s_type->i_mutex_key#8){+.+.+.}, at:
[<ffffffff8119702e>] ext4_evict_inode+0x76/0x33c
[ 68.476950]
[ 68.476950] but task is already holding lock:
[ 68.476953] (&mm->mmap_sem){++++++}, at: [<ffffffff810fcb08>]
sys_munmap+0x3b/0x60
[ 68.476960]
[ 68.476961] which lock already depends on the new lock.
[ 68.476962]
[ 68.476964]
[ 68.476965] the existing dependency chain (in reverse order) is:
[ 68.476968]
[ 68.476968] -> #1 (&mm->mmap_sem){++++++}:
[ 68.476973] [<ffffffff810819d0>] lock_acquire+0x106/0x15b
[ 68.476979] [<ffffffff810f5fa3>] might_fault+0x89/0xac
[ 68.476984] [<ffffffff8113716b>] filldir+0x6f/0xc7
[ 68.476990] [<ffffffff8118df2b>] call_filldir+0x96/0xbd
[ 68.476994] [<ffffffff8118e258>] ext4_readdir+0x1b4/0x515
[ 68.476998] [<ffffffff811373c0>] vfs_readdir+0x7b/0xb1
[ 68.477003] [<ffffffff811374dc>] sys_getdents+0x7e/0xce
[ 68.477007] [<ffffffff814c6042>] system_call_fastpath+0x16/0x1b
[ 68.477008]
[ 68.477008] -> #0 (&sb->s_type->i_mutex_key#8){+.+.+.}:
[ 68.477008] [<ffffffff810811fa>] __lock_acquire+0xa06/0xce3
[ 68.477008] [<ffffffff810819d0>] lock_acquire+0x106/0x15b
[ 68.477008] [<ffffffff814bd955>] __mutex_lock_common+0x61/0x380
[ 68.477008] [<ffffffff814bdd83>] mutex_lock_nested+0x40/0x45
[ 68.477008] [<ffffffff8119702e>] ext4_evict_inode+0x76/0x33c
[ 68.477008] [<ffffffff8113d249>] evict+0x99/0x153
[ 68.477008] [<ffffffff8113d494>] iput+0x191/0x19a
[ 68.477008] [<ffffffff8113a155>] dentry_kill+0x123/0x145
[ 68.477008] [<ffffffff8113a564>] dput+0xf7/0x107
[ 68.477008] [<ffffffff8112970c>] fput+0x1ce/0x1e6
[ 68.477008] [<ffffffff810fb7cf>] remove_vma+0x56/0x87
[ 68.477008] [<ffffffff810fc995>] do_munmap+0x2f2/0x30b
[ 68.477008] [<ffffffff810fcb16>] sys_munmap+0x49/0x60
[ 68.477008] [<ffffffff814c6042>] system_call_fastpath+0x16/0x1b
[ 68.477008]
[ 68.477008] other info that might help us debug this:
[ 68.477008]
[ 68.477008] Possible unsafe locking scenario:
[ 68.477008]
[ 68.477008] CPU0 CPU1
[ 68.477008] ---- ----
[ 68.477008] lock(&mm->mmap_sem);
[ 68.477008]
lock(&sb->s_type->i_mutex_key);
[ 68.477008] lock(&mm->mmap_sem);
[ 68.477008] lock(&sb->s_type->i_mutex_key);
[ 68.477008]
[ 68.477008] *** DEADLOCK ***
[ 68.477008]
[ 68.477008] 1 lock held by rhythmbox/1597:
[ 68.477008] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff810fcb08>]
sys_munmap+0x3b/0x60
[ 68.477008]
[ 68.477008] stack backtrace:
[ 68.477008] Pid: 1597, comm: rhythmbox Not tainted
3.1.0-rc2-00190-g3210d19 #7
[ 68.477008] Call Trace:
[ 68.477008] [<ffffffff814b50da>] print_circular_bug+0x1f8/0x209
[ 68.477008] [<ffffffff810811fa>] __lock_acquire+0xa06/0xce3
[ 68.477008] [<ffffffff8119702e>] ? ext4_evict_inode+0x76/0x33c
[ 68.477008] [<ffffffff810819d0>] lock_acquire+0x106/0x15b
[ 68.477008] [<ffffffff8119702e>] ? ext4_evict_inode+0x76/0x33c
[ 68.477008] [<ffffffff814bd955>] __mutex_lock_common+0x61/0x380
[ 68.477008] [<ffffffff8119702e>] ? ext4_evict_inode+0x76/0x33c
[ 68.477008] [<ffffffff8113d219>] ? evict+0x69/0x153
[ 68.477008] [<ffffffff8119702e>] ? ext4_evict_inode+0x76/0x33c
[ 68.477008] [<ffffffff8113d219>] ? evict+0x69/0x153
[ 68.477008] [<ffffffff81081893>] ? lock_release+0x1a9/0x1e0
[ 68.477008] [<ffffffff814bdd83>] mutex_lock_nested+0x40/0x45
[ 68.477008] [<ffffffff8119702e>] ext4_evict_inode+0x76/0x33c
[ 68.477008] [<ffffffff8113d249>] evict+0x99/0x153
[ 68.477008] [<ffffffff8113d494>] iput+0x191/0x19a
[ 68.477008] [<ffffffff8113a155>] dentry_kill+0x123/0x145
[ 68.477008] [<ffffffff8113a564>] dput+0xf7/0x107
[ 68.477008] [<ffffffff8112970c>] fput+0x1ce/0x1e6
[ 68.477008] [<ffffffff810fb7cf>] remove_vma+0x56/0x87
[ 68.477008] [<ffffffff810fc995>] do_munmap+0x2f2/0x30b
[ 68.477008] [<ffffffff810fcb16>] sys_munmap+0x49/0x60
[ 68.477008] [<ffffffff814c6042>] system_call_fastpath+0x16/0x1b
[ 69.728185] ata1: lost interrupt (Status 0x59)
[ 69.728226] ata1: drained 8 bytes to clear DRQ
[ 69.728240] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
frozen
[ 69.728250] sr 0:0:0:0: CDB: Get event status notification: 4a 01 00
00 10 00 00 00 08 00
[ 69.728288] ata1.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0
pio 16392 in
[ 69.728291] res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask
0x4 (timeout)
[ 69.728299] ata1.00: status: { DRDY }
[ 74.738348] ata1: link is slow to respond, please be patient (ready=0)
[ 79.772091] ata1: device not ready (errno=-16), forcing hardreset
[ 79.772110] ata1: soft resetting link
[ 79.934483] ata1.00: configured for UDMA/66
[ 84.934264] ata1.00: qc timeout (cmd 0xa0)
[ 84.934276] ata1.00: TEST_UNIT_READY failed (err_mask=0x5)
[ 84.934327] ata1: soft resetting link
[ 85.094470] ata1.00: configured for UDMA/66
[ 90.094089] ata1.00: qc timeout (cmd 0xa0)
[ 90.094108] ata1.00: TEST_UNIT_READY failed (err_mask=0x5)
[ 90.094113] ata1.00: limiting speed to UDMA/66:PIO3
[ 90.094153] ata1: soft resetting link
[ 90.274440] ata1.00: configured for UDMA/66
[ 95.274071] ata1.00: qc timeout (cmd 0xa0)
[ 95.274084] ata1.00: TEST_UNIT_READY failed (err_mask=0x5)
[ 95.274091] ata1.00: disabled
[ 95.274157] ata1: soft resetting link
[ 95.425185] ata1: EH complete
full dmesg here:
http://fpaste.org/Hxog/
Justin P. Mattock
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: INFO: possible circular locking dependency detected 3.1.0-rc2-00190-g3210d19 2011-08-22 3:41 INFO: possible circular locking dependency detected 3.1.0-rc2-00190-g3210d19 Justin P. Mattock @ 2011-08-22 13:09 ` Josh Boyer 2011-08-22 13:16 ` Al Viro 0 siblings, 1 reply; 8+ messages in thread From: Josh Boyer @ 2011-08-22 13:09 UTC (permalink / raw) To: Justin P. Mattock, Alexander Viro, Peter Zijlstra Cc: linux-kernel@vger.kernel.org On Sun, Aug 21, 2011 at 11:41 PM, Justin P. Mattock <justinmattock@gmail.com> wrote: > yikes.. seems the latest Mainline doesnt like rhythmbox or vice versa. > > [ 68.476921] ======================================================= > [ 68.476926] [ INFO: possible circular locking dependency detected ] > [ 68.476929] 3.1.0-rc2-00190-g3210d19 #7 > [ 68.476931] ------------------------------------------------------- > [ 68.476934] rhythmbox/1597 is trying to acquire lock: > [ 68.476937] (&sb->s_type->i_mutex_key#8){+.+.+.}, at: > [<ffffffff8119702e>] ext4_evict_inode+0x76/0x33c > [ 68.476950] > [ 68.476950] but task is already holding lock: > [ 68.476953] (&mm->mmap_sem){++++++}, at: [<ffffffff810fcb08>] > sys_munmap+0x3b/0x60 > [ 68.476960] > [ 68.476961] which lock already depends on the new lock. > [ 68.476962] > [ 68.476964] > [ 68.476965] the existing dependency chain (in reverse order) is: > [ 68.476968] > [ 68.476968] -> #1 (&mm->mmap_sem){++++++}: > [ 68.476973] [<ffffffff810819d0>] lock_acquire+0x106/0x15b > [ 68.476979] [<ffffffff810f5fa3>] might_fault+0x89/0xac > [ 68.476984] [<ffffffff8113716b>] filldir+0x6f/0xc7 > [ 68.476990] [<ffffffff8118df2b>] call_filldir+0x96/0xbd > [ 68.476994] [<ffffffff8118e258>] ext4_readdir+0x1b4/0x515 > [ 68.476998] [<ffffffff811373c0>] vfs_readdir+0x7b/0xb1 > [ 68.477003] [<ffffffff811374dc>] sys_getdents+0x7e/0xce > [ 68.477007] [<ffffffff814c6042>] system_call_fastpath+0x16/0x1b > [ 68.477008] > [ 68.477008] -> #0 (&sb->s_type->i_mutex_key#8){+.+.+.}: > [ 68.477008] [<ffffffff810811fa>] __lock_acquire+0xa06/0xce3 > [ 68.477008] [<ffffffff810819d0>] lock_acquire+0x106/0x15b > [ 68.477008] [<ffffffff814bd955>] __mutex_lock_common+0x61/0x380 > [ 68.477008] [<ffffffff814bdd83>] mutex_lock_nested+0x40/0x45 > [ 68.477008] [<ffffffff8119702e>] ext4_evict_inode+0x76/0x33c > [ 68.477008] [<ffffffff8113d249>] evict+0x99/0x153 > [ 68.477008] [<ffffffff8113d494>] iput+0x191/0x19a > [ 68.477008] [<ffffffff8113a155>] dentry_kill+0x123/0x145 > [ 68.477008] [<ffffffff8113a564>] dput+0xf7/0x107 > [ 68.477008] [<ffffffff8112970c>] fput+0x1ce/0x1e6 > [ 68.477008] [<ffffffff810fb7cf>] remove_vma+0x56/0x87 > [ 68.477008] [<ffffffff810fc995>] do_munmap+0x2f2/0x30b > [ 68.477008] [<ffffffff810fcb16>] sys_munmap+0x49/0x60 > [ 68.477008] [<ffffffff814c6042>] system_call_fastpath+0x16/0x1b > [ 68.477008] > [ 68.477008] other info that might help us debug this: > [ 68.477008] > [ 68.477008] Possible unsafe locking scenario: > [ 68.477008] > [ 68.477008] CPU0 CPU1 > [ 68.477008] ---- ---- > [ 68.477008] lock(&mm->mmap_sem); > [ 68.477008] lock(&sb->s_type->i_mutex_key); > [ 68.477008] lock(&mm->mmap_sem); > [ 68.477008] lock(&sb->s_type->i_mutex_key); > [ 68.477008] > [ 68.477008] *** DEADLOCK *** > [ 68.477008] > [ 68.477008] 1 lock held by rhythmbox/1597: > [ 68.477008] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff810fcb08>] > sys_munmap+0x3b/0x60 > [ 68.477008] > [ 68.477008] stack backtrace: > [ 68.477008] Pid: 1597, comm: rhythmbox Not tainted > 3.1.0-rc2-00190-g3210d19 #7 We've had a report of this on 3.0.1 as well. Slightly different scenario and fs, but the locks in question are the same. https://bugzilla.redhat.com/show_bug.cgi?id=730998 It seems that with CONFIG_PROVE_LOCKING on, might_fault will always attempt to grab mm->mmap_sem. The common flow here is that getdents calls filldir, which calls copy_to_user, which is what is calling might_fault. Beyond that, I'm a bit over my head at the moment because I don't know if the VFS is right and we just need some more lockdep annotations or if there really is a problem. josh ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: INFO: possible circular locking dependency detected 3.1.0-rc2-00190-g3210d19 2011-08-22 13:09 ` Josh Boyer @ 2011-08-22 13:16 ` Al Viro 2011-08-22 13:27 ` Josh Boyer 2011-08-22 13:27 ` Al Viro 0 siblings, 2 replies; 8+ messages in thread From: Al Viro @ 2011-08-22 13:16 UTC (permalink / raw) To: Josh Boyer Cc: Justin P. Mattock, Peter Zijlstra, linux-kernel@vger.kernel.org On Mon, Aug 22, 2011 at 09:09:14AM -0400, Josh Boyer wrote: > We've had a report of this on 3.0.1 as well. Slightly different > scenario and fs, but the locks in question are the same. > https://bugzilla.redhat.com/show_bug.cgi?id=730998 > > It seems that with CONFIG_PROVE_LOCKING on, might_fault will always > attempt to grab mm->mmap_sem. The common flow here is that getdents > calls filldir, which calls copy_to_user, which is what is calling > might_fault. > > Beyond that, I'm a bit over my head at the moment because I don't know > if the VFS is right and we just need some more lockdep annotations or > if there really is a problem. Don't grab ->i_mutex in ->evict_inode(). Why are you doing that, anyway? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: INFO: possible circular locking dependency detected 3.1.0-rc2-00190-g3210d19 2011-08-22 13:16 ` Al Viro @ 2011-08-22 13:27 ` Josh Boyer 2011-08-22 13:27 ` Al Viro 1 sibling, 0 replies; 8+ messages in thread From: Josh Boyer @ 2011-08-22 13:27 UTC (permalink / raw) To: Al Viro, Theodore Ts'o Cc: Justin P. Mattock, Peter Zijlstra, linux-kernel@vger.kernel.org On Mon, Aug 22, 2011 at 9:16 AM, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Mon, Aug 22, 2011 at 09:09:14AM -0400, Josh Boyer wrote: > >> We've had a report of this on 3.0.1 as well. Slightly different >> scenario and fs, but the locks in question are the same. >> https://bugzilla.redhat.com/show_bug.cgi?id=730998 >> >> It seems that with CONFIG_PROVE_LOCKING on, might_fault will always >> attempt to grab mm->mmap_sem. The common flow here is that getdents >> calls filldir, which calls copy_to_user, which is what is calling >> might_fault. >> >> Beyond that, I'm a bit over my head at the moment because I don't know >> if the VFS is right and we just need some more lockdep annotations or >> if there really is a problem. > > Don't grab ->i_mutex in ->evict_inode(). Why are you doing that, anyway? I've no idea. Let's ask Ted! josh ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: INFO: possible circular locking dependency detected 3.1.0-rc2-00190-g3210d19 2011-08-22 13:16 ` Al Viro 2011-08-22 13:27 ` Josh Boyer @ 2011-08-22 13:27 ` Al Viro 2011-08-22 13:33 ` Josh Boyer 1 sibling, 1 reply; 8+ messages in thread From: Al Viro @ 2011-08-22 13:27 UTC (permalink / raw) To: Josh Boyer Cc: Justin P. Mattock, Peter Zijlstra, linux-kernel@vger.kernel.org On Mon, Aug 22, 2011 at 02:16:21PM +0100, Al Viro wrote: > On Mon, Aug 22, 2011 at 09:09:14AM -0400, Josh Boyer wrote: > > > We've had a report of this on 3.0.1 as well. Slightly different > > scenario and fs, but the locks in question are the same. > > https://bugzilla.redhat.com/show_bug.cgi?id=730998 > > > > It seems that with CONFIG_PROVE_LOCKING on, might_fault will always > > attempt to grab mm->mmap_sem. The common flow here is that getdents > > calls filldir, which calls copy_to_user, which is what is calling > > might_fault. > > > > Beyond that, I'm a bit over my head at the moment because I don't know > > if the VFS is right and we just need some more lockdep annotations or > > if there really is a problem. > > Don't grab ->i_mutex in ->evict_inode(). Why are you doing that, anyway? Note, BTW, that readdir() is a red herring here; there is a much more relevant reason for that ranking. Namely, write() doing copy_from_user() when the file we are writing into has i_mutex held by us. That can fault and in this case we have a non-directory inode. While you can't have directory mmapped, regular files can be mmapped just fine. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: INFO: possible circular locking dependency detected 3.1.0-rc2-00190-g3210d19 2011-08-22 13:27 ` Al Viro @ 2011-08-22 13:33 ` Josh Boyer 2011-08-22 13:56 ` Al Viro 0 siblings, 1 reply; 8+ messages in thread From: Josh Boyer @ 2011-08-22 13:33 UTC (permalink / raw) To: Al Viro; +Cc: Justin P. Mattock, Peter Zijlstra, linux-kernel@vger.kernel.org On Mon, Aug 22, 2011 at 9:27 AM, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Mon, Aug 22, 2011 at 02:16:21PM +0100, Al Viro wrote: >> On Mon, Aug 22, 2011 at 09:09:14AM -0400, Josh Boyer wrote: >> >> > We've had a report of this on 3.0.1 as well. Slightly different >> > scenario and fs, but the locks in question are the same. >> > https://bugzilla.redhat.com/show_bug.cgi?id=730998 >> > >> > It seems that with CONFIG_PROVE_LOCKING on, might_fault will always >> > attempt to grab mm->mmap_sem. The common flow here is that getdents >> > calls filldir, which calls copy_to_user, which is what is calling >> > might_fault. >> > >> > Beyond that, I'm a bit over my head at the moment because I don't know >> > if the VFS is right and we just need some more lockdep annotations or >> > if there really is a problem. >> >> Don't grab ->i_mutex in ->evict_inode(). Why are you doing that, anyway? > > Note, BTW, that readdir() is a red herring here; there is a much more > relevant reason for that ranking. Namely, write() doing copy_from_user() > when the file we are writing into has i_mutex held by us. That can fault > and in this case we have a non-directory inode. While you can't have > directory mmapped, regular files can be mmapped just fine. So the lockdep report in the RHBZ (which now that I look at it probably isn't the same as this report) seems to be doing a readdir while find is trying to mmap, which is calling into hugetlbfs_file_mmap and throwing the same deadlock warning. Is that like the scenario you are describing above? josh ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: INFO: possible circular locking dependency detected 3.1.0-rc2-00190-g3210d19 2011-08-22 13:33 ` Josh Boyer @ 2011-08-22 13:56 ` Al Viro 2011-08-22 15:24 ` Josh Boyer 0 siblings, 1 reply; 8+ messages in thread From: Al Viro @ 2011-08-22 13:56 UTC (permalink / raw) To: Josh Boyer Cc: Justin P. Mattock, Peter Zijlstra, linux-kernel@vger.kernel.org On Mon, Aug 22, 2011 at 09:33:34AM -0400, Josh Boyer wrote: > So the lockdep report in the RHBZ (which now that I look at it > probably isn't the same as this report) seems to be doing a readdir > while find is trying to mmap, which is calling into > hugetlbfs_file_mmap and throwing the same deadlock warning. Is that > like the scenario you are describing above? Lockdep records the first trace that leads to locks taken in this order. readdir() seems to be the first thing to step on i_mutex and mmap_sem (not too surprisingly, come to think of that - directory reads happening earlier in the boot than regular file writes). So when it reports i_mutex taken under mmap_sem, readdir gets mentioned by lockdep. Often leading to comments along the lines of "but this inode is not a directory at all; shouldn't we relax the rules for non-directories?" Nope; the same ordering very much applies to regular files. With s/readdir/write/. The bottom line is: don't take i_mutex while holding mmap_sem. Really. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: INFO: possible circular locking dependency detected 3.1.0-rc2-00190-g3210d19 2011-08-22 13:56 ` Al Viro @ 2011-08-22 15:24 ` Josh Boyer 0 siblings, 0 replies; 8+ messages in thread From: Josh Boyer @ 2011-08-22 15:24 UTC (permalink / raw) To: Al Viro; +Cc: Justin P. Mattock, Peter Zijlstra, linux-kernel@vger.kernel.org On Mon, Aug 22, 2011 at 9:56 AM, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Mon, Aug 22, 2011 at 09:33:34AM -0400, Josh Boyer wrote: > >> So the lockdep report in the RHBZ (which now that I look at it >> probably isn't the same as this report) seems to be doing a readdir >> while find is trying to mmap, which is calling into >> hugetlbfs_file_mmap and throwing the same deadlock warning. Is that >> like the scenario you are describing above? > > Lockdep records the first trace that leads to locks taken in this > order. readdir() seems to be the first thing to step on i_mutex > and mmap_sem (not too surprisingly, come to think of that - directory > reads happening earlier in the boot than regular file writes). > > So when it reports i_mutex taken under mmap_sem, readdir gets mentioned > by lockdep. Often leading to comments along the lines of "but this > inode is not a directory at all; shouldn't we relax the rules for > non-directories?" Nope; the same ordering very much applies to regular > files. With s/readdir/write/. > > The bottom line is: don't take i_mutex while holding mmap_sem. Really. OK, thanks. It seems this particular hugetlbfs issue was reported a while ago here: https://lkml.org/lkml/2011/4/15/272 I'll go poke that thread a bit. That just leaves the ext4 evict case, which hopefully Ted can answer. josh ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-08-22 15:24 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-08-22 3:41 INFO: possible circular locking dependency detected 3.1.0-rc2-00190-g3210d19 Justin P. Mattock 2011-08-22 13:09 ` Josh Boyer 2011-08-22 13:16 ` Al Viro 2011-08-22 13:27 ` Josh Boyer 2011-08-22 13:27 ` Al Viro 2011-08-22 13:33 ` Josh Boyer 2011-08-22 13:56 ` Al Viro 2011-08-22 15:24 ` Josh Boyer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox