* page fault deadlock
@ 2013-11-28 3:25 Xiaotian Feng
2013-11-28 4:11 ` Greg KH
0 siblings, 1 reply; 6+ messages in thread
From: Xiaotian Feng @ 2013-11-28 3:25 UTC (permalink / raw)
To: Tejun Heo, Andrew Morton, neilb, gregkh; +Cc: linux-kernel
Hi,
When I upgrade to latest kernel, I found my system hang there. It
is reproducible on my virtualbox, and I found each time I mounted my
RAID6 partition and tried to vi or build kernel, my whole system
lockup very soon.
After turning on lockdep, I found following lockdep warning:
[ 27.848462]
[ 27.848471] ======================================================
[ 27.848477] [ INFO: possible circular locking dependency detected ]
[ 27.848484] 3.13.0-rc1+ #1 Tainted: GF W
[ 27.848490] -------------------------------------------------------
[ 27.848496] Xorg/1268 is trying to acquire lock:
[ 27.848501] (&of->mutex){+.+.+.}, at: [<ffffffff8125d58f>]
sysfs_bin_mmap+0x4f/0x120
[ 27.848516]
[ 27.848516] but task is already holding lock:
[ 27.848521] (&mm->mmap_sem){++++++}, at: [<ffffffff811875bf>]
vm_mmap_pgoff+0x6f/0xc0
[ 27.848534]
[ 27.848534] which lock already depends on the new lock.
[ 27.848534]
[ 27.848541]
[ 27.848541] the existing dependency chain (in reverse order) is:
[ 27.848547]
[ 27.848547] -> #2 (&mm->mmap_sem){++++++}:
[ 27.848556] [<ffffffff810c0510>] lock_acquire+0xb0/0x160
[ 27.848564] [<ffffffff8119177c>] might_fault+0x8c/0xb0
[ 27.848572] [<ffffffff815f4c08>] md_ioctl+0xa78/0x19b0
[ 27.848580] [<ffffffff813915a4>] blkdev_ioctl+0x234/0x840
[ 27.848588] [<ffffffff8121db61>] block_ioctl+0x41/0x50
[ 27.848597] [<ffffffff811f5330>] do_vfs_ioctl+0x300/0x520
[ 27.848605] [<ffffffff811f55d1>] SyS_ioctl+0x81/0xa0
[ 27.848613] [<ffffffff81784e98>] tracesys+0xe1/0xe6
[ 27.848622]
[ 27.848622] -> #1 (&mddev->reconfig_mutex){+.+.+.}:
[ 27.848630] [<ffffffff810c0510>] lock_acquire+0xb0/0x160
[ 27.848637] [<ffffffff81778568>]
mutex_lock_interruptible_nested+0x78/0x610
[ 27.848646] [<ffffffff815e9750>] rdev_attr_show+0x40/0x90
[ 27.848654] [<ffffffff8125db2a>] sysfs_seq_show+0xda/0x170
[ 27.848662] [<ffffffff812076f4>] seq_read+0x164/0x3e0
[ 27.848671] [<ffffffff811e1005>] vfs_read+0x95/0x160
[ 27.848680] [<ffffffff811e1b19>] SyS_read+0x49/0xa0
[ 27.848687] [<ffffffff81784e98>] tracesys+0xe1/0xe6
[ 27.848695]
[ 27.848695] -> #0 (&of->mutex){+.+.+.}:
[ 27.848703] [<ffffffff810bfd47>] __lock_acquire+0x1587/0x1ca0
[ 27.848711] [<ffffffff810c0510>] lock_acquire+0xb0/0x160
[ 27.848718] [<ffffffff81778048>] mutex_lock_nested+0x68/0x510
[ 27.848725] [<ffffffff8125d58f>] sysfs_bin_mmap+0x4f/0x120
[ 27.848732] [<ffffffff8119d82d>] mmap_region+0x3ed/0x5d0
[ 27.848741] [<ffffffff8119dd5e>] do_mmap_pgoff+0x34e/0x3d0
[ 27.848748] [<ffffffff811875e0>] vm_mmap_pgoff+0x90/0xc0
[ 27.848755] [<ffffffff8119c2b5>] SyS_mmap_pgoff+0x1d5/0x270
[ 27.848763] [<ffffffff8101ae52>] SyS_mmap+0x22/0x30
[ 27.848771] [<ffffffff81784e98>] tracesys+0xe1/0xe6
[ 27.848778]
[ 27.848778] other info that might help us debug this:
[ 27.848778]
[ 27.848785] Chain exists of:
[ 27.848785] &of->mutex --> &mddev->reconfig_mutex --> &mm->mmap_sem
[ 27.848785]
[ 27.848795] Possible unsafe locking scenario:
[ 27.848795]
[ 27.848800] CPU0 CPU1
[ 27.848805] ---- ----
[ 27.848810] lock(&mm->mmap_sem);
[ 27.848817] lock(&mddev->reconfig_mutex);
[ 27.848824] lock(&mm->mmap_sem);
[ 27.848830] lock(&of->mutex);
[ 27.848837]
[ 27.848837] *** DEADLOCK ***
[ 27.848837]
[ 27.848844] 1 lock held by Xorg/1268:
[ 27.848849] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff811875bf>]
vm_mmap_pgoff+0x6f/0xc0
[ 27.848861]
[ 27.848861] stack backtrace:
[ 27.848868] CPU: 1 PID: 1268 Comm: Xorg Tainted: GF W 3.13.0-rc1+ #1
[ 27.848873] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006
[ 27.848879] ffffffff822daa00 ffff8800d0371bc8 ffffffff817725f7
ffffffff822cbdc0
[ 27.848901] ffff8800d0371c08 ffffffff8176d9eb ffff8800d0371c60
ffff880115b42a78
[ 27.848909] 0000000000000000 ffff880115b42a78 ffff880115b422a0
0000000000000001
[ 27.848918] Call Trace:
[ 27.848930] [<ffffffff817725f7>] dump_stack+0x4e/0x7a
[ 27.848942] [<ffffffff8176d9eb>] print_circular_bug+0x1f9/0x208
[ 27.848952] [<ffffffff810bfd47>] __lock_acquire+0x1587/0x1ca0
[ 27.848964] [<ffffffff8101955f>] ? print_context_stack+0x8f/0x100
[ 27.848975] [<ffffffff810c0510>] lock_acquire+0xb0/0x160
[ 27.848986] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120
[ 27.848996] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120
[ 27.849007] [<ffffffff81778048>] mutex_lock_nested+0x68/0x510
[ 27.849016] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120
[ 27.849027] [<ffffffff8176456e>] ? kmemleak_alloc+0x4e/0xb0
[ 27.849038] [<ffffffff8125d58f>] sysfs_bin_mmap+0x4f/0x120
[ 27.849048] [<ffffffff8119d82d>] mmap_region+0x3ed/0x5d0
[ 27.849058] [<ffffffff8119dd5e>] do_mmap_pgoff+0x34e/0x3d0
[ 27.849070] [<ffffffff811875e0>] vm_mmap_pgoff+0x90/0xc0
[ 27.849080] [<ffffffff8119c2b5>] SyS_mmap_pgoff+0x1d5/0x270
[ 27.849092] [<ffffffff81023c55>] ? syscall_trace_enter+0x145/0x270
[ 27.849102] [<ffffffff8101ae52>] SyS_mmap+0x22/0x30
[ 27.849112] [<ffffffff81784e98>] tracesys+0xe1/0xe6
I think it is a real deadlock, and it is caused by commit
3124eb1679b28726 "sysfs: merge regular and bin file handling".
With that commit, sysfs_bin_mmap will hold of->mutex.
So assume cpu0 called sysfs_bin_mmap, acquired mmap_sem and trying
to get of->mutex.
CPU1 called sysfs_seq_show, acqured of->mutex and trying to
get mddev->reconfig_mutex.
CPU2 called md_ioctl, acquired mddev->reconfig_mutex, and
later call copy_from_user and page fault trying to get mmap_sem.
DEADLOCK now. I can't test the effort of reverting 3124eb16 as
there're a whole patchset and many commits after that. But I do
believe it's buggy and the root cause of my system hang.
CPU0: CPU1:
CPU2:
lock(&mm->mmap_sem)
lock(&of->mutex);
lock(&mddev->reconfig_mutex)
lock(&mm->mmap_sem)
lock(&mddev->reconfig_mutex)
lock(&of->mutex)
Can we revert commit 3124eb167? or any patches to solve this page
fault deadlock? Thanks.
Best Regards
Xiaotian
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: page fault deadlock 2013-11-28 3:25 page fault deadlock Xiaotian Feng @ 2013-11-28 4:11 ` Greg KH 2013-11-28 4:30 ` Xiaotian Feng 2013-11-28 7:28 ` Xiaotian Feng 0 siblings, 2 replies; 6+ messages in thread From: Greg KH @ 2013-11-28 4:11 UTC (permalink / raw) To: Xiaotian Feng; +Cc: Tejun Heo, Andrew Morton, neilb, linux-kernel On Thu, Nov 28, 2013 at 11:25:32AM +0800, Xiaotian Feng wrote: > Hi, > > When I upgrade to latest kernel, I found my system hang there. It > is reproducible on my virtualbox, and I found each time I mounted my > RAID6 partition and tried to vi or build kernel, my whole system > lockup very soon. > > After turning on lockdep, I found following lockdep warning: > > [ 27.848462] > [ 27.848471] ====================================================== > [ 27.848477] [ INFO: possible circular locking dependency detected ] > [ 27.848484] 3.13.0-rc1+ #1 Tainted: GF W > [ 27.848490] ------------------------------------------------------- > [ 27.848496] Xorg/1268 is trying to acquire lock: > [ 27.848501] (&of->mutex){+.+.+.}, at: [<ffffffff8125d58f>] > sysfs_bin_mmap+0x4f/0x120 > [ 27.848516] > [ 27.848516] but task is already holding lock: > [ 27.848521] (&mm->mmap_sem){++++++}, at: [<ffffffff811875bf>] > vm_mmap_pgoff+0x6f/0xc0 > [ 27.848534] > [ 27.848534] which lock already depends on the new lock. > [ 27.848534] > [ 27.848541] > [ 27.848541] the existing dependency chain (in reverse order) is: > [ 27.848547] > [ 27.848547] -> #2 (&mm->mmap_sem){++++++}: > [ 27.848556] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 > [ 27.848564] [<ffffffff8119177c>] might_fault+0x8c/0xb0 > [ 27.848572] [<ffffffff815f4c08>] md_ioctl+0xa78/0x19b0 > [ 27.848580] [<ffffffff813915a4>] blkdev_ioctl+0x234/0x840 > [ 27.848588] [<ffffffff8121db61>] block_ioctl+0x41/0x50 > [ 27.848597] [<ffffffff811f5330>] do_vfs_ioctl+0x300/0x520 > [ 27.848605] [<ffffffff811f55d1>] SyS_ioctl+0x81/0xa0 > [ 27.848613] [<ffffffff81784e98>] tracesys+0xe1/0xe6 > [ 27.848622] > [ 27.848622] -> #1 (&mddev->reconfig_mutex){+.+.+.}: > [ 27.848630] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 > [ 27.848637] [<ffffffff81778568>] > mutex_lock_interruptible_nested+0x78/0x610 > [ 27.848646] [<ffffffff815e9750>] rdev_attr_show+0x40/0x90 > [ 27.848654] [<ffffffff8125db2a>] sysfs_seq_show+0xda/0x170 > [ 27.848662] [<ffffffff812076f4>] seq_read+0x164/0x3e0 > [ 27.848671] [<ffffffff811e1005>] vfs_read+0x95/0x160 > [ 27.848680] [<ffffffff811e1b19>] SyS_read+0x49/0xa0 > [ 27.848687] [<ffffffff81784e98>] tracesys+0xe1/0xe6 > [ 27.848695] > [ 27.848695] -> #0 (&of->mutex){+.+.+.}: > [ 27.848703] [<ffffffff810bfd47>] __lock_acquire+0x1587/0x1ca0 > [ 27.848711] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 > [ 27.848718] [<ffffffff81778048>] mutex_lock_nested+0x68/0x510 > [ 27.848725] [<ffffffff8125d58f>] sysfs_bin_mmap+0x4f/0x120 > [ 27.848732] [<ffffffff8119d82d>] mmap_region+0x3ed/0x5d0 > [ 27.848741] [<ffffffff8119dd5e>] do_mmap_pgoff+0x34e/0x3d0 > [ 27.848748] [<ffffffff811875e0>] vm_mmap_pgoff+0x90/0xc0 > [ 27.848755] [<ffffffff8119c2b5>] SyS_mmap_pgoff+0x1d5/0x270 > [ 27.848763] [<ffffffff8101ae52>] SyS_mmap+0x22/0x30 > [ 27.848771] [<ffffffff81784e98>] tracesys+0xe1/0xe6 > [ 27.848778] > [ 27.848778] other info that might help us debug this: > [ 27.848778] > [ 27.848785] Chain exists of: > [ 27.848785] &of->mutex --> &mddev->reconfig_mutex --> &mm->mmap_sem > [ 27.848785] > [ 27.848795] Possible unsafe locking scenario: > [ 27.848795] > [ 27.848800] CPU0 CPU1 > [ 27.848805] ---- ---- > [ 27.848810] lock(&mm->mmap_sem); > [ 27.848817] lock(&mddev->reconfig_mutex); > [ 27.848824] lock(&mm->mmap_sem); > [ 27.848830] lock(&of->mutex); > [ 27.848837] > [ 27.848837] *** DEADLOCK *** > [ 27.848837] > [ 27.848844] 1 lock held by Xorg/1268: > [ 27.848849] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff811875bf>] > vm_mmap_pgoff+0x6f/0xc0 > [ 27.848861] > [ 27.848861] stack backtrace: > [ 27.848868] CPU: 1 PID: 1268 Comm: Xorg Tainted: GF W 3.13.0-rc1+ #1 > [ 27.848873] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS > VirtualBox 12/01/2006 > [ 27.848879] ffffffff822daa00 ffff8800d0371bc8 ffffffff817725f7 > ffffffff822cbdc0 > [ 27.848901] ffff8800d0371c08 ffffffff8176d9eb ffff8800d0371c60 > ffff880115b42a78 > [ 27.848909] 0000000000000000 ffff880115b42a78 ffff880115b422a0 > 0000000000000001 > [ 27.848918] Call Trace: > [ 27.848930] [<ffffffff817725f7>] dump_stack+0x4e/0x7a > [ 27.848942] [<ffffffff8176d9eb>] print_circular_bug+0x1f9/0x208 > [ 27.848952] [<ffffffff810bfd47>] __lock_acquire+0x1587/0x1ca0 > [ 27.848964] [<ffffffff8101955f>] ? print_context_stack+0x8f/0x100 > [ 27.848975] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 > [ 27.848986] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120 > [ 27.848996] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120 > [ 27.849007] [<ffffffff81778048>] mutex_lock_nested+0x68/0x510 > [ 27.849016] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120 > [ 27.849027] [<ffffffff8176456e>] ? kmemleak_alloc+0x4e/0xb0 > [ 27.849038] [<ffffffff8125d58f>] sysfs_bin_mmap+0x4f/0x120 > [ 27.849048] [<ffffffff8119d82d>] mmap_region+0x3ed/0x5d0 > [ 27.849058] [<ffffffff8119dd5e>] do_mmap_pgoff+0x34e/0x3d0 > [ 27.849070] [<ffffffff811875e0>] vm_mmap_pgoff+0x90/0xc0 > [ 27.849080] [<ffffffff8119c2b5>] SyS_mmap_pgoff+0x1d5/0x270 > [ 27.849092] [<ffffffff81023c55>] ? syscall_trace_enter+0x145/0x270 > [ 27.849102] [<ffffffff8101ae52>] SyS_mmap+0x22/0x30 > [ 27.849112] [<ffffffff81784e98>] tracesys+0xe1/0xe6 > > > I think it is a real deadlock, and it is caused by commit > 3124eb1679b28726 "sysfs: merge regular and bin file handling". > > With that commit, sysfs_bin_mmap will hold of->mutex. > > So assume cpu0 called sysfs_bin_mmap, acquired mmap_sem and trying > to get of->mutex. > > CPU1 called sysfs_seq_show, acqured of->mutex and trying to > get mddev->reconfig_mutex. > > CPU2 called md_ioctl, acquired mddev->reconfig_mutex, and > later call copy_from_user and page fault trying to get mmap_sem. > > DEADLOCK now. I can't test the effort of reverting 3124eb16 as > there're a whole patchset and many commits after that. But I do > believe it's buggy and the root cause of my system hang. > > CPU0: CPU1: > CPU2: > lock(&mm->mmap_sem) > lock(&of->mutex); > > lock(&mddev->reconfig_mutex) > > lock(&mm->mmap_sem) > > lock(&mddev->reconfig_mutex) > lock(&of->mutex) > > Can we revert commit 3124eb167? or any patches to solve this page > fault deadlock? Thanks. Can you try linux-next, this should be fixed with a patch in my tree there, thanks. greg k-h ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: page fault deadlock 2013-11-28 4:11 ` Greg KH @ 2013-11-28 4:30 ` Xiaotian Feng 2013-11-28 7:28 ` Xiaotian Feng 1 sibling, 0 replies; 6+ messages in thread From: Xiaotian Feng @ 2013-11-28 4:30 UTC (permalink / raw) To: Greg KH; +Cc: Tejun Heo, Andrew Morton, neilb, linux-kernel On Thu, Nov 28, 2013 at 12:11 PM, Greg KH <gregkh@linuxfoundation.org> wrote: > On Thu, Nov 28, 2013 at 11:25:32AM +0800, Xiaotian Feng wrote: >> Hi, >> >> When I upgrade to latest kernel, I found my system hang there. It >> is reproducible on my virtualbox, and I found each time I mounted my >> RAID6 partition and tried to vi or build kernel, my whole system >> lockup very soon. >> >> After turning on lockdep, I found following lockdep warning: >> >> [ 27.848462] >> [ 27.848471] ====================================================== >> [ 27.848477] [ INFO: possible circular locking dependency detected ] >> [ 27.848484] 3.13.0-rc1+ #1 Tainted: GF W >> [ 27.848490] ------------------------------------------------------- >> [ 27.848496] Xorg/1268 is trying to acquire lock: >> [ 27.848501] (&of->mutex){+.+.+.}, at: [<ffffffff8125d58f>] >> sysfs_bin_mmap+0x4f/0x120 >> [ 27.848516] >> [ 27.848516] but task is already holding lock: >> [ 27.848521] (&mm->mmap_sem){++++++}, at: [<ffffffff811875bf>] >> vm_mmap_pgoff+0x6f/0xc0 >> [ 27.848534] >> [ 27.848534] which lock already depends on the new lock. >> [ 27.848534] >> [ 27.848541] >> [ 27.848541] the existing dependency chain (in reverse order) is: >> [ 27.848547] >> [ 27.848547] -> #2 (&mm->mmap_sem){++++++}: >> [ 27.848556] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 >> [ 27.848564] [<ffffffff8119177c>] might_fault+0x8c/0xb0 >> [ 27.848572] [<ffffffff815f4c08>] md_ioctl+0xa78/0x19b0 >> [ 27.848580] [<ffffffff813915a4>] blkdev_ioctl+0x234/0x840 >> [ 27.848588] [<ffffffff8121db61>] block_ioctl+0x41/0x50 >> [ 27.848597] [<ffffffff811f5330>] do_vfs_ioctl+0x300/0x520 >> [ 27.848605] [<ffffffff811f55d1>] SyS_ioctl+0x81/0xa0 >> [ 27.848613] [<ffffffff81784e98>] tracesys+0xe1/0xe6 >> [ 27.848622] >> [ 27.848622] -> #1 (&mddev->reconfig_mutex){+.+.+.}: >> [ 27.848630] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 >> [ 27.848637] [<ffffffff81778568>] >> mutex_lock_interruptible_nested+0x78/0x610 >> [ 27.848646] [<ffffffff815e9750>] rdev_attr_show+0x40/0x90 >> [ 27.848654] [<ffffffff8125db2a>] sysfs_seq_show+0xda/0x170 >> [ 27.848662] [<ffffffff812076f4>] seq_read+0x164/0x3e0 >> [ 27.848671] [<ffffffff811e1005>] vfs_read+0x95/0x160 >> [ 27.848680] [<ffffffff811e1b19>] SyS_read+0x49/0xa0 >> [ 27.848687] [<ffffffff81784e98>] tracesys+0xe1/0xe6 >> [ 27.848695] >> [ 27.848695] -> #0 (&of->mutex){+.+.+.}: >> [ 27.848703] [<ffffffff810bfd47>] __lock_acquire+0x1587/0x1ca0 >> [ 27.848711] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 >> [ 27.848718] [<ffffffff81778048>] mutex_lock_nested+0x68/0x510 >> [ 27.848725] [<ffffffff8125d58f>] sysfs_bin_mmap+0x4f/0x120 >> [ 27.848732] [<ffffffff8119d82d>] mmap_region+0x3ed/0x5d0 >> [ 27.848741] [<ffffffff8119dd5e>] do_mmap_pgoff+0x34e/0x3d0 >> [ 27.848748] [<ffffffff811875e0>] vm_mmap_pgoff+0x90/0xc0 >> [ 27.848755] [<ffffffff8119c2b5>] SyS_mmap_pgoff+0x1d5/0x270 >> [ 27.848763] [<ffffffff8101ae52>] SyS_mmap+0x22/0x30 >> [ 27.848771] [<ffffffff81784e98>] tracesys+0xe1/0xe6 >> [ 27.848778] >> [ 27.848778] other info that might help us debug this: >> [ 27.848778] >> [ 27.848785] Chain exists of: >> [ 27.848785] &of->mutex --> &mddev->reconfig_mutex --> &mm->mmap_sem >> [ 27.848785] >> [ 27.848795] Possible unsafe locking scenario: >> [ 27.848795] >> [ 27.848800] CPU0 CPU1 >> [ 27.848805] ---- ---- >> [ 27.848810] lock(&mm->mmap_sem); >> [ 27.848817] lock(&mddev->reconfig_mutex); >> [ 27.848824] lock(&mm->mmap_sem); >> [ 27.848830] lock(&of->mutex); >> [ 27.848837] >> [ 27.848837] *** DEADLOCK *** >> [ 27.848837] >> [ 27.848844] 1 lock held by Xorg/1268: >> [ 27.848849] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff811875bf>] >> vm_mmap_pgoff+0x6f/0xc0 >> [ 27.848861] >> [ 27.848861] stack backtrace: >> [ 27.848868] CPU: 1 PID: 1268 Comm: Xorg Tainted: GF W 3.13.0-rc1+ #1 >> [ 27.848873] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS >> VirtualBox 12/01/2006 >> [ 27.848879] ffffffff822daa00 ffff8800d0371bc8 ffffffff817725f7 >> ffffffff822cbdc0 >> [ 27.848901] ffff8800d0371c08 ffffffff8176d9eb ffff8800d0371c60 >> ffff880115b42a78 >> [ 27.848909] 0000000000000000 ffff880115b42a78 ffff880115b422a0 >> 0000000000000001 >> [ 27.848918] Call Trace: >> [ 27.848930] [<ffffffff817725f7>] dump_stack+0x4e/0x7a >> [ 27.848942] [<ffffffff8176d9eb>] print_circular_bug+0x1f9/0x208 >> [ 27.848952] [<ffffffff810bfd47>] __lock_acquire+0x1587/0x1ca0 >> [ 27.848964] [<ffffffff8101955f>] ? print_context_stack+0x8f/0x100 >> [ 27.848975] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 >> [ 27.848986] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120 >> [ 27.848996] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120 >> [ 27.849007] [<ffffffff81778048>] mutex_lock_nested+0x68/0x510 >> [ 27.849016] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120 >> [ 27.849027] [<ffffffff8176456e>] ? kmemleak_alloc+0x4e/0xb0 >> [ 27.849038] [<ffffffff8125d58f>] sysfs_bin_mmap+0x4f/0x120 >> [ 27.849048] [<ffffffff8119d82d>] mmap_region+0x3ed/0x5d0 >> [ 27.849058] [<ffffffff8119dd5e>] do_mmap_pgoff+0x34e/0x3d0 >> [ 27.849070] [<ffffffff811875e0>] vm_mmap_pgoff+0x90/0xc0 >> [ 27.849080] [<ffffffff8119c2b5>] SyS_mmap_pgoff+0x1d5/0x270 >> [ 27.849092] [<ffffffff81023c55>] ? syscall_trace_enter+0x145/0x270 >> [ 27.849102] [<ffffffff8101ae52>] SyS_mmap+0x22/0x30 >> [ 27.849112] [<ffffffff81784e98>] tracesys+0xe1/0xe6 >> >> >> I think it is a real deadlock, and it is caused by commit >> 3124eb1679b28726 "sysfs: merge regular and bin file handling". >> >> With that commit, sysfs_bin_mmap will hold of->mutex. >> >> So assume cpu0 called sysfs_bin_mmap, acquired mmap_sem and trying >> to get of->mutex. >> >> CPU1 called sysfs_seq_show, acqured of->mutex and trying to >> get mddev->reconfig_mutex. >> >> CPU2 called md_ioctl, acquired mddev->reconfig_mutex, and >> later call copy_from_user and page fault trying to get mmap_sem. >> >> DEADLOCK now. I can't test the effort of reverting 3124eb16 as >> there're a whole patchset and many commits after that. But I do >> believe it's buggy and the root cause of my system hang. >> >> CPU0: CPU1: >> CPU2: >> lock(&mm->mmap_sem) >> lock(&of->mutex); >> >> lock(&mddev->reconfig_mutex) >> >> lock(&mm->mmap_sem) >> >> lock(&mddev->reconfig_mutex) >> lock(&of->mutex) >> >> Can we revert commit 3124eb167? or any patches to solve this page >> fault deadlock? Thanks. > > Can you try linux-next, this should be fixed with a patch in my tree > there, thanks. > Okay, building now, I'll update when I got the result, thanks. > greg k-h ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: page fault deadlock 2013-11-28 4:11 ` Greg KH 2013-11-28 4:30 ` Xiaotian Feng @ 2013-11-28 7:28 ` Xiaotian Feng 2013-11-28 19:17 ` Greg KH 1 sibling, 1 reply; 6+ messages in thread From: Xiaotian Feng @ 2013-11-28 7:28 UTC (permalink / raw) To: Greg KH; +Cc: Tejun Heo, Andrew Morton, neilb, linux-kernel On Thu, Nov 28, 2013 at 12:11 PM, Greg KH <gregkh@linuxfoundation.org> wrote: > On Thu, Nov 28, 2013 at 11:25:32AM +0800, Xiaotian Feng wrote: >> Hi, >> >> When I upgrade to latest kernel, I found my system hang there. It >> is reproducible on my virtualbox, and I found each time I mounted my >> RAID6 partition and tried to vi or build kernel, my whole system >> lockup very soon. >> >> After turning on lockdep, I found following lockdep warning: >> >> [ 27.848462] >> [ 27.848471] ====================================================== >> [ 27.848477] [ INFO: possible circular locking dependency detected ] >> [ 27.848484] 3.13.0-rc1+ #1 Tainted: GF W >> [ 27.848490] ------------------------------------------------------- >> [ 27.848496] Xorg/1268 is trying to acquire lock: >> [ 27.848501] (&of->mutex){+.+.+.}, at: [<ffffffff8125d58f>] >> sysfs_bin_mmap+0x4f/0x120 >> [ 27.848516] >> [ 27.848516] but task is already holding lock: >> [ 27.848521] (&mm->mmap_sem){++++++}, at: [<ffffffff811875bf>] >> vm_mmap_pgoff+0x6f/0xc0 >> [ 27.848534] >> [ 27.848534] which lock already depends on the new lock. >> [ 27.848534] >> [ 27.848541] >> [ 27.848541] the existing dependency chain (in reverse order) is: >> [ 27.848547] >> [ 27.848547] -> #2 (&mm->mmap_sem){++++++}: >> [ 27.848556] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 >> [ 27.848564] [<ffffffff8119177c>] might_fault+0x8c/0xb0 >> [ 27.848572] [<ffffffff815f4c08>] md_ioctl+0xa78/0x19b0 >> [ 27.848580] [<ffffffff813915a4>] blkdev_ioctl+0x234/0x840 >> [ 27.848588] [<ffffffff8121db61>] block_ioctl+0x41/0x50 >> [ 27.848597] [<ffffffff811f5330>] do_vfs_ioctl+0x300/0x520 >> [ 27.848605] [<ffffffff811f55d1>] SyS_ioctl+0x81/0xa0 >> [ 27.848613] [<ffffffff81784e98>] tracesys+0xe1/0xe6 >> [ 27.848622] >> [ 27.848622] -> #1 (&mddev->reconfig_mutex){+.+.+.}: >> [ 27.848630] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 >> [ 27.848637] [<ffffffff81778568>] >> mutex_lock_interruptible_nested+0x78/0x610 >> [ 27.848646] [<ffffffff815e9750>] rdev_attr_show+0x40/0x90 >> [ 27.848654] [<ffffffff8125db2a>] sysfs_seq_show+0xda/0x170 >> [ 27.848662] [<ffffffff812076f4>] seq_read+0x164/0x3e0 >> [ 27.848671] [<ffffffff811e1005>] vfs_read+0x95/0x160 >> [ 27.848680] [<ffffffff811e1b19>] SyS_read+0x49/0xa0 >> [ 27.848687] [<ffffffff81784e98>] tracesys+0xe1/0xe6 >> [ 27.848695] >> [ 27.848695] -> #0 (&of->mutex){+.+.+.}: >> [ 27.848703] [<ffffffff810bfd47>] __lock_acquire+0x1587/0x1ca0 >> [ 27.848711] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 >> [ 27.848718] [<ffffffff81778048>] mutex_lock_nested+0x68/0x510 >> [ 27.848725] [<ffffffff8125d58f>] sysfs_bin_mmap+0x4f/0x120 >> [ 27.848732] [<ffffffff8119d82d>] mmap_region+0x3ed/0x5d0 >> [ 27.848741] [<ffffffff8119dd5e>] do_mmap_pgoff+0x34e/0x3d0 >> [ 27.848748] [<ffffffff811875e0>] vm_mmap_pgoff+0x90/0xc0 >> [ 27.848755] [<ffffffff8119c2b5>] SyS_mmap_pgoff+0x1d5/0x270 >> [ 27.848763] [<ffffffff8101ae52>] SyS_mmap+0x22/0x30 >> [ 27.848771] [<ffffffff81784e98>] tracesys+0xe1/0xe6 >> [ 27.848778] >> [ 27.848778] other info that might help us debug this: >> [ 27.848778] >> [ 27.848785] Chain exists of: >> [ 27.848785] &of->mutex --> &mddev->reconfig_mutex --> &mm->mmap_sem >> [ 27.848785] >> [ 27.848795] Possible unsafe locking scenario: >> [ 27.848795] >> [ 27.848800] CPU0 CPU1 >> [ 27.848805] ---- ---- >> [ 27.848810] lock(&mm->mmap_sem); >> [ 27.848817] lock(&mddev->reconfig_mutex); >> [ 27.848824] lock(&mm->mmap_sem); >> [ 27.848830] lock(&of->mutex); >> [ 27.848837] >> [ 27.848837] *** DEADLOCK *** >> [ 27.848837] >> [ 27.848844] 1 lock held by Xorg/1268: >> [ 27.848849] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff811875bf>] >> vm_mmap_pgoff+0x6f/0xc0 >> [ 27.848861] >> [ 27.848861] stack backtrace: >> [ 27.848868] CPU: 1 PID: 1268 Comm: Xorg Tainted: GF W 3.13.0-rc1+ #1 >> [ 27.848873] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS >> VirtualBox 12/01/2006 >> [ 27.848879] ffffffff822daa00 ffff8800d0371bc8 ffffffff817725f7 >> ffffffff822cbdc0 >> [ 27.848901] ffff8800d0371c08 ffffffff8176d9eb ffff8800d0371c60 >> ffff880115b42a78 >> [ 27.848909] 0000000000000000 ffff880115b42a78 ffff880115b422a0 >> 0000000000000001 >> [ 27.848918] Call Trace: >> [ 27.848930] [<ffffffff817725f7>] dump_stack+0x4e/0x7a >> [ 27.848942] [<ffffffff8176d9eb>] print_circular_bug+0x1f9/0x208 >> [ 27.848952] [<ffffffff810bfd47>] __lock_acquire+0x1587/0x1ca0 >> [ 27.848964] [<ffffffff8101955f>] ? print_context_stack+0x8f/0x100 >> [ 27.848975] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 >> [ 27.848986] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120 >> [ 27.848996] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120 >> [ 27.849007] [<ffffffff81778048>] mutex_lock_nested+0x68/0x510 >> [ 27.849016] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120 >> [ 27.849027] [<ffffffff8176456e>] ? kmemleak_alloc+0x4e/0xb0 >> [ 27.849038] [<ffffffff8125d58f>] sysfs_bin_mmap+0x4f/0x120 >> [ 27.849048] [<ffffffff8119d82d>] mmap_region+0x3ed/0x5d0 >> [ 27.849058] [<ffffffff8119dd5e>] do_mmap_pgoff+0x34e/0x3d0 >> [ 27.849070] [<ffffffff811875e0>] vm_mmap_pgoff+0x90/0xc0 >> [ 27.849080] [<ffffffff8119c2b5>] SyS_mmap_pgoff+0x1d5/0x270 >> [ 27.849092] [<ffffffff81023c55>] ? syscall_trace_enter+0x145/0x270 >> [ 27.849102] [<ffffffff8101ae52>] SyS_mmap+0x22/0x30 >> [ 27.849112] [<ffffffff81784e98>] tracesys+0xe1/0xe6 >> >> >> I think it is a real deadlock, and it is caused by commit >> 3124eb1679b28726 "sysfs: merge regular and bin file handling". >> >> With that commit, sysfs_bin_mmap will hold of->mutex. >> >> So assume cpu0 called sysfs_bin_mmap, acquired mmap_sem and trying >> to get of->mutex. >> >> CPU1 called sysfs_seq_show, acqured of->mutex and trying to >> get mddev->reconfig_mutex. >> >> CPU2 called md_ioctl, acquired mddev->reconfig_mutex, and >> later call copy_from_user and page fault trying to get mmap_sem. >> >> DEADLOCK now. I can't test the effort of reverting 3124eb16 as >> there're a whole patchset and many commits after that. But I do >> believe it's buggy and the root cause of my system hang. >> >> CPU0: CPU1: >> CPU2: >> lock(&mm->mmap_sem) >> lock(&of->mutex); >> >> lock(&mddev->reconfig_mutex) >> >> lock(&mm->mmap_sem) >> >> lock(&mddev->reconfig_mutex) >> lock(&of->mutex) >> >> Can we revert commit 3124eb167? or any patches to solve this page >> fault deadlock? Thanks. > > Can you try linux-next, this should be fixed with a patch in my tree > there, thanks. > Sorry, It's even worse. My whole system lockup when I'm trying to mount /dev/md0 :( > greg k-h ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: page fault deadlock 2013-11-28 7:28 ` Xiaotian Feng @ 2013-11-28 19:17 ` Greg KH 2013-11-29 7:38 ` Xiaotian Feng 0 siblings, 1 reply; 6+ messages in thread From: Greg KH @ 2013-11-28 19:17 UTC (permalink / raw) To: Xiaotian Feng; +Cc: Tejun Heo, Andrew Morton, neilb, linux-kernel On Thu, Nov 28, 2013 at 03:28:39PM +0800, Xiaotian Feng wrote: > On Thu, Nov 28, 2013 at 12:11 PM, Greg KH <gregkh@linuxfoundation.org> wrote: > > On Thu, Nov 28, 2013 at 11:25:32AM +0800, Xiaotian Feng wrote: > >> Hi, > >> > >> When I upgrade to latest kernel, I found my system hang there. It > >> is reproducible on my virtualbox, and I found each time I mounted my > >> RAID6 partition and tried to vi or build kernel, my whole system > >> lockup very soon. > >> > >> After turning on lockdep, I found following lockdep warning: > >> > >> [ 27.848462] > >> [ 27.848471] ====================================================== > >> [ 27.848477] [ INFO: possible circular locking dependency detected ] > >> [ 27.848484] 3.13.0-rc1+ #1 Tainted: GF W > >> [ 27.848490] ------------------------------------------------------- > >> [ 27.848496] Xorg/1268 is trying to acquire lock: > >> [ 27.848501] (&of->mutex){+.+.+.}, at: [<ffffffff8125d58f>] > >> sysfs_bin_mmap+0x4f/0x120 > >> [ 27.848516] > >> [ 27.848516] but task is already holding lock: > >> [ 27.848521] (&mm->mmap_sem){++++++}, at: [<ffffffff811875bf>] > >> vm_mmap_pgoff+0x6f/0xc0 > >> [ 27.848534] > >> [ 27.848534] which lock already depends on the new lock. > >> [ 27.848534] > >> [ 27.848541] > >> [ 27.848541] the existing dependency chain (in reverse order) is: > >> [ 27.848547] > >> [ 27.848547] -> #2 (&mm->mmap_sem){++++++}: > >> [ 27.848556] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 > >> [ 27.848564] [<ffffffff8119177c>] might_fault+0x8c/0xb0 > >> [ 27.848572] [<ffffffff815f4c08>] md_ioctl+0xa78/0x19b0 > >> [ 27.848580] [<ffffffff813915a4>] blkdev_ioctl+0x234/0x840 > >> [ 27.848588] [<ffffffff8121db61>] block_ioctl+0x41/0x50 > >> [ 27.848597] [<ffffffff811f5330>] do_vfs_ioctl+0x300/0x520 > >> [ 27.848605] [<ffffffff811f55d1>] SyS_ioctl+0x81/0xa0 > >> [ 27.848613] [<ffffffff81784e98>] tracesys+0xe1/0xe6 > >> [ 27.848622] > >> [ 27.848622] -> #1 (&mddev->reconfig_mutex){+.+.+.}: > >> [ 27.848630] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 > >> [ 27.848637] [<ffffffff81778568>] > >> mutex_lock_interruptible_nested+0x78/0x610 > >> [ 27.848646] [<ffffffff815e9750>] rdev_attr_show+0x40/0x90 > >> [ 27.848654] [<ffffffff8125db2a>] sysfs_seq_show+0xda/0x170 > >> [ 27.848662] [<ffffffff812076f4>] seq_read+0x164/0x3e0 > >> [ 27.848671] [<ffffffff811e1005>] vfs_read+0x95/0x160 > >> [ 27.848680] [<ffffffff811e1b19>] SyS_read+0x49/0xa0 > >> [ 27.848687] [<ffffffff81784e98>] tracesys+0xe1/0xe6 > >> [ 27.848695] > >> [ 27.848695] -> #0 (&of->mutex){+.+.+.}: > >> [ 27.848703] [<ffffffff810bfd47>] __lock_acquire+0x1587/0x1ca0 > >> [ 27.848711] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 > >> [ 27.848718] [<ffffffff81778048>] mutex_lock_nested+0x68/0x510 > >> [ 27.848725] [<ffffffff8125d58f>] sysfs_bin_mmap+0x4f/0x120 > >> [ 27.848732] [<ffffffff8119d82d>] mmap_region+0x3ed/0x5d0 > >> [ 27.848741] [<ffffffff8119dd5e>] do_mmap_pgoff+0x34e/0x3d0 > >> [ 27.848748] [<ffffffff811875e0>] vm_mmap_pgoff+0x90/0xc0 > >> [ 27.848755] [<ffffffff8119c2b5>] SyS_mmap_pgoff+0x1d5/0x270 > >> [ 27.848763] [<ffffffff8101ae52>] SyS_mmap+0x22/0x30 > >> [ 27.848771] [<ffffffff81784e98>] tracesys+0xe1/0xe6 > >> [ 27.848778] > >> [ 27.848778] other info that might help us debug this: > >> [ 27.848778] > >> [ 27.848785] Chain exists of: > >> [ 27.848785] &of->mutex --> &mddev->reconfig_mutex --> &mm->mmap_sem > >> [ 27.848785] > >> [ 27.848795] Possible unsafe locking scenario: > >> [ 27.848795] > >> [ 27.848800] CPU0 CPU1 > >> [ 27.848805] ---- ---- > >> [ 27.848810] lock(&mm->mmap_sem); > >> [ 27.848817] lock(&mddev->reconfig_mutex); > >> [ 27.848824] lock(&mm->mmap_sem); > >> [ 27.848830] lock(&of->mutex); > >> [ 27.848837] > >> [ 27.848837] *** DEADLOCK *** > >> [ 27.848837] > >> [ 27.848844] 1 lock held by Xorg/1268: > >> [ 27.848849] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff811875bf>] > >> vm_mmap_pgoff+0x6f/0xc0 > >> [ 27.848861] > >> [ 27.848861] stack backtrace: > >> [ 27.848868] CPU: 1 PID: 1268 Comm: Xorg Tainted: GF W 3.13.0-rc1+ #1 > >> [ 27.848873] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS > >> VirtualBox 12/01/2006 > >> [ 27.848879] ffffffff822daa00 ffff8800d0371bc8 ffffffff817725f7 > >> ffffffff822cbdc0 > >> [ 27.848901] ffff8800d0371c08 ffffffff8176d9eb ffff8800d0371c60 > >> ffff880115b42a78 > >> [ 27.848909] 0000000000000000 ffff880115b42a78 ffff880115b422a0 > >> 0000000000000001 > >> [ 27.848918] Call Trace: > >> [ 27.848930] [<ffffffff817725f7>] dump_stack+0x4e/0x7a > >> [ 27.848942] [<ffffffff8176d9eb>] print_circular_bug+0x1f9/0x208 > >> [ 27.848952] [<ffffffff810bfd47>] __lock_acquire+0x1587/0x1ca0 > >> [ 27.848964] [<ffffffff8101955f>] ? print_context_stack+0x8f/0x100 > >> [ 27.848975] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 > >> [ 27.848986] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120 > >> [ 27.848996] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120 > >> [ 27.849007] [<ffffffff81778048>] mutex_lock_nested+0x68/0x510 > >> [ 27.849016] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120 > >> [ 27.849027] [<ffffffff8176456e>] ? kmemleak_alloc+0x4e/0xb0 > >> [ 27.849038] [<ffffffff8125d58f>] sysfs_bin_mmap+0x4f/0x120 > >> [ 27.849048] [<ffffffff8119d82d>] mmap_region+0x3ed/0x5d0 > >> [ 27.849058] [<ffffffff8119dd5e>] do_mmap_pgoff+0x34e/0x3d0 > >> [ 27.849070] [<ffffffff811875e0>] vm_mmap_pgoff+0x90/0xc0 > >> [ 27.849080] [<ffffffff8119c2b5>] SyS_mmap_pgoff+0x1d5/0x270 > >> [ 27.849092] [<ffffffff81023c55>] ? syscall_trace_enter+0x145/0x270 > >> [ 27.849102] [<ffffffff8101ae52>] SyS_mmap+0x22/0x30 > >> [ 27.849112] [<ffffffff81784e98>] tracesys+0xe1/0xe6 > >> > >> > >> I think it is a real deadlock, and it is caused by commit > >> 3124eb1679b28726 "sysfs: merge regular and bin file handling". > >> > >> With that commit, sysfs_bin_mmap will hold of->mutex. > >> > >> So assume cpu0 called sysfs_bin_mmap, acquired mmap_sem and trying > >> to get of->mutex. > >> > >> CPU1 called sysfs_seq_show, acqured of->mutex and trying to > >> get mddev->reconfig_mutex. > >> > >> CPU2 called md_ioctl, acquired mddev->reconfig_mutex, and > >> later call copy_from_user and page fault trying to get mmap_sem. > >> > >> DEADLOCK now. I can't test the effort of reverting 3124eb16 as > >> there're a whole patchset and many commits after that. But I do > >> believe it's buggy and the root cause of my system hang. > >> > >> CPU0: CPU1: > >> CPU2: > >> lock(&mm->mmap_sem) > >> lock(&of->mutex); > >> > >> lock(&mddev->reconfig_mutex) > >> > >> lock(&mm->mmap_sem) > >> > >> lock(&mddev->reconfig_mutex) > >> lock(&of->mutex) > >> > >> Can we revert commit 3124eb167? or any patches to solve this page > >> fault deadlock? Thanks. > > > > Can you try linux-next, this should be fixed with a patch in my tree > > there, thanks. > > > > Sorry, It's even worse. My whole system lockup when I'm trying to > mount /dev/md0 :( Ok, that sounds like some other problem. Can you try Linus's tree now, the sysfs patch is now in it. greg k-h ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: page fault deadlock 2013-11-28 19:17 ` Greg KH @ 2013-11-29 7:38 ` Xiaotian Feng 0 siblings, 0 replies; 6+ messages in thread From: Xiaotian Feng @ 2013-11-29 7:38 UTC (permalink / raw) To: Greg KH; +Cc: Tejun Heo, Andrew Morton, neilb, linux-kernel On Fri, Nov 29, 2013 at 3:17 AM, Greg KH <gregkh@linuxfoundation.org> wrote: > On Thu, Nov 28, 2013 at 03:28:39PM +0800, Xiaotian Feng wrote: >> On Thu, Nov 28, 2013 at 12:11 PM, Greg KH <gregkh@linuxfoundation.org> wrote: >> > On Thu, Nov 28, 2013 at 11:25:32AM +0800, Xiaotian Feng wrote: >> >> Hi, >> >> >> >> When I upgrade to latest kernel, I found my system hang there. It >> >> is reproducible on my virtualbox, and I found each time I mounted my >> >> RAID6 partition and tried to vi or build kernel, my whole system >> >> lockup very soon. >> >> >> >> After turning on lockdep, I found following lockdep warning: >> >> >> >> [ 27.848462] >> >> [ 27.848471] ====================================================== >> >> [ 27.848477] [ INFO: possible circular locking dependency detected ] >> >> [ 27.848484] 3.13.0-rc1+ #1 Tainted: GF W >> >> [ 27.848490] ------------------------------------------------------- >> >> [ 27.848496] Xorg/1268 is trying to acquire lock: >> >> [ 27.848501] (&of->mutex){+.+.+.}, at: [<ffffffff8125d58f>] >> >> sysfs_bin_mmap+0x4f/0x120 >> >> [ 27.848516] >> >> [ 27.848516] but task is already holding lock: >> >> [ 27.848521] (&mm->mmap_sem){++++++}, at: [<ffffffff811875bf>] >> >> vm_mmap_pgoff+0x6f/0xc0 >> >> [ 27.848534] >> >> [ 27.848534] which lock already depends on the new lock. >> >> [ 27.848534] >> >> [ 27.848541] >> >> [ 27.848541] the existing dependency chain (in reverse order) is: >> >> [ 27.848547] >> >> [ 27.848547] -> #2 (&mm->mmap_sem){++++++}: >> >> [ 27.848556] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 >> >> [ 27.848564] [<ffffffff8119177c>] might_fault+0x8c/0xb0 >> >> [ 27.848572] [<ffffffff815f4c08>] md_ioctl+0xa78/0x19b0 >> >> [ 27.848580] [<ffffffff813915a4>] blkdev_ioctl+0x234/0x840 >> >> [ 27.848588] [<ffffffff8121db61>] block_ioctl+0x41/0x50 >> >> [ 27.848597] [<ffffffff811f5330>] do_vfs_ioctl+0x300/0x520 >> >> [ 27.848605] [<ffffffff811f55d1>] SyS_ioctl+0x81/0xa0 >> >> [ 27.848613] [<ffffffff81784e98>] tracesys+0xe1/0xe6 >> >> [ 27.848622] >> >> [ 27.848622] -> #1 (&mddev->reconfig_mutex){+.+.+.}: >> >> [ 27.848630] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 >> >> [ 27.848637] [<ffffffff81778568>] >> >> mutex_lock_interruptible_nested+0x78/0x610 >> >> [ 27.848646] [<ffffffff815e9750>] rdev_attr_show+0x40/0x90 >> >> [ 27.848654] [<ffffffff8125db2a>] sysfs_seq_show+0xda/0x170 >> >> [ 27.848662] [<ffffffff812076f4>] seq_read+0x164/0x3e0 >> >> [ 27.848671] [<ffffffff811e1005>] vfs_read+0x95/0x160 >> >> [ 27.848680] [<ffffffff811e1b19>] SyS_read+0x49/0xa0 >> >> [ 27.848687] [<ffffffff81784e98>] tracesys+0xe1/0xe6 >> >> [ 27.848695] >> >> [ 27.848695] -> #0 (&of->mutex){+.+.+.}: >> >> [ 27.848703] [<ffffffff810bfd47>] __lock_acquire+0x1587/0x1ca0 >> >> [ 27.848711] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 >> >> [ 27.848718] [<ffffffff81778048>] mutex_lock_nested+0x68/0x510 >> >> [ 27.848725] [<ffffffff8125d58f>] sysfs_bin_mmap+0x4f/0x120 >> >> [ 27.848732] [<ffffffff8119d82d>] mmap_region+0x3ed/0x5d0 >> >> [ 27.848741] [<ffffffff8119dd5e>] do_mmap_pgoff+0x34e/0x3d0 >> >> [ 27.848748] [<ffffffff811875e0>] vm_mmap_pgoff+0x90/0xc0 >> >> [ 27.848755] [<ffffffff8119c2b5>] SyS_mmap_pgoff+0x1d5/0x270 >> >> [ 27.848763] [<ffffffff8101ae52>] SyS_mmap+0x22/0x30 >> >> [ 27.848771] [<ffffffff81784e98>] tracesys+0xe1/0xe6 >> >> [ 27.848778] >> >> [ 27.848778] other info that might help us debug this: >> >> [ 27.848778] >> >> [ 27.848785] Chain exists of: >> >> [ 27.848785] &of->mutex --> &mddev->reconfig_mutex --> &mm->mmap_sem >> >> [ 27.848785] >> >> [ 27.848795] Possible unsafe locking scenario: >> >> [ 27.848795] >> >> [ 27.848800] CPU0 CPU1 >> >> [ 27.848805] ---- ---- >> >> [ 27.848810] lock(&mm->mmap_sem); >> >> [ 27.848817] lock(&mddev->reconfig_mutex); >> >> [ 27.848824] lock(&mm->mmap_sem); >> >> [ 27.848830] lock(&of->mutex); >> >> [ 27.848837] >> >> [ 27.848837] *** DEADLOCK *** >> >> [ 27.848837] >> >> [ 27.848844] 1 lock held by Xorg/1268: >> >> [ 27.848849] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff811875bf>] >> >> vm_mmap_pgoff+0x6f/0xc0 >> >> [ 27.848861] >> >> [ 27.848861] stack backtrace: >> >> [ 27.848868] CPU: 1 PID: 1268 Comm: Xorg Tainted: GF W 3.13.0-rc1+ #1 >> >> [ 27.848873] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS >> >> VirtualBox 12/01/2006 >> >> [ 27.848879] ffffffff822daa00 ffff8800d0371bc8 ffffffff817725f7 >> >> ffffffff822cbdc0 >> >> [ 27.848901] ffff8800d0371c08 ffffffff8176d9eb ffff8800d0371c60 >> >> ffff880115b42a78 >> >> [ 27.848909] 0000000000000000 ffff880115b42a78 ffff880115b422a0 >> >> 0000000000000001 >> >> [ 27.848918] Call Trace: >> >> [ 27.848930] [<ffffffff817725f7>] dump_stack+0x4e/0x7a >> >> [ 27.848942] [<ffffffff8176d9eb>] print_circular_bug+0x1f9/0x208 >> >> [ 27.848952] [<ffffffff810bfd47>] __lock_acquire+0x1587/0x1ca0 >> >> [ 27.848964] [<ffffffff8101955f>] ? print_context_stack+0x8f/0x100 >> >> [ 27.848975] [<ffffffff810c0510>] lock_acquire+0xb0/0x160 >> >> [ 27.848986] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120 >> >> [ 27.848996] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120 >> >> [ 27.849007] [<ffffffff81778048>] mutex_lock_nested+0x68/0x510 >> >> [ 27.849016] [<ffffffff8125d58f>] ? sysfs_bin_mmap+0x4f/0x120 >> >> [ 27.849027] [<ffffffff8176456e>] ? kmemleak_alloc+0x4e/0xb0 >> >> [ 27.849038] [<ffffffff8125d58f>] sysfs_bin_mmap+0x4f/0x120 >> >> [ 27.849048] [<ffffffff8119d82d>] mmap_region+0x3ed/0x5d0 >> >> [ 27.849058] [<ffffffff8119dd5e>] do_mmap_pgoff+0x34e/0x3d0 >> >> [ 27.849070] [<ffffffff811875e0>] vm_mmap_pgoff+0x90/0xc0 >> >> [ 27.849080] [<ffffffff8119c2b5>] SyS_mmap_pgoff+0x1d5/0x270 >> >> [ 27.849092] [<ffffffff81023c55>] ? syscall_trace_enter+0x145/0x270 >> >> [ 27.849102] [<ffffffff8101ae52>] SyS_mmap+0x22/0x30 >> >> [ 27.849112] [<ffffffff81784e98>] tracesys+0xe1/0xe6 >> >> >> >> >> >> I think it is a real deadlock, and it is caused by commit >> >> 3124eb1679b28726 "sysfs: merge regular and bin file handling". >> >> >> >> With that commit, sysfs_bin_mmap will hold of->mutex. >> >> >> >> So assume cpu0 called sysfs_bin_mmap, acquired mmap_sem and trying >> >> to get of->mutex. >> >> >> >> CPU1 called sysfs_seq_show, acqured of->mutex and trying to >> >> get mddev->reconfig_mutex. >> >> >> >> CPU2 called md_ioctl, acquired mddev->reconfig_mutex, and >> >> later call copy_from_user and page fault trying to get mmap_sem. >> >> >> >> DEADLOCK now. I can't test the effort of reverting 3124eb16 as >> >> there're a whole patchset and many commits after that. But I do >> >> believe it's buggy and the root cause of my system hang. >> >> >> >> CPU0: CPU1: >> >> CPU2: >> >> lock(&mm->mmap_sem) >> >> lock(&of->mutex); >> >> >> >> lock(&mddev->reconfig_mutex) >> >> >> >> lock(&mm->mmap_sem) >> >> >> >> lock(&mddev->reconfig_mutex) >> >> lock(&of->mutex) >> >> >> >> Can we revert commit 3124eb167? or any patches to solve this page >> >> fault deadlock? Thanks. >> > >> > Can you try linux-next, this should be fixed with a patch in my tree >> > there, thanks. >> > >> >> Sorry, It's even worse. My whole system lockup when I'm trying to >> mount /dev/md0 :( > > Ok, that sounds like some other problem. > > Can you try Linus's tree now, the sysfs patch is now in it. Yes, the lockdep warning disappeared and my system doesn't freeze on file operations on my /dev/md0. Thanks. > > greg k-h ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-11-29 7:39 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-11-28 3:25 page fault deadlock Xiaotian Feng 2013-11-28 4:11 ` Greg KH 2013-11-28 4:30 ` Xiaotian Feng 2013-11-28 7:28 ` Xiaotian Feng 2013-11-28 19:17 ` Greg KH 2013-11-29 7:38 ` Xiaotian Feng
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.