* lockdep detects possible deadlock scenario
@ 2017-07-20 11:29 Jerry Lee
2017-07-20 12:16 ` Jan Kara
0 siblings, 1 reply; 5+ messages in thread
From: Jerry Lee @ 2017-07-20 11:29 UTC (permalink / raw)
To: linux-ext4, Theodore Ts'o, jack
Hi,
I hit the following lockdep trace on linux-4.2.8 and I could steadily
re-produce it on some of my machine. Although the trace shows up, the
file system works quite well without seeing any operations being stuck
on it. Does it mean that the trace is just a false alarm? Thanks.
BTW, I've saw some similar traces previously in the mailing list and
found that the patch, "ext4: add lockdep annotations for i_data_sem
(daf647d2dd58)", which is already included in my kernel.
======================================================
<4>[ 205.633705] [ INFO: possible circular locking dependency detected ]
<4>[ 205.639962] 4.2.8 #3 Tainted: G W O
<4>[ 205.644395] -------------------------------------------------------
<4>[ 205.650650] rm/19302 is trying to acquire lock:
<4>[ 205.655174] (&s->s_dquot.dqio_mutex){+.+...}, at:
[<ffffffff81250678>] dquot_commit+0x28/0xc0
<4>[ 205.663835]
<4>[ 205.663835] but task is already holding lock:
<4>[ 205.669659] (&ei->i_data_sem){++++..}, at: [<ffffffff812b2329>]
ext4_truncate+0x379/0x680
<4>[ 205.677960]
<4>[ 205.677960] which lock already depends on the new lock.
<4>[ 205.677960]
<4>[ 205.686119]
<4>[ 205.686119] the existing dependency chain (in reverse order) is:
<4>[ 205.693586]
<4>[ 205.693586] -> #1 (&ei->i_data_sem){++++..}:
<4>[ 205.698071] [<ffffffff810cde05>] lock_acquire+0xd5/0x280
<4>[ 205.703995] [<ffffffff81c82727>] down_read+0x47/0x60
<4>[ 205.709573] [<ffffffff812acefb>] ext4_map_blocks+0x48b/0x5f0
<4>[ 205.715845] [<ffffffff812ad653>] ext4_getblk+0x43/0x190
<4>[ 205.721682] [<ffffffff812ad7ae>] ext4_bread+0xe/0xa0
<4>[ 205.727260] [<ffffffff812c225d>] ext4_quota_read+0xcd/0x110
<4>[ 205.733442] [<ffffffff81254687>] read_blk+0x47/0x50
<4>[ 205.738933] [<ffffffff81255452>] find_tree_dqentry+0x42/0x230
<4>[ 205.745290] [<ffffffff812555b4>] find_tree_dqentry+0x1a4/0x230
<4>[ 205.751730] [<ffffffff812555b4>] find_tree_dqentry+0x1a4/0x230
<4>[ 205.758172] [<ffffffff812555b4>] find_tree_dqentry+0x1a4/0x230
<4>[ 205.764612] [<ffffffff81255773>] qtree_read_dquot+0x133/0x260
<4>[ 205.770968] [<ffffffff81253d89>] v2_read_dquot+0x29/0x30
<4>[ 205.776892] [<ffffffff8124f7f6>] dquot_acquire+0xe6/0x130
<4>[ 205.782902] [<ffffffff812c1c9a>] ext4_acquire_dquot+0x6a/0xb0
<4>[ 205.789258] [<ffffffff81251970>] dqget+0x3c0/0x420
<4>[ 205.794662] [<ffffffff81251afd>] __dquot_initialize+0x12d/0x230
<4>[ 205.801187] [<ffffffff81251c0e>] dquot_initialize+0xe/0x10
<4>[ 205.807283] [<ffffffff812d215b>] ext4_fill_super+0x2d9b/0x3150
<4>[ 205.813729] [<ffffffff811e0f00>] mount_bdev+0x180/0x1b0
<4>[ 205.819566] [<ffffffff812c18c0>] ext4_mount+0x10/0x20
<4>[ 205.825230] [<ffffffff811e17a4>] mount_fs+0x14/0xa0
<4>[ 205.830719] [<ffffffff81203736>] vfs_kern_mount+0x66/0x150
<4>[ 205.836815] [<ffffffff812066a5>] do_mount+0x1e5/0xd00
<4>[ 205.842476] [<ffffffff812074b6>] SyS_mount+0x86/0xc0
<4>[ 205.848048] [<ffffffff81c84bd7>]
entry_SYSCALL_64_fastpath+0x12/0x6f
<4>[ 205.855010]
<4>[ 205.855010] -> #0 (&s->s_dquot.dqio_mutex){+.+...}:
<4>[ 205.860103] [<ffffffff810ccccc>] __lock_acquire+0x1fdc/0x23a0
<4>[ 205.866464] [<ffffffff810cde05>] lock_acquire+0xd5/0x280
<4>[ 205.872393] [<ffffffff81c80990>] mutex_lock_nested+0x60/0x370
<4>[ 205.878753] [<ffffffff81250678>] dquot_commit+0x28/0xc0
<4>[ 205.884594] [<ffffffff812c1d4e>] ext4_write_dquot+0x6e/0xa0
<4>[ 205.890783] [<ffffffff812c1dbe>] ext4_mark_dquot_dirty+0x3e/0x60
<4>[ 205.897408] [<ffffffff81250857>] __dquot_free_space+0x147/0x310
<4>[ 205.903943] [<ffffffff812ebe0d>] ext4_free_blocks+0x77d/0x1010
<4>[ 205.910390] [<ffffffff812dbb66>] ext4_ext_remove_space+0x8f6/0x16a0
<4>[ 205.917271] [<ffffffff812de49f>] ext4_ext_truncate+0xaf/0xe0
<4>[ 205.923547] [<ffffffff812b23f0>] ext4_truncate+0x440/0x680
<4>[ 205.929651] [<ffffffff812b2a9f>] ext4_evict_inode+0x46f/0x730
<4>[ 205.936017] [<ffffffff811fe473>] evict+0xb3/0x180
<4>[ 205.941337] [<ffffffff811fecc7>] iput+0x187/0x350
<4>[ 205.946662] [<ffffffff811f0243>] do_unlinkat+0x163/0x340
<4>[ 205.952588] [<ffffffff811f0481>] SyS_unlink+0x11/0x20
<4>[ 205.958258] [<ffffffff81c84bd7>]
entry_SYSCALL_64_fastpath+0x12/0x6f
<4>[ 205.965226]
<4>[ 205.965226] other info that might help us debug this:
<4>[ 205.965226]
<4>[ 205.973217] Possible unsafe locking scenario:
<4>[ 205.973217]
<4>[ 205.979127] CPU0 CPU1
<4>[ 205.983652] ---- ----
<4>[ 205.988174] lock(&ei->i_data_sem);
<4>[ 205.991772] lock(&s->s_dquot.dqio_mutex);
<4>[ 205.998487] lock(&ei->i_data_sem);
<4>[ 206.004596] lock(&s->s_dquot.dqio_mutex);
<4>[ 206.008795]
<4>[ 206.008795] *** DEADLOCK ***
<4>[ 206.008795]
<4>[ 206.014708] 5 locks held by rm/19302:
<4>[ 206.018364] #0: (sb_writers#10){.+.+.+}, at:
[<ffffffff81204bbf>] mnt_want_write+0x1f/0x50
<4>[ 206.026870] #1: (sb_internal){.+.+..}, at:
[<ffffffff812b27a9>] ext4_evict_inode+0x179/0x730
<4>[ 206.035538] #2: (jbd2_handle){+.+...}, at:
[<ffffffff81306cc1>] start_this_handle+0x191/0x630
<4>[ 206.044298] #3: (&ei->i_data_sem){++++..}, at:
[<ffffffff812b2329>] ext4_truncate+0x379/0x680
<4>[ 206.053052] #4: (dquot_srcu){......}, at: [<ffffffff8125076a>]
__dquot_free_space+0x5a/0x310
<4>[ 206.061719]
<4>[ 206.061719] stack backtrace:
<4>[ 206.066072] CPU: 0 PID: 19302 Comm: rm Tainted: G W O 4.2.8 #3
<4>[ 206.072848] Hardware name: To be filled by O.E.M. To be filled
by O.E.M./MAHOBAY, BIOS QC30AR23 08/14/2014
<4>[ 206.082486] ffffffff82effc40 ffff88004cebf7d8 ffffffff81c767eb
0000000000000007
<4>[ 206.089919] ffffffff82effc40 ffff88004cebf828 ffffffff81c739cd
ffff880044724e08
<4>[ 206.097364] ffff88004cebf898 ffff88004cebf828 0000000000000005
ffff880044724640
<4>[ 206.104807] Call Trace:
<4>[ 206.107257] [<ffffffff81c767eb>] dump_stack+0x4c/0x65
<4>[ 206.112393] [<ffffffff81c739cd>] print_circular_bug+0x202/0x213
<4>[ 206.118394] [<ffffffff810ccccc>] __lock_acquire+0x1fdc/0x23a0
<4>[ 206.124222] [<ffffffff810cde05>] lock_acquire+0xd5/0x280
<4>[ 206.129616] [<ffffffff81250678>] ? dquot_commit+0x28/0xc0
<4>[ 206.135099] [<ffffffff81c80990>] mutex_lock_nested+0x60/0x370
<4>[ 206.140931] [<ffffffff81250678>] ? dquot_commit+0x28/0xc0
<4>[ 206.146414] [<ffffffff812c1d3a>] ? ext4_write_dquot+0x5a/0xa0
<4>[ 206.152245] [<ffffffff8130784a>] ? jbd2__journal_start+0x1a/0x20
<4>[ 206.158766] [<ffffffff81250678>] dquot_commit+0x28/0xc0
<4>[ 206.164074] [<ffffffff812c1d4e>] ext4_write_dquot+0x6e/0xa0
<4>[ 206.169731] [<ffffffff812c1dbe>] ext4_mark_dquot_dirty+0x3e/0x60
<4>[ 206.175821] [<ffffffff81250857>] __dquot_free_space+0x147/0x310
<4>[ 206.181825] [<ffffffff8125076a>] ? __dquot_free_space+0x5a/0x310
<4>[ 206.187917] [<ffffffff812ebc62>] ? ext4_free_blocks+0x5d2/0x1010
<4>[ 206.194005] [<ffffffff812ebe0d>] ext4_free_blocks+0x77d/0x1010
<4>[ 206.199920] [<ffffffff810ca7e1>] ? mark_held_locks+0x71/0x90
<4>[ 206.205662] [<ffffffff811cadc6>] ? __kmalloc+0xa6/0x5d0
<4>[ 206.210972] [<ffffffff810c8f4d>] ? __lock_is_held+0x4d/0x70
<4>[ 206.216627] [<ffffffff812db2ca>] ? ext4_ext_remove_space+0x5a/0x16a0
<4>[ 206.223061] [<ffffffff812dbb66>] ext4_ext_remove_space+0x8f6/0x16a0
<4>[ 206.229412] [<ffffffff812de49f>] ext4_ext_truncate+0xaf/0xe0
<4>[ 206.235157] [<ffffffff812b23f0>] ext4_truncate+0x440/0x680
<4>[ 206.240723] [<ffffffff812b2a9f>] ext4_evict_inode+0x46f/0x730
<4>[ 206.246551] [<ffffffff811fe473>] evict+0xb3/0x180
<4>[ 206.251339] [<ffffffff811fecc7>] iput+0x187/0x350
<4>[ 206.256129] [<ffffffff811f0243>] do_unlinkat+0x163/0x340
<4>[ 206.261525] [<ffffffff812041c0>] ? mnt_get_count+0x60/0x60
<4>[ 206.267092] [<ffffffff81002044>] ? lockdep_sys_exit_thunk+0x12/0x14
<4>[ 206.273441] [<ffffffff811f0481>] SyS_unlink+0x11/0x20
<4>[ 206.278578] [<ffffffff81c84bd7>] entry_SYSCALL_64_fastpath+0x12/0x6f
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: lockdep detects possible deadlock scenario
2017-07-20 11:29 lockdep detects possible deadlock scenario Jerry Lee
@ 2017-07-20 12:16 ` Jan Kara
2017-07-21 1:41 ` Jerry Lee
0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2017-07-20 12:16 UTC (permalink / raw)
To: Jerry Lee; +Cc: linux-ext4, Theodore Ts'o, jack
Hi!
On Thu 20-07-17 19:29:28, Jerry Lee wrote:
> I hit the following lockdep trace on linux-4.2.8 and I could steadily
> re-produce it on some of my machine. Although the trace shows up, the
> file system works quite well without seeing any operations being stuck
> on it. Does it mean that the trace is just a false alarm? Thanks.
>
> BTW, I've saw some similar traces previously in the mailing list and
> found that the patch, "ext4: add lockdep annotations for i_data_sem
> (daf647d2dd58)", which is already included in my kernel.
I don't think that patch is included in the kernel reporting this trace -
from the trace ei->i_data_sem obtained on quota file (the first stack
trace) did not use the special I_DATA_SEM_QUOTA locking class which commit
daf647d2dd58 introduced and it should have... In either case this report
is a false positive.
Honza
>
> ======================================================
> <4>[ 205.633705] [ INFO: possible circular locking dependency detected ]
> <4>[ 205.639962] 4.2.8 #3 Tainted: G W O
> <4>[ 205.644395] -------------------------------------------------------
> <4>[ 205.650650] rm/19302 is trying to acquire lock:
> <4>[ 205.655174] (&s->s_dquot.dqio_mutex){+.+...}, at:
> [<ffffffff81250678>] dquot_commit+0x28/0xc0
> <4>[ 205.663835]
> <4>[ 205.663835] but task is already holding lock:
> <4>[ 205.669659] (&ei->i_data_sem){++++..}, at: [<ffffffff812b2329>]
> ext4_truncate+0x379/0x680
> <4>[ 205.677960]
> <4>[ 205.677960] which lock already depends on the new lock.
> <4>[ 205.677960]
> <4>[ 205.686119]
> <4>[ 205.686119] the existing dependency chain (in reverse order) is:
> <4>[ 205.693586]
> <4>[ 205.693586] -> #1 (&ei->i_data_sem){++++..}:
> <4>[ 205.698071] [<ffffffff810cde05>] lock_acquire+0xd5/0x280
> <4>[ 205.703995] [<ffffffff81c82727>] down_read+0x47/0x60
> <4>[ 205.709573] [<ffffffff812acefb>] ext4_map_blocks+0x48b/0x5f0
> <4>[ 205.715845] [<ffffffff812ad653>] ext4_getblk+0x43/0x190
> <4>[ 205.721682] [<ffffffff812ad7ae>] ext4_bread+0xe/0xa0
> <4>[ 205.727260] [<ffffffff812c225d>] ext4_quota_read+0xcd/0x110
> <4>[ 205.733442] [<ffffffff81254687>] read_blk+0x47/0x50
> <4>[ 205.738933] [<ffffffff81255452>] find_tree_dqentry+0x42/0x230
> <4>[ 205.745290] [<ffffffff812555b4>] find_tree_dqentry+0x1a4/0x230
> <4>[ 205.751730] [<ffffffff812555b4>] find_tree_dqentry+0x1a4/0x230
> <4>[ 205.758172] [<ffffffff812555b4>] find_tree_dqentry+0x1a4/0x230
> <4>[ 205.764612] [<ffffffff81255773>] qtree_read_dquot+0x133/0x260
> <4>[ 205.770968] [<ffffffff81253d89>] v2_read_dquot+0x29/0x30
> <4>[ 205.776892] [<ffffffff8124f7f6>] dquot_acquire+0xe6/0x130
> <4>[ 205.782902] [<ffffffff812c1c9a>] ext4_acquire_dquot+0x6a/0xb0
> <4>[ 205.789258] [<ffffffff81251970>] dqget+0x3c0/0x420
> <4>[ 205.794662] [<ffffffff81251afd>] __dquot_initialize+0x12d/0x230
> <4>[ 205.801187] [<ffffffff81251c0e>] dquot_initialize+0xe/0x10
> <4>[ 205.807283] [<ffffffff812d215b>] ext4_fill_super+0x2d9b/0x3150
> <4>[ 205.813729] [<ffffffff811e0f00>] mount_bdev+0x180/0x1b0
> <4>[ 205.819566] [<ffffffff812c18c0>] ext4_mount+0x10/0x20
> <4>[ 205.825230] [<ffffffff811e17a4>] mount_fs+0x14/0xa0
> <4>[ 205.830719] [<ffffffff81203736>] vfs_kern_mount+0x66/0x150
> <4>[ 205.836815] [<ffffffff812066a5>] do_mount+0x1e5/0xd00
> <4>[ 205.842476] [<ffffffff812074b6>] SyS_mount+0x86/0xc0
> <4>[ 205.848048] [<ffffffff81c84bd7>]
> entry_SYSCALL_64_fastpath+0x12/0x6f
> <4>[ 205.855010]
> <4>[ 205.855010] -> #0 (&s->s_dquot.dqio_mutex){+.+...}:
> <4>[ 205.860103] [<ffffffff810ccccc>] __lock_acquire+0x1fdc/0x23a0
> <4>[ 205.866464] [<ffffffff810cde05>] lock_acquire+0xd5/0x280
> <4>[ 205.872393] [<ffffffff81c80990>] mutex_lock_nested+0x60/0x370
> <4>[ 205.878753] [<ffffffff81250678>] dquot_commit+0x28/0xc0
> <4>[ 205.884594] [<ffffffff812c1d4e>] ext4_write_dquot+0x6e/0xa0
> <4>[ 205.890783] [<ffffffff812c1dbe>] ext4_mark_dquot_dirty+0x3e/0x60
> <4>[ 205.897408] [<ffffffff81250857>] __dquot_free_space+0x147/0x310
> <4>[ 205.903943] [<ffffffff812ebe0d>] ext4_free_blocks+0x77d/0x1010
> <4>[ 205.910390] [<ffffffff812dbb66>] ext4_ext_remove_space+0x8f6/0x16a0
> <4>[ 205.917271] [<ffffffff812de49f>] ext4_ext_truncate+0xaf/0xe0
> <4>[ 205.923547] [<ffffffff812b23f0>] ext4_truncate+0x440/0x680
> <4>[ 205.929651] [<ffffffff812b2a9f>] ext4_evict_inode+0x46f/0x730
> <4>[ 205.936017] [<ffffffff811fe473>] evict+0xb3/0x180
> <4>[ 205.941337] [<ffffffff811fecc7>] iput+0x187/0x350
> <4>[ 205.946662] [<ffffffff811f0243>] do_unlinkat+0x163/0x340
> <4>[ 205.952588] [<ffffffff811f0481>] SyS_unlink+0x11/0x20
> <4>[ 205.958258] [<ffffffff81c84bd7>]
> entry_SYSCALL_64_fastpath+0x12/0x6f
> <4>[ 205.965226]
> <4>[ 205.965226] other info that might help us debug this:
> <4>[ 205.965226]
> <4>[ 205.973217] Possible unsafe locking scenario:
> <4>[ 205.973217]
> <4>[ 205.979127] CPU0 CPU1
> <4>[ 205.983652] ---- ----
> <4>[ 205.988174] lock(&ei->i_data_sem);
> <4>[ 205.991772] lock(&s->s_dquot.dqio_mutex);
> <4>[ 205.998487] lock(&ei->i_data_sem);
> <4>[ 206.004596] lock(&s->s_dquot.dqio_mutex);
> <4>[ 206.008795]
> <4>[ 206.008795] *** DEADLOCK ***
> <4>[ 206.008795]
> <4>[ 206.014708] 5 locks held by rm/19302:
> <4>[ 206.018364] #0: (sb_writers#10){.+.+.+}, at:
> [<ffffffff81204bbf>] mnt_want_write+0x1f/0x50
> <4>[ 206.026870] #1: (sb_internal){.+.+..}, at:
> [<ffffffff812b27a9>] ext4_evict_inode+0x179/0x730
> <4>[ 206.035538] #2: (jbd2_handle){+.+...}, at:
> [<ffffffff81306cc1>] start_this_handle+0x191/0x630
> <4>[ 206.044298] #3: (&ei->i_data_sem){++++..}, at:
> [<ffffffff812b2329>] ext4_truncate+0x379/0x680
> <4>[ 206.053052] #4: (dquot_srcu){......}, at: [<ffffffff8125076a>]
> __dquot_free_space+0x5a/0x310
> <4>[ 206.061719]
> <4>[ 206.061719] stack backtrace:
> <4>[ 206.066072] CPU: 0 PID: 19302 Comm: rm Tainted: G W O 4.2.8 #3
> <4>[ 206.072848] Hardware name: To be filled by O.E.M. To be filled
> by O.E.M./MAHOBAY, BIOS QC30AR23 08/14/2014
> <4>[ 206.082486] ffffffff82effc40 ffff88004cebf7d8 ffffffff81c767eb
> 0000000000000007
> <4>[ 206.089919] ffffffff82effc40 ffff88004cebf828 ffffffff81c739cd
> ffff880044724e08
> <4>[ 206.097364] ffff88004cebf898 ffff88004cebf828 0000000000000005
> ffff880044724640
> <4>[ 206.104807] Call Trace:
> <4>[ 206.107257] [<ffffffff81c767eb>] dump_stack+0x4c/0x65
> <4>[ 206.112393] [<ffffffff81c739cd>] print_circular_bug+0x202/0x213
> <4>[ 206.118394] [<ffffffff810ccccc>] __lock_acquire+0x1fdc/0x23a0
> <4>[ 206.124222] [<ffffffff810cde05>] lock_acquire+0xd5/0x280
> <4>[ 206.129616] [<ffffffff81250678>] ? dquot_commit+0x28/0xc0
> <4>[ 206.135099] [<ffffffff81c80990>] mutex_lock_nested+0x60/0x370
> <4>[ 206.140931] [<ffffffff81250678>] ? dquot_commit+0x28/0xc0
> <4>[ 206.146414] [<ffffffff812c1d3a>] ? ext4_write_dquot+0x5a/0xa0
> <4>[ 206.152245] [<ffffffff8130784a>] ? jbd2__journal_start+0x1a/0x20
> <4>[ 206.158766] [<ffffffff81250678>] dquot_commit+0x28/0xc0
> <4>[ 206.164074] [<ffffffff812c1d4e>] ext4_write_dquot+0x6e/0xa0
> <4>[ 206.169731] [<ffffffff812c1dbe>] ext4_mark_dquot_dirty+0x3e/0x60
> <4>[ 206.175821] [<ffffffff81250857>] __dquot_free_space+0x147/0x310
> <4>[ 206.181825] [<ffffffff8125076a>] ? __dquot_free_space+0x5a/0x310
> <4>[ 206.187917] [<ffffffff812ebc62>] ? ext4_free_blocks+0x5d2/0x1010
> <4>[ 206.194005] [<ffffffff812ebe0d>] ext4_free_blocks+0x77d/0x1010
> <4>[ 206.199920] [<ffffffff810ca7e1>] ? mark_held_locks+0x71/0x90
> <4>[ 206.205662] [<ffffffff811cadc6>] ? __kmalloc+0xa6/0x5d0
> <4>[ 206.210972] [<ffffffff810c8f4d>] ? __lock_is_held+0x4d/0x70
> <4>[ 206.216627] [<ffffffff812db2ca>] ? ext4_ext_remove_space+0x5a/0x16a0
> <4>[ 206.223061] [<ffffffff812dbb66>] ext4_ext_remove_space+0x8f6/0x16a0
> <4>[ 206.229412] [<ffffffff812de49f>] ext4_ext_truncate+0xaf/0xe0
> <4>[ 206.235157] [<ffffffff812b23f0>] ext4_truncate+0x440/0x680
> <4>[ 206.240723] [<ffffffff812b2a9f>] ext4_evict_inode+0x46f/0x730
> <4>[ 206.246551] [<ffffffff811fe473>] evict+0xb3/0x180
> <4>[ 206.251339] [<ffffffff811fecc7>] iput+0x187/0x350
> <4>[ 206.256129] [<ffffffff811f0243>] do_unlinkat+0x163/0x340
> <4>[ 206.261525] [<ffffffff812041c0>] ? mnt_get_count+0x60/0x60
> <4>[ 206.267092] [<ffffffff81002044>] ? lockdep_sys_exit_thunk+0x12/0x14
> <4>[ 206.273441] [<ffffffff811f0481>] SyS_unlink+0x11/0x20
> <4>[ 206.278578] [<ffffffff81c84bd7>] entry_SYSCALL_64_fastpath+0x12/0x6f
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: lockdep detects possible deadlock scenario
2017-07-20 12:16 ` Jan Kara
@ 2017-07-21 1:41 ` Jerry Lee
2017-07-24 8:35 ` Jan Kara
0 siblings, 1 reply; 5+ messages in thread
From: Jerry Lee @ 2017-07-21 1:41 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-ext4, Theodore Ts'o, jack
Hi Jan,
On 20 July 2017 at 20:16, Jan Kara <jack@suse.cz> wrote:
> Hi!
>
> On Thu 20-07-17 19:29:28, Jerry Lee wrote:
>> I hit the following lockdep trace on linux-4.2.8 and I could steadily
>> re-produce it on some of my machine. Although the trace shows up, the
>> file system works quite well without seeing any operations being stuck
>> on it. Does it mean that the trace is just a false alarm? Thanks.
>>
>> BTW, I've saw some similar traces previously in the mailing list and
>> found that the patch, "ext4: add lockdep annotations for i_data_sem
>> (daf647d2dd58)", which is already included in my kernel.
>
> I don't think that patch is included in the kernel reporting this trace -
> from the trace ei->i_data_sem obtained on quota file (the first stack
> trace) did not use the special I_DATA_SEM_QUOTA locking class which commit
> daf647d2dd58 introduced and it should have... In either case this report
> is a false positive.
>
> Honza
You are right that the patch is not included in the linux-4.2.8 kernel
on the mainstream. I'm sorry that I didn't clearly describe my setup
in previous post. Before I sent the mail, I found the patch and
back-ported it to my kernel to get rid of possible false positive.
But, with the patch, I still got the trace. Does it mean that I miss
some other patches when directly back-porting the patch on my kernel?
Thanks for your quick reply.
>
>>
>> ======================================================
>> <4>[ 205.633705] [ INFO: possible circular locking dependency detected ]
>> <4>[ 205.639962] 4.2.8 #3 Tainted: G W O
>> <4>[ 205.644395] -------------------------------------------------------
>> <4>[ 205.650650] rm/19302 is trying to acquire lock:
>> <4>[ 205.655174] (&s->s_dquot.dqio_mutex){+.+...}, at:
>> [<ffffffff81250678>] dquot_commit+0x28/0xc0
>> <4>[ 205.663835]
>> <4>[ 205.663835] but task is already holding lock:
>> <4>[ 205.669659] (&ei->i_data_sem){++++..}, at: [<ffffffff812b2329>]
>> ext4_truncate+0x379/0x680
>> <4>[ 205.677960]
>> <4>[ 205.677960] which lock already depends on the new lock.
>> <4>[ 205.677960]
>> <4>[ 205.686119]
>> <4>[ 205.686119] the existing dependency chain (in reverse order) is:
>> <4>[ 205.693586]
>> <4>[ 205.693586] -> #1 (&ei->i_data_sem){++++..}:
>> <4>[ 205.698071] [<ffffffff810cde05>] lock_acquire+0xd5/0x280
>> <4>[ 205.703995] [<ffffffff81c82727>] down_read+0x47/0x60
>> <4>[ 205.709573] [<ffffffff812acefb>] ext4_map_blocks+0x48b/0x5f0
>> <4>[ 205.715845] [<ffffffff812ad653>] ext4_getblk+0x43/0x190
>> <4>[ 205.721682] [<ffffffff812ad7ae>] ext4_bread+0xe/0xa0
>> <4>[ 205.727260] [<ffffffff812c225d>] ext4_quota_read+0xcd/0x110
>> <4>[ 205.733442] [<ffffffff81254687>] read_blk+0x47/0x50
>> <4>[ 205.738933] [<ffffffff81255452>] find_tree_dqentry+0x42/0x230
>> <4>[ 205.745290] [<ffffffff812555b4>] find_tree_dqentry+0x1a4/0x230
>> <4>[ 205.751730] [<ffffffff812555b4>] find_tree_dqentry+0x1a4/0x230
>> <4>[ 205.758172] [<ffffffff812555b4>] find_tree_dqentry+0x1a4/0x230
>> <4>[ 205.764612] [<ffffffff81255773>] qtree_read_dquot+0x133/0x260
>> <4>[ 205.770968] [<ffffffff81253d89>] v2_read_dquot+0x29/0x30
>> <4>[ 205.776892] [<ffffffff8124f7f6>] dquot_acquire+0xe6/0x130
>> <4>[ 205.782902] [<ffffffff812c1c9a>] ext4_acquire_dquot+0x6a/0xb0
>> <4>[ 205.789258] [<ffffffff81251970>] dqget+0x3c0/0x420
>> <4>[ 205.794662] [<ffffffff81251afd>] __dquot_initialize+0x12d/0x230
>> <4>[ 205.801187] [<ffffffff81251c0e>] dquot_initialize+0xe/0x10
>> <4>[ 205.807283] [<ffffffff812d215b>] ext4_fill_super+0x2d9b/0x3150
>> <4>[ 205.813729] [<ffffffff811e0f00>] mount_bdev+0x180/0x1b0
>> <4>[ 205.819566] [<ffffffff812c18c0>] ext4_mount+0x10/0x20
>> <4>[ 205.825230] [<ffffffff811e17a4>] mount_fs+0x14/0xa0
>> <4>[ 205.830719] [<ffffffff81203736>] vfs_kern_mount+0x66/0x150
>> <4>[ 205.836815] [<ffffffff812066a5>] do_mount+0x1e5/0xd00
>> <4>[ 205.842476] [<ffffffff812074b6>] SyS_mount+0x86/0xc0
>> <4>[ 205.848048] [<ffffffff81c84bd7>]
>> entry_SYSCALL_64_fastpath+0x12/0x6f
>> <4>[ 205.855010]
>> <4>[ 205.855010] -> #0 (&s->s_dquot.dqio_mutex){+.+...}:
>> <4>[ 205.860103] [<ffffffff810ccccc>] __lock_acquire+0x1fdc/0x23a0
>> <4>[ 205.866464] [<ffffffff810cde05>] lock_acquire+0xd5/0x280
>> <4>[ 205.872393] [<ffffffff81c80990>] mutex_lock_nested+0x60/0x370
>> <4>[ 205.878753] [<ffffffff81250678>] dquot_commit+0x28/0xc0
>> <4>[ 205.884594] [<ffffffff812c1d4e>] ext4_write_dquot+0x6e/0xa0
>> <4>[ 205.890783] [<ffffffff812c1dbe>] ext4_mark_dquot_dirty+0x3e/0x60
>> <4>[ 205.897408] [<ffffffff81250857>] __dquot_free_space+0x147/0x310
>> <4>[ 205.903943] [<ffffffff812ebe0d>] ext4_free_blocks+0x77d/0x1010
>> <4>[ 205.910390] [<ffffffff812dbb66>] ext4_ext_remove_space+0x8f6/0x16a0
>> <4>[ 205.917271] [<ffffffff812de49f>] ext4_ext_truncate+0xaf/0xe0
>> <4>[ 205.923547] [<ffffffff812b23f0>] ext4_truncate+0x440/0x680
>> <4>[ 205.929651] [<ffffffff812b2a9f>] ext4_evict_inode+0x46f/0x730
>> <4>[ 205.936017] [<ffffffff811fe473>] evict+0xb3/0x180
>> <4>[ 205.941337] [<ffffffff811fecc7>] iput+0x187/0x350
>> <4>[ 205.946662] [<ffffffff811f0243>] do_unlinkat+0x163/0x340
>> <4>[ 205.952588] [<ffffffff811f0481>] SyS_unlink+0x11/0x20
>> <4>[ 205.958258] [<ffffffff81c84bd7>]
>> entry_SYSCALL_64_fastpath+0x12/0x6f
>> <4>[ 205.965226]
>> <4>[ 205.965226] other info that might help us debug this:
>> <4>[ 205.965226]
>> <4>[ 205.973217] Possible unsafe locking scenario:
>> <4>[ 205.973217]
>> <4>[ 205.979127] CPU0 CPU1
>> <4>[ 205.983652] ---- ----
>> <4>[ 205.988174] lock(&ei->i_data_sem);
>> <4>[ 205.991772] lock(&s->s_dquot.dqio_mutex);
>> <4>[ 205.998487] lock(&ei->i_data_sem);
>> <4>[ 206.004596] lock(&s->s_dquot.dqio_mutex);
>> <4>[ 206.008795]
>> <4>[ 206.008795] *** DEADLOCK ***
>> <4>[ 206.008795]
>> <4>[ 206.014708] 5 locks held by rm/19302:
>> <4>[ 206.018364] #0: (sb_writers#10){.+.+.+}, at:
>> [<ffffffff81204bbf>] mnt_want_write+0x1f/0x50
>> <4>[ 206.026870] #1: (sb_internal){.+.+..}, at:
>> [<ffffffff812b27a9>] ext4_evict_inode+0x179/0x730
>> <4>[ 206.035538] #2: (jbd2_handle){+.+...}, at:
>> [<ffffffff81306cc1>] start_this_handle+0x191/0x630
>> <4>[ 206.044298] #3: (&ei->i_data_sem){++++..}, at:
>> [<ffffffff812b2329>] ext4_truncate+0x379/0x680
>> <4>[ 206.053052] #4: (dquot_srcu){......}, at: [<ffffffff8125076a>]
>> __dquot_free_space+0x5a/0x310
>> <4>[ 206.061719]
>> <4>[ 206.061719] stack backtrace:
>> <4>[ 206.066072] CPU: 0 PID: 19302 Comm: rm Tainted: G W O 4.2.8 #3
>> <4>[ 206.072848] Hardware name: To be filled by O.E.M. To be filled
>> by O.E.M./MAHOBAY, BIOS QC30AR23 08/14/2014
>> <4>[ 206.082486] ffffffff82effc40 ffff88004cebf7d8 ffffffff81c767eb
>> 0000000000000007
>> <4>[ 206.089919] ffffffff82effc40 ffff88004cebf828 ffffffff81c739cd
>> ffff880044724e08
>> <4>[ 206.097364] ffff88004cebf898 ffff88004cebf828 0000000000000005
>> ffff880044724640
>> <4>[ 206.104807] Call Trace:
>> <4>[ 206.107257] [<ffffffff81c767eb>] dump_stack+0x4c/0x65
>> <4>[ 206.112393] [<ffffffff81c739cd>] print_circular_bug+0x202/0x213
>> <4>[ 206.118394] [<ffffffff810ccccc>] __lock_acquire+0x1fdc/0x23a0
>> <4>[ 206.124222] [<ffffffff810cde05>] lock_acquire+0xd5/0x280
>> <4>[ 206.129616] [<ffffffff81250678>] ? dquot_commit+0x28/0xc0
>> <4>[ 206.135099] [<ffffffff81c80990>] mutex_lock_nested+0x60/0x370
>> <4>[ 206.140931] [<ffffffff81250678>] ? dquot_commit+0x28/0xc0
>> <4>[ 206.146414] [<ffffffff812c1d3a>] ? ext4_write_dquot+0x5a/0xa0
>> <4>[ 206.152245] [<ffffffff8130784a>] ? jbd2__journal_start+0x1a/0x20
>> <4>[ 206.158766] [<ffffffff81250678>] dquot_commit+0x28/0xc0
>> <4>[ 206.164074] [<ffffffff812c1d4e>] ext4_write_dquot+0x6e/0xa0
>> <4>[ 206.169731] [<ffffffff812c1dbe>] ext4_mark_dquot_dirty+0x3e/0x60
>> <4>[ 206.175821] [<ffffffff81250857>] __dquot_free_space+0x147/0x310
>> <4>[ 206.181825] [<ffffffff8125076a>] ? __dquot_free_space+0x5a/0x310
>> <4>[ 206.187917] [<ffffffff812ebc62>] ? ext4_free_blocks+0x5d2/0x1010
>> <4>[ 206.194005] [<ffffffff812ebe0d>] ext4_free_blocks+0x77d/0x1010
>> <4>[ 206.199920] [<ffffffff810ca7e1>] ? mark_held_locks+0x71/0x90
>> <4>[ 206.205662] [<ffffffff811cadc6>] ? __kmalloc+0xa6/0x5d0
>> <4>[ 206.210972] [<ffffffff810c8f4d>] ? __lock_is_held+0x4d/0x70
>> <4>[ 206.216627] [<ffffffff812db2ca>] ? ext4_ext_remove_space+0x5a/0x16a0
>> <4>[ 206.223061] [<ffffffff812dbb66>] ext4_ext_remove_space+0x8f6/0x16a0
>> <4>[ 206.229412] [<ffffffff812de49f>] ext4_ext_truncate+0xaf/0xe0
>> <4>[ 206.235157] [<ffffffff812b23f0>] ext4_truncate+0x440/0x680
>> <4>[ 206.240723] [<ffffffff812b2a9f>] ext4_evict_inode+0x46f/0x730
>> <4>[ 206.246551] [<ffffffff811fe473>] evict+0xb3/0x180
>> <4>[ 206.251339] [<ffffffff811fecc7>] iput+0x187/0x350
>> <4>[ 206.256129] [<ffffffff811f0243>] do_unlinkat+0x163/0x340
>> <4>[ 206.261525] [<ffffffff812041c0>] ? mnt_get_count+0x60/0x60
>> <4>[ 206.267092] [<ffffffff81002044>] ? lockdep_sys_exit_thunk+0x12/0x14
>> <4>[ 206.273441] [<ffffffff811f0481>] SyS_unlink+0x11/0x20
>> <4>[ 206.278578] [<ffffffff81c84bd7>] entry_SYSCALL_64_fastpath+0x12/0x6f
>>
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: lockdep detects possible deadlock scenario
2017-07-21 1:41 ` Jerry Lee
@ 2017-07-24 8:35 ` Jan Kara
2017-07-25 8:24 ` Jerry Lee
0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2017-07-24 8:35 UTC (permalink / raw)
To: Jerry Lee; +Cc: Jan Kara, linux-ext4, Theodore Ts'o, jack
On Fri 21-07-17 09:41:54, Jerry Lee wrote:
> On 20 July 2017 at 20:16, Jan Kara <jack@suse.cz> wrote:
> > Hi!
> >
> > On Thu 20-07-17 19:29:28, Jerry Lee wrote:
> >> I hit the following lockdep trace on linux-4.2.8 and I could steadily
> >> re-produce it on some of my machine. Although the trace shows up, the
> >> file system works quite well without seeing any operations being stuck
> >> on it. Does it mean that the trace is just a false alarm? Thanks.
> >>
> >> BTW, I've saw some similar traces previously in the mailing list and
> >> found that the patch, "ext4: add lockdep annotations for i_data_sem
> >> (daf647d2dd58)", which is already included in my kernel.
> >
> > I don't think that patch is included in the kernel reporting this trace -
> > from the trace ei->i_data_sem obtained on quota file (the first stack
> > trace) did not use the special I_DATA_SEM_QUOTA locking class which commit
> > daf647d2dd58 introduced and it should have... In either case this report
> > is a false positive.
> >
> > Honza
>
> You are right that the patch is not included in the linux-4.2.8 kernel
> on the mainstream. I'm sorry that I didn't clearly describe my setup
> in previous post. Before I sent the mail, I found the patch and
> back-ported it to my kernel to get rid of possible false positive.
> But, with the patch, I still got the trace. Does it mean that I miss
> some other patches when directly back-porting the patch on my kernel?
Well, I'm not sure. Was it the same trace? There's followup fix for commit
daf647d2dd58 - commit 964edf66bf9ab "ext4: clear lockdep subtype for quota
files on quota off" so you may miss that one.
Honza
> Thanks for your quick reply.
>
> >
> >>
> >> ======================================================
> >> <4>[ 205.633705] [ INFO: possible circular locking dependency detected ]
> >> <4>[ 205.639962] 4.2.8 #3 Tainted: G W O
> >> <4>[ 205.644395] -------------------------------------------------------
> >> <4>[ 205.650650] rm/19302 is trying to acquire lock:
> >> <4>[ 205.655174] (&s->s_dquot.dqio_mutex){+.+...}, at:
> >> [<ffffffff81250678>] dquot_commit+0x28/0xc0
> >> <4>[ 205.663835]
> >> <4>[ 205.663835] but task is already holding lock:
> >> <4>[ 205.669659] (&ei->i_data_sem){++++..}, at: [<ffffffff812b2329>]
> >> ext4_truncate+0x379/0x680
> >> <4>[ 205.677960]
> >> <4>[ 205.677960] which lock already depends on the new lock.
> >> <4>[ 205.677960]
> >> <4>[ 205.686119]
> >> <4>[ 205.686119] the existing dependency chain (in reverse order) is:
> >> <4>[ 205.693586]
> >> <4>[ 205.693586] -> #1 (&ei->i_data_sem){++++..}:
> >> <4>[ 205.698071] [<ffffffff810cde05>] lock_acquire+0xd5/0x280
> >> <4>[ 205.703995] [<ffffffff81c82727>] down_read+0x47/0x60
> >> <4>[ 205.709573] [<ffffffff812acefb>] ext4_map_blocks+0x48b/0x5f0
> >> <4>[ 205.715845] [<ffffffff812ad653>] ext4_getblk+0x43/0x190
> >> <4>[ 205.721682] [<ffffffff812ad7ae>] ext4_bread+0xe/0xa0
> >> <4>[ 205.727260] [<ffffffff812c225d>] ext4_quota_read+0xcd/0x110
> >> <4>[ 205.733442] [<ffffffff81254687>] read_blk+0x47/0x50
> >> <4>[ 205.738933] [<ffffffff81255452>] find_tree_dqentry+0x42/0x230
> >> <4>[ 205.745290] [<ffffffff812555b4>] find_tree_dqentry+0x1a4/0x230
> >> <4>[ 205.751730] [<ffffffff812555b4>] find_tree_dqentry+0x1a4/0x230
> >> <4>[ 205.758172] [<ffffffff812555b4>] find_tree_dqentry+0x1a4/0x230
> >> <4>[ 205.764612] [<ffffffff81255773>] qtree_read_dquot+0x133/0x260
> >> <4>[ 205.770968] [<ffffffff81253d89>] v2_read_dquot+0x29/0x30
> >> <4>[ 205.776892] [<ffffffff8124f7f6>] dquot_acquire+0xe6/0x130
> >> <4>[ 205.782902] [<ffffffff812c1c9a>] ext4_acquire_dquot+0x6a/0xb0
> >> <4>[ 205.789258] [<ffffffff81251970>] dqget+0x3c0/0x420
> >> <4>[ 205.794662] [<ffffffff81251afd>] __dquot_initialize+0x12d/0x230
> >> <4>[ 205.801187] [<ffffffff81251c0e>] dquot_initialize+0xe/0x10
> >> <4>[ 205.807283] [<ffffffff812d215b>] ext4_fill_super+0x2d9b/0x3150
> >> <4>[ 205.813729] [<ffffffff811e0f00>] mount_bdev+0x180/0x1b0
> >> <4>[ 205.819566] [<ffffffff812c18c0>] ext4_mount+0x10/0x20
> >> <4>[ 205.825230] [<ffffffff811e17a4>] mount_fs+0x14/0xa0
> >> <4>[ 205.830719] [<ffffffff81203736>] vfs_kern_mount+0x66/0x150
> >> <4>[ 205.836815] [<ffffffff812066a5>] do_mount+0x1e5/0xd00
> >> <4>[ 205.842476] [<ffffffff812074b6>] SyS_mount+0x86/0xc0
> >> <4>[ 205.848048] [<ffffffff81c84bd7>]
> >> entry_SYSCALL_64_fastpath+0x12/0x6f
> >> <4>[ 205.855010]
> >> <4>[ 205.855010] -> #0 (&s->s_dquot.dqio_mutex){+.+...}:
> >> <4>[ 205.860103] [<ffffffff810ccccc>] __lock_acquire+0x1fdc/0x23a0
> >> <4>[ 205.866464] [<ffffffff810cde05>] lock_acquire+0xd5/0x280
> >> <4>[ 205.872393] [<ffffffff81c80990>] mutex_lock_nested+0x60/0x370
> >> <4>[ 205.878753] [<ffffffff81250678>] dquot_commit+0x28/0xc0
> >> <4>[ 205.884594] [<ffffffff812c1d4e>] ext4_write_dquot+0x6e/0xa0
> >> <4>[ 205.890783] [<ffffffff812c1dbe>] ext4_mark_dquot_dirty+0x3e/0x60
> >> <4>[ 205.897408] [<ffffffff81250857>] __dquot_free_space+0x147/0x310
> >> <4>[ 205.903943] [<ffffffff812ebe0d>] ext4_free_blocks+0x77d/0x1010
> >> <4>[ 205.910390] [<ffffffff812dbb66>] ext4_ext_remove_space+0x8f6/0x16a0
> >> <4>[ 205.917271] [<ffffffff812de49f>] ext4_ext_truncate+0xaf/0xe0
> >> <4>[ 205.923547] [<ffffffff812b23f0>] ext4_truncate+0x440/0x680
> >> <4>[ 205.929651] [<ffffffff812b2a9f>] ext4_evict_inode+0x46f/0x730
> >> <4>[ 205.936017] [<ffffffff811fe473>] evict+0xb3/0x180
> >> <4>[ 205.941337] [<ffffffff811fecc7>] iput+0x187/0x350
> >> <4>[ 205.946662] [<ffffffff811f0243>] do_unlinkat+0x163/0x340
> >> <4>[ 205.952588] [<ffffffff811f0481>] SyS_unlink+0x11/0x20
> >> <4>[ 205.958258] [<ffffffff81c84bd7>]
> >> entry_SYSCALL_64_fastpath+0x12/0x6f
> >> <4>[ 205.965226]
> >> <4>[ 205.965226] other info that might help us debug this:
> >> <4>[ 205.965226]
> >> <4>[ 205.973217] Possible unsafe locking scenario:
> >> <4>[ 205.973217]
> >> <4>[ 205.979127] CPU0 CPU1
> >> <4>[ 205.983652] ---- ----
> >> <4>[ 205.988174] lock(&ei->i_data_sem);
> >> <4>[ 205.991772] lock(&s->s_dquot.dqio_mutex);
> >> <4>[ 205.998487] lock(&ei->i_data_sem);
> >> <4>[ 206.004596] lock(&s->s_dquot.dqio_mutex);
> >> <4>[ 206.008795]
> >> <4>[ 206.008795] *** DEADLOCK ***
> >> <4>[ 206.008795]
> >> <4>[ 206.014708] 5 locks held by rm/19302:
> >> <4>[ 206.018364] #0: (sb_writers#10){.+.+.+}, at:
> >> [<ffffffff81204bbf>] mnt_want_write+0x1f/0x50
> >> <4>[ 206.026870] #1: (sb_internal){.+.+..}, at:
> >> [<ffffffff812b27a9>] ext4_evict_inode+0x179/0x730
> >> <4>[ 206.035538] #2: (jbd2_handle){+.+...}, at:
> >> [<ffffffff81306cc1>] start_this_handle+0x191/0x630
> >> <4>[ 206.044298] #3: (&ei->i_data_sem){++++..}, at:
> >> [<ffffffff812b2329>] ext4_truncate+0x379/0x680
> >> <4>[ 206.053052] #4: (dquot_srcu){......}, at: [<ffffffff8125076a>]
> >> __dquot_free_space+0x5a/0x310
> >> <4>[ 206.061719]
> >> <4>[ 206.061719] stack backtrace:
> >> <4>[ 206.066072] CPU: 0 PID: 19302 Comm: rm Tainted: G W O 4.2.8 #3
> >> <4>[ 206.072848] Hardware name: To be filled by O.E.M. To be filled
> >> by O.E.M./MAHOBAY, BIOS QC30AR23 08/14/2014
> >> <4>[ 206.082486] ffffffff82effc40 ffff88004cebf7d8 ffffffff81c767eb
> >> 0000000000000007
> >> <4>[ 206.089919] ffffffff82effc40 ffff88004cebf828 ffffffff81c739cd
> >> ffff880044724e08
> >> <4>[ 206.097364] ffff88004cebf898 ffff88004cebf828 0000000000000005
> >> ffff880044724640
> >> <4>[ 206.104807] Call Trace:
> >> <4>[ 206.107257] [<ffffffff81c767eb>] dump_stack+0x4c/0x65
> >> <4>[ 206.112393] [<ffffffff81c739cd>] print_circular_bug+0x202/0x213
> >> <4>[ 206.118394] [<ffffffff810ccccc>] __lock_acquire+0x1fdc/0x23a0
> >> <4>[ 206.124222] [<ffffffff810cde05>] lock_acquire+0xd5/0x280
> >> <4>[ 206.129616] [<ffffffff81250678>] ? dquot_commit+0x28/0xc0
> >> <4>[ 206.135099] [<ffffffff81c80990>] mutex_lock_nested+0x60/0x370
> >> <4>[ 206.140931] [<ffffffff81250678>] ? dquot_commit+0x28/0xc0
> >> <4>[ 206.146414] [<ffffffff812c1d3a>] ? ext4_write_dquot+0x5a/0xa0
> >> <4>[ 206.152245] [<ffffffff8130784a>] ? jbd2__journal_start+0x1a/0x20
> >> <4>[ 206.158766] [<ffffffff81250678>] dquot_commit+0x28/0xc0
> >> <4>[ 206.164074] [<ffffffff812c1d4e>] ext4_write_dquot+0x6e/0xa0
> >> <4>[ 206.169731] [<ffffffff812c1dbe>] ext4_mark_dquot_dirty+0x3e/0x60
> >> <4>[ 206.175821] [<ffffffff81250857>] __dquot_free_space+0x147/0x310
> >> <4>[ 206.181825] [<ffffffff8125076a>] ? __dquot_free_space+0x5a/0x310
> >> <4>[ 206.187917] [<ffffffff812ebc62>] ? ext4_free_blocks+0x5d2/0x1010
> >> <4>[ 206.194005] [<ffffffff812ebe0d>] ext4_free_blocks+0x77d/0x1010
> >> <4>[ 206.199920] [<ffffffff810ca7e1>] ? mark_held_locks+0x71/0x90
> >> <4>[ 206.205662] [<ffffffff811cadc6>] ? __kmalloc+0xa6/0x5d0
> >> <4>[ 206.210972] [<ffffffff810c8f4d>] ? __lock_is_held+0x4d/0x70
> >> <4>[ 206.216627] [<ffffffff812db2ca>] ? ext4_ext_remove_space+0x5a/0x16a0
> >> <4>[ 206.223061] [<ffffffff812dbb66>] ext4_ext_remove_space+0x8f6/0x16a0
> >> <4>[ 206.229412] [<ffffffff812de49f>] ext4_ext_truncate+0xaf/0xe0
> >> <4>[ 206.235157] [<ffffffff812b23f0>] ext4_truncate+0x440/0x680
> >> <4>[ 206.240723] [<ffffffff812b2a9f>] ext4_evict_inode+0x46f/0x730
> >> <4>[ 206.246551] [<ffffffff811fe473>] evict+0xb3/0x180
> >> <4>[ 206.251339] [<ffffffff811fecc7>] iput+0x187/0x350
> >> <4>[ 206.256129] [<ffffffff811f0243>] do_unlinkat+0x163/0x340
> >> <4>[ 206.261525] [<ffffffff812041c0>] ? mnt_get_count+0x60/0x60
> >> <4>[ 206.267092] [<ffffffff81002044>] ? lockdep_sys_exit_thunk+0x12/0x14
> >> <4>[ 206.273441] [<ffffffff811f0481>] SyS_unlink+0x11/0x20
> >> <4>[ 206.278578] [<ffffffff81c84bd7>] entry_SYSCALL_64_fastpath+0x12/0x6f
> >>
> > --
> > Jan Kara <jack@suse.com>
> > SUSE Labs, CR
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: lockdep detects possible deadlock scenario
2017-07-24 8:35 ` Jan Kara
@ 2017-07-25 8:24 ` Jerry Lee
0 siblings, 0 replies; 5+ messages in thread
From: Jerry Lee @ 2017-07-25 8:24 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-ext4, Theodore Ts'o, jack
Hi Jan,
On 24 July 2017 at 16:35, Jan Kara <jack@suse.cz> wrote:
> On Fri 21-07-17 09:41:54, Jerry Lee wrote:
>> On 20 July 2017 at 20:16, Jan Kara <jack@suse.cz> wrote:
>> > Hi!
>> >
>> > On Thu 20-07-17 19:29:28, Jerry Lee wrote:
>> >> I hit the following lockdep trace on linux-4.2.8 and I could steadily
>> >> re-produce it on some of my machine. Although the trace shows up, the
>> >> file system works quite well without seeing any operations being stuck
>> >> on it. Does it mean that the trace is just a false alarm? Thanks.
>> >>
>> >> BTW, I've saw some similar traces previously in the mailing list and
>> >> found that the patch, "ext4: add lockdep annotations for i_data_sem
>> >> (daf647d2dd58)", which is already included in my kernel.
>> >
>> > I don't think that patch is included in the kernel reporting this trace -
>> > from the trace ei->i_data_sem obtained on quota file (the first stack
>> > trace) did not use the special I_DATA_SEM_QUOTA locking class which commit
>> > daf647d2dd58 introduced and it should have... In either case this report
>> > is a false positive.
>> >
>> > Honza
>>
>> You are right that the patch is not included in the linux-4.2.8 kernel
>> on the mainstream. I'm sorry that I didn't clearly describe my setup
>> in previous post. Before I sent the mail, I found the patch and
>> back-ported it to my kernel to get rid of possible false positive.
>> But, with the patch, I still got the trace. Does it mean that I miss
>> some other patches when directly back-porting the patch on my kernel?
>
> Well, I'm not sure. Was it the same trace? There's followup fix for commit
> daf647d2dd58 - commit 964edf66bf9ab "ext4: clear lockdep subtype for quota
> files on quota off" so you may miss that one.
>
> Honza
Hmm, it was the same trace. I noticed the commit 964edf66bf9ab "ext4:
clear lockdep subtype for quota files on quota off" before and tried
the patch on my kernel with following modification. Unfortunately,
the same trace still occurred. Anyway, I will spent some time
figuring out the issue in my environment. Thanks for your suggestion
and help :-)
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5887,6 +5887,7 @@ static int ext4_quota_off(struct super_block
*sb, int type)
{
struct inode *inode = sb_dqopt(sb)->files[type];
handle_t *handle;
+ int err;
/* Force all delayed allocation blocks to be allocated.
* Caller already holds s_umount sem */
@@ -5896,6 +5897,10 @@ static int ext4_quota_off(struct super_block
*sb, int type)
if (!inode)
goto out;
+ err = dquot_quota_off(sb, type);
+ if (err)
+ goto out_restore;
+
/* Update modification times of quota files when userspace can
* start looking at them */
handle = ext4_journal_start(inode, EXT4_HT_QUOTA, 1);
@@ -5905,6 +5910,9 @@ static int ext4_quota_off(struct super_block
*sb, int type)
ext4_mark_inode_dirty(handle, inode);
ext4_journal_stop(handle);
+out_restore:
+ lockdep_set_quota_inode(inode, I_DATA_SEM_NORMAL);
+ return err;
out:
return dquot_quota_off(sb, type);
}
>
>> Thanks for your quick reply.
>>
>> >
>> >>
>> >> ======================================================
>> >> <4>[ 205.633705] [ INFO: possible circular locking dependency detected ]
>> >> <4>[ 205.639962] 4.2.8 #3 Tainted: G W O
>> >> <4>[ 205.644395] -------------------------------------------------------
>> >> <4>[ 205.650650] rm/19302 is trying to acquire lock:
>> >> <4>[ 205.655174] (&s->s_dquot.dqio_mutex){+.+...}, at:
>> >> [<ffffffff81250678>] dquot_commit+0x28/0xc0
>> >> <4>[ 205.663835]
>> >> <4>[ 205.663835] but task is already holding lock:
>> >> <4>[ 205.669659] (&ei->i_data_sem){++++..}, at: [<ffffffff812b2329>]
>> >> ext4_truncate+0x379/0x680
>> >> <4>[ 205.677960]
>> >> <4>[ 205.677960] which lock already depends on the new lock.
>> >> <4>[ 205.677960]
>> >> <4>[ 205.686119]
>> >> <4>[ 205.686119] the existing dependency chain (in reverse order) is:
>> >> <4>[ 205.693586]
>> >> <4>[ 205.693586] -> #1 (&ei->i_data_sem){++++..}:
>> >> <4>[ 205.698071] [<ffffffff810cde05>] lock_acquire+0xd5/0x280
>> >> <4>[ 205.703995] [<ffffffff81c82727>] down_read+0x47/0x60
>> >> <4>[ 205.709573] [<ffffffff812acefb>] ext4_map_blocks+0x48b/0x5f0
>> >> <4>[ 205.715845] [<ffffffff812ad653>] ext4_getblk+0x43/0x190
>> >> <4>[ 205.721682] [<ffffffff812ad7ae>] ext4_bread+0xe/0xa0
>> >> <4>[ 205.727260] [<ffffffff812c225d>] ext4_quota_read+0xcd/0x110
>> >> <4>[ 205.733442] [<ffffffff81254687>] read_blk+0x47/0x50
>> >> <4>[ 205.738933] [<ffffffff81255452>] find_tree_dqentry+0x42/0x230
>> >> <4>[ 205.745290] [<ffffffff812555b4>] find_tree_dqentry+0x1a4/0x230
>> >> <4>[ 205.751730] [<ffffffff812555b4>] find_tree_dqentry+0x1a4/0x230
>> >> <4>[ 205.758172] [<ffffffff812555b4>] find_tree_dqentry+0x1a4/0x230
>> >> <4>[ 205.764612] [<ffffffff81255773>] qtree_read_dquot+0x133/0x260
>> >> <4>[ 205.770968] [<ffffffff81253d89>] v2_read_dquot+0x29/0x30
>> >> <4>[ 205.776892] [<ffffffff8124f7f6>] dquot_acquire+0xe6/0x130
>> >> <4>[ 205.782902] [<ffffffff812c1c9a>] ext4_acquire_dquot+0x6a/0xb0
>> >> <4>[ 205.789258] [<ffffffff81251970>] dqget+0x3c0/0x420
>> >> <4>[ 205.794662] [<ffffffff81251afd>] __dquot_initialize+0x12d/0x230
>> >> <4>[ 205.801187] [<ffffffff81251c0e>] dquot_initialize+0xe/0x10
>> >> <4>[ 205.807283] [<ffffffff812d215b>] ext4_fill_super+0x2d9b/0x3150
>> >> <4>[ 205.813729] [<ffffffff811e0f00>] mount_bdev+0x180/0x1b0
>> >> <4>[ 205.819566] [<ffffffff812c18c0>] ext4_mount+0x10/0x20
>> >> <4>[ 205.825230] [<ffffffff811e17a4>] mount_fs+0x14/0xa0
>> >> <4>[ 205.830719] [<ffffffff81203736>] vfs_kern_mount+0x66/0x150
>> >> <4>[ 205.836815] [<ffffffff812066a5>] do_mount+0x1e5/0xd00
>> >> <4>[ 205.842476] [<ffffffff812074b6>] SyS_mount+0x86/0xc0
>> >> <4>[ 205.848048] [<ffffffff81c84bd7>]
>> >> entry_SYSCALL_64_fastpath+0x12/0x6f
>> >> <4>[ 205.855010]
>> >> <4>[ 205.855010] -> #0 (&s->s_dquot.dqio_mutex){+.+...}:
>> >> <4>[ 205.860103] [<ffffffff810ccccc>] __lock_acquire+0x1fdc/0x23a0
>> >> <4>[ 205.866464] [<ffffffff810cde05>] lock_acquire+0xd5/0x280
>> >> <4>[ 205.872393] [<ffffffff81c80990>] mutex_lock_nested+0x60/0x370
>> >> <4>[ 205.878753] [<ffffffff81250678>] dquot_commit+0x28/0xc0
>> >> <4>[ 205.884594] [<ffffffff812c1d4e>] ext4_write_dquot+0x6e/0xa0
>> >> <4>[ 205.890783] [<ffffffff812c1dbe>] ext4_mark_dquot_dirty+0x3e/0x60
>> >> <4>[ 205.897408] [<ffffffff81250857>] __dquot_free_space+0x147/0x310
>> >> <4>[ 205.903943] [<ffffffff812ebe0d>] ext4_free_blocks+0x77d/0x1010
>> >> <4>[ 205.910390] [<ffffffff812dbb66>] ext4_ext_remove_space+0x8f6/0x16a0
>> >> <4>[ 205.917271] [<ffffffff812de49f>] ext4_ext_truncate+0xaf/0xe0
>> >> <4>[ 205.923547] [<ffffffff812b23f0>] ext4_truncate+0x440/0x680
>> >> <4>[ 205.929651] [<ffffffff812b2a9f>] ext4_evict_inode+0x46f/0x730
>> >> <4>[ 205.936017] [<ffffffff811fe473>] evict+0xb3/0x180
>> >> <4>[ 205.941337] [<ffffffff811fecc7>] iput+0x187/0x350
>> >> <4>[ 205.946662] [<ffffffff811f0243>] do_unlinkat+0x163/0x340
>> >> <4>[ 205.952588] [<ffffffff811f0481>] SyS_unlink+0x11/0x20
>> >> <4>[ 205.958258] [<ffffffff81c84bd7>]
>> >> entry_SYSCALL_64_fastpath+0x12/0x6f
>> >> <4>[ 205.965226]
>> >> <4>[ 205.965226] other info that might help us debug this:
>> >> <4>[ 205.965226]
>> >> <4>[ 205.973217] Possible unsafe locking scenario:
>> >> <4>[ 205.973217]
>> >> <4>[ 205.979127] CPU0 CPU1
>> >> <4>[ 205.983652] ---- ----
>> >> <4>[ 205.988174] lock(&ei->i_data_sem);
>> >> <4>[ 205.991772] lock(&s->s_dquot.dqio_mutex);
>> >> <4>[ 205.998487] lock(&ei->i_data_sem);
>> >> <4>[ 206.004596] lock(&s->s_dquot.dqio_mutex);
>> >> <4>[ 206.008795]
>> >> <4>[ 206.008795] *** DEADLOCK ***
>> >> <4>[ 206.008795]
>> >> <4>[ 206.014708] 5 locks held by rm/19302:
>> >> <4>[ 206.018364] #0: (sb_writers#10){.+.+.+}, at:
>> >> [<ffffffff81204bbf>] mnt_want_write+0x1f/0x50
>> >> <4>[ 206.026870] #1: (sb_internal){.+.+..}, at:
>> >> [<ffffffff812b27a9>] ext4_evict_inode+0x179/0x730
>> >> <4>[ 206.035538] #2: (jbd2_handle){+.+...}, at:
>> >> [<ffffffff81306cc1>] start_this_handle+0x191/0x630
>> >> <4>[ 206.044298] #3: (&ei->i_data_sem){++++..}, at:
>> >> [<ffffffff812b2329>] ext4_truncate+0x379/0x680
>> >> <4>[ 206.053052] #4: (dquot_srcu){......}, at: [<ffffffff8125076a>]
>> >> __dquot_free_space+0x5a/0x310
>> >> <4>[ 206.061719]
>> >> <4>[ 206.061719] stack backtrace:
>> >> <4>[ 206.066072] CPU: 0 PID: 19302 Comm: rm Tainted: G W O 4.2.8 #3
>> >> <4>[ 206.072848] Hardware name: To be filled by O.E.M. To be filled
>> >> by O.E.M./MAHOBAY, BIOS QC30AR23 08/14/2014
>> >> <4>[ 206.082486] ffffffff82effc40 ffff88004cebf7d8 ffffffff81c767eb
>> >> 0000000000000007
>> >> <4>[ 206.089919] ffffffff82effc40 ffff88004cebf828 ffffffff81c739cd
>> >> ffff880044724e08
>> >> <4>[ 206.097364] ffff88004cebf898 ffff88004cebf828 0000000000000005
>> >> ffff880044724640
>> >> <4>[ 206.104807] Call Trace:
>> >> <4>[ 206.107257] [<ffffffff81c767eb>] dump_stack+0x4c/0x65
>> >> <4>[ 206.112393] [<ffffffff81c739cd>] print_circular_bug+0x202/0x213
>> >> <4>[ 206.118394] [<ffffffff810ccccc>] __lock_acquire+0x1fdc/0x23a0
>> >> <4>[ 206.124222] [<ffffffff810cde05>] lock_acquire+0xd5/0x280
>> >> <4>[ 206.129616] [<ffffffff81250678>] ? dquot_commit+0x28/0xc0
>> >> <4>[ 206.135099] [<ffffffff81c80990>] mutex_lock_nested+0x60/0x370
>> >> <4>[ 206.140931] [<ffffffff81250678>] ? dquot_commit+0x28/0xc0
>> >> <4>[ 206.146414] [<ffffffff812c1d3a>] ? ext4_write_dquot+0x5a/0xa0
>> >> <4>[ 206.152245] [<ffffffff8130784a>] ? jbd2__journal_start+0x1a/0x20
>> >> <4>[ 206.158766] [<ffffffff81250678>] dquot_commit+0x28/0xc0
>> >> <4>[ 206.164074] [<ffffffff812c1d4e>] ext4_write_dquot+0x6e/0xa0
>> >> <4>[ 206.169731] [<ffffffff812c1dbe>] ext4_mark_dquot_dirty+0x3e/0x60
>> >> <4>[ 206.175821] [<ffffffff81250857>] __dquot_free_space+0x147/0x310
>> >> <4>[ 206.181825] [<ffffffff8125076a>] ? __dquot_free_space+0x5a/0x310
>> >> <4>[ 206.187917] [<ffffffff812ebc62>] ? ext4_free_blocks+0x5d2/0x1010
>> >> <4>[ 206.194005] [<ffffffff812ebe0d>] ext4_free_blocks+0x77d/0x1010
>> >> <4>[ 206.199920] [<ffffffff810ca7e1>] ? mark_held_locks+0x71/0x90
>> >> <4>[ 206.205662] [<ffffffff811cadc6>] ? __kmalloc+0xa6/0x5d0
>> >> <4>[ 206.210972] [<ffffffff810c8f4d>] ? __lock_is_held+0x4d/0x70
>> >> <4>[ 206.216627] [<ffffffff812db2ca>] ? ext4_ext_remove_space+0x5a/0x16a0
>> >> <4>[ 206.223061] [<ffffffff812dbb66>] ext4_ext_remove_space+0x8f6/0x16a0
>> >> <4>[ 206.229412] [<ffffffff812de49f>] ext4_ext_truncate+0xaf/0xe0
>> >> <4>[ 206.235157] [<ffffffff812b23f0>] ext4_truncate+0x440/0x680
>> >> <4>[ 206.240723] [<ffffffff812b2a9f>] ext4_evict_inode+0x46f/0x730
>> >> <4>[ 206.246551] [<ffffffff811fe473>] evict+0xb3/0x180
>> >> <4>[ 206.251339] [<ffffffff811fecc7>] iput+0x187/0x350
>> >> <4>[ 206.256129] [<ffffffff811f0243>] do_unlinkat+0x163/0x340
>> >> <4>[ 206.261525] [<ffffffff812041c0>] ? mnt_get_count+0x60/0x60
>> >> <4>[ 206.267092] [<ffffffff81002044>] ? lockdep_sys_exit_thunk+0x12/0x14
>> >> <4>[ 206.273441] [<ffffffff811f0481>] SyS_unlink+0x11/0x20
>> >> <4>[ 206.278578] [<ffffffff81c84bd7>] entry_SYSCALL_64_fastpath+0x12/0x6f
>> >>
>> > --
>> > Jan Kara <jack@suse.com>
>> > SUSE Labs, CR
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-07-25 8:24 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-20 11:29 lockdep detects possible deadlock scenario Jerry Lee
2017-07-20 12:16 ` Jan Kara
2017-07-21 1:41 ` Jerry Lee
2017-07-24 8:35 ` Jan Kara
2017-07-25 8:24 ` Jerry Lee
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).