* stalls with latest btrfs merge into 3.0-rc2
@ 2011-06-13 21:07 Jim Schutt
2011-06-13 23:51 ` Chris Samuel
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Jim Schutt @ 2011-06-13 21:07 UTC (permalink / raw)
To: linux-btrfs; +Cc: ceph-devel
Hi,
On a system under a heavy write load from multiple ceph OSDs,
I'm running into the following hung tasks where btrfs is implicated.
I'm running commit 3c25fa740e2 from Linus' tree merged with
commit cb9b41c92fa from git://ceph.newdream.net/git/ceph-client.git.
Let me know what else I can do to help sort this out.
[ 961.318047] INFO: task kworker/1:2:2346 blocked for more than 120 seconds.
[ 961.324993] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 961.332891] ffff8802253dbcd0 0000000000000046 ffff8802253dbcb0 ffff88021c138000
[ 961.340398] ffff880222c3dac0 ffff880222c3dac0 00000000000108c0 ffff88021c138000
[ 961.347893] ffff880222c3de80 0000000000000001 ffff8802253dbd00 ffffffff813b0f69
[ 961.355384] Call Trace:
[ 961.357838] [<ffffffff813b0f69>] schedule+0x164/0x19e
[ 961.363041] [<ffffffffa066069c>] btrfs_start_ordered_extent+0xa8/0xc4 [btrfs]
[ 961.370268] [<ffffffff81066544>] ? list_del_init+0x21/0x21
[ 961.376075] [<ffffffffa0660f93>] btrfs_wait_ordered_extents+0xd8/0x143 [btrfs]
[ 961.383387] [<ffffffffa0647065>] btrfs_commit_transaction+0x20b/0x5a4 [btrfs]
[ 961.390642] [<ffffffff81066544>] ? list_del_init+0x21/0x21
[ 961.396284] [<ffffffffa064741d>] do_async_commit+0x1f/0x2c [btrfs]
[ 961.402638] [<ffffffff81060bba>] process_one_work+0x124/0x1e0
[ 961.408478] [<ffffffffa06473fe>] ? btrfs_commit_transaction+0x5a4/0x5a4 [btrfs]
[ 961.415917] [<ffffffff81062219>] ? destroy_workqueue+0x161/0x161
[ 961.422155] [<ffffffff810622a8>] worker_thread+0x8f/0x124
[ 961.427642] [<ffffffff8106603f>] kthread+0x72/0x7a
[ 961.432636] [<ffffffff813b9e94>] kernel_thread_helper+0x4/0x10
[ 961.438561] [<ffffffff813b272d>] ? retint_restore_args+0xe/0xe
[ 961.444530] [<ffffffff81065fcd>] ? kthread_bind+0x53/0x53
[ 961.450100] [<ffffffff813b9e90>] ? gs_change+0xb/0xb
[ 961.455188] INFO: task btrfs-transacti:7653 blocked for more than 120 seconds.
[ 961.462506] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 961.477512] ffff88021c24fd90 0000000000000046 ffff88021c24fd70 ffff880190a65ac0
[ 961.491735] ffff8802246cad60 ffff8802246cad60 00000000000108c0 ffff880190a65ac0
[ 961.505853] ffff8802246cb120 0000000000000001 ffff88021c24fdc0 ffffffff813b0f69
[ 961.521106] Call Trace:
[ 961.526052] [<ffffffff813b0f69>] schedule+0x164/0x19e
[ 961.531910] [<ffffffffa06460b4>] wait_current_trans+0xb8/0xec [btrfs]
[ 961.544101] [<ffffffff81066544>] ? list_del_init+0x21/0x21
[ 961.550394] [<ffffffffa0645cd2>] ? spin_lock+0xe/0x10 [btrfs]
[ 961.556471] [<ffffffffa0647970>] start_transaction+0xd1/0x206 [btrfs]
[ 961.563023] [<ffffffffa0643ec7>] ? btree_readpage_end_io_hook+0x192/0x192 [btrfs]
[ 961.570597] [<ffffffffa0647ae8>] btrfs_join_transaction+0x15/0x17 [btrfs]
[ 961.577609] [<ffffffffa064401b>] transaction_kthread+0x154/0x22d [btrfs]
[ 961.584465] [<ffffffff81036a33>] ? need_resched+0x23/0x2d
[ 961.589967] [<ffffffff8106603f>] kthread+0x72/0x7a
[ 961.594843] [<ffffffff813b9e94>] kernel_thread_helper+0x4/0x10
[ 961.600777] [<ffffffff813b272d>] ? retint_restore_args+0xe/0xe
[ 961.606702] [<ffffffff81065fcd>] ? kthread_bind+0x53/0x53
[ 961.612206] [<ffffffff813b9e90>] ? gs_change+0xb/0xb
[ 961.617322] INFO: task cosd:16719 blocked for more than 120 seconds.
[ 961.623702] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 961.631542] ffff8801bf163cb8 0000000000000086 ffff8801bf163c98 ffff880226db16b0
[ 961.639072] ffff8801df4f5ac0 ffff8801df4f5ac0 00000000000108c0 ffff880226db16b0
[ 961.646516] ffff8801df4f5e80 0000000000000001 ffff8801bf163ce8 ffffffff813b0f69
[ 961.653955] Call Trace:
[ 961.656483] [<ffffffff813b0f69>] schedule+0x164/0x19e
[ 961.661725] [<ffffffffa0645dbd>] wait_current_trans_commit_start_and_unblock+0xa9/0xce [btrfs]
[ 961.670461] [<ffffffff81066544>] ? list_del_init+0x21/0x21
[ 961.676028] [<ffffffff81061512>] ? queue_work+0x1f/0x21
[ 961.681436] [<ffffffffa0647bbd>] btrfs_commit_transaction_async+0xd3/0x115 [btrfs]
[ 961.689118] [<ffffffffa066d8b3>] create_snapshot+0xe5/0x177 [btrfs]
[ 961.695546] [<ffffffffa0671559>] btrfs_mksubvol+0xfa/0x167 [btrfs]
[ 961.702416] [<ffffffffa06716c5>] btrfs_ioctl_snap_create_transid+0xff/0x121 [btrfs]
[ 961.710454] [<ffffffffa067176f>] btrfs_ioctl_snap_create_v2+0x88/0xea [btrfs]
[ 961.718000] [<ffffffffa0671a40>] btrfs_ioctl+0x208/0x358 [btrfs]
[ 961.724088] [<ffffffff8112221a>] vfs_ioctl+0x1d/0x34
[ 961.729159] [<ffffffff81122895>] do_vfs_ioctl+0x171/0x17a
[ 961.734732] [<ffffffff811154bb>] ? fget_light+0x69/0x81
[ 961.740057] [<ffffffff811228fa>] sys_ioctl+0x5c/0x7c
[ 961.745333] [<ffffffffa0534af5>] ? jbd_free_handle+0x1b/0x1d [jbd]
[ 961.752028] [<ffffffff813b8dab>] system_call_fastpath+0x16/0x1b
[ 961.758429] INFO: task cosd:16720 blocked for more than 120 seconds.
[ 961.765416] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 961.776419] ffff8801bf123bb8 0000000000000086 ffff8801bf123b98 ffff8801664bad60
[ 961.792604] ffff8801df4f2d60 ffff8801df4f2d60 00000000000108c0 ffff8801664bad60
[ 961.814250] ffff8801df4f3120 0000000000000000 ffff8801bf123be8 ffffffff813b0f69
[ 961.825356] Call Trace:
[ 961.830932] [<ffffffff813b0f69>] schedule+0x164/0x19e
[ 961.837403] [<ffffffffa06460b4>] wait_current_trans+0xb8/0xec [btrfs]
[ 961.844413] [<ffffffff81066544>] ? list_del_init+0x21/0x21
[ 961.850203] [<ffffffff81109fe6>] ? kmem_cache_alloc+0xad/0xb9
[ 961.856255] [<ffffffffa0647970>] start_transaction+0xd1/0x206 [btrfs]
[ 961.863351] [<ffffffffa0647c12>] btrfs_start_transaction+0x13/0x15 [btrfs]
[ 961.870549] [<ffffffffa064dc24>] btrfs_create+0x3b/0x197 [btrfs]
[ 961.876889] [<ffffffff8111edd8>] vfs_create+0x72/0x92
[ 961.882115] [<ffffffff8111f024>] do_last+0x22c/0x40b
[ 961.887195] [<ffffffff8111fd02>] path_openat+0xc0/0x2ef
[ 961.892581] [<ffffffff81121011>] do_filp_open+0x3d/0x87
[ 961.897888] [<ffffffff811e070c>] ? strncpy_from_user+0x43/0x4d
[ 961.903880] [<ffffffff8111e78d>] ? getname_flags+0x2e/0x80
[ 961.909463] [<ffffffff8111e711>] ? do_getname+0x14b/0x173
[ 961.915472] [<ffffffff8111e74f>] ? audit_getname+0x16/0x26
[ 961.921259] [<ffffffff8112a743>] ? spin_lock+0xe/0x10
[ 961.926480] [<ffffffff811122d1>] do_sys_open+0x69/0xae
[ 961.931759] [<ffffffff81112349>] sys_open+0x20/0x22
[ 961.936730] [<ffffffff813b8dab>] system_call_fastpath+0x16/0x1b
[ 961.942878] INFO: task cosd:16721 blocked for more than 120 seconds.
[ 961.949231] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 961.957398] ffff8801502dfb38 0000000000000086 ffff8801502dfb18 ffff8801dd19c410
[ 961.965782] ffff880150314410 ffff880150314410 00000000000108c0 ffff8801dd19c410
[ 961.973784] ffff8801503147d0 0000000000400000 ffff8801502dfb68 ffffffff813b0f69
[ 961.981702] Call Trace:
[ 961.984239] [<ffffffff813b0f69>] schedule+0x164/0x19e
[ 961.989445] [<ffffffffa06460b4>] wait_current_trans+0xb8/0xec [btrfs]
[ 961.995993] [<ffffffff81066544>] ? list_del_init+0x21/0x21
[ 962.001781] [<ffffffffa0647692>] btrfs_throttle+0x1f/0x21 [btrfs]
[ 962.008040] [<ffffffffa0655672>] __btrfs_buffered_write+0x29e/0x2dd [btrfs]
[ 962.015217] [<ffffffffa06559df>] btrfs_file_aio_write+0x203/0x2df [btrfs]
[ 962.022152] [<ffffffffa06557dc>] ? __btrfs_direct_write+0x12b/0x12b [btrfs]
[ 962.029240] [<ffffffff81113a17>] do_sync_readv_writev+0xc1/0x109
[ 962.035448] [<ffffffff8112c187>] ? mnt_dec_count+0x11/0x13
[ 962.041016] [<ffffffff81036a33>] ? need_resched+0x23/0x2d
[ 962.046796] [<ffffffff81036a4b>] ? should_resched+0xe/0x2f
[ 962.057128] [<ffffffff811986c3>] ? security_file_permission+0x2c/0x31
[ 962.068078] [<ffffffff81113d21>] ? rw_verify_area+0xac/0xdb
[ 962.073949] [<ffffffff811140a6>] do_readv_writev+0xbf/0x127
[ 962.079741] [<ffffffff8111e78d>] ? getname_flags+0x2e/0x80
[ 962.096720] [<ffffffff81106005>] ? compound_head+0x12/0x20
[ 962.103380] [<ffffffff81036a33>] ? need_resched+0x23/0x2d
[ 962.109497] [<ffffffff81036a4b>] ? should_resched+0xe/0x2f
[ 962.119919] [<ffffffff813b117f>] ? _cond_resched+0xe/0x22
[ 962.131022] [<ffffffff813b1895>] ? mutex_lock+0x16/0x34
[ 962.137370] [<ffffffff81114151>] vfs_writev+0x43/0x4e
[ 962.142531] [<ffffffff81114729>] sys_writev+0x4c/0x71
[ 962.148853] [<ffffffff813b8dab>] system_call_fastpath+0x16/0x1b
[ 962.155452] INFO: task btrfs-endio-wri:23430 blocked for more than 120 seconds.
[ 962.162885] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 962.170846] ffff8801dd943c20 0000000000000046 ffff8801dd943c00 ffff8801df4f5ac0
[ 962.178326] ffff880182060000 ffff880182060000 00000000000108c0 ffff8801df4f5ac0
[ 962.185821] ffff8801820603c0 0000000000000001 ffff8801dd943c50 ffffffff813b0f69
[ 962.193386] Call Trace:
[ 962.195837] [<ffffffff813b0f69>] schedule+0x164/0x19e
[ 962.201044] [<ffffffffa06460b4>] wait_current_trans+0xb8/0xec [btrfs]
[ 962.208020] [<ffffffff81066544>] ? list_del_init+0x21/0x21
[ 962.213751] [<ffffffffa0645cd2>] ? spin_lock+0xe/0x10 [btrfs]
[ 962.219713] [<ffffffffa0647970>] start_transaction+0xd1/0x206 [btrfs]
[ 962.226247] [<ffffffffa0647ae8>] btrfs_join_transaction+0x15/0x17 [btrfs]
[ 962.233127] [<ffffffffa064ec90>] btrfs_finish_ordered_io+0x135/0x2c1 [btrfs]
[ 962.240308] [<ffffffffa064ee59>] btrfs_writepage_end_io_hook+0x3d/0x45 [btrfs]
[ 962.247648] [<ffffffffa0665e20>] end_bio_extent_writepage+0xaa/0x15a [btrfs]
[ 962.255291] [<ffffffff8113d194>] bio_endio+0x2d/0x2f
[ 962.260613] [<ffffffffa0642c2c>] end_workqueue_fn+0x72/0x7b [btrfs]
[ 962.267510] [<ffffffffa066cd4f>] worker_loop+0x60/0x1f6 [btrfs]
[ 962.273543] [<ffffffff81033dad>] ? calc_load_account_idle+0xe/0x1d
[ 962.279841] [<ffffffffa066ccef>] ? run_ordered_completions+0x99/0x99 [btrfs]
[ 962.286968] [<ffffffff8106603f>] kthread+0x72/0x7a
[ 962.291965] [<ffffffff813b9e94>] kernel_thread_helper+0x4/0x10
[ 962.297888] [<ffffffff813b272d>] ? retint_restore_args+0xe/0xe
[ 962.303801] [<ffffffff81065fcd>] ? kthread_bind+0x53/0x53
[ 962.309278] [<ffffffff813b9e90>] ? gs_change+0xb/0xb
-- Jim
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: stalls with latest btrfs merge into 3.0-rc2
2011-06-13 21:07 stalls with latest btrfs merge into 3.0-rc2 Jim Schutt
@ 2011-06-13 23:51 ` Chris Samuel
2011-06-14 0:47 ` Josef Bacik
2011-06-14 1:01 ` Josef Bacik
2 siblings, 0 replies; 5+ messages in thread
From: Chris Samuel @ 2011-06-13 23:51 UTC (permalink / raw)
To: Jim Schutt; +Cc: linux-btrfs, ceph-devel
Hi Jim,
On 14/06/11 07:07, Jim Schutt wrote:
> I'm running into the following hung tasks where btrfs is implicated.
3.0-rc3 has only just come out which has had a fair bit of
btrfs patches merged (much to Linus's annoyance) so it is
likely worth having another shot with that..
cheers!
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: stalls with latest btrfs merge into 3.0-rc2
2011-06-13 21:07 stalls with latest btrfs merge into 3.0-rc2 Jim Schutt
2011-06-13 23:51 ` Chris Samuel
@ 2011-06-14 0:47 ` Josef Bacik
2011-06-14 1:01 ` Josef Bacik
2 siblings, 0 replies; 5+ messages in thread
From: Josef Bacik @ 2011-06-14 0:47 UTC (permalink / raw)
To: Jim Schutt; +Cc: linux-btrfs, ceph-devel
On 06/13/2011 05:07 PM, Jim Schutt wrote:
> Hi,
>
> On a system under a heavy write load from multiple ceph OSDs,
> I'm running into the following hung tasks where btrfs is implicated.
> I'm running commit 3c25fa740e2 from Linus' tree merged with
> commit cb9b41c92fa from git://ceph.newdream.net/git/ceph-client.git.
>
> Let me know what else I can do to help sort this out.
>
> [ 961.318047] INFO: task kworker/1:2:2346 blocked for more than 120
> seconds.
> [ 961.324993] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 961.332891] ffff8802253dbcd0 0000000000000046 ffff8802253dbcb0
> ffff88021c138000
> [ 961.340398] ffff880222c3dac0 ffff880222c3dac0 00000000000108c0
> ffff88021c138000
> [ 961.347893] ffff880222c3de80 0000000000000001 ffff8802253dbd00
> ffffffff813b0f69
> [ 961.355384] Call Trace:
> [ 961.357838] [<ffffffff813b0f69>] schedule+0x164/0x19e
> [ 961.363041] [<ffffffffa066069c>] btrfs_start_ordered_extent+0xa8/0xc4
> [btrfs]
> [ 961.370268] [<ffffffff81066544>] ? list_del_init+0x21/0x21
> [ 961.376075] [<ffffffffa0660f93>] btrfs_wait_ordered_extents+0xd8/0x143
> [btrfs]
> [ 961.383387] [<ffffffffa0647065>] btrfs_commit_transaction+0x20b/0x5a4
> [btrfs]
> [ 961.390642] [<ffffffff81066544>] ? list_del_init+0x21/0x21
> [ 961.396284] [<ffffffffa064741d>] do_async_commit+0x1f/0x2c [btrfs]
> [ 961.402638] [<ffffffff81060bba>] process_one_work+0x124/0x1e0
> [ 961.408478] [<ffffffffa06473fe>] ?
> btrfs_commit_transaction+0x5a4/0x5a4 [btrfs]
> [ 961.415917] [<ffffffff81062219>] ? destroy_workqueue+0x161/0x161
> [ 961.422155] [<ffffffff810622a8>] worker_thread+0x8f/0x124
> [ 961.427642] [<ffffffff8106603f>] kthread+0x72/0x7a
> [ 961.432636] [<ffffffff813b9e94>] kernel_thread_helper+0x4/0x10
> [ 961.438561] [<ffffffff813b272d>] ? retint_restore_args+0xe/0xe
> [ 961.444530] [<ffffffff81065fcd>] ? kthread_bind+0x53/0x53
> [ 961.450100] [<ffffffff813b9e90>] ? gs_change+0xb/0xb
> [ 961.455188] INFO: task btrfs-transacti:7653 blocked for more than 120
> seconds.
> [ 961.462506] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 961.477512] ffff88021c24fd90 0000000000000046 ffff88021c24fd70
> ffff880190a65ac0
> [ 961.491735] ffff8802246cad60 ffff8802246cad60 00000000000108c0
> ffff880190a65ac0
> [ 961.505853] ffff8802246cb120 0000000000000001 ffff88021c24fdc0
> ffffffff813b0f69
> [ 961.521106] Call Trace:
> [ 961.526052] [<ffffffff813b0f69>] schedule+0x164/0x19e
> [ 961.531910] [<ffffffffa06460b4>] wait_current_trans+0xb8/0xec [btrfs]
> [ 961.544101] [<ffffffff81066544>] ? list_del_init+0x21/0x21
> [ 961.550394] [<ffffffffa0645cd2>] ? spin_lock+0xe/0x10 [btrfs]
> [ 961.556471] [<ffffffffa0647970>] start_transaction+0xd1/0x206 [btrfs]
> [ 961.563023] [<ffffffffa0643ec7>] ?
> btree_readpage_end_io_hook+0x192/0x192 [btrfs]
> [ 961.570597] [<ffffffffa0647ae8>] btrfs_join_transaction+0x15/0x17 [btrfs]
> [ 961.577609] [<ffffffffa064401b>] transaction_kthread+0x154/0x22d [btrfs]
> [ 961.584465] [<ffffffff81036a33>] ? need_resched+0x23/0x2d
> [ 961.589967] [<ffffffff8106603f>] kthread+0x72/0x7a
> [ 961.594843] [<ffffffff813b9e94>] kernel_thread_helper+0x4/0x10
> [ 961.600777] [<ffffffff813b272d>] ? retint_restore_args+0xe/0xe
> [ 961.606702] [<ffffffff81065fcd>] ? kthread_bind+0x53/0x53
> [ 961.612206] [<ffffffff813b9e90>] ? gs_change+0xb/0xb
> [ 961.617322] INFO: task cosd:16719 blocked for more than 120 seconds.
> [ 961.623702] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 961.631542] ffff8801bf163cb8 0000000000000086 ffff8801bf163c98
> ffff880226db16b0
> [ 961.639072] ffff8801df4f5ac0 ffff8801df4f5ac0 00000000000108c0
> ffff880226db16b0
> [ 961.646516] ffff8801df4f5e80 0000000000000001 ffff8801bf163ce8
> ffffffff813b0f69
> [ 961.653955] Call Trace:
> [ 961.656483] [<ffffffff813b0f69>] schedule+0x164/0x19e
> [ 961.661725] [<ffffffffa0645dbd>]
> wait_current_trans_commit_start_and_unblock+0xa9/0xce [btrfs]
> [ 961.670461] [<ffffffff81066544>] ? list_del_init+0x21/0x21
> [ 961.676028] [<ffffffff81061512>] ? queue_work+0x1f/0x21
> [ 961.681436] [<ffffffffa0647bbd>]
> btrfs_commit_transaction_async+0xd3/0x115 [btrfs]
> [ 961.689118] [<ffffffffa066d8b3>] create_snapshot+0xe5/0x177 [btrfs]
> [ 961.695546] [<ffffffffa0671559>] btrfs_mksubvol+0xfa/0x167 [btrfs]
> [ 961.702416] [<ffffffffa06716c5>]
> btrfs_ioctl_snap_create_transid+0xff/0x121 [btrfs]
> [ 961.710454] [<ffffffffa067176f>] btrfs_ioctl_snap_create_v2+0x88/0xea
> [btrfs]
> [ 961.718000] [<ffffffffa0671a40>] btrfs_ioctl+0x208/0x358 [btrfs]
> [ 961.724088] [<ffffffff8112221a>] vfs_ioctl+0x1d/0x34
> [ 961.729159] [<ffffffff81122895>] do_vfs_ioctl+0x171/0x17a
> [ 961.734732] [<ffffffff811154bb>] ? fget_light+0x69/0x81
> [ 961.740057] [<ffffffff811228fa>] sys_ioctl+0x5c/0x7c
> [ 961.745333] [<ffffffffa0534af5>] ? jbd_free_handle+0x1b/0x1d [jbd]
> [ 961.752028] [<ffffffff813b8dab>] system_call_fastpath+0x16/0x1b
> [ 961.758429] INFO: task cosd:16720 blocked for more than 120 seconds.
> [ 961.765416] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 961.776419] ffff8801bf123bb8 0000000000000086 ffff8801bf123b98
> ffff8801664bad60
> [ 961.792604] ffff8801df4f2d60 ffff8801df4f2d60 00000000000108c0
> ffff8801664bad60
> [ 961.814250] ffff8801df4f3120 0000000000000000 ffff8801bf123be8
> ffffffff813b0f69
> [ 961.825356] Call Trace:
> [ 961.830932] [<ffffffff813b0f69>] schedule+0x164/0x19e
> [ 961.837403] [<ffffffffa06460b4>] wait_current_trans+0xb8/0xec [btrfs]
> [ 961.844413] [<ffffffff81066544>] ? list_del_init+0x21/0x21
> [ 961.850203] [<ffffffff81109fe6>] ? kmem_cache_alloc+0xad/0xb9
> [ 961.856255] [<ffffffffa0647970>] start_transaction+0xd1/0x206 [btrfs]
> [ 961.863351] [<ffffffffa0647c12>] btrfs_start_transaction+0x13/0x15
> [btrfs]
> [ 961.870549] [<ffffffffa064dc24>] btrfs_create+0x3b/0x197 [btrfs]
> [ 961.876889] [<ffffffff8111edd8>] vfs_create+0x72/0x92
> [ 961.882115] [<ffffffff8111f024>] do_last+0x22c/0x40b
> [ 961.887195] [<ffffffff8111fd02>] path_openat+0xc0/0x2ef
> [ 961.892581] [<ffffffff81121011>] do_filp_open+0x3d/0x87
> [ 961.897888] [<ffffffff811e070c>] ? strncpy_from_user+0x43/0x4d
> [ 961.903880] [<ffffffff8111e78d>] ? getname_flags+0x2e/0x80
> [ 961.909463] [<ffffffff8111e711>] ? do_getname+0x14b/0x173
> [ 961.915472] [<ffffffff8111e74f>] ? audit_getname+0x16/0x26
> [ 961.921259] [<ffffffff8112a743>] ? spin_lock+0xe/0x10
> [ 961.926480] [<ffffffff811122d1>] do_sys_open+0x69/0xae
> [ 961.931759] [<ffffffff81112349>] sys_open+0x20/0x22
> [ 961.936730] [<ffffffff813b8dab>] system_call_fastpath+0x16/0x1b
> [ 961.942878] INFO: task cosd:16721 blocked for more than 120 seconds.
> [ 961.949231] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 961.957398] ffff8801502dfb38 0000000000000086 ffff8801502dfb18
> ffff8801dd19c410
> [ 961.965782] ffff880150314410 ffff880150314410 00000000000108c0
> ffff8801dd19c410
> [ 961.973784] ffff8801503147d0 0000000000400000 ffff8801502dfb68
> ffffffff813b0f69
> [ 961.981702] Call Trace:
> [ 961.984239] [<ffffffff813b0f69>] schedule+0x164/0x19e
> [ 961.989445] [<ffffffffa06460b4>] wait_current_trans+0xb8/0xec [btrfs]
> [ 961.995993] [<ffffffff81066544>] ? list_del_init+0x21/0x21
> [ 962.001781] [<ffffffffa0647692>] btrfs_throttle+0x1f/0x21 [btrfs]
> [ 962.008040] [<ffffffffa0655672>] __btrfs_buffered_write+0x29e/0x2dd
> [btrfs]
> [ 962.015217] [<ffffffffa06559df>] btrfs_file_aio_write+0x203/0x2df [btrfs]
> [ 962.022152] [<ffffffffa06557dc>] ? __btrfs_direct_write+0x12b/0x12b
> [btrfs]
> [ 962.029240] [<ffffffff81113a17>] do_sync_readv_writev+0xc1/0x109
> [ 962.035448] [<ffffffff8112c187>] ? mnt_dec_count+0x11/0x13
> [ 962.041016] [<ffffffff81036a33>] ? need_resched+0x23/0x2d
> [ 962.046796] [<ffffffff81036a4b>] ? should_resched+0xe/0x2f
> [ 962.057128] [<ffffffff811986c3>] ? security_file_permission+0x2c/0x31
> [ 962.068078] [<ffffffff81113d21>] ? rw_verify_area+0xac/0xdb
> [ 962.073949] [<ffffffff811140a6>] do_readv_writev+0xbf/0x127
> [ 962.079741] [<ffffffff8111e78d>] ? getname_flags+0x2e/0x80
> [ 962.096720] [<ffffffff81106005>] ? compound_head+0x12/0x20
> [ 962.103380] [<ffffffff81036a33>] ? need_resched+0x23/0x2d
> [ 962.109497] [<ffffffff81036a4b>] ? should_resched+0xe/0x2f
> [ 962.119919] [<ffffffff813b117f>] ? _cond_resched+0xe/0x22
> [ 962.131022] [<ffffffff813b1895>] ? mutex_lock+0x16/0x34
> [ 962.137370] [<ffffffff81114151>] vfs_writev+0x43/0x4e
> [ 962.142531] [<ffffffff81114729>] sys_writev+0x4c/0x71
> [ 962.148853] [<ffffffff813b8dab>] system_call_fastpath+0x16/0x1b
> [ 962.155452] INFO: task btrfs-endio-wri:23430 blocked for more than 120
> seconds.
> [ 962.162885] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 962.170846] ffff8801dd943c20 0000000000000046 ffff8801dd943c00
> ffff8801df4f5ac0
> [ 962.178326] ffff880182060000 ffff880182060000 00000000000108c0
> ffff8801df4f5ac0
> [ 962.185821] ffff8801820603c0 0000000000000001 ffff8801dd943c50
> ffffffff813b0f69
> [ 962.193386] Call Trace:
> [ 962.195837] [<ffffffff813b0f69>] schedule+0x164/0x19e
> [ 962.201044] [<ffffffffa06460b4>] wait_current_trans+0xb8/0xec [btrfs]
> [ 962.208020] [<ffffffff81066544>] ? list_del_init+0x21/0x21
> [ 962.213751] [<ffffffffa0645cd2>] ? spin_lock+0xe/0x10 [btrfs]
> [ 962.219713] [<ffffffffa0647970>] start_transaction+0xd1/0x206 [btrfs]
> [ 962.226247] [<ffffffffa0647ae8>] btrfs_join_transaction+0x15/0x17 [btrfs]
> [ 962.233127] [<ffffffffa064ec90>] btrfs_finish_ordered_io+0x135/0x2c1
> [btrfs]
> [ 962.240308] [<ffffffffa064ee59>] btrfs_writepage_end_io_hook+0x3d/0x45
> [btrfs]
> [ 962.247648] [<ffffffffa0665e20>] end_bio_extent_writepage+0xaa/0x15a
> [btrfs]
> [ 962.255291] [<ffffffff8113d194>] bio_endio+0x2d/0x2f
> [ 962.260613] [<ffffffffa0642c2c>] end_workqueue_fn+0x72/0x7b [btrfs]
> [ 962.267510] [<ffffffffa066cd4f>] worker_loop+0x60/0x1f6 [btrfs]
> [ 962.273543] [<ffffffff81033dad>] ? calc_load_account_idle+0xe/0x1d
> [ 962.279841] [<ffffffffa066ccef>] ? run_ordered_completions+0x99/0x99
> [btrfs]
> [ 962.286968] [<ffffffff8106603f>] kthread+0x72/0x7a
> [ 962.291965] [<ffffffff813b9e94>] kernel_thread_helper+0x4/0x10
> [ 962.297888] [<ffffffff813b272d>] ? retint_restore_args+0xe/0xe
> [ 962.303801] [<ffffffff81065fcd>] ? kthread_bind+0x53/0x53
> [ 962.309278] [<ffffffff813b9e90>] ? gs_change+0xb/0xb
>
Ugh that's not good. I'll look at it tomorrow and send you something to
test. Thanks,
Josef
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: stalls with latest btrfs merge into 3.0-rc2
2011-06-13 21:07 stalls with latest btrfs merge into 3.0-rc2 Jim Schutt
2011-06-13 23:51 ` Chris Samuel
2011-06-14 0:47 ` Josef Bacik
@ 2011-06-14 1:01 ` Josef Bacik
2011-06-14 15:13 ` Jim Schutt
2 siblings, 1 reply; 5+ messages in thread
From: Josef Bacik @ 2011-06-14 1:01 UTC (permalink / raw)
To: Jim Schutt; +Cc: linux-btrfs, ceph-devel
[-- Attachment #1: Type: text/plain, Size: 449 bytes --]
On 06/13/2011 05:07 PM, Jim Schutt wrote:
> Hi,
>
> On a system under a heavy write load from multiple ceph OSDs,
> I'm running into the following hung tasks where btrfs is implicated.
> I'm running commit 3c25fa740e2 from Linus' tree merged with
> commit cb9b41c92fa from git://ceph.newdream.net/git/ceph-client.git.
>
Please try this patch and verify it fixes the problem. If it does I'll
make it less crappy and send it along. Thanks,
Josef
[-- Attachment #2: test-fix.patch --]
[-- Type: text/plain, Size: 818 bytes --]
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 7a9f517..532139e 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1236,12 +1236,16 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
schedule_timeout(1);
finish_wait(&cur_trans->writer_wait, &wait);
- spin_lock(&root->fs_info->trans_lock);
- root->fs_info->trans_no_join = 1;
- spin_unlock(&root->fs_info->trans_lock);
} while (atomic_read(&cur_trans->num_writers) > 1 ||
(should_grow && cur_trans->num_joined != joined));
+ spin_lock(&root->fs_info->trans_lock);
+ root->fs_info->trans_no_join = 1;
+ spin_unlock(&root->fs_info->trans_lock);
+
+ while (atomic_read(&cur_trans->num_writers) > 1)
+ schedule_timeout(1);
+
ret = create_pending_snapshots(trans, root->fs_info);
BUG_ON(ret);
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: stalls with latest btrfs merge into 3.0-rc2
2011-06-14 1:01 ` Josef Bacik
@ 2011-06-14 15:13 ` Jim Schutt
0 siblings, 0 replies; 5+ messages in thread
From: Jim Schutt @ 2011-06-14 15:13 UTC (permalink / raw)
To: Josef Bacik; +Cc: linux-btrfs, ceph-devel
Josef Bacik wrote:
> On 06/13/2011 05:07 PM, Jim Schutt wrote:
>> Hi,
>>
>> On a system under a heavy write load from multiple ceph OSDs,
>> I'm running into the following hung tasks where btrfs is implicated.
>> I'm running commit 3c25fa740e2 from Linus' tree merged with
>> commit cb9b41c92fa from git://ceph.newdream.net/git/ceph-client.git.
>>
>
> Please try this patch and verify it fixes the problem. If it does I'll
> make it less crappy and send it along. Thanks,
I saw no stalls with your patch applied after 30 minutes of writing,
whereas without it the stalls would trigger after a few minutes.
Let me know if you get a new version you'd like me to test, otherwise
Tested-by: Jim Schutt <jaschut@sandia.gov>
Thanks -- Jim
>
> Josef
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-06-14 15:13 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-13 21:07 stalls with latest btrfs merge into 3.0-rc2 Jim Schutt
2011-06-13 23:51 ` Chris Samuel
2011-06-14 0:47 ` Josef Bacik
2011-06-14 1:01 ` Josef Bacik
2011-06-14 15:13 ` Jim Schutt
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.