public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* nginx blocked task when XFS partition is very full
@ 2011-09-02 12:23 Sean Purdy
  2011-09-02 13:02 ` Stan Hoeppner
  0 siblings, 1 reply; 2+ messages in thread
From: Sean Purdy @ 2011-09-02 12:23 UTC (permalink / raw)
  To: xfs

Hi,


This is on 2.6.32-32-server from Ubuntu Lucid 10.04 LTS

We have nginx fileserving from four disks on a machine.  We're using mogilefs as a distributed file system.  We've come across an issue where when the filesystem is very full, and nginx tries to write to it, we get "task blocked for more than 120 seconds" messages from the kernel, nginx hangs and we can no longer access the fs.  When a file is uploaded via nginx, nginx writes to a .tmp directory on the disk before moving the file to its destination.  nginx can't be kill -9ed, it needs a reboot.

Has this sort of issue been fixed since 2.6.32?


Thanks,

Sean


Here's the gory details:

[340381.743426] INFO: task flush-8:48:506 blocked for more than 120 seconds.
[340381.743802] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[340381.750570] flush-8:48    D 0000000000000000     0   506      2 0x00000000
[340381.750576]  ffff880126ea59b0 0000000000000046 0000000000015f00 0000000000015f00
[340381.750581]  ffff880126c5c8d0 ffff880126ea5fd8 0000000000015f00 ffff880126c5c500
[340381.750585]  0000000000015f00 ffff880126ea5fd8 0000000000015f00 ffff880126c5c8d0
[340381.750589] Call Trace:
[340381.750599]  [<ffffffff810f6410>] ? sync_page+0x0/0x50
[340381.750607]  [<ffffffff81558013>] io_schedule+0x73/0xc0
[340381.750610]  [<ffffffff810f644d>] sync_page+0x3d/0x50
[340381.750615]  [<ffffffff8155850a>] __wait_on_bit_lock+0x5a/0xc0
[340381.750618]  [<ffffffff810f63e7>] __lock_page+0x67/0x70
[340381.750624]  [<ffffffff810860d0>] ? wake_bit_function+0x0/0x40
[340381.750629]  [<ffffffff81100a15>] ? pagevec_lookup_tag+0x25/0x40
[340381.750632]  [<ffffffff810ffa87>] write_cache_pages+0x2f7/0x3e0
[340381.750637]  [<ffffffff810fe7d0>] ? __writepage+0x0/0x40
[340381.750641]  [<ffffffff810ffb94>] generic_writepages+0x24/0x30
[340381.750664]  [<ffffffffa0237afd>] xfs_vm_writepages+0x5d/0x80 [xfs]
[340381.750668]  [<ffffffff810ffbc1>] do_writepages+0x21/0x40
[340381.750673]  [<ffffffff81168db6>] writeback_single_inode+0xf6/0x3d0
[340381.750677]  [<ffffffff811694e5>] writeback_sb_inodes+0x195/0x280
[340381.750680]  [<ffffffff81169d00>] writeback_inodes_wb+0xa0/0x1b0
[340381.750684]  [<ffffffff8116a04b>] wb_writeback+0x23b/0x2a0
[340381.750688]  [<ffffffff81077bec>] ? lock_timer_base+0x3c/0x70
[340381.750692]  [<ffffffff8116a22c>] wb_do_writeback+0x17c/0x190
[340381.750696]  [<ffffffff81077d00>] ? process_timeout+0x0/0x10
[340381.750700]  [<ffffffff8116a293>] bdi_writeback_task+0x53/0xf0
[340381.750704]  [<ffffffff81111636>] bdi_start_fn+0x86/0x100
[340381.750708]  [<ffffffff811115b0>] ? bdi_start_fn+0x0/0x100
[340381.750711]  [<ffffffff81085d16>] kthread+0x96/0xa0
[340381.750716]  [<ffffffff810141ea>] child_rip+0xa/0x20
[340381.750719]  [<ffffffff81085c80>] ? kthread+0x0/0xa0
[340381.750723]  [<ffffffff810141e0>] ? child_rip+0x0/0x20

[340381.750732] INFO: task nginx:3581 blocked for more than 120 seconds.
[340381.755178] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[340381.765161] nginx         D 0000000000000000     0  3581   3580 0x00000000
[340381.765166]  ffff880128a27d38 0000000000000082 0000000000015f00 0000000000015f00
[340381.765170]  ffff88012742b1d0 ffff880128a27fd8 0000000000015f00 ffff88012742ae00
[340381.765174]  0000000000015f00 ffff880128a27fd8 0000000000015f00 ffff88012742b1d0
[340381.765178] Call Trace:
[340381.765184]  [<ffffffff81558d47>] __mutex_lock_slowpath+0x107/0x190
[340381.765189]  [<ffffffff811517c1>] ? path_put+0x31/0x40
[340381.765193]  [<ffffffff81558743>] mutex_lock+0x23/0x50
[340381.765197]  [<ffffffff81155139>] do_filp_open+0x3d9/0xba0
[340381.765203]  [<ffffffff810397f9>] ? default_spin_lock_flags+0x9/0x10
[340381.765209]  [<ffffffff81160aca>] ? alloc_fd+0x10a/0x150
[340381.765214]  [<ffffffff81144789>] do_sys_open+0x69/0x170
[340381.765217]  [<ffffffff811448d0>] sys_open+0x20/0x30
[340381.765223]  [<ffffffff810131b2>] system_call_fastpath+0x16/0x1b

... a couple of similar ones for nginx:3582, nginx:3583, nginx:3584, followed by ...

[340381.838307] INFO: task nginx:3585 blocked for more than 120 seconds.
[340381.849329] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[340381.872639] nginx         D 0000000000000000     0  3585   3580 0x00000000
[340381.872643]  ffff88012b3f9098 0000000000000086 0000000000015f00 0000000000015f00
[340381.872648]  ffff88012a3c03d0 ffff88012b3f9fd8 0000000000015f00 ffff88012a3c0000
[340381.872652]  0000000000015f00 ffff88012b3f9fd8 0000000000015f00 ffff88012a3c03d0
[340381.872656] Call Trace:
[340381.872660]  [<ffffffff8155837d>] schedule_timeout+0x22d/0x300
[340381.872676]  [<ffffffffa01fea12>] ? xfs_btree_is_lastrec+0x52/0x70 [xfs]
[340381.872691]  [<ffffffffa0203d04>] ? xfs_btree_insrec+0x2e4/0x5a0 [xfs]
[340381.872695]  [<ffffffff815592ae>] __down+0x7e/0xc0
[340381.872699]  [<ffffffff8108b021>] down+0x41/0x50
[340381.872714]  [<ffffffffa023b1f3>] xfs_buf_lock+0x23/0x60 [xfs]
[340381.872732]  [<ffffffffa023b375>] _xfs_buf_find+0x145/0x240 [xfs]
[340381.872749]  [<ffffffffa023b4d0>] xfs_buf_get_flags+0x60/0x170 [xfs]
[340381.872768]  [<ffffffffa02307eb>] xfs_trans_get_buf+0xeb/0x140 [xfs]
[340381.872786]  [<ffffffffa01ff74e>] xfs_btree_get_bufs+0x3e/0x40 [xfs]
[340381.872801]  [<ffffffffa01ec380>] xfs_alloc_fix_freelist+0x110/0x480 [xfs]
[340381.872821]  [<ffffffffa022ae5f>] ? xfs_icsb_disable_counter+0x1f/0x150 [xfs]
[340381.872829]  [<ffffffff812c1d90>] ? __bitmap_weight+0x50/0xb0
[340381.872847]  [<ffffffffa022afcb>] ? xfs_icsb_balance_counter_locked+0x3b/0xd0 [xfs]
[340381.872864]  [<ffffffffa01ec958>] xfs_alloc_vextent+0x178/0x490 [xfs]
[340381.872880]  [<ffffffffa01fa856>] xfs_bmap_btalloc+0x176/0x9f0 [xfs]
[340381.872897]  [<ffffffffa01fbfb1>] xfs_bmap_alloc+0x21/0x40 [xfs]
[340381.872914]  [<ffffffffa01fcb6f>] xfs_bmapi+0xb9f/0x1290 [xfs]
[340381.872933]  [<ffffffffa0217ba8>] ? xfs_iext_get_ext+0x38/0x80 [xfs]
[340381.872952]  [<ffffffffa021e055>] xfs_iomap_write_allocate+0x1c5/0x3c0 [xfs]
[340381.872972]  [<ffffffffa021edab>] xfs_iomap+0x2ab/0x2e0 [xfs]
[340381.872989]  [<ffffffffa023745d>] xfs_map_blocks+0x2d/0x40 [xfs]
[340381.873007]  [<ffffffffa023886a>] xfs_page_state_convert+0x3da/0x720 [xfs]
[340381.873025]  [<ffffffffa0238d0a>] xfs_vm_writepage+0x7a/0x130 [xfs]
[340381.873034]  [<ffffffff8110f91e>] ? __dec_zone_page_state+0x2e/0x30
[340381.873041]  [<ffffffff810fe7e7>] __writepage+0x17/0x40
[340381.873046]  [<ffffffff810ff967>] write_cache_pages+0x1d7/0x3e0
[340381.873053]  [<ffffffff810fe7d0>] ? __writepage+0x0/0x40
[340381.873059]  [<ffffffff810ffb94>] generic_writepages+0x24/0x30
[340381.873076]  [<ffffffffa0237afd>] xfs_vm_writepages+0x5d/0x80 [xfs]
[340381.873083]  [<ffffffff810ffbc1>] do_writepages+0x21/0x40
[340381.873089]  [<ffffffff810f6beb>] __filemap_fdatawrite_range+0x5b/0x60
[340381.873096]  [<ffffffff810f6f1f>] filemap_fdatawrite+0x1f/0x30
[340381.873113]  [<ffffffffa023bf79>] xfs_flush_pages+0xa9/0xc0 [xfs]
[340381.873130]  [<ffffffffa0240448>] xfs_write+0x838/0x9f0 [xfs]
[340381.873137]  [<ffffffff81077bec>] ? lock_timer_base+0x3c/0x70
[340381.873154]  [<ffffffffa023bde2>] xfs_file_aio_write+0x62/0x70 [xfs]
[340381.873162]  [<ffffffff8114679a>] do_sync_write+0xfa/0x140
[340381.873169]  [<ffffffff81086090>] ? autoremove_wake_function+0x0/0x40
[340381.873177]  [<ffffffff812552b6>] ? security_file_permission+0x16/0x20
[340381.873184]  [<ffffffff81146a98>] vfs_write+0xb8/0x1a0
[340381.873190]  [<ffffffff81147402>] sys_pwrite64+0x82/0xa0
[340381.873197]  [<ffffffff810131b2>] system_call_fastpath+0x16/0x1b

[340381.912512] INFO: task nginx:3587 blocked for more than 120 seconds.
[340381.926794] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[340381.956631] nginx         D 0000000000000000     0  3587   3580 0x00000000
[340381.956635]  ffff88012a28da38 0000000000000086 0000000000015f00 0000000000015f00
[340381.956639]  ffff88012a3c31d0 ffff88012a28dfd8 0000000000015f00 ffff88012a3c2e00
[340381.956643]  0000000000015f00 ffff88012a28dfd8 0000000000015f00 ffff88012a3c31d0
[340381.956647] Call Trace:
[340381.956652]  [<ffffffff8155837d>] schedule_timeout+0x22d/0x300
[340381.956656]  [<ffffffff8105f022>] ? enqueue_entity+0x132/0x1b0
[340381.956660]  [<ffffffff8105f0fb>] ? enqueue_task_fair+0x5b/0xa0
[340381.956664]  [<ffffffff815574f6>] wait_for_common+0xd6/0x180
[340381.956668]  [<ffffffff8105df00>] ? default_wake_function+0x0/0x20
[340381.956671]  [<ffffffff8105df55>] ? wake_up_process+0x15/0x20
[340381.956675]  [<ffffffff8155765d>] wait_for_completion+0x1d/0x20
[340381.956690]  [<ffffffffa0243a27>] xfs_flush_inodes+0x67/0x90 [xfs]
[340381.956706]  [<ffffffffa0233348>] xfs_create+0x328/0x5f0 [xfs]
[340381.956725]  [<ffffffffa022f749>] ? xfs_trans_unlocked_item+0x39/0x60 [xfs]
[340381.956742]  [<ffffffffa023eb97>] xfs_vn_mknod+0xa7/0x1c0 [xfs]
[340381.956759]  [<ffffffffa023ece0>] xfs_vn_create+0x10/0x20 [xfs]
[340381.956766]  [<ffffffff81151f34>] vfs_create+0xb4/0xe0
[340381.956772]  [<ffffffff81152024>] __open_namei_create+0xc4/0x110
[340381.956779]  [<ffffffff811557cb>] do_filp_open+0xa6b/0xba0
[340381.956786]  [<ffffffff810397f9>] ? default_spin_lock_flags+0x9/0x10
[340381.956793]  [<ffffffff81160aca>] ? alloc_fd+0x10a/0x150
[340381.956799]  [<ffffffff81144789>] do_sys_open+0x69/0x170
[340381.956805]  [<ffffffff811448d0>] sys_open+0x20/0x30
[340381.956811]  [<ffffffff810131b2>] system_call_fastpath+0x16/0x1b
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: nginx blocked task when XFS partition is very full
  2011-09-02 12:23 nginx blocked task when XFS partition is very full Sean Purdy
@ 2011-09-02 13:02 ` Stan Hoeppner
  0 siblings, 0 replies; 2+ messages in thread
From: Stan Hoeppner @ 2011-09-02 13:02 UTC (permalink / raw)
  To: xfs

On 9/2/2011 7:23 AM, Sean Purdy wrote:
> Hi,
>
>
> This is on 2.6.32-32-server from Ubuntu Lucid 10.04 LTS
>
> We have nginx fileserving from four disks on a machine.  We're using mogilefs as a distributed file system.  We've come across an issue where when the filesystem is very full, and nginx tries to write to it, we get "task blocked for more than 120 seconds" messages from the kernel, nginx hangs and we can no longer access the fs.  When a file is uploaded via nginx, nginx writes to a .tmp directory on the disk before moving the file to its destination.  nginx can't be kill -9ed, it needs a reboot.
>
> Has this sort of issue been fixed since 2.6.32?

It would help to see df output for space and inodes of the XFS 
filesystem in question.

-- 
Stan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2011-09-02 13:02 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-02 12:23 nginx blocked task when XFS partition is very full Sean Purdy
2011-09-02 13:02 ` Stan Hoeppner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox