From: Dave Chinner <david@fromorbit.com>
To: "Guk-Bong, Kwon" <gbkwon@gmail.com>
Cc: xfs@oss.sgi.com
Subject: Re: xfs hang when filesystem filled
Date: Thu, 9 Aug 2012 08:23:05 +1000 [thread overview]
Message-ID: <20120808222305.GV2877@dastard> (raw)
In-Reply-To: <CAJ-WH_z+6HhooORtMKeO2=kURyiWMcxk_5eUbeBSz3cAXgRRfQ@mail.gmail.com>
On Tue, Aug 07, 2012 at 02:54:48PM +0900, Guk-Bong, Kwon wrote:
> HI all
>
> I tested xfs over nfs using bonnie++
>
> xfs and nfs hang when xfs filesystem filled
>
> What's the problem?
It appears to be blocked in writeback, getting ENOSPC errors when
they shouldn't occur.
> see below
> --------------------------------
>
> 1. nfs server
>
> a. uname -a
> - Linux nfs_server 2.6.32.58 #1 SMP Thu Mar 22 13:33:34 KST 2012 x86_64
> Intel(R) Xeon(R) CPU E5606 @ 2.13GHz GenuineIntel GNU/Linux
Old kernel. Upgrade.
> ================================================================================
> /test 0.0.0.0/0.0.0.0(rw,async,wdelay,hide,nocrossmnt,insecure,no_root_squash,no_all_squash,no_subtree_check,secure_locks,acl,fsid=1342087477,anonuid=65534,anongid=65534)
> ================================================================================
You're using the async export option, which means the server/client
write throttling mechanisms built into the NFs protocol are not
active. That leads to clients swamping the server with dirty data
and not backing off when the server is overloaded, and leads to
-data loss- when the server fails.
IOWs, you're massively overcomitting allocation from lots of
threads which means you are probably depleting the free space pool,
and that leads to -data loss- and potentially deadlocks. If this is
what your production systems do, then a) increase the reserve pool,
and b) fix your producton systems not to do this.
> Aug 2 18:17:58 anystor1 kernel: Call Trace:
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811738ce>] ? xfs_btree_is_lastrec+0x4e/0x60
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff8135edad>] ? schedule_timeout+0x1ed/0x250
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff8135fcd1>] ? __down+0x61/0xa0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff810572d6>] ? down+0x46/0x50
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811af6a4>] ? _xfs_buf_find+0x134/0x220
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811af7fe>] ? xfs_buf_get_flags+0x6e/0x190
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811a525e>] ? xfs_trans_get_buf+0x10e/0x160
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff81161954>] ? xfs_alloc_fix_freelist+0x144/0x450
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff8119e597>] ? xfs_icsb_disable_counter+0x17/0x160
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff8116d2f2>] ? xfs_bmap_add_extent_delay_real+0x8d2/0x11a0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811a4b83>] ? xfs_trans_log_buf+0x63/0xa0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff8119e731>] ? xfs_icsb_balance_counter_locked+0x31/0xf0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff81161ed1>] ? xfs_alloc_vextent+0x1b1/0x4c0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff8116e946>] ? xfs_bmap_btalloc+0x596/0xa70
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff8117125a>] ? xfs_bmapi+0x9fa/0x1230
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811965f6>] ? xlog_state_release_iclog+0x56/0xe0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811a3a0f>] ? xfs_trans_reserve+0x9f/0x210
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff81192d0e>] ? xfs_iomap_write_allocate+0x24e/0x3d0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811c29c0>] ? elv_insert+0xf0/0x260
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff8119396b>] ? xfs_iomap+0x2cb/0x300
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811aba05>] ? xfs_map_blocks+0x25/0x30
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811acb64>] ? xfs_page_state_convert+0x414/0x6d0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811ad137>] ? xfs_vm_writepage+0x77/0x130
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff8107c8ca>] ? __writepage+0xa/0x40
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff8107d0af>] ? write_cache_pages+0x1df/0x3d0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff8107c8c0>] ? __writepage+0x0/0x40
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff81076f4c>] ? __filemap_fdatawrite_range+0x4c/0x60
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811da3a1>] ? radix_tree_gang_lookup+0x71/0xf0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811b029d>] ? xfs_flush_pages+0xad/0xc0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811b795a>] ? xfs_sync_inode_data+0xca/0xf0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811b7aa0>] ? xfs_inode_ag_walk+0x80/0x140
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811b7890>] ? xfs_sync_inode_data+0x0/0xf0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811b7be8>] ? xfs_inode_ag_iterator+0x88/0xd0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811b7890>] ? xfs_sync_inode_data+0x0/0xf0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff8135ed1d>] ? schedule_timeout+0x15d/0x250
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811b7f40>] ? xfs_sync_data+0x30/0x60
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811b7f8e>] ? xfs_flush_inodes_work+0x1e/0x50
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811b726c>] ? xfssyncd+0x13c/0x1d0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff811b7130>] ? xfssyncd+0x0/0x1d0
> Aug 2 18:17:58 anystor1 kernel: [<ffffffff810529d6>] ? kthread+0x96/0xb0
There's your problem - writeback of data is blocked waiting on a
metadata buffer, and everything else is blocked behind it. Upgrade
your kernel.
In summary, you are doing something silly on a very old kernel and
you broke it. As a prize, you get to keep all the broken pieces.....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
prev parent reply other threads:[~2012-08-08 22:23 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-07 5:54 xfs hang when filesystem filled Guk-Bong, Kwon
2012-08-07 9:05 ` Burbidge, Simon A
2012-08-08 7:55 ` Guk-Bong, Kwon
2012-08-08 10:08 ` Dave Howorth
2012-08-08 10:28 ` Burbidge, Simon A
2012-08-07 10:18 ` Michael Monnerie
2012-08-07 13:04 ` Stan Hoeppner
2012-08-08 22:23 ` Dave Chinner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120808222305.GV2877@dastard \
--to=david@fromorbit.com \
--cc=gbkwon@gmail.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox