From: Theodore Ts'o <tytso@mit.edu>
To: stable@kernel.org
Cc: Ext4 Developers List <linux-ext4@vger.kernel.org>,
Eric Sandeen <sandeen@redhat.com>,
"Theodore Ts'o" <tytso@mit.edu>
Subject: [PATCH v2.6.32.y 39/53] ext4: don't scan/accumulate more pages than mballoc will allocate
Date: Sun, 30 May 2010 22:49:52 -0400 [thread overview]
Message-ID: <1275274206-3900-39-git-send-email-tytso@mit.edu> (raw)
In-Reply-To: <1275274206-3900-1-git-send-email-tytso@mit.edu>
From: Eric Sandeen <sandeen@redhat.com>
commit c445e3e0a5c2804524dec6e55f66d63f6bc5bc3e upstream (as of v2.6.34-git13)
There was a bug reported on RHEL5 that a 10G dd on a 12G box
had a very, very slow sync after that.
At issue was the loop in write_cache_pages scanning all the way
to the end of the 10G file, even though the subsequent call
to mpage_da_submit_io would only actually write a smallish amt; then
we went back to the write_cache_pages loop ... wasting tons of time
in calling __mpage_da_writepage for thousands of pages we would
just revisit (many times) later.
Upstream it's not such a big issue for sys_sync because we get
to the loop with a much smaller nr_to_write, which limits the loop.
However, talking with Aneesh he realized that fsync upstream still
gets here with a very large nr_to_write and we face the same problem.
This patch makes mpage_add_bh_to_extent stop the loop after we've
accumulated 2048 pages, by setting mpd->io_done = 1; which ultimately
causes the write_cache_pages loop to break.
Repeating the test with a dirty_ratio of 80 (to leave something for
fsync to do), I don't see huge IO performance gains, but the reduction
in cpu usage is striking: 80% usage with stock, and 2% with the
below patch. Instrumenting the loop in write_cache_pages clearly
shows that we are wasting time here.
Eventually we need to change mpage_da_map_pages() also submit its I/O
to the block layer, subsuming mpage_da_submit_io(), and then change it
call ext4_get_blocks() multiple times.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
---
fs/ext4/inode.c | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index fc06fcd..ba7549f 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2361,6 +2361,15 @@ static void mpage_add_bh_to_extent(struct mpage_da_data *mpd,
sector_t next;
int nrblocks = mpd->b_size >> mpd->inode->i_blkbits;
+ /*
+ * XXX Don't go larger than mballoc is willing to allocate
+ * This is a stopgap solution. We eventually need to fold
+ * mpage_da_submit_io() into this function and then call
+ * ext4_get_blocks() multiple times in a loop
+ */
+ if (nrblocks >= 8*1024*1024/mpd->inode->i_sb->s_blocksize)
+ goto flush_it;
+
/* check if thereserved journal credits might overflow */
if (!(EXT4_I(mpd->inode)->i_flags & EXT4_EXTENTS_FL)) {
if (nrblocks >= EXT4_MAX_TRANS_DATA) {
--
1.6.6.1.1.g974db.dirty
next prev parent reply other threads:[~2010-05-31 2:50 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-31 2:49 [PATCH v2.6.32.y 01/53] ext4: Fix potential quota deadlock Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 02/53] jbd: jbd-debug and jbd2-debug should be writable Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 03/53] ext4: replace BUG() with return -EIO in ext4_ext_get_blocks Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 04/53] ext4, jbd2: Add barriers for file systems with exernal journals Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 05/53] ext4: Eliminate potential double free on error path Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 06/53] ext4: return correct wbc.nr_to_write in ext4_da_writepages Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 07/53] ext4: Ensure zeroout blocks have no dirty metadata Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 08/53] ext4: Patch up how we claim metadata blocks for quota purposes Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 09/53] ext4: Fix accounting of reserved metadata blocks Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 10/53] ext4: Calculate metadata requirements more accurately Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 11/53] ext4: Handle -EDQUOT error on write Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 12/53] ext4: Fix quota accounting error with fallocate Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 13/53] ext4: Drop EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE flag Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 14/53] ext4: Use bitops to read/modify EXT4_I(inode)->i_state Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 15/53] ext4: Fix BUG_ON at fs/buffer.c:652 in no journal mode Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 16/53] ext4: Add flag to files with blocks intentionally past EOF Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 17/53] ext4: Fix fencepost error in chosing choosing group vs file preallocation Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 18/53] ext4: fix error handling in migrate Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 19/53] ext4: explicitly remove inode from orphan list after failed direct io Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 20/53] ext4: Handle non empty on-disk orphan link Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 21/53] ext4: make "offset" consistent in ext4_check_dir_entry() Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 22/53] ext4: Fix insertion point of extent in mext_insert_across_blocks() Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 23/53] ext4: Fix the NULL reference in double_down_write_data_sem() Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 24/53] ext4: Code cleanup for EXT4_IOC_MOVE_EXT ioctl Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 25/53] ext4: Fix estimate of # of blocks needed to write indirect-mapped files Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 26/53] ext4: Fixed inode allocator to correctly track a flex_bg's used_dirs Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 27/53] ext4: Fix possible lost inode write in no journal mode Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 28/53] ext4: Fix buffer head leaks after calls to ext4_get_inode_loc() Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 29/53] ext4: Issue the discard operation *before* releasing the blocks to be reused Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 30/53] ext4: check missed return value in ext4_sync_file() Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 31/53] ext4: fix memory leaks in error path handling of ext4_ext_zeroout() Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 32/53] ext4: Remove unnecessary call to ext4_get_group_desc() in mballoc Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 33/53] ext4: rename ext4_mb_release_desc() to ext4_mb_unload_buddy() Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 34/53] ext4: allow defrag (EXT4_IOC_MOVE_EXT) in 32bit compat mode Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 35/53] ext4: fix quota accounting in case of fallocate Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 36/53] ext4: check s_log_groups_per_flex in online resize code Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 37/53] ext4: don't return to userspace after freezing the fs with a mutex held Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 38/53] ext4: stop issuing discards if not supported by device Theodore Ts'o
2010-05-31 2:49 ` Theodore Ts'o [this message]
2010-05-31 2:49 ` [PATCH v2.6.32.y 40/53] ext4: Do not zero out uninitialized extents beyond i_size Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 41/53] ext4: clean up inode bitmaps manipulation in ext4_free_inode Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 42/53] ext4: init statistics after journal recovery Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 43/53] ext4: Remove extraneous newlines in ext4_msg() calls Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 44/53] ext4: Prevent creation of files larger than RLIMIT_FSIZE using fallocate Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 45/53] ext4: check for a good block group before loading buddy pages Theodore Ts'o
2010-05-31 2:49 ` [PATCH v2.6.32.y 46/53] ext4: Show journal_checksum option Theodore Ts'o
2010-05-31 2:50 ` [PATCH v2.6.32.y 47/53] ext4: Use bitops to read/modify i_flags in struct ext4_inode_info Theodore Ts'o
2010-05-31 2:50 ` [PATCH v2.6.32.y 48/53] ext4: Avoid crashing on NULL ptr dereference on a filesystem error Theodore Ts'o
2010-05-31 2:50 ` [PATCH v2.6.32.y 49/53] ext4: Clear the EXT4_EOFBLOCKS_FL flag only when warranted Theodore Ts'o
2010-05-31 2:50 ` [PATCH v2.6.32.y 50/53] ext4: restart ext4_ext_remove_space() after transaction restart Theodore Ts'o
2010-05-31 2:50 ` [PATCH v2.6.32.y 51/53] ext4: Conditionally define compat ioctl numbers Theodore Ts'o
2010-05-31 2:50 ` [PATCH v2.6.32.y 52/53] ext4: Fix compat EXT4_IOC_ADD_GROUP Theodore Ts'o
2010-05-31 2:50 ` [PATCH v2.6.32.y 53/53] ext4: Make fsync sync new parent directories in no-journal mode Theodore Ts'o
2010-06-25 22:29 ` [stable] [PATCH v2.6.32.y 01/53] ext4: Fix potential quota deadlock Greg KH
2010-06-26 23:19 ` tytso
2010-06-28 15:48 ` Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1275274206-3900-39-git-send-email-tytso@mit.edu \
--to=tytso@mit.edu \
--cc=linux-ext4@vger.kernel.org \
--cc=sandeen@redhat.com \
--cc=stable@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).