linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: torvalds@linuxfoundation.org
Cc: Linux Kernel Developers List <linux-kernel@vger.kernel.org>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>
Subject: [GIT PULL] Ext3 latency fixes
Date: Wed, 08 Apr 2009 19:40:05 -0400	[thread overview]
Message-ID: <E1LrhNF-0002zd-BG@closure.thunk.org> (raw)

Hi Linus,

Please pull from: 

git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git ext3-latency-fixes

One of these patches fixes a performance regression caused by a64c8610,
which unplugged the write queue after every page write.  Now that Jens
added WRITE_SYNC_PLUG.the patch causes us to use it instead of
WRITE_SYNC, to avoid the implicit unplugging.  These patches also seem
to further improbve ext3 latency, especially during the "sync" command
in Linus's write-big-file-and-sync workload.

In addition, I have a patch from Jan that avoids starting a transaction
unnecessarily for the data=writepage case.

						- Ted

Jan Kara (1):
      ext3: Try to avoid starting a transaction in writepage for data=writepage

Theodore Ts'o (1):
      block_write_full_page: switch synchronous writes to use WRITE_SYNC_PLUG

 fs/buffer.c     |   13 ++++++++++++-
 fs/ext3/inode.c |   23 ++++++++++++++++++-----
 2 files changed, 30 insertions(+), 6 deletions(-)


commit 430db323fae7665da721768949ade6304811c648
Author: Jan Kara <jack@suse.cz>
Date:   Tue Apr 7 18:25:01 2009 -0400

    ext3: Try to avoid starting a transaction in writepage for data=writepage
    
    This does the same as commit 9e80d407736161d9b8b0c5a0d44f786e44c322ea
    (avoid starting a transaction when no block allocation is needed)
    but for data=writeback mode of ext3. We also cleanup the data=ordered
    case a bit to stick to coding style...
    
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index 466a332..fcfa243 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -1521,12 +1521,16 @@ static int ext3_ordered_writepage(struct page *page,
 	if (!page_has_buffers(page)) {
 		create_empty_buffers(page, inode->i_sb->s_blocksize,
 				(1 << BH_Dirty)|(1 << BH_Uptodate));
-	} else if (!walk_page_buffers(NULL, page_buffers(page), 0, PAGE_CACHE_SIZE, NULL, buffer_unmapped)) {
-		/* Provide NULL instead of get_block so that we catch bugs if buffers weren't really mapped */
-		return block_write_full_page(page, NULL, wbc);
+		page_bufs = page_buffers(page);
+	} else {
+		page_bufs = page_buffers(page);
+		if (!walk_page_buffers(NULL, page_bufs, 0, PAGE_CACHE_SIZE,
+				       NULL, buffer_unmapped)) {
+			/* Provide NULL get_block() to catch bugs if buffers
+			 * weren't really mapped */
+			return block_write_full_page(page, NULL, wbc);
+		}
 	}
-	page_bufs = page_buffers(page);
-
 	handle = ext3_journal_start(inode, ext3_writepage_trans_blocks(inode));
 
 	if (IS_ERR(handle)) {
@@ -1581,6 +1585,15 @@ static int ext3_writeback_writepage(struct page *page,
 	if (ext3_journal_current_handle())
 		goto out_fail;
 
+	if (page_has_buffers(page)) {
+		if (!walk_page_buffers(NULL, page_buffers(page), 0,
+				      PAGE_CACHE_SIZE, NULL, buffer_unmapped)) {
+			/* Provide NULL get_block() to catch bugs if buffers
+			 * weren't really mapped */
+			return block_write_full_page(page, NULL, wbc);
+		}
+	}
+
 	handle = ext3_journal_start(inode, ext3_writepage_trans_blocks(inode));
 	if (IS_ERR(handle)) {
 		ret = PTR_ERR(handle);

commit 6e34eeddf7deec1444bbddab533f03f520d8458c
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Tue Apr 7 18:12:43 2009 -0400

    block_write_full_page: switch synchronous writes to use WRITE_SYNC_PLUG
    
    Now that we have a distinction between WRITE_SYNC and WRITE_SYNC_PLUG,
    use WRITE_SYNC_PLUG in __block_write_full_page() to avoid unplugging
    the block device I/O queue between each page that gets flushed out.
    
    Otherwise, when we run sync() or fsync() and we need to write out a
    large number of pages, the block device queue will get unplugged
    between for every page that is flushed out, which will be a pretty
    serious performance regression caused by commit a64c8610.
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

diff --git a/fs/buffer.c b/fs/buffer.c
index 6e35762..13edf7a 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1596,6 +1596,16 @@ EXPORT_SYMBOL(unmap_underlying_metadata);
  * locked buffer.   This only can happen if someone has written the buffer
  * directly, with submit_bh().  At the address_space level PageWriteback
  * prevents this contention from occurring.
+ *
+ * If block_write_full_page() is called with wbc->sync_mode ==
+ * WB_SYNC_ALL, the writes are posted using WRITE_SYNC_PLUG; this
+ * causes the writes to be flagged as synchronous writes, but the
+ * block device queue will NOT be unplugged, since usually many pages
+ * will be pushed to the out before the higher-level caller actually
+ * waits for the writes to be completed.  The various wait functions,
+ * such as wait_on_writeback_range() will ultimately call sync_page()
+ * which will ultimately call blk_run_backing_dev(), which will end up
+ * unplugging the device queue.
  */
 static int __block_write_full_page(struct inode *inode, struct page *page,
 			get_block_t *get_block, struct writeback_control *wbc)
@@ -1606,7 +1616,8 @@ static int __block_write_full_page(struct inode *inode, struct page *page,
 	struct buffer_head *bh, *head;
 	const unsigned blocksize = 1 << inode->i_blkbits;
 	int nr_underway = 0;
-	int write_op = (wbc->sync_mode == WB_SYNC_ALL ? WRITE_SYNC : WRITE);
+	int write_op = (wbc->sync_mode == WB_SYNC_ALL ?
+			WRITE_SYNC_PLUG : WRITE);
 
 	BUG_ON(!PageLocked(page));
 

             reply	other threads:[~2009-04-08 23:40 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-08 23:40 Theodore Ts'o [this message]
2009-04-09 15:49 ` [GIT PULL] Ext3 latency fixes Linus Torvalds
2009-04-09 16:23   ` Chris Mason
2009-04-09 17:49     ` Jan Kara
2009-04-09 18:10       ` Chris Mason
2009-04-09 19:04         ` Jan Kara
2009-04-14  2:29           ` [RFC] ext3 data=guarded was " Chris Mason
2009-04-09 17:36   ` Jan Kara
  -- strict thread matches above, loose matches on Subject: below --
2009-04-03  7:01 Theodore Ts'o
2009-04-03 18:24 ` Linus Torvalds
2009-04-03 18:47   ` Jens Axboe
2009-04-03 19:13     ` Theodore Tso
2009-04-03 21:01     ` Chris Mason
2009-04-03 19:02   ` Linus Torvalds
2009-04-03 20:41     ` Linus Torvalds
2009-04-04 13:57       ` Theodore Tso
2009-04-04 15:16         ` Jens Axboe
2009-04-04 15:57           ` Linus Torvalds
2009-04-04 16:06             ` Linus Torvalds
2009-04-04 17:36               ` Jens Axboe
2009-04-04 17:34             ` Jens Axboe
2009-04-04 17:44               ` Linus Torvalds
2009-04-04 18:00                 ` Trenton D. Adams
2009-04-04 18:01                 ` Jens Axboe
2009-04-04 18:10                   ` Linus Torvalds
2009-04-04 23:22                   ` Theodore Tso
2009-04-04 23:33                     ` Arjan van de Ven
2009-04-05  0:10                       ` Theodore Tso
2009-04-05 15:05                         ` Arjan van de Ven
2009-04-05 17:01                         ` Linus Torvalds
2009-04-05 17:15                           ` Mark Lord
2009-04-05 20:57                             ` Jeff Garzik
2009-04-05 23:48                               ` Arjan van de Ven
2009-04-06  2:32                                 ` Mark Lord
2009-04-06  5:47                                 ` Jeff Garzik
2009-04-07 18:18                                   ` Linus Torvalds
2009-04-07 18:22                                     ` Linus Torvalds
2009-04-06  8:13                             ` Jens Axboe
2009-04-05 18:56                           ` Arjan van de Ven
2009-04-05 19:34                             ` Linus Torvalds
2009-04-05 20:06                               ` Arjan van de Ven
2009-04-06  6:25                               ` Jens Axboe
2009-04-06  6:05                           ` Theodore Tso
2009-04-06  6:23                           ` Jens Axboe
2009-04-06  8:16                       ` Jens Axboe
2009-04-06 14:48                         ` Linus Torvalds
2009-04-06 15:09                           ` Jens Axboe
2009-04-06  6:15                     ` Jens Axboe
2009-04-04 20:18               ` Ingo Molnar
2009-04-06 21:50                 ` Lennart Sorensen
2009-04-07 13:31                   ` Mark Lord
2009-04-07 14:48                     ` Lennart Sorensen
2009-04-07 19:21                       ` Mark Lord
2009-04-07 19:57                         ` Lennart Sorensen
2009-04-04 20:56               ` Arjan van de Ven
2009-04-06  7:06                 ` Jens Axboe
2009-04-07 15:39             ` Indan Zupancic
2009-04-04 19:18           ` Theodore Tso
2009-04-06  8:12             ` Jens Axboe
2009-04-04 22:13         ` Linus Torvalds
2009-04-04 22:19           ` Linus Torvalds
2009-04-05  0:20           ` Theodore Tso
2009-04-03 19:54   ` Theodore Tso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E1LrhNF-0002zd-BG@closure.thunk.org \
    --to=tytso@mit.edu \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).