From: Wu Fengguang <fengguang.wu@intel.com>
To: Theodore Tso <tytso@mit.edu>, Jens Axboe <jens.axboe@oracle.com>,
Christoph Hellwig <hch@infradead.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"chris.mason@oracle.com" <chris.mason@oracle.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"jack@suse.cz" <jack@suse.cz>
Subject: Re: [PATCH 0/7] Per-bdi writeback flusher threads v20
Date: Sat, 19 Sep 2009 23:03:51 +0800 [thread overview]
Message-ID: <20090919150351.GA19880@localhost> (raw)
In-Reply-To: <20090919042607.GA19752@localhost>
On Sat, Sep 19, 2009 at 12:26:07PM +0800, Wu Fengguang wrote:
> On Sat, Sep 19, 2009 at 12:00:51PM +0800, Wu Fengguang wrote:
> > On Sat, Sep 19, 2009 at 11:58:35AM +0800, Wu Fengguang wrote:
> > > On Sat, Sep 19, 2009 at 01:52:52AM +0800, Theodore Tso wrote:
> > > > On Fri, Sep 11, 2009 at 10:39:29PM +0800, Wu Fengguang wrote:
> > > > >
> > > > > That would be good. Sorry for the late work. I'll allocate some time
> > > > > in mid next week to help review and benchmark recent writeback works,
> > > > > and hope to get things done in this merge window.
> > > >
> > > > Did you have some chance to get more work done on the your writeback
> > > > patches?
> > >
> > > Sorry for the delay, I'm now testing the patches with commands
> > >
> > > cp /dev/zero /mnt/test/zero0 &
> > > dd if=/dev/zero of=/mnt/test/zero1 &
> > >
> > > and the attached debug patch.
> > >
> > > One problem I found with ext3/4 is, redirty_tail() is called repeatedly
> > > in the traces, which could slow down the inode writeback significantly.
> >
> > FYI, it's this redirty_tail() called in writeback_single_inode():
> >
> > /*
> > * Someone redirtied the inode while were writing back
> > * the pages.
> > */
> > redirty_tail(inode);
>
> Hmm, this looks like an old fashioned problem get blew up by the
> 128MB MAX_WRITEBACK_PAGES.
>
> The inode was redirtied by the busy cp/dd processes. Now it takes much
> more time to sync 128MB, so that a heavy dirtier can easily redirty
> the inode in that time window.
>
> One single invocation of redirty_tail() could hold up the writeback of
> current inode for up to 30 seconds.
It seems that this patch helps. However I'm afraid it's too late to
risk merging such kind of patches now..
Thanks,
Fengguang
---
writeback: don't delay redirtied inode by a fast dirtier
The large 128MB MAX_WRITEBACK_PAGES greatly increases the chance
for an inode to be dirtied by a fast dirtier during the writeback.
We used to call redirty_tail() in this case, which could delay inode
writeback for up to 30s. This becomes unacceptable now even for simple
dd.
But still delay these cases:
- only inode metadata is dirtied (by the fs)
- the writeback_index wrapped around
(to protect against fast dirtier that do repeated overwrites)
CC: Jan Kara <jack@suse.cz>
CC: Theodore Ts'o <tytso@mit.edu>
CC: Dave Chinner <david@fromorbit.com>
CC: Jens Axboe <jens.axboe@oracle.com>
CC: Chris Mason <chris.mason@oracle.com>
CC: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
fs/fs-writeback.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
--- linux.orig/fs/fs-writeback.c 2009-09-19 18:09:50.000000000 +0800
+++ linux/fs/fs-writeback.c 2009-09-19 19:00:18.000000000 +0800
@@ -466,6 +466,7 @@ writeback_single_inode(struct inode *ino
long last_file_written;
long nr_to_write;
unsigned dirty;
+ pgoff_t writeback_index;
int ret;
if (!atomic_read(&inode->i_count))
@@ -508,6 +509,7 @@ writeback_single_inode(struct inode *ino
last_file_written = wbc->last_file_written;
wbc->nr_to_write -= last_file_written;
nr_to_write = wbc->nr_to_write;
+ writeback_index = mapping->writeback_index;
ret = do_writepages(mapping, wbc);
@@ -534,10 +536,15 @@ writeback_single_inode(struct inode *ino
spin_lock(&inode_lock);
inode->i_state &= ~I_SYNC;
if (!(inode->i_state & (I_FREEING | I_CLEAR))) {
- if (inode->i_state & I_DIRTY) {
+ if (inode->i_state & I_DIRTY_PAGES) {
/*
- * Someone redirtied the inode while were writing back
- * the pages.
+ * More pages get dirtied by a fast dirtier.
+ */
+ goto select_queue;
+ } else if (inode->i_state & I_DIRTY) {
+ /*
+ * At least XFS will redirty the inode during the
+ * writeback (delalloc) and on io completion (isize).
*/
redirty_tail(inode);
} else if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) {
@@ -546,8 +553,10 @@ writeback_single_inode(struct inode *ino
* sometimes bales out without doing anything.
*/
inode->i_state |= I_DIRTY_PAGES;
+select_queue:
if (wbc->encountered_congestion ||
- wbc->nr_to_write <= 0) {
+ wbc->nr_to_write <= 0 ||
+ writeback_index < mapping->writeback_index) {
/*
* if slice used up, queue for next round;
* otherwise continue this inode after return
@@ -556,6 +565,7 @@ writeback_single_inode(struct inode *ino
} else {
/*
* somehow blocked: retry later
+ * also protect against busy rewrites.
*/
redirty_tail(inode);
}
next prev parent reply other threads:[~2009-09-19 15:04 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-11 7:34 [PATCH 0/7] Per-bdi writeback flusher threads v20 Jens Axboe
2009-09-11 7:34 ` [PATCH 1/7] writeback: get rid of generic_sync_sb_inodes() export Jens Axboe
2009-09-11 7:34 ` [PATCH 2/7] writeback: move dirty inodes from super_block to backing_dev_info Jens Axboe
2009-09-11 7:34 ` [PATCH 3/7] writeback: switch to per-bdi threads for flushing data Jens Axboe
2009-09-11 7:34 ` [PATCH 4/7] writeback: get rid of pdflush completely Jens Axboe
2009-09-11 7:34 ` [PATCH 5/7] writeback: add some debug inode list counters to bdi stats Jens Axboe
2009-09-11 7:34 ` [PATCH 6/7] writeback: add name to backing_dev_info Jens Axboe
2009-09-11 7:34 ` [PATCH 7/7] writeback: check for registered bdi in flusher add and inode dirty Jens Axboe
2009-09-11 13:42 ` [PATCH 0/7] Per-bdi writeback flusher threads v20 Theodore Tso
2009-09-11 13:45 ` Chris Mason
2009-09-11 13:45 ` Chris Mason
2009-09-11 14:04 ` Jens Axboe
2009-09-11 14:16 ` Christoph Hellwig
2009-09-11 14:16 ` Christoph Hellwig
2009-09-11 14:29 ` Jens Axboe
2009-09-11 14:39 ` Wu Fengguang
2009-09-18 17:52 ` Theodore Tso
2009-09-19 3:58 ` Wu Fengguang
2009-09-19 3:58 ` Wu Fengguang
2009-09-19 4:00 ` Wu Fengguang
2009-09-19 4:00 ` Wu Fengguang
2009-09-19 4:26 ` Wu Fengguang
2009-09-19 4:26 ` Wu Fengguang
2009-09-19 15:03 ` Wu Fengguang [this message]
2009-09-20 19:00 ` Jan Kara
2009-09-21 3:04 ` Wu Fengguang
2009-09-21 5:35 ` Wu Fengguang
2009-09-21 9:53 ` Wu Fengguang
2009-09-21 10:02 ` Jan Kara
2009-09-21 10:18 ` Wu Fengguang
2009-09-21 12:42 ` Jan Kara
2009-09-21 15:12 ` Wu Fengguang
2009-09-21 16:08 ` Jan Kara
2009-09-22 5:10 ` Wu Fengguang
2009-09-19 15:03 ` Wu Fengguang
2009-09-21 13:53 ` Chris Mason
2009-09-22 10:13 ` Wu Fengguang
2009-09-22 10:13 ` Wu Fengguang
2009-09-22 11:30 ` Chris Mason
2009-09-22 11:45 ` Jan Kara
2009-09-22 12:47 ` Wu Fengguang
2009-09-22 17:41 ` Chris Mason
2009-09-22 13:18 ` Wu Fengguang
2009-09-22 13:18 ` Wu Fengguang
2009-09-22 15:59 ` Chris Mason
2009-09-23 1:05 ` Wu Fengguang
2009-09-23 1:05 ` Wu Fengguang
2009-09-23 14:08 ` Chris Mason
2009-09-24 1:32 ` Wu Fengguang
2009-09-24 1:32 ` Wu Fengguang
2009-09-22 11:30 ` Jan Kara
2009-09-22 13:33 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090919150351.GA19880@localhost \
--to=fengguang.wu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=chris.mason@oracle.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=jens.axboe@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.