From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>, Mel Gorman <mel@csn.ul.ie>,
Dave Chinner <david@fromorbit.com>,
Christoph Hellwig <hch@infradead.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 03/15] writeback: introduce writeback_control.inodes_cleaned
Date: Wed, 8 Jun 2011 08:10:49 +0800 [thread overview]
Message-ID: <20110608001048.GC19547@localhost> (raw)
In-Reply-To: <20110607160313.86fb31df.akpm@linux-foundation.org>
On Wed, Jun 08, 2011 at 07:03:13AM +0800, Andrew Morton wrote:
> On Wed, 08 Jun 2011 05:32:39 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
>
> > The flusher works on dirty inodes in batches, and may quit prematurely
> > if the batch of inodes happen to be metadata-only dirtied: in this case
> > wbc->nr_to_write won't be decreased at all, which stands for "no pages
> > written" but also mis-interpreted as "no progress".
> >
> > So introduce writeback_control.inodes_cleaned to count the inodes get
> > cleaned. A non-zero value means there are some progress on writeback,
> > in which case more writeback can be tried.
>
> Yes, that makes sense. I had a workload which demonstrated/exploited
> this nine years ago but I never got around to fixing it, never told
> anyone and nobody noticed ;)
Good to know that :)
> > + long inodes_cleaned; /* # of inodes cleaned */
>
> nanonit: I'd call this inodes_written, because they may not actually be
> clean.
Yeah, at least for ext4.
Done the rename. Also added a note that it's not the exact number of
inodes written.
Thanks,
Fengguang
---
Subject: writeback: introduce writeback_control.inodes_written
Date: Wed Jul 21 22:50:57 CST 2010
The flusher works on dirty inodes in batches, and may quit prematurely
if the batch of inodes happen to be metadata-only dirtied: in this case
wbc->nr_to_write won't be decreased at all, which stands for "no pages
written" but also mis-interpreted as "no progress".
So introduce writeback_control.inodes_written to count the inodes get
cleaned from VFS POV. A non-zero value means there are some progress on
writeback, in which case more writeback can be tried.
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
fs/fs-writeback.c | 4 ++++
include/linux/writeback.h | 1 +
2 files changed, 5 insertions(+)
about v1: The initial version was to count successful ->write_inode()
calls. However it leads to busy loops for sync() over NFS, because NFS
ridiculously returns 0 (success) while at the same time redirties the
inode. The NFS case can be trivially fixed, however there may be more
hidden bugs in other filesystems..
--- linux-next.orig/fs/fs-writeback.c 2011-05-24 11:17:16.000000000 +0800
+++ linux-next/fs/fs-writeback.c 2011-05-24 11:17:16.000000000 +0800
@@ -464,6 +464,7 @@ writeback_single_inode(struct inode *ino
* No need to add it back to the LRU.
*/
list_del_init(&inode->i_wb_list);
+ wbc->inodes_written++;
}
}
inode_sync_complete(inode);
@@ -725,6 +726,7 @@ static long wb_writeback(struct bdi_writ
wbc.more_io = 0;
wbc.nr_to_write = write_chunk;
wbc.pages_skipped = 0;
+ wbc.inodes_written = 0;
trace_wbc_writeback_start(&wbc, wb->bdi);
if (work->sb)
@@ -741,6 +743,8 @@ static long wb_writeback(struct bdi_writ
*/
if (wbc.nr_to_write <= 0)
continue;
+ if (wbc.inodes_written)
+ continue;
/*
* Didn't write everything and we don't have more IO, bail
*/
--- linux-next.orig/include/linux/writeback.h 2011-05-24 11:17:14.000000000 +0800
+++ linux-next/include/linux/writeback.h 2011-05-24 11:17:16.000000000 +0800
@@ -34,6 +34,7 @@ struct writeback_control {
long nr_to_write; /* Write this many pages, and decrement
this for each page written */
long pages_skipped; /* Pages which were not written */
+ long inodes_written; /* # of inodes written (at least) */
/*
* For a_ops->writepages(): is start or end are non-zero then this is
next prev parent reply other threads:[~2011-06-08 0:10 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-07 21:32 [PATCH 00/15] writeback fixes and cleanups for 3.0 (v5) Wu Fengguang
2011-06-07 21:32 ` [PATCH 01/15] writeback: introduce .tagged_writepages for the WB_SYNC_NONE sync stage Wu Fengguang
2011-06-07 23:02 ` Andrew Morton
2011-06-07 23:24 ` Wu Fengguang
2011-06-07 21:32 ` [PATCH 02/15] writeback: update dirtied_when for synced inode to prevent livelock Wu Fengguang
2011-06-07 23:02 ` Andrew Morton
2011-06-07 23:51 ` Wu Fengguang
2011-06-07 21:32 ` [PATCH 03/15] writeback: introduce writeback_control.inodes_cleaned Wu Fengguang
2011-06-07 23:03 ` Andrew Morton
2011-06-08 0:10 ` Wu Fengguang [this message]
2011-06-07 21:32 ` [PATCH 04/15] writeback: try more writeback as long as something was written Wu Fengguang
2011-06-07 21:32 ` [PATCH 05/15] writeback: the kupdate expire timestamp should be a moving target Wu Fengguang
2011-06-07 21:32 ` [PATCH 06/15] writeback: refill b_io iff empty Wu Fengguang
2011-06-07 21:32 ` [PATCH 07/15] writeback: split inode_wb_list_lock into bdi_writeback.list_lock Wu Fengguang
2011-06-07 23:03 ` Andrew Morton
2011-06-08 0:20 ` Wu Fengguang
2011-06-08 0:35 ` Andrew Morton
2011-06-08 1:36 ` Wu Fengguang
2011-06-07 21:32 ` [PATCH 08/15] writeback: elevate queue_io() into wb_writeback() Wu Fengguang
2011-06-07 21:32 ` [PATCH 09/15] writeback: avoid extra sync work at enqueue time Wu Fengguang
2011-06-07 21:32 ` [PATCH 10/15] writeback: add bdi_dirty_limit() kernel-doc Wu Fengguang
2011-06-07 21:32 ` [PATCH 11/15] writeback: skip balance_dirty_pages() for in-memory fs Wu Fengguang
2011-06-11 13:07 ` Wu Fengguang
2011-06-13 13:42 ` Jan Kara
2011-06-07 21:32 ` [PATCH 12/15] writeback: remove writeback_control.more_io Wu Fengguang
2011-07-11 21:31 ` Hugh Dickins
2011-07-12 6:20 ` Wu Fengguang
2011-07-12 19:50 ` Hugh Dickins
2011-07-13 5:49 ` Hugh Dickins
2011-07-13 10:57 ` Hugh Dickins
2011-07-13 11:19 ` Jan Kara
2011-07-13 15:06 ` Hugh Dickins
2011-07-13 22:07 ` Wu Fengguang
2011-06-07 21:32 ` [PATCH 13/15] writeback: remove .nonblocking and .encountered_congestion Wu Fengguang
2011-06-07 21:32 ` [PATCH 14/15] writeback: trace event writeback_single_inode Wu Fengguang
2011-06-07 21:32 ` [PATCH 15/15] writeback: trace event writeback_queue_io Wu Fengguang
2011-06-07 23:04 ` [PATCH 00/15] writeback fixes and cleanups for 3.0 (v5) Andrew Morton
2011-06-08 2:01 ` Wu Fengguang
2011-06-08 6:21 ` Sedat Dilek
2011-06-08 13:45 ` Wu Fengguang
2011-06-09 1:16 ` Stephen Rothwell
2011-06-09 2:18 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110608001048.GC19547@localhost \
--to=fengguang.wu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mel@csn.ul.ie \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).