From: Fengguang Wu <wfg@mail.ustc.edu.cn>
To: Andrew Morton <akpm@osdl.org>
Cc: Ken Chen <kenchen@google.com>, Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Subject: [PATCH 1/6] writeback: fix time ordering of the per superblock inode lists 8
Date: Sun, 19 Aug 2007 14:53:50 +0800 [thread overview]
Message-ID: <387507684.32621@ustc.edu.cn> (raw)
Message-ID: <20070819071444.499427465@mail.ustc.edu.cn> (raw)
In-Reply-To: 20070819065349.160284305@mail.ustc.edu.cn
[-- Attachment #1: inode-dirty-time-ordering-fix.patch --]
[-- Type: text/plain, Size: 5497 bytes --]
Streamline the management of dirty inode lists and fix time ordering bugs.
The writeback logic used to moving not-yet-expired dirty inodes from s_dirty to
s_io, *only to* move them back. The move-inodes-back-and-forth thing is a mess,
which is eliminated by this patch.
The new scheme is:
- s_dirty acts as a time ordered io delaying queue;
- s_io/s_more_io together acts as an io dispatching queue.
On kupdate writeback, we pull some inodes from s_dirty to s_io at the start of
every full scan of s_io. Otherwise(i.e. for sync/throttle/background
writeback), we always pull from s_dirty on each run(a partial scan).
Note that the line
list_splice_init(&sb->s_more_io, &sb->s_io);
is moved to queue_io() to leave s_io empty. Otherwise a big dirtied file will
sit in s_io for a long time, preventing new expired inodes to get in.
Cc: Ken Chen <kenchen@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
---
fs/fs-writeback.c | 60 +++++++++++++++++++++++++++-----------------
1 file changed, 38 insertions(+), 22 deletions(-)
--- linux-2.6.23-rc2-mm2.orig/fs/fs-writeback.c
+++ linux-2.6.23-rc2-mm2/fs/fs-writeback.c
@@ -118,7 +118,7 @@ void __mark_inode_dirty(struct inode *in
goto out;
/*
- * If the inode was already on s_dirty or s_io, don't
+ * If the inode was already on s_dirty/s_io/s_more_io, don't
* reposition it (that would break s_dirty time-ordering).
*/
if (!was_dirty) {
@@ -172,6 +172,33 @@ static void requeue_io(struct inode *ino
}
/*
+ * Move expired dirty inodes from @delaying_queue to @dispatch_queue.
+ */
+static void move_expired_inodes(struct list_head *delaying_queue,
+ struct list_head *dispatch_queue,
+ unsigned long *older_than_this)
+{
+ while (!list_empty(delaying_queue)) {
+ struct inode *inode = list_entry(delaying_queue->prev,
+ struct inode, i_list);
+ if (older_than_this &&
+ time_after(inode->dirtied_when, *older_than_this))
+ break;
+ list_move(&inode->i_list, dispatch_queue);
+ }
+}
+
+/*
+ * Queue all expired dirty inodes for io, eldest first.
+ */
+static void queue_io(struct super_block *sb,
+ unsigned long *older_than_this)
+{
+ list_splice_init(&sb->s_more_io, sb->s_io.prev);
+ move_expired_inodes(&sb->s_dirty, &sb->s_io, older_than_this);
+}
+
+/*
* Write a single inode's dirty pages and inode data out to disk.
* If `wait' is set, wait on the writeout.
*
@@ -221,7 +248,7 @@ __sync_single_inode(struct inode *inode,
/*
* We didn't write back all the pages. nfs_writepages()
* sometimes bales out without doing anything. Redirty
- * the inode. It is moved from s_io onto s_dirty.
+ * the inode; Move it from s_io onto s_more_io/s_dirty.
*/
/*
* akpm: if the caller was the kupdate function we put
@@ -234,10 +261,9 @@ __sync_single_inode(struct inode *inode,
*/
if (wbc->for_kupdate) {
/*
- * For the kupdate function we leave the inode
- * at the head of sb_dirty so it will get more
- * writeout as soon as the queue becomes
- * uncongested.
+ * For the kupdate function we move the inode
+ * to s_more_io so it will get more writeout as
+ * soon as the queue becomes uncongested.
*/
inode->i_state |= I_DIRTY_PAGES;
requeue_io(inode);
@@ -295,10 +321,10 @@ __writeback_single_inode(struct inode *i
/*
* We're skipping this inode because it's locked, and we're not
- * doing writeback-for-data-integrity. Move it to the head of
- * s_dirty so that writeback can proceed with the other inodes
- * on s_io. We'll have another go at writing back this inode
- * when the s_dirty iodes get moved back onto s_io.
+ * doing writeback-for-data-integrity. Move it to s_more_io so
+ * that writeback can proceed with the other inodes on s_io.
+ * We'll have another go at writing back this inode when we
+ * completed a full scan of s_io.
*/
requeue_io(inode);
@@ -365,7 +391,7 @@ sync_sb_inodes(struct super_block *sb, s
const unsigned long start = jiffies; /* livelock avoidance */
if (!wbc->for_kupdate || list_empty(&sb->s_io))
- list_splice_init(&sb->s_dirty, &sb->s_io);
+ queue_io(sb, wbc->older_than_this);
while (!list_empty(&sb->s_io)) {
struct inode *inode = list_entry(sb->s_io.prev,
@@ -410,13 +436,6 @@ sync_sb_inodes(struct super_block *sb, s
if (time_after(inode->dirtied_when, start))
break;
- /* Was this inode dirtied too recently? */
- if (wbc->older_than_this && time_after(inode->dirtied_when,
- *wbc->older_than_this)) {
- list_splice_init(&sb->s_io, sb->s_dirty.prev);
- break;
- }
-
/* Is another pdflush already flushing this queue? */
if (current_is_pdflush() && !writeback_acquire(bdi))
break;
@@ -446,9 +465,6 @@ sync_sb_inodes(struct super_block *sb, s
break;
}
- if (list_empty(&sb->s_io))
- list_splice_init(&sb->s_more_io, &sb->s_io);
-
return; /* Leave any unwritten inodes on s_io */
}
@@ -458,7 +474,7 @@ sync_sb_inodes(struct super_block *sb, s
* Note:
* We don't need to grab a reference to superblock here. If it has non-empty
* ->s_dirty it's hadn't been killed yet and kill_super() won't proceed
- * past sync_inodes_sb() until both the ->s_dirty and ->s_io lists are
+ * past sync_inodes_sb() until the ->s_dirty/s_io/s_more_io lists are all
* empty. Since __sync_single_inode() regains inode_lock before it finally moves
* inode from superblock lists we are OK.
*
--
next prev parent reply other threads:[~2007-08-19 7:16 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-08-19 6:53 [PATCH 0/6] dirty inode lists time delay/ordering fixes Fengguang Wu
2007-08-19 6:53 ` Fengguang Wu
2007-08-19 6:53 ` Fengguang Wu [this message]
2007-08-19 6:53 ` [PATCH 1/6] writeback: fix time ordering of the per superblock inode lists 8 Fengguang Wu
2007-08-19 6:53 ` [PATCH 2/6] writeback: fix ntfs with sb_has_dirty_inodes() Fengguang Wu
2007-08-19 6:53 ` Fengguang Wu
2007-08-19 6:53 ` [PATCH 3/6] writeback: remove pages_skipped accounting in __block_write_full_page() Fengguang Wu
2007-08-19 6:53 ` Fengguang Wu
2007-08-19 6:53 ` [PATCH 4/6] writeback: introduce writeback_control.more_io to indicate more io Fengguang Wu
2007-08-19 6:53 ` Fengguang Wu
2007-08-19 6:53 ` [PATCH 5/6] check dirty inode list Fengguang Wu
2007-08-19 6:53 ` Fengguang Wu
2007-08-19 6:53 ` [PATCH 6/6] prevent time-ordering warnings Fengguang Wu
2007-08-19 6:53 ` Fengguang Wu
-- strict thread matches above, loose matches on Subject: below --
2007-08-12 9:11 [PATCH 0/6] writeback time order/delay fixes take 3 Fengguang Wu
2007-08-12 9:11 ` [PATCH 1/6] writeback: fix time ordering of the per superblock inode lists 8 Fengguang Wu
2007-08-12 9:11 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=387507684.32621@ustc.edu.cn \
--to=wfg@mail.ustc.edu.cn \
--cc=akpm@linux-foundation.org \
--cc=akpm@osdl.org \
--cc=kenchen@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.