From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
To: Jens Axboe <jens.axboe@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: jack@suse.cz
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>,
LKML <linux-kernel@vger.kernel.org>,
<linux-fsdevel@vger.kernel.org>
Subject: [RFC][PATCH 4/7] writeback: ensure large files are written in MAX_WRITEBACK_PAGES chunks
Date: Wed, 09 Sep 2009 22:51:45 +0800 [thread overview]
Message-ID: <20090909150600.727523225@intel.com> (raw)
In-Reply-To: 20090909145141.293229693@intel.com
[-- Attachment #1: writeback-track-large-file.patch --]
[-- Type: text/plain, Size: 4903 bytes --]
Remember pages written for the current file between successive
writeback_single_inode() invocations, and modify wbc->nr_to_write
accordingly to continue write the file until MAX_WRITEBACK_PAGES is
reached for this single file.
This ensures large files will be written in large MAX_WRITEBACK_PAGES
chunks. It works best for kernel sync threads which repeatedly call into
writeback_single_inode() with the same wbc. For balance_dirty_pages()
that normally restart with a fresh wbc, it may never collect enough
last_file_written to skip the current large file, hence lead to
starvation of other (small) files. However/luckily balance_dirty_pages()
writeback is normally interleaved with background writeback, which will
do the duty of rotating the writeback files. So this is not a bit problem.
CC: Dave Chinner <david@fromorbit.com>
Cc: Martin Bligh <mbligh@google.com>
CC: Chris Mason <chris.mason@oracle.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
fs/fs-writeback.c | 41 +++++++++++++++++++++++++++---------
include/linux/writeback.h | 12 ++++++++++
2 files changed, 43 insertions(+), 10 deletions(-)
--- linux.orig/fs/fs-writeback.c 2009-09-09 21:50:53.000000000 +0800
+++ linux/fs/fs-writeback.c 2009-09-09 21:51:04.000000000 +0800
@@ -271,6 +271,19 @@ static void requeue_io(struct inode *ino
list_move(&inode->i_list, &wb->b_more_io);
}
+/*
+ * continue io on this inode on next writeback if
+ * it has not accumulated large enough writeback io chunk
+ */
+static void requeue_partial_io(struct writeback_control *wbc, struct inode *inode)
+{
+ if (wbc->last_file_written == 0 ||
+ wbc->last_file_written >= MAX_WRITEBACK_PAGES)
+ return requeue_io(inode);
+
+ list_move_tail(&inode->i_list, &inode_to_bdi(inode)->wb.b_io);
+}
+
static void inode_sync_complete(struct inode *inode)
{
/*
@@ -365,6 +378,8 @@ writeback_single_inode(struct inode *ino
{
struct address_space *mapping = inode->i_mapping;
int wait = wbc->sync_mode == WB_SYNC_ALL;
+ long last_file_written;
+ long nr_to_write;
unsigned dirty;
int ret;
@@ -402,8 +417,21 @@ writeback_single_inode(struct inode *ino
spin_unlock(&inode_lock);
+ if (wbc->last_file != inode->i_ino)
+ last_file_written = 0;
+ else
+ last_file_written = wbc->last_file_written;
+ wbc->nr_to_write -= last_file_written;
+ nr_to_write = wbc->nr_to_write;
+
ret = do_writepages(mapping, wbc);
+ if (wbc->last_file != inode->i_ino) {
+ wbc->last_file = inode->i_ino;
+ wbc->last_file_written = nr_to_write - wbc->nr_to_write;
+ } else
+ wbc->last_file_written += nr_to_write - wbc->nr_to_write;
+
/* Don't write the inode if only I_DIRTY_PAGES was set */
if (dirty & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) {
int err = write_inode(inode, wait);
@@ -436,7 +464,7 @@ writeback_single_inode(struct inode *ino
/*
* slice used up: queue for next turn
*/
- requeue_io(inode);
+ requeue_partial_io(wbc, inode);
} else {
/*
* somehow blocked: retry later
@@ -456,6 +484,8 @@ writeback_single_inode(struct inode *ino
}
}
inode_sync_complete(inode);
+ wbc->nr_to_write += last_file_written;
+
return ret;
}
@@ -612,15 +642,6 @@ void writeback_inodes_wbc(struct writeba
writeback_inodes_wb(&bdi->wb, wbc);
}
-/*
- * The maximum number of pages to writeout in a single bdi flush/kupdate
- * operation. We do this so we don't hold I_SYNC against an inode for
- * enormous amounts of time, which would block a userspace task which has
- * been forced to throttle against that inode. Also, the code reevaluates
- * the dirty each time it has written this many pages.
- */
-#define MAX_WRITEBACK_PAGES 1024
-
static inline bool over_bground_thresh(void)
{
unsigned long background_thresh, dirty_thresh;
--- linux.orig/include/linux/writeback.h 2009-09-09 21:50:53.000000000 +0800
+++ linux/include/linux/writeback.h 2009-09-09 21:51:22.000000000 +0800
@@ -14,6 +14,16 @@ extern struct list_head inode_in_use;
extern struct list_head inode_unused;
/*
+ * The maximum number of pages to writeout in a single bdi flush/kupdate
+ * operation. We do this so we don't hold I_SYNC against an inode for
+ * enormous amounts of time, which would block a userspace task which has
+ * been forced to throttle against that inode. Also, the code reevaluates
+ * the dirty each time it has written this many pages.
+ */
+#define MAX_WRITEBACK_PAGES 1024
+
+
+/*
* fs/fs-writeback.c
*/
enum writeback_sync_modes {
@@ -36,6 +46,8 @@ struct writeback_control {
older than this */
long nr_to_write; /* Write this many pages, and decrement
this for each page written */
+ unsigned long last_file; /* Inode number of last written file */
+ long last_file_written; /* Total pages written for last file */
long pages_skipped; /* Pages which were not written */
/*
--
next prev parent reply other threads:[~2009-09-09 15:09 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-09 14:51 [RFC][PATCH 0/7] some random writeback fixes Wu Fengguang
2009-09-09 14:51 ` [RFC][PATCH 1/7] writeback: cleanup writeback_single_inode() Wu Fengguang
2009-09-09 15:45 ` Jan Kara
2009-09-09 14:51 ` [RFC][PATCH 2/7] writeback: fix queue_io() ordering Wu Fengguang
2009-09-09 15:53 ` Jan Kara
2009-09-10 1:26 ` Wu Fengguang
2009-09-10 14:14 ` Jan Kara
2009-09-10 14:17 ` Wu Fengguang
2009-09-09 14:51 ` [RFC][PATCH 3/7] writeback: merge for_kupdate and !for_kupdate requeue io logics Wu Fengguang
2009-09-09 14:51 ` Wu Fengguang [this message]
2009-09-09 14:51 ` [RFC][PATCH 5/7] writeback: use 64MB MAX_WRITEBACK_PAGES Wu Fengguang
2009-09-09 23:29 ` Theodore Tso
2009-09-10 0:13 ` Wu Fengguang
2009-09-10 0:13 ` Wu Fengguang
2009-09-10 4:53 ` Peter Zijlstra
2009-09-10 7:35 ` Wu Fengguang
2009-09-09 14:51 ` [RFC][PATCH 6/7] writeback: dont abort inode on congestion Wu Fengguang
2009-09-09 14:51 ` [RFC][PATCH 7/7] writeback: balance_dirty_pages() shall write more than dirtied pages Wu Fengguang
2009-09-09 15:44 ` Jan Kara
2009-09-10 1:42 ` Wu Fengguang
2009-09-10 12:57 ` Chris Mason
2009-09-10 13:21 ` Wu Fengguang
2009-09-10 13:21 ` Wu Fengguang
2009-09-10 14:56 ` Peter Zijlstra
2009-09-10 15:14 ` Wu Fengguang
2009-09-10 15:31 ` Peter Zijlstra
2009-09-10 15:41 ` Wu Fengguang
2009-09-10 15:54 ` Peter Zijlstra
2009-09-10 16:08 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090909150600.727523225@intel.com \
--to=fengguang.wu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.