From: Dave Chinner <david@fromorbit.com>
To: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, xfs@oss.sgi.com
Subject: [PATCH 3/4] writeback: pay attention to wbc->nr_to_write in write_cache_pages
Date: Tue, 20 Apr 2010 12:41:53 +1000 [thread overview]
Message-ID: <1271731314-5893-4-git-send-email-david@fromorbit.com> (raw)
In-Reply-To: <1271731314-5893-1-git-send-email-david@fromorbit.com>
From: Dave Chinner <dchinner@redhat.com>
If a filesystem writes more than one page in ->writepage, write_cache_pages
fails to notice this and continues to attempt writeback when wbc->nr_to_write
has gone negative - this trace was captured from XFS:
wbc_writeback_start: towrt=1024
wbc_writepage: towrt=1024
wbc_writepage: towrt=0
wbc_writepage: towrt=-1
wbc_writepage: towrt=-5
wbc_writepage: towrt=-21
wbc_writepage: towrt=-85
This has adverse effects on filesystem writeback behaviour. write_cache_pages()
needs to terminate after a certain number of pages are written, not after a
certain number of calls to ->writepage are made. Make it observe the current
value of wbc->nr_to_write and treat a value of <= 0 as though it is a either a
termination condition or a trigger to reset to MAX_WRITEḆACK_PAGES for data
integrity syncs.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/fs-writeback.c | 9 ---------
include/linux/writeback.h | 9 +++++++++
include/trace/events/writeback.h | 1 +
mm/page-writeback.c | 20 +++++++++++++++++++-
4 files changed, 29 insertions(+), 10 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 5214b61..d8271d5 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -675,15 +675,6 @@ void writeback_inodes_wbc(struct writeback_control *wbc)
writeback_inodes_wb(&bdi->wb, wbc);
}
-/*
- * The maximum number of pages to writeout in a single bdi flush/kupdate
- * operation. We do this so we don't hold I_SYNC against an inode for
- * enormous amounts of time, which would block a userspace task which has
- * been forced to throttle against that inode. Also, the code reevaluates
- * the dirty each time it has written this many pages.
- */
-#define MAX_WRITEBACK_PAGES 1024
-
static inline bool over_bground_thresh(void)
{
unsigned long background_thresh, dirty_thresh;
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index b2d615f..8533a0f 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -14,6 +14,15 @@ extern struct list_head inode_in_use;
extern struct list_head inode_unused;
/*
+ * The maximum number of pages to writeout in a single bdi flush/kupdate
+ * operation. We do this so we don't hold I_SYNC against an inode for
+ * enormous amounts of time, which would block a userspace task which has
+ * been forced to throttle against that inode. Also, the code reevaluates
+ * the dirty each time it has written this many pages.
+ */
+#define MAX_WRITEBACK_PAGES 1024
+
+/*
* fs/fs-writeback.c
*/
enum writeback_sync_modes {
diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h
index 02f34a5..3bcbd83 100644
--- a/include/trace/events/writeback.h
+++ b/include/trace/events/writeback.h
@@ -241,6 +241,7 @@ DEFINE_WBC_EVENT(wbc_writeback_wait);
DEFINE_WBC_EVENT(wbc_balance_dirty_start);
DEFINE_WBC_EVENT(wbc_balance_dirty_written);
DEFINE_WBC_EVENT(wbc_balance_dirty_wait);
+DEFINE_WBC_EVENT(wbc_writepage);
#endif /* _TRACE_WRITEBACK_H */
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index d45f59e..e22af84 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -917,6 +917,7 @@ continue_unlock:
if (!clear_page_dirty_for_io(page))
goto continue_unlock;
+ trace_wbc_writepage(wbc);
ret = (*writepage)(page, wbc, data);
if (unlikely(ret)) {
if (ret == AOP_WRITEPAGE_ACTIVATE) {
@@ -935,7 +936,7 @@ continue_unlock:
done = 1;
break;
}
- }
+ }
if (nr_to_write > 0) {
nr_to_write--;
@@ -955,6 +956,23 @@ continue_unlock:
break;
}
}
+
+ /*
+ * Some filesystems will write multiple pages in
+ * ->writepage, so wbc->nr_to_write can change much,
+ * much faster than nr_to_write. Check this as an exit
+ * condition, or if we are doing a data integrity sync,
+ * reset the wbc to MAX_WRITEBACK_PAGES so that such
+ * filesystems can do optimal writeout here.
+ */
+ if (wbc->nr_to_write <= 0) {
+ if (wbc->sync_mode == WB_SYNC_NONE) {
+ done = 1;
+ nr_to_write = 0;
+ break;
+ }
+ wbc->nr_to_write = MAX_WRITEBACK_PAGES;
+ }
}
pagevec_release(&pvec);
cond_resched();
--
1.6.5
next prev parent reply other threads:[~2010-04-20 2:41 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-20 2:41 [PATCH 0/4] writeback: tracing and wbc->nr_to_write fixes Dave Chinner
2010-04-20 2:41 ` [PATCH 1/4] writeback: initial tracing support Dave Chinner
2010-05-21 15:06 ` Christoph Hellwig
2010-04-20 2:41 ` [PATCH 2/4] writeback: Add tracing to balance_dirty_pages Dave Chinner
2010-04-20 2:41 ` Dave Chinner [this message]
2010-04-22 19:07 ` [PATCH 3/4] writeback: pay attention to wbc->nr_to_write in write_cache_pages Jan Kara
2010-04-25 3:33 ` tytso
2010-04-26 1:49 ` Dave Chinner
2010-04-26 2:43 ` tytso
2010-04-26 2:45 ` tytso
2010-04-27 3:30 ` Dave Chinner
2010-04-29 21:39 ` Andrew Morton
2010-04-30 6:01 ` Aneesh Kumar K. V
2010-04-30 19:43 ` Andrew Morton
2010-05-01 19:47 ` tytso
2010-04-20 2:41 ` [PATCH 4/4] xfs: remove nr_to_write writeback windup Dave Chinner
2010-04-22 19:09 ` Jan Kara
2010-04-26 0:46 ` Dave Chinner
2010-04-20 3:40 ` [PATCH 5/4] writeback: limit write_cache_pages integrity scanning to current EOF Dave Chinner
2010-04-20 23:28 ` Jamie Lokier
2010-04-20 23:31 ` Dave Chinner
2010-04-22 19:13 ` Jan Kara
2010-04-20 12:02 ` [PATCH 0/4] writeback: tracing and wbc->nr_to_write fixes Richard Kennedy
2010-04-20 23:29 ` Dave Chinner
2010-05-21 15:05 ` Christoph Hellwig
2010-05-22 0:09 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1271731314-5893-4-git-send-email-david@fromorbit.com \
--to=david@fromorbit.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).