From: Wu Fengguang <fengguang.wu@intel.com>
To: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Cc: Jan Kara <jack@suse.cz>, Dave Chinner <david@fromorbit.com>,
Christoph Hellwig <hch@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
Wu Fengguang <fengguang.wu@intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Subject: [PATCH 4/7] writeback: introduce max-pause and pass-good dirty limits
Date: Sun, 19 Jun 2011 23:01:12 +0800 [thread overview]
Message-ID: <20110619150510.367141119@intel.com> (raw)
In-Reply-To: 20110619150108.691351746@intel.com
[-- Attachment #1: writeback-dirty-limits --]
[-- Type: text/plain, Size: 2928 bytes --]
The max-pause limit helps to keep the sleep time inside
balance_dirty_pages() within 200ms. The 200ms max sleep means per task
rate limit of 8pages/200ms=160KB/s, which normally is enough to stop
dirtiers from continue pushing the dirty pages high, unless there are
a sufficient large number of slow dirtiers (ie. 500 tasks doing 160KB/s
will still sum up to 80MB/s, reaching the write bandwidth of a slow disk).
The pass-good limit helps to let go of the good bdi's in the presence of
a blocked bdi (ie. NFS server not responding) or slow USB disk which for
some reason build up a large number of initial dirty pages that refuse
to go away anytime soon.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
include/linux/writeback.h | 21 +++++++++++++++++++++
mm/page-writeback.c | 13 +++++++++++++
2 files changed, 34 insertions(+)
--- linux-next.orig/include/linux/writeback.h 2011-06-19 22:59:29.000000000 +0800
+++ linux-next/include/linux/writeback.h 2011-06-19 22:59:47.000000000 +0800
@@ -7,6 +7,27 @@
#include <linux/sched.h>
#include <linux/fs.h>
+/*
+ * The 1/16 region above the global dirty limit will be put to maximum pauses:
+ *
+ * (limit, limit + limit/DIRTY_MAXPAUSE)
+ *
+ * The 1/16 region above the max-pause region, dirty exceeded bdi's will be put
+ * to loops:
+ *
+ * (limit + limit/DIRTY_MAXPAUSE, limit + limit/DIRTY_PASSGOOD)
+ *
+ * Further beyond, all dirtier tasks will enter a loop waiting (possibly long
+ * time) for the dirty pages to drop, unless written enough pages.
+ *
+ * The global dirty threshold is normally equal to the global dirty limit,
+ * except when the system suddenly allocates a lot of anonymous memory and
+ * knocks down the global dirty threshold quickly, in which case the global
+ * dirty limit will follow down slowly to prevent livelocking all dirtier tasks.
+ */
+#define DIRTY_MAXPAUSE 16
+#define DIRTY_PASSGOOD 8
+
struct backing_dev_info;
/*
--- linux-next.orig/mm/page-writeback.c 2011-06-19 22:59:29.000000000 +0800
+++ linux-next/mm/page-writeback.c 2011-06-19 22:59:47.000000000 +0800
@@ -399,6 +399,11 @@ unsigned long determine_dirtyable_memory
return x + 1; /* Ensure that we never return 0 */
}
+static unsigned long hard_dirty_limit(unsigned long thresh)
+{
+ return max(thresh, global_dirty_limit);
+}
+
/*
* global_dirty_limits - background-writeback and dirty-throttling thresholds
*
@@ -704,6 +709,14 @@ static void balance_dirty_pages(struct a
__set_current_state(TASK_UNINTERRUPTIBLE);
io_schedule_timeout(pause);
+ dirty_thresh = hard_dirty_limit(dirty_thresh);
+ if (nr_dirty < dirty_thresh + dirty_thresh / DIRTY_MAXPAUSE &&
+ jiffies - start_time > MAX_PAUSE)
+ break;
+ if (nr_dirty < dirty_thresh + dirty_thresh / DIRTY_PASSGOOD &&
+ bdi_dirty < bdi_thresh)
+ break;
+
/*
* Increase the delay for each loop, up to our previous
* default of taking a 100ms nap.
next prev parent reply other threads:[~2011-06-19 15:28 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-19 15:01 [PATCH 0/7] more writeback patches for 3.1 Wu Fengguang
2011-06-19 15:01 ` [PATCH 1/7] writeback: consolidate variable names in balance_dirty_pages() Wu Fengguang
2011-06-20 7:45 ` Christoph Hellwig
2011-06-19 15:01 ` [PATCH 2/7] writeback: add parameters to __bdi_update_bandwidth() Wu Fengguang
2011-06-19 15:31 ` Christoph Hellwig
2011-06-19 15:35 ` Wu Fengguang
2011-06-19 15:01 ` [PATCH 3/7] writeback: introduce smoothed global dirty limit Wu Fengguang
2011-06-19 15:36 ` Christoph Hellwig
2011-06-19 15:55 ` Wu Fengguang
2011-06-21 23:59 ` Andrew Morton
2011-06-22 14:11 ` Wu Fengguang
2011-06-20 21:18 ` Jan Kara
2011-06-21 14:24 ` Wu Fengguang
2011-06-22 0:04 ` Andrew Morton
2011-06-22 14:24 ` Wu Fengguang
2011-06-19 15:01 ` Wu Fengguang [this message]
2011-06-22 0:20 ` [PATCH 4/7] writeback: introduce max-pause and pass-good dirty limits Andrew Morton
2011-06-23 13:18 ` Wu Fengguang
2011-06-19 15:01 ` [PATCH 5/7] writeback: make writeback_control.nr_to_write straight Wu Fengguang
2011-06-19 15:35 ` Christoph Hellwig
2011-06-19 16:14 ` Wu Fengguang
2011-06-19 15:01 ` [PATCH 6/7] writeback: scale IO chunk size up to half device bandwidth Wu Fengguang
2011-06-19 15:01 ` [PATCH 7/7] writeback: timestamp based bdi dirty_exceeded state Wu Fengguang
2011-06-20 20:09 ` Christoph Hellwig
2011-06-21 10:00 ` Steven Whitehouse
2011-06-20 21:38 ` Jan Kara
2011-06-21 15:07 ` Wu Fengguang
2011-06-21 21:14 ` Jan Kara
2011-06-22 14:37 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110619150510.367141119@intel.com \
--to=fengguang.wu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).