From: Wu Fengguang <fengguang.wu@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Christoph Hellwig <hch@infradead.org>,
Dave Chinner <david@fromorbit.com>, Jan Kara <jack@suse.cz>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
Linux Memory Management List <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/6] writeback: avoid unnecessary calculation of bdi dirty thresholds
Date: Thu, 5 Aug 2010 00:41:59 +0800 [thread overview]
Message-ID: <20100804164159.GA22189@localhost> (raw)
In-Reply-To: <1280847822.1923.597.camel@laptop>
On Tue, Aug 03, 2010 at 11:03:42PM +0800, Peter Zijlstra wrote:
> On Sun, 2010-07-11 at 10:06 +0800, Wu Fengguang wrote:
> > plain text document attachment (writeback-less-bdi-calc.patch)
> > Split get_dirty_limits() into global_dirty_limits()+bdi_dirty_limit(),
> > so that the latter can be avoided when under global dirty background
> > threshold (which is the normal state for most systems).
>
> The patch looks OK, although esp with the proposed comments in the
> follow up email, bdi_dirty_limit() gets a bit confusing wrt to how and
> what the limit is.
>
> Maybe its clearer to not call task_dirty_limit() from bdi_dirty_limit(),
> that way the comment can focus on the device write request completion
> proportion thing.
Done, thanks.
> > +unsigned long bdi_dirty_limit(struct backing_dev_info *bdi,
> > + unsigned long dirty)
> > +{
> > + u64 bdi_dirty;
> > + long numerator, denominator;
> >
> > + /*
> > + * Calculate this BDI's share of the dirty ratio.
> > + */
> > + bdi_writeout_fraction(bdi, &numerator, &denominator);
> >
> > + bdi_dirty = (dirty * (100 - bdi_min_ratio)) / 100;
> > + bdi_dirty *= numerator;
> > + do_div(bdi_dirty, denominator);
> >
> > + bdi_dirty += (dirty * bdi->min_ratio) / 100;
> > + if (bdi_dirty > (dirty * bdi->max_ratio) / 100)
> > + bdi_dirty = dirty * bdi->max_ratio / 100;
> > +
> + return bdi_dirty;
> > }
>
> And then add the call to task_dirty_limit() here:
Done. I omitted adding task_dirty_limit() to the bdi_dirty_limit()
inside bdi_debug_stats_show() -- looks unnecessary there.
> > +++ linux-next/mm/backing-dev.c 2010-07-11 08:53:44.000000000 +0800
> > @@ -83,7 +83,8 @@ static int bdi_debug_stats_show(struct s
> > nr_more_io++;
> > spin_unlock(&inode_lock);
> >
> > - get_dirty_limits(&background_thresh, &dirty_thresh, &bdi_thresh, bdi);
> > + global_dirty_limits(&background_thresh, &dirty_thresh);
> > + bdi_thresh = bdi_dirty_limit(bdi, dirty_thresh);
> + bdi_thresh = task_dirty_limit(current, bdi_thresh);
>
> And add a comment to task_dirty_limit() as well, explaining its reason
> for existence (protecting light/slow dirtying tasks from heavier/fast
> ones).
Comments updated as below. Any suggestions/corrections?
Thanks,
Fengguang
Subject: writeback: add comment to the dirty limits functions
From: Wu Fengguang <fengguang.wu@intel.com>
Date: Thu Jul 15 09:54:25 CST 2010
Document global_dirty_limits(), bdi_dirty_limit() and task_dirty_limit().
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/page-writeback.c | 31 ++++++++++++++++++++++++++++---
1 file changed, 28 insertions(+), 3 deletions(-)
--- linux-next.orig/mm/page-writeback.c 2010-08-03 23:14:19.000000000 +0800
+++ linux-next/mm/page-writeback.c 2010-08-05 00:37:17.000000000 +0800
@@ -261,11 +261,18 @@ static inline void task_dirties_fraction
}
/*
- * scale the dirty limit
+ * task_dirty_limit - scale down dirty throttling threshold for one task
*
* task specific dirty limit:
*
* dirty -= (dirty/8) * p_{t}
+ *
+ * To protect light/slow dirtying tasks from heavier/fast ones, we start
+ * throttling individual tasks before reaching the bdi dirty limit.
+ * Relatively low thresholds will be allocated to heavy dirtiers. So when
+ * dirty pages grow large, heavy dirtiers will be throttled first, which will
+ * effectively curb the growth of dirty pages. Light dirtiers with high enough
+ * dirty threshold may never get throttled.
*/
static unsigned long task_dirty_limit(struct task_struct *tsk,
unsigned long bdi_dirty)
@@ -390,6 +397,15 @@ unsigned long determine_dirtyable_memory
return x + 1; /* Ensure that we never return 0 */
}
+/**
+ * global_dirty_limits - background-writeback and dirty-throttling thresholds
+ *
+ * Calculate the dirty thresholds based on sysctl parameters
+ * - vm.dirty_background_ratio or vm.dirty_background_bytes
+ * - vm.dirty_ratio or vm.dirty_bytes
+ * The dirty limits will be lifted by 1/4 for PF_LESS_THROTTLE (ie. nfsd) and
+ * runtime tasks.
+ */
void global_dirty_limits(unsigned long *pbackground, unsigned long *pdirty)
{
unsigned long background;
@@ -424,8 +440,17 @@ void global_dirty_limits(unsigned long *
*pdirty = dirty;
}
-unsigned long bdi_dirty_limit(struct backing_dev_info *bdi,
- unsigned long dirty)
+/**
+ * bdi_dirty_limit - @bdi's share of dirty throttling threshold
+ *
+ * Allocate high/low dirty limits to fast/slow devices, in order to prevent
+ * - starving fast devices
+ * - piling up dirty pages (that will take long time to sync) on slow devices
+ *
+ * The bdi's share of dirty limit will be adapting to its throughput and
+ * bounded by the bdi->min_ratio and/or bdi->max_ratio parameters, if set.
+ */
+unsigned long bdi_dirty_limit(struct backing_dev_info *bdi, unsigned long dirty)
{
u64 bdi_dirty;
long numerator, denominator;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-08-04 16:42 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-11 2:06 [PATCH 0/6] writeback cleanups and trivial fixes Wu Fengguang
2010-07-11 2:06 ` [PATCH 1/6] writeback: take account of NR_WRITEBACK_TEMP in balance_dirty_pages() Wu Fengguang
2010-07-12 21:52 ` Andrew Morton
2010-07-13 8:58 ` Miklos Szeredi
2010-07-15 14:50 ` Wu Fengguang
2010-07-11 2:06 ` [PATCH 2/6] writeback: reduce calls to global_page_state " Wu Fengguang
2010-07-26 15:19 ` Jan Kara
2010-07-27 3:59 ` Wu Fengguang
2010-07-27 9:12 ` Jan Kara
2010-07-28 2:04 ` Wu Fengguang
2010-08-03 14:55 ` Peter Zijlstra
2010-07-11 2:06 ` [PATCH 3/6] writeback: avoid unnecessary calculation of bdi dirty thresholds Wu Fengguang
2010-07-12 21:56 ` Andrew Morton
2010-07-15 14:55 ` Wu Fengguang
2010-07-19 21:35 ` Andrew Morton
2010-07-20 3:34 ` Wu Fengguang
2010-07-20 4:14 ` Andrew Morton
2010-08-03 15:03 ` Peter Zijlstra
2010-08-03 15:10 ` Wu Fengguang
2010-08-04 16:41 ` Wu Fengguang [this message]
2010-08-04 17:10 ` Peter Zijlstra
2010-07-11 2:07 ` [PATCH 4/6] writeback: dont redirty tail an inode with dirty pages Wu Fengguang
2010-07-12 2:01 ` Dave Chinner
2010-07-12 15:31 ` Wu Fengguang
2010-07-12 22:13 ` Andrew Morton
2010-07-15 15:35 ` Wu Fengguang
2010-07-11 2:07 ` [PATCH 5/6] writeback: fix queue_io() ordering Wu Fengguang
2010-07-12 22:15 ` Andrew Morton
2010-07-11 2:07 ` [PATCH 6/6] writeback: merge for_kupdate and !for_kupdate cases Wu Fengguang
2010-07-12 2:08 ` Dave Chinner
2010-07-12 15:52 ` Wu Fengguang
2010-07-12 22:06 ` Dave Chinner
2010-07-12 22:22 ` Andrew Morton
2010-08-05 16:01 ` Wu Fengguang
2010-07-11 2:44 ` [PATCH 0/6] writeback cleanups and trivial fixes Christoph Hellwig
2010-07-11 2:50 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100804164159.GA22189@localhost \
--to=fengguang.wu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).