linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <matthew@wil.cx>, Jan Kara <jack@suse.cz>,
	LKML <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Theodore Ts'o <tytso@mit.edu>,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH] writeback: permit through good bdi even when global dirty exceeded
Date: Fri, 2 Dec 2011 18:28:44 +0800	[thread overview]
Message-ID: <20111202102844.GA2918@localhost> (raw)
In-Reply-To: <20111202101606.GA1158@localhost>

On Fri, Dec 02, 2011 at 06:16:06PM +0800, Wu Fengguang wrote:
> On Fri, Dec 02, 2011 at 04:29:50PM +0800, Wu Fengguang wrote:
> > On Fri, Dec 02, 2011 at 03:03:59PM +0800, Andrew Morton wrote:
> > > On Fri, 2 Dec 2011 14:36:03 +0800 Wu Fengguang <fengguang.wu@intel.com> wrote:
> > > 
> > > > --- linux-next.orig/mm/page-writeback.c	2011-12-02 10:16:21.000000000 +0800
> > > > +++ linux-next/mm/page-writeback.c	2011-12-02 14:28:44.000000000 +0800
> > > > @@ -1182,6 +1182,14 @@ pause:
> > > >  		if (task_ratelimit)
> > > >  			break;
> > > >  
> > > > +		/*
> > > > +		 * In the case of an unresponding NFS server and the NFS dirty
> > > > +		 * pages exceeds dirty_thresh, give the other good bdi's a pipe
> > > > +		 * to go through, so that tasks on them still remain responsive.
> > > > +		 */
> > > > +		if (bdi_dirty < 8)
> > > > +			break;
> > > 
> > > What happens if the local disk has nine dirty pages?
> > 
> > The 9 dirty pages will be cleaned by the flusher (likely in one shot),
> > so after a while the dirtier task can dirty 8 pages more. This
> > consumer-producer work flow can keep going on as long as the magic
> > number chosen is >= 1.
> > 
> > > Also: please, no more magic numbers.  We have too many in there already.
> > 
> > Good point. Let's add some comment on the number chosen?
> 
> I did a dd test to the local disk (when w/ a stalled NFS mount) and
> find that it always idle for several seconds before making a little
> progress. It can be confirmed from the trace that the bdi_dirty
> remains 8 even when the flusher has done its work.
> 
> So the number is lifted to bdi_stat_error to cover the errors in
> bdi_dirty. Here goes the updated patch.

The new trace now shows bdi_dirty=0 in the _majority_ lines. But in
fact it's some small value. In this case the max pause time should
really be set to the smallest non-zero value to avoid IO queue underrun
and improve throughput.

So here comes one more fix.

--- linux-next.orig/mm/page-writeback.c	2011-12-02 18:20:14.000000000 +0800
+++ linux-next/mm/page-writeback.c	2011-12-02 18:20:27.000000000 +0800
@@ -989,18 +989,17 @@ static unsigned long bdi_max_pause(struc
 
 	/*
 	 * Limit pause time for small memory systems. If sleeping for too long
 	 * time, a small pool of dirty/writeback pages may go empty and disk go
 	 * idle.
 	 *
 	 * 8 serves as the safety ratio.
 	 */
-	if (bdi_dirty)
-		t = min(t, bdi_dirty * HZ / (8 * bw + 1));
+	t = min(t, bdi_dirty * HZ / (8 * bw + 1));
 
 	/*
 	 * The pause time will be settled within range (max_pause/4, max_pause).
 	 * Apply a minimal value of 4 to get a non-zero max_pause/4.
 	 */
 	return clamp_val(t, 4, MAX_PAUSE);
 }
 

      reply	other threads:[~2011-12-02 10:28 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-01 10:27 [PATCH] fs: Make write(2) interruptible by a fatal signal Jan Kara
2011-12-01 12:24 ` Wu Fengguang
2011-12-01 14:27   ` Matthew Wilcox
2011-12-01 16:10     ` Linus Torvalds
2011-12-02 11:58       ` Janne Blomqvist
2011-12-02  2:05     ` Wu Fengguang
2011-12-02  6:36     ` [PATCH] writeback: permit through good bdi even when global dirty exceeded Wu Fengguang
2011-12-02  7:03       ` Andrew Morton
2011-12-02  8:29         ` Wu Fengguang
2011-12-02 10:16           ` Wu Fengguang
2011-12-02 10:28             ` Wu Fengguang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111202102844.GA2918@localhost \
    --to=fengguang.wu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew@wil.cx \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).