linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"mm-commits@vger.kernel.org" <mm-commits@vger.kernel.org>,
	"richard@rsk.demon.co.uk" <richard@rsk.demon.co.uk>,
	"chris.mason@oracle.com" <chris.mason@oracle.com>,
	"jens.axboe@oracle.com" <jens.axboe@oracle.com>,
	"mbligh@mbligh.org" <mbligh@mbligh.org>,
	"miklos@szeredi.hu" <miklos@szeredi.hu>,
	linux-fsdevel@vger.kernel.org
Subject: Re: + mm-balance_dirty_pages-reduce-calls-to-global_page_state-to-reduce-c ache-references.patch added to -mm tree
Date: Sat, 22 Aug 2009 20:11:41 +0200	[thread overview]
Message-ID: <1250964701.7538.101.camel@twins> (raw)
In-Reply-To: <20090822025150.GB7798@localhost>

On Sat, 2009-08-22 at 10:51 +0800, Wu Fengguang wrote:

> > +++ a/mm/page-writeback.c

> > @@ -465,7 +439,6 @@ get_dirty_limits(unsigned long *pbackgro
> >  			bdi_dirty = dirty * bdi->max_ratio / 100;
> >  
> >  		*pbdi_dirty = bdi_dirty;
> >  		task_dirty_limit(current, pbdi_dirty);
> >  	}
> >  }
> > @@ -499,45 +472,12 @@ static void balance_dirty_pages(struct a
> >  		};
> >  
> >  		get_dirty_limits(&background_thresh, &dirty_thresh,
> > +				 &bdi_thresh, bdi);
> >  
> >  		nr_reclaimable = global_page_state(NR_FILE_DIRTY) +
> > +			global_page_state(NR_UNSTABLE_NFS);
> > +		nr_writeback = global_page_state(NR_WRITEBACK) +
> > +			global_page_state(NR_WRITEBACK_TEMP);
> >  
> >  		/*
> >  		 * In order to avoid the stacked BDI deadlock we need
> > @@ -557,16 +497,48 @@ static void balance_dirty_pages(struct a
> >  			bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
> >  		}
> >  
> 
> > +		/* always throttle if over threshold */
> > +		if (nr_reclaimable + nr_writeback < dirty_thresh) {
> 
> That 'if' is a big behavior change. It effectively blocks every one
> and canceled Peter's proportional throttling work: the less a process
> dirtied, the less it should be throttled.

Hmm, I think you're right, I had not considered that, thanks for
catching that.

> I'd propose to remove the above 'if' and liberate the following three 'if's.

That might work, but it looses the total dirty_thresh constraint. The
sum of per-bdi dirties _should_ not be larger than that, but I'm not
sure it won't ever be.

The clip code Richard removed ensured that, and I think I wrote that out
of more than sheer paranoia, but I'm not sure anymore :/

I'll go over the numeric stuff again to see where it could go wrong.

If we can bound the error (I'm suspecting it was some numerical error
bound) we should be good and can indeed do this.

> > +
> > +			if (bdi_nr_reclaimable + bdi_nr_writeback <= bdi_thresh)
> > +				break;
> > +
> > +			/*
> > +			 * Throttle it only when the background writeback cannot
> > +			 * catch-up. This avoids (excessively) small writeouts
> > +			 * when the bdi limits are ramping up.
> > +			 */
> > +			if (nr_reclaimable + nr_writeback <
> > +			    (background_thresh + dirty_thresh) / 2)
> > +				break;
> > +
> > +			/* done enough? */
> > +			if (pages_written >= write_chunk)
> > +				break;
> > +		}
> > +		if (!bdi->dirty_exceeded)
> > +			bdi->dirty_exceeded = 1;
> >  
> > +		/* Note: nr_reclaimable denotes nr_dirty + nr_unstable.
> > +		 * Unstable writes are a feature of certain networked
> > +		 * filesystems (i.e. NFS) in which data may have been
> > +		 * written to the server's write cache, but has not yet
> > +		 * been flushed to permanent storage.
> > +		 * Only move pages to writeback if this bdi is over its
> > +		 * threshold otherwise wait until the disk writes catch
> > +		 * up.
> > +		 */
> > +		if (bdi_nr_reclaimable > bdi_thresh) {
> > +			writeback_inodes(&wbc);
> > +			pages_written += write_chunk - wbc.nr_to_write;
> 
> > +			if (wbc.nr_to_write == 0)
> > +				continue;
> 
> What's the purpose of the above 2 lines?

I think I should slow down, I seem to have totally missed these two
lines when I read the patch :/

> > +		}
> >  		congestion_wait(BLK_RW_ASYNC, HZ/10);
> >  	}
> >  
> >  	if (bdi_nr_reclaimable + bdi_nr_writeback < bdi_thresh &&
> > +	    bdi->dirty_exceeded)
> >  		bdi->dirty_exceeded = 0;
> >  
> >  	if (writeback_in_progress(bdi))
> > @@ -580,10 +552,8 @@ static void balance_dirty_pages(struct a
> >  	 * In normal mode, we start background writeout at the lower
> >  	 * background_thresh, to keep the amount of dirty memory low.
> >  	 */
> > +	if ((laptop_mode && pages_written) || (!laptop_mode &&
> > +	     (nr_reclaimable > background_thresh)))
> >  		bdi_start_writeback(bdi, NULL, 0, WB_SYNC_NONE);
> >  }



  reply	other threads:[~2009-08-22 18:12 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <200908212250.n7LMox3g029154@imap1.linux-foundation.org>
2009-08-22  2:51 ` + mm-balance_dirty_pages-reduce-calls-to-global_page_state-to-reduce-c ache-references.patch added to -mm tree Wu Fengguang
2009-08-22 18:11   ` Peter Zijlstra [this message]
2009-08-23  1:32     ` Wu Fengguang
2009-08-23  5:31       ` Peter Zijlstra
2009-08-23  7:27         ` Wu Fengguang
2009-08-23  7:45           ` Peter Zijlstra
2009-09-02  8:31     ` Peter Zijlstra
2009-09-02  9:57       ` Wu Fengguang
2009-09-02 10:45         ` Peter Zijlstra
2009-09-02 13:53           ` Richard Kennedy
2009-09-03  2:22             ` Wu Fengguang
2009-09-03  3:09               ` Wu Fengguang
2009-09-03  9:48               ` Richard Kennedy
2009-09-03 11:05                 ` Wu Fengguang
2009-09-03 12:26                   ` Richard Kennedy
2009-09-03  4:53           ` Wu Fengguang
2009-08-23  9:33   ` Richard Kennedy
2009-08-23 13:00     ` Wu Fengguang
2009-08-23 13:46       ` Richard Kennedy
2009-08-24  1:41         ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1250964701.7538.101.camel@twins \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=fengguang.wu@intel.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mbligh@mbligh.org \
    --cc=miklos@szeredi.hu \
    --cc=mm-commits@vger.kernel.org \
    --cc=richard@rsk.demon.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).