linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Theodore Tso <tytso@mit.edu>,
	Christoph Hellwig <hch@infradead.org>,
	Dave Chinner <david@fromorbit.com>,
	Chris Mason <chris.mason@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	"Li, Shaohua" <shaohua.li@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"richard@rsk.demon.co.uk" <richard@rsk.demon.co.uk>,
	"jens.axboe@oracle.com" <jens.axboe@oracle.com>
Subject: Re: regression in page writeback
Date: Sat, 3 Oct 2009 14:10:44 +0800	[thread overview]
Message-ID: <20091003061044.GA3791@localhost> (raw)
In-Reply-To: <20091002172620.GB8161@mit.edu>

On Sat, Oct 03, 2009 at 01:26:20AM +0800, Theodore Ts'o wrote:
> On Fri, Oct 02, 2009 at 04:19:53PM +0800, Wu Fengguang wrote:
> > > > The big writes, if they are contiguous, could take 1-2 seconds
> > > > on a very slow, ancient laptop disk, and that will hold up any kind of 
> > > > small synchornous activities --- such as either a disk read or a firefox-
> > > > triggered fsync().
> > > 
> > > Yes, that's a problem. The SYNC/ASYNC elevator queues can help here.
> 
> The SYNC/ASYNC queues will partially help, up to the whatever the
> largest I/O that can issued as a single chunk times the queue depth
> for those disks that support NCQ. 
> 
> > > There's still the problem of IO submission time != IO completion time,
> > > due to fluctuations of randomness and more. However that's a general
> > > and unavoidable problem.  Both the wbc.timeout scheme and the
> > > "wbc.nr_to_write based on estimated throughput" scheme are based on
> > > _past_ requests and it's simply impossible to have a 100% accurate
> > > scheme. In principle, wbc.timeout will only be inferior at IO startup
> > > time. In the steady state of 100% full queue, it is actually estimating
> > > the IO throughput implicitly :)
> > 
> > Another difference between wbc.timeout and adaptive wbc.nr_to_write
> > is, when there comes many _read_ requests or fsync, these SYNC rw
> > requests will significant lower the ASYNC writeback throughput, if
> > it's not completely stalled. So with timeout, the inode will be
> > aborted with few pages written; with nr_to_write, the inode will be
> > written a good number of pages, at the cost of taking up long time.
> > 
> > IMHO the nr_to_write behavior seems more efficient. What do you think?
> 
> I agree, adaptively changing nr_to_write seems like the right thing to

I'd like to estimate the writeback throughput in bdi_writeback_wakeup(),
where the queue is not starved and the estimation would reflect the max
device capability (unless there are busy reads, in which case we need
lower nr_to_write anyway).

> do.  For bonus points, we could also monitor how often synchronous I/O
> operations are happening, allow nr_to_write to go up by some amount if
> there aren't many synchronous operations happening at the moment.  So
> that might be another opportunity to do auto-tuning, although this
> might be a hueristic that might need to be configurable for certain
> specialized workloads.  For many other workloads, the it should be
> possible to detect regular pattern of reads and/or synchronous writes,
> and if so, use a lower nr_to_write versus if there isn't many
> synchronous I/O operations happening on that particular block device.

It's not easy to get state of the art SYNC read/write busyness.
However it is possible to "feel" them through the progress of ASYNC
writes.

- setup a per-file timeout=3*HZ
- check this in write_cache_pages:

        if (half nr_to_write pages written && timeout)
                break;

In this way we back off to nr_to_write/2 if the writeback is blocked
by some busy READs.

I'd choose to implement this advanced feature some time later :)

Thanks,
Fengguang

  reply	other threads:[~2009-10-03  6:11 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-22  5:49 regression in page writeback Shaohua Li
2009-09-22  6:40 ` Peter Zijlstra
2009-09-22  8:05   ` Wu Fengguang
2009-09-22  8:09     ` Peter Zijlstra
2009-09-22  8:24       ` Wu Fengguang
2009-09-22  8:32         ` Peter Zijlstra
2009-09-22  8:51           ` Wu Fengguang
2009-09-22  8:52           ` Richard Kennedy
2009-09-22  9:05             ` Wu Fengguang
2009-09-22 11:41               ` Shaohua Li
2009-09-22 15:52           ` Chris Mason
2009-09-23  0:22             ` Wu Fengguang
2009-09-23  0:54               ` Andrew Morton
2009-09-23  1:17                 ` Wu Fengguang
2009-09-23  1:27                   ` Wu Fengguang
2009-09-23  1:28                   ` Andrew Morton
2009-09-23  1:32                     ` Wu Fengguang
2009-09-23  1:47                       ` Andrew Morton
2009-09-23  2:01                         ` Wu Fengguang
2009-09-23  2:09                           ` Andrew Morton
2009-09-23  3:07                             ` Wu Fengguang
2009-09-23  1:45                     ` Wu Fengguang
2009-09-23  1:59                       ` Andrew Morton
2009-09-23  2:26                         ` Wu Fengguang
2009-09-23  2:36                           ` Andrew Morton
2009-09-23  2:49                             ` Wu Fengguang
2009-09-23  2:56                               ` Andrew Morton
2009-09-23  3:11                                 ` Wu Fengguang
2009-09-23  3:10                               ` Shaohua Li
2009-09-23  3:14                                 ` Wu Fengguang
2009-09-23  3:25                                   ` Wu Fengguang
2009-09-23 14:00                             ` Chris Mason
2009-09-24  3:15                               ` Wu Fengguang
2009-09-24 12:10                                 ` Chris Mason
2009-09-25  3:26                                   ` Wu Fengguang
2009-09-25  0:11                                 ` Dave Chinner
2009-09-25  0:38                                   ` Chris Mason
2009-09-25  5:04                                     ` Dave Chinner
2009-09-25  6:45                                       ` Wu Fengguang
2009-09-28  1:07                                         ` Dave Chinner
2009-09-28  7:15                                           ` Wu Fengguang
2009-09-28 13:08                                             ` Christoph Hellwig
2009-09-28 14:07                                               ` Theodore Tso
2009-09-30  5:26                                                 ` Wu Fengguang
2009-09-30  5:32                                                   ` Wu Fengguang
2009-10-01 22:17                                                     ` Jan Kara
2009-10-02  3:27                                                       ` Wu Fengguang
2009-10-06 12:55                                                         ` Jan Kara
2009-10-06 13:18                                                           ` Wu Fengguang
2009-09-30 14:11                                                   ` Theodore Tso
2009-10-01 15:14                                                     ` Wu Fengguang
2009-10-01 21:54                                                       ` Theodore Tso
2009-10-02  2:55                                                         ` Wu Fengguang
2009-10-02  8:19                                                           ` Wu Fengguang
2009-10-02 17:26                                                             ` Theodore Tso
2009-10-03  6:10                                                               ` Wu Fengguang [this message]
2009-09-29  2:32                                               ` Wu Fengguang
2009-09-29 14:00                                                 ` Chris Mason
2009-09-29 14:21                                                 ` Christoph Hellwig
2009-09-29  0:15                                             ` Wu Fengguang
2009-09-28 14:25                                           ` Chris Mason
2009-09-29 23:39                                             ` Dave Chinner
2009-09-30  1:30                                               ` Wu Fengguang
2009-09-25 12:06                                       ` Chris Mason
2009-09-25  3:19                                   ` Wu Fengguang
2009-09-26  1:47                                     ` Dave Chinner
2009-09-26  3:02                                       ` Wu Fengguang
2009-09-23  9:19                         ` Richard Kennedy
2009-09-23  9:23                           ` Peter Zijlstra
2009-09-23  9:37                             ` Wu Fengguang
2009-09-23 10:30                               ` Wu Fengguang
2009-09-23  6:41             ` Shaohua Li
2009-09-22 10:49 ` Wu Fengguang
2009-09-22 11:50   ` Shaohua Li
2009-09-22 13:39     ` Wu Fengguang
2009-09-23  1:52       ` Shaohua Li
2009-09-23  4:00         ` Wu Fengguang
2009-09-25  6:14           ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091003061044.GA3791@localhost \
    --to=fengguang.wu@intel.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=richard@rsk.demon.co.uk \
    --cc=shaohua.li@intel.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).