All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: Chris Mason <chris.mason@oracle.com>,
	Theodore Tso <tytso@mit.edu>, Jens Axboe <jens.axboe@oracle.com>,
	Christoph Hellwig <hch@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>
Subject: Re: [PATCH 0/7] Per-bdi writeback flusher threads v20
Date: Tue, 22 Sep 2009 21:33:58 +0800	[thread overview]
Message-ID: <20090922133358.GC7675@localhost> (raw)
In-Reply-To: <20090922113055.GA13887@duck.suse.cz>

On Tue, Sep 22, 2009 at 07:30:55PM +0800, Jan Kara wrote:
> On Tue 22-09-09 18:13:35, Wu Fengguang wrote:
> > Yes a more general solution would help. I'd like to propose one which
> > works in the other way round. In brief,
> > (1) the VFS give a large enough per-file writeback quota to btrfs;
> > (2) btrfs tells VFS "here is a (seek) boundary, stop voluntarily",
> >     before exhausting the quota and be force stopped.
> > 
> > There will be two limits (the second one is new):
> > 
> > - total nr to write in one wb_writeback invocation
> > - _max_ nr to write per file (before switching to sync the next inode)
> > 
> > The per-invocation limit is useful for balance_dirty_pages().
> > The per-file number can be accumulated across successive wb_writeback
> > invocations and thus can be much larger (eg. 128MB) than the legacy
> > per-invocation number. 
>   Actually, it doesn't make much sence to have a per-file limit in number
> of pages. I've been playing with an idea that we could have a per-file
> *time* quota. That would have an advantage that if a file generates random
> IO, we wouldn't block for longer time on it than when it generates linear
> IO.

Heh, FYI recently I tried per-file submission time quota:

        http://lkml.org/lkml/2009/9/10/54

Though I didn't take randomness of IO into account, which definitely
deserves some attention.

>   I imagine that in ->writepage we would substract from given time quota in
> wbc the time it takes to write the current page. It would need some context
> in wbc so that it is able to tell whether the IO is linear or random to
> properly account for some seek penalty but generally it seems to be
> doable...

Yeah, maybe page segments that are distant enough could be treated as "seeks".

>   Filesystems implementing ->writepages can then make decision whether they
> have enough time quota to seek to next extent and write it out or whether
> they should rather yield to other inodes...

Yeah, it's possible. VFS provides (one or more) quota info and
file systems decide when to yield.

Thanks,
Fengguang

> > The file system will only see the per-file numbers. The "max" means
> > if btrfs find the current page to be the last page in the extent,
> > it could indicate this fact to VFS by setting wbc->would_seek=1. The
> > VFS will then switch to write the next inode.
> > 
> > The benefit of early voluntarily yield is, it reduced the possibility
> > to be force stopped half way in an extent. When next time VFS returns
> > to sync this inode, it will again be honored the full 128MB quota,
> > which should be enough to cover a big fresh extent.
> 
> 								Honza
> -- 
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR

  reply	other threads:[~2009-09-22 13:34 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-11  7:34 [PATCH 0/7] Per-bdi writeback flusher threads v20 Jens Axboe
2009-09-11  7:34 ` [PATCH 1/7] writeback: get rid of generic_sync_sb_inodes() export Jens Axboe
2009-09-11  7:34 ` [PATCH 2/7] writeback: move dirty inodes from super_block to backing_dev_info Jens Axboe
2009-09-11  7:34 ` [PATCH 3/7] writeback: switch to per-bdi threads for flushing data Jens Axboe
2009-09-11  7:34 ` [PATCH 4/7] writeback: get rid of pdflush completely Jens Axboe
2009-09-11  7:34 ` [PATCH 5/7] writeback: add some debug inode list counters to bdi stats Jens Axboe
2009-09-11  7:34 ` [PATCH 6/7] writeback: add name to backing_dev_info Jens Axboe
2009-09-11  7:34 ` [PATCH 7/7] writeback: check for registered bdi in flusher add and inode dirty Jens Axboe
2009-09-11 13:42 ` [PATCH 0/7] Per-bdi writeback flusher threads v20 Theodore Tso
2009-09-11 13:45   ` Chris Mason
2009-09-11 13:45     ` Chris Mason
2009-09-11 14:04     ` Jens Axboe
2009-09-11 14:16   ` Christoph Hellwig
2009-09-11 14:16     ` Christoph Hellwig
2009-09-11 14:29     ` Jens Axboe
2009-09-11 14:39       ` Wu Fengguang
2009-09-18 17:52         ` Theodore Tso
2009-09-19  3:58           ` Wu Fengguang
2009-09-19  3:58             ` Wu Fengguang
2009-09-19  4:00             ` Wu Fengguang
2009-09-19  4:00               ` Wu Fengguang
2009-09-19  4:26               ` Wu Fengguang
2009-09-19  4:26               ` Wu Fengguang
2009-09-19 15:03                 ` Wu Fengguang
2009-09-19 15:03                 ` Wu Fengguang
2009-09-20 19:00                   ` Jan Kara
2009-09-21  3:04                     ` Wu Fengguang
2009-09-21  5:35                       ` Wu Fengguang
2009-09-21  9:53                         ` Wu Fengguang
2009-09-21 10:02                           ` Jan Kara
2009-09-21 10:18                             ` Wu Fengguang
2009-09-21 12:42                       ` Jan Kara
2009-09-21 15:12                         ` Wu Fengguang
2009-09-21 16:08                           ` Jan Kara
2009-09-22  5:10                             ` Wu Fengguang
2009-09-21 13:53                 ` Chris Mason
2009-09-22 10:13                   ` Wu Fengguang
2009-09-22 10:13                   ` Wu Fengguang
2009-09-22 11:30                     ` Jan Kara
2009-09-22 13:33                       ` Wu Fengguang [this message]
2009-09-22 11:30                     ` Chris Mason
2009-09-22 11:45                       ` Jan Kara
2009-09-22 12:47                         ` Wu Fengguang
2009-09-22 17:41                         ` Chris Mason
2009-09-22 13:18                       ` Wu Fengguang
2009-09-22 13:18                         ` Wu Fengguang
2009-09-22 15:59                         ` Chris Mason
2009-09-23  1:05                           ` Wu Fengguang
2009-09-23  1:05                             ` Wu Fengguang
2009-09-23 14:08                             ` Chris Mason
2009-09-24  1:32                               ` Wu Fengguang
2009-09-24  1:32                               ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090922133358.GC7675@localhost \
    --to=fengguang.wu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jens.axboe@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.