From: Chris Mason <chris.mason@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <jens.axboe@oracle.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
npiggin@suse.de
Subject: Re: [PATCH 2/7] writeback: switch to per-bdi threads for flushing data
Date: Tue, 17 Mar 2009 09:21:14 -0400 [thread overview]
Message-ID: <1237296074.31273.19.camel@think.oraclecorp.com> (raw)
In-Reply-To: <20090316233835.GM26138@disturbed>
On Tue, 2009-03-17 at 10:38 +1100, Dave Chinner wrote:
> On Mon, Mar 16, 2009 at 08:33:21AM +0100, Jens Axboe wrote:
> > On Mon, Mar 16 2009, Dave Chinner wrote:
> > > On Fri, Mar 13, 2009 at 11:54:46AM +0100, Jens Axboe wrote:
> > > > On Thu, Mar 12 2009, Andrew Morton wrote:
> > > > > On Thu, 12 Mar 2009 15:33:43 +0100 Jens Axboe <jens.axboe@oracle.com> wrote:
> > > > > Bear in mind that the XFS guys found that one thread per fs had
> > > > > insufficient CPU power to keep up with fast devices.
> > > >
> > > > Yes, I definitely want to experiment with > 1 thread per device in the
> > > > near future.
> > >
> > > The question here is how to do this efficiently. Even if XFS is
> > > operating on a single device, it is not optimal just to throw
> > > multiple threads at the bdi. Ideally we want a thread per region
> > > (allocation group) of the filesystem as each allocation group has
> > > it's own inode cache (radix tree) to traverse. These traversals can
> > > be done completely in parallel and won't contend either at the
> > > traversal level or in the IO hardware....
> > >
> > > i.e. what I'd like to see is the ability so any new flushing
> > > mechanism to be able to offload responsibility of tracking,
> > > traversing and flushing of dirty inodes to the filesystem.
> > > Filesystems that don't do such things could use a generic
> > > bdi-based implementation.
> > >
> > > FWIW, we also want to avoid the current pattern of flushing
> > > data, then the inode, then data, then the inode, ....
> > > By offloading into the filesystem, this writeback ordering can
> > > be done as efficiently as possible for each given filesystem.
> > > XFs already has all the hooks to be able to do this
> > > effectively....
> > >
> > > I know that Christoph was doing some work towards this end;
> > > perhaps he can throw his 2c worth in here...
> >
> > This is very useful feedback, thanks Dave. So on the filesystem vs bdi
> > side, XFS could register a bdi per allocation group.
>
> How do multiple bdis on a single block device interact?
The main difference is that dirty page tracking for balance_dirty_pages
and friends is done per-bdi. So, you'll end up with uneven memory
pressure on ags that don't have much dirty data, but hopefully that's a
good thing.
-chris
next prev parent reply other threads:[~2009-03-17 13:22 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-12 14:33 [PATCH 0/7] Per-bdi writeback flusher threads Jens Axboe
2009-03-12 14:33 ` [PATCH 1/7] writeback: move dirty inodes from super_block to backing_dev_info Jens Axboe
2009-03-24 16:17 ` Jan Kara
2009-03-24 18:45 ` Jens Axboe
2009-03-12 14:33 ` [PATCH 2/7] writeback: switch to per-bdi threads for flushing data Jens Axboe
2009-03-13 5:33 ` Andrew Morton
2009-03-13 10:54 ` Jens Axboe
2009-03-15 22:52 ` Dave Chinner
2009-03-16 7:33 ` Jens Axboe
2009-03-16 10:17 ` Christoph Hellwig
2009-03-16 10:21 ` Jens Axboe
2009-03-16 23:38 ` Dave Chinner
2009-03-17 9:37 ` Jens Axboe
2009-03-17 13:21 ` Chris Mason [this message]
2009-03-16 10:22 ` Christoph Hellwig
2009-03-16 10:22 ` Christoph Hellwig
2009-03-16 13:30 ` Chris Mason
2009-03-16 13:30 ` Chris Mason
2009-03-16 13:39 ` Christoph Hellwig
2009-03-16 13:39 ` Christoph Hellwig
2009-03-12 14:33 ` [PATCH 3/7] writeback: get rid of pdflush_operation() in emergency sync and remount Jens Axboe
2009-03-16 10:13 ` Christoph Hellwig
2009-03-12 14:33 ` [PATCH 4/7] writeback: get rid of task/current_is_pdflush() Jens Axboe
2009-03-16 10:14 ` Christoph Hellwig
2009-03-16 10:22 ` Jens Axboe
2009-03-16 13:26 ` Chris Mason
2009-03-12 14:33 ` [PATCH 5/7] writeback: move the default backing_dev_info out of readahead Jens Axboe
2009-03-16 10:19 ` Christoph Hellwig
2009-03-16 10:23 ` Jens Axboe
2009-03-12 14:33 ` [PATCH 6/7] writeback: add lazy bdi->task creation Jens Axboe
2009-03-12 14:33 ` [PATCH 7/7] writeback: add some debug inode list counters to bdi stats Jens Axboe
-- strict thread matches above, loose matches on Subject: below --
2009-08-31 19:41 [PATCH 0/7] Per-bdi writeback flusher threads v15 Jens Axboe
2009-08-31 19:41 ` [PATCH 2/7] writeback: switch to per-bdi threads for flushing data Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1237296074.31273.19.camel@think.oraclecorp.com \
--to=chris.mason@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=jens.axboe@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=npiggin@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.