linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Chinner <dgc@sgi.com>
To: Alex Tomas <alex@clusterfs.com>
Cc: Jeff Garzik <jeff@garzik.org>,
	ext4 development <linux-ext4@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [RFC] basic delayed allocation in VFS
Date: Fri, 27 Jul 2007 15:07:14 +1000	[thread overview]
Message-ID: <20070727050714.GS12413810@sgi.com> (raw)
In-Reply-To: <46A8A294.2070106@clusterfs.com>

[please don't top post!]

On Thu, Jul 26, 2007 at 05:33:08PM +0400, Alex Tomas wrote:
> Jeff Garzik wrote:
> >The XFS one is proven and the work was already completed.
> >
> >What were the specific technical issues that made it unsuitable for ext4?
> >
> >I would rather not reinvent the wheel, particularly if the reinvention 
> >is less capable than the existing work.
>
> It duplicates fs/mpage.c in bio building and introduces new generic API
> (iomap, map_blocks_t, etc).

Using a new API for new functionality is a bad thing?

> In contrast, my trivial implementation re-use
> existing code in fs/mpage.c, doesn't introduce new API and I tend to think
> provides quite the same functionality. I can be wrong, of course ...

No, it doesn't provide the same functionality.

Firstly, XFS attaches a different I/O completion to delalloc writes
to allow us to update the file size when the write is beyond the
current on disk EOF. This code cannot do that as all it does is
allocation and present "normal looking" buffers to the generic code
path.

Secondly, apart from delalloc, XFS cannot use the generic code paths
for writeback because unwritten extent conversion also requires
custom I/O completion handlers. Given that __mpage_writepage() only
calls ->writepage when it is confused, XFS simply cannot use this
API.

Also, looking at the way mpage_da_map_blocks() is done - if we have
an 128MB delalloc extent - ext4 will allocate that will allocate it
in one go, right? What happens if we then crash after only writing a
few megabytes of that extent? stale data exposure? XFS can allocate
multiple gigabytes in a single get_blocks call so even if ext4 can't
do this, it's a problem for XFS.....

So without the ability to attach specific I/O completions to bios
or support for unwritten extents directly in __mpage_writepage,
there is no way XFS can use this "generic" delayed allocation code.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

  reply	other threads:[~2007-07-27  5:07 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-26  8:59 [RFC] basic delayed allocation in VFS Alex Tomas
2007-07-26 10:32 ` Jeff Garzik
2007-07-26 10:35   ` Alex Tomas
2007-07-26 12:05     ` Jeff Garzik
2007-07-26 13:33       ` Alex Tomas
2007-07-27  5:07         ` David Chinner [this message]
2007-07-27  7:51           ` Alex Tomas
2007-07-27 12:33             ` Jeff Garzik
2007-07-27 12:42               ` Alex Tomas
2007-07-28 19:56             ` Christoph Hellwig
2007-07-29  9:18             ` David Chinner
2007-07-29 12:09               ` Alex Tomas
2007-07-30  0:29                 ` David Chinner
2007-07-27 12:38           ` Alex Tomas
2007-07-28 19:57             ` Christoph Hellwig
2007-07-28 19:53           ` Christoph Hellwig
2007-07-28 19:51   ` Christoph Hellwig
2007-07-28 19:56     ` Alex Tomas
2007-07-29 17:30     ` Andreas Dilger
2007-07-29 17:48       ` Alex Tomas
2007-07-29 19:22         ` Christoph Hellwig
2007-07-29 19:24       ` Christoph Hellwig
2007-07-29 19:51         ` Alex Tomas
2007-07-30  0:28         ` Theodore Tso
2007-07-30 17:49         ` Mingming Cao
2007-07-30 19:43           ` Andrew Morton
2007-07-26 11:47 ` Aneesh Kumar K.V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070727050714.GS12413810@sgi.com \
    --to=dgc@sgi.com \
    --cc=alex@clusterfs.com \
    --cc=hch@infradead.org \
    --cc=jeff@garzik.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).