linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Whitehouse <steve@chygwyn.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>, Nick Piggin <npiggin@suse.de>,
	Josef Bacik <josef@redhat.com>,
	linux-fsdevel@vger.kernel.org, chris.mason@oracle.com,
	hch@infradead.org, akpm@linux-foundation.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC] new ->perform_write fop
Date: Fri, 21 May 2010 10:05:10 +0100	[thread overview]
Message-ID: <1274432710.2578.11.camel@localhost> (raw)
In-Reply-To: <20100520230524.GU8120@dastard>

Hi,

On Fri, 2010-05-21 at 09:05 +1000, Dave Chinner wrote:
> On Thu, May 20, 2010 at 10:12:32PM +0200, Jan Kara wrote:
> > On Thu 20-05-10 09:50:54, Dave Chinner wrote:
> > > On Wed, May 19, 2010 at 01:09:12AM +1000, Nick Piggin wrote:
> > > > On Tue, May 18, 2010 at 10:27:14PM +1000, Dave Chinner wrote:
> > > > 
> > > > Is it really going to be a problem to implement block hole punching
> > > > in ext4 and gfs2?
> > > 
[snip]
> >   b) E.g. ext4 can do even without hole punching. It can allocate extent
> >      as 'unwritten' and when something during the write fails, it just
> >      leaves the extent allocated and the 'unwritten' flag makes sure that
> >      any read will see zeros. I suppose that other filesystems that care
> >      about multipage writes are able to do similar things (e.g. btrfs can
> >      do the same as far as I remember, I'm not sure about gfs2).
> 
> Allocating multipage writes as unwritten extents turns off delayed
> allocation and hence we'd lose all the benefits that this gives...

It should be possible to implement hole punching in GFS2 I think. The
main issue is locking order of resource groups. We have on our todo list
a rewrite of the truncate/delete code which is currently used to
deallocate data blocks and metadata tree blocks. The current algorithm
is a rather inefficient recursive scanning of the tree which is done
multiple times depending on the tree height.

Adapting that to punch holes should be possible without too much effort
if we need to do that. We do need to allow for the possibility that such
a deallocation might have to be split into multiple transactions
depending on the amount of metadata involved (for large files, this
could be larger than the size of the log for example). Currently the
code will split up truncates into multiple transactions which allows the
deallocation to be restartable from any transaction boundary.

GFS2 does not have any way to mark unwritten extents, so we cannot do
delayed allocation or implement an efficient fallocate. We can do better
performance-wise than just dd'ing zeros to a file for fallocate, but
we'll never be able to match a fs that can mark extents unwritten in
performance terms,

Steve.

  reply	other threads:[~2010-05-21  9:05 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-12 21:24 [RFC] new ->perform_write fop Josef Bacik
2010-05-13  1:39 ` Josef Bacik
2010-05-13 15:36   ` Christoph Hellwig
2010-05-14  1:00   ` Dave Chinner
2010-05-14  3:30     ` Josef Bacik
2010-05-14  5:50       ` Nick Piggin
2010-05-14  7:20         ` Dave Chinner
2010-05-14  7:33           ` Nick Piggin
2010-05-14  6:41       ` Dave Chinner
2010-05-14  7:22         ` Nick Piggin
2010-05-14  8:38           ` Dave Chinner
2010-05-14 13:33             ` Chris Mason
2010-05-18  6:36             ` Nick Piggin
2010-05-18  8:05               ` Dave Chinner
2010-05-18 10:43                 ` Nick Piggin
2010-05-18 12:27                   ` Dave Chinner
2010-05-18 15:09                     ` Nick Piggin
2010-05-19 23:50                       ` Dave Chinner
2010-05-20  6:48                         ` Nick Piggin
2010-05-20 20:12                         ` Jan Kara
2010-05-20 23:05                           ` Dave Chinner
2010-05-21  9:05                             ` Steven Whitehouse [this message]
2010-05-21 13:50                             ` Josef Bacik
2010-05-21 14:23                               ` Nick Piggin
2010-05-21 15:19                                 ` Josef Bacik
2010-05-24  3:29                                   ` Nick Piggin
2010-05-22  0:31                               ` Dave Chinner
2010-05-21 18:58                             ` Jan Kara
2010-05-22  0:27                               ` Dave Chinner
2010-05-24  9:20                                 ` Jan Kara
2010-05-24  9:33                                   ` Nick Piggin
2010-06-05 15:05                                   ` tytso
2010-06-06  7:59                                     ` Nick Piggin
2010-05-21 15:15           ` Christoph Hellwig
2010-05-22  2:31             ` Nick Piggin
2010-05-22  8:37               ` Dave Chinner
2010-05-24  3:09                 ` Nick Piggin
2010-05-24  5:53                   ` Dave Chinner
2010-05-24  6:55                     ` Nick Piggin
2010-05-24 10:21                       ` Dave Chinner
2010-06-01  6:27                         ` Nick Piggin
2010-05-24 18:40                       ` Joel Becker
2010-05-17 23:35         ` Jan Kara
2010-05-18  1:21           ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1274432710.2578.11.camel@localhost \
    --to=steve@chygwyn.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=josef@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).