From: Jan Kara <jack@suse.cz>
To: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>, Nick Piggin <npiggin@suse.de>,
Josef Bacik <josef@redhat.com>,
linux-fsdevel@vger.kernel.org, chris.mason@oracle.com,
hch@infradead.org, akpm@linux-foundation.org,
linux-kernel@vger.kernel.org
Subject: Re: [RFC] new ->perform_write fop
Date: Fri, 21 May 2010 20:58:46 +0200 [thread overview]
Message-ID: <20100521185846.GD10149@quack.suse.cz> (raw)
In-Reply-To: <20100520230524.GU8120@dastard>
On Fri 21-05-10 09:05:24, Dave Chinner wrote:
> On Thu, May 20, 2010 at 10:12:32PM +0200, Jan Kara wrote:
> > Hmm, I was thinking about it and I see two options how to get out
> > of problems:
> > a) Filesystems which are not able to handle hole punching will allow
> > multipage writes only after EOF (which can be easily undone by
> > truncate in case of failure). That should actually cover lots of
> > cases we are interested in (I don't expect multipage writes to holes
> > to be a common case).
>
> multipage writes to holes is a relatively common operation in the
> HPC space that XFS is designed for (e.g. calculations on huge sparse
> matrices), so I'm not really fond of this idea....
Well, XFS could still handle them because it is able to do hole punching
but I get your point.
> > b) E.g. ext4 can do even without hole punching. It can allocate extent
> > as 'unwritten' and when something during the write fails, it just
> > leaves the extent allocated and the 'unwritten' flag makes sure that
> > any read will see zeros. I suppose that other filesystems that care
> > about multipage writes are able to do similar things (e.g. btrfs can
> > do the same as far as I remember, I'm not sure about gfs2).
>
> Allocating multipage writes as unwritten extents turns off delayed
> allocation and hence we'd lose all the benefits that this gives...
Ah, sorry. That was a short-circuit in my brain. But when we do delayed
I don't see why we should actually do any hole punching... The write needs
to:
a) reserve enough blocks for the write - I don't know about other
filesystems but for ext4 this means just incrementing a counter.
b) copy data page by page
c) release part of reservation (i.e. decrement counter) if we actually
copied less than we originally thought.
Am I missing something?
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
next prev parent reply other threads:[~2010-05-21 18:58 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-12 21:24 [RFC] new ->perform_write fop Josef Bacik
2010-05-13 1:39 ` Josef Bacik
2010-05-13 15:36 ` Christoph Hellwig
2010-05-14 1:00 ` Dave Chinner
2010-05-14 3:30 ` Josef Bacik
2010-05-14 5:50 ` Nick Piggin
2010-05-14 7:20 ` Dave Chinner
2010-05-14 7:33 ` Nick Piggin
2010-05-14 6:41 ` Dave Chinner
2010-05-14 7:22 ` Nick Piggin
2010-05-14 8:38 ` Dave Chinner
2010-05-14 13:33 ` Chris Mason
2010-05-18 6:36 ` Nick Piggin
2010-05-18 8:05 ` Dave Chinner
2010-05-18 10:43 ` Nick Piggin
2010-05-18 10:43 ` Nick Piggin
2010-05-18 12:27 ` Dave Chinner
2010-05-18 12:27 ` Dave Chinner
2010-05-18 15:09 ` Nick Piggin
2010-05-19 23:50 ` Dave Chinner
2010-05-20 6:48 ` Nick Piggin
2010-05-20 20:12 ` Jan Kara
2010-05-20 23:05 ` Dave Chinner
2010-05-21 9:05 ` Steven Whitehouse
2010-05-21 13:50 ` Josef Bacik
2010-05-21 13:50 ` Josef Bacik
2010-05-21 14:23 ` Nick Piggin
2010-05-21 15:19 ` Josef Bacik
2010-05-24 3:29 ` Nick Piggin
2010-05-22 0:31 ` Dave Chinner
2010-05-21 18:58 ` Jan Kara [this message]
2010-05-22 0:27 ` Dave Chinner
2010-05-24 9:20 ` Jan Kara
2010-05-24 9:33 ` Nick Piggin
2010-06-05 15:05 ` tytso
2010-06-06 7:59 ` Nick Piggin
2010-06-06 7:59 ` Nick Piggin
2010-05-21 15:15 ` Christoph Hellwig
2010-05-22 2:31 ` Nick Piggin
2010-05-22 8:37 ` Dave Chinner
2010-05-24 3:09 ` Nick Piggin
2010-05-24 5:53 ` Dave Chinner
2010-05-24 6:55 ` Nick Piggin
2010-05-24 10:21 ` Dave Chinner
2010-06-01 6:27 ` Nick Piggin
2010-05-24 18:40 ` Joel Becker
2010-05-17 23:35 ` Jan Kara
2010-05-18 1:21 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100521185846.GD10149@quack.suse.cz \
--to=jack@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=chris.mason@oracle.com \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=josef@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=npiggin@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.