public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Mike Waychison <mikew@google.com>
To: Andreas Dilger <adilger@clusterfs.com>
Cc: Theodore Tso <tytso@mit.edu>,
	Andrew Morton <akpm@linux-foundation.org>,
	Sreenivasa Busam <sreenivasac@google.com>,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Subject: Re: fallocate support for bitmap-based files
Date: Fri, 29 Jun 2007 18:26:30 -0400	[thread overview]
Message-ID: <46858716.2050901@google.com> (raw)
In-Reply-To: <20070629214615.GB5026@schatzie.adilger.int>

Andreas Dilger wrote:
> On Jun 29, 2007  16:55 -0400, Theodore Tso wrote:
> 
>>What's the eventual goal of this work?  Would it be for mainline use,
>>or just something that would be used internally at Google?  I'm not
>>particularly ennthused about supporting two ways of doing fallocate();
>>one for ext4 and one for bitmap-based files in ext2/3/4.  Is the
>>benefit reallyworth it?
>>
>>What I would suggest, which would make much easier, is to make this be
>>an incompatible extensions (which you as you point out is needed for
>>security reasons anyway) and then steal the high bit from the block
>>number field to indicate whether or not the block has been initialized
>>or not.  That way you don't end up having to seek to a potentially
>>distant part of the disk to check out the bitmap.  Also, you don't
>>have to worry about how to recover if the "block initialized bitmap"
>>inode gets smashed.  
>>
>>The downside is that it reduces the maximum size of the filesystem
>>supported by ext2 by a factor of two.  But, there are at least two
>>patch series floating about that promise to allow filesystem block
>>sizes > than PAGE_SIZE which would allow you to recover the maximum
>>size supported by the filesytem.
> 
> 
> I don't think ext2 is safe for > 8TB filesystems anyways, so this
> isn't a huge loss.

This is reference to the idea of overloading the high-bit and not 
related to the >PAGE_SIZE blocks correct?

> 
> The other possibility is, assuming Google likes ext2 because they
> don't care about e2fsck, is to patch ext4 to not use any
> journaling (i.e. make all of the ext4_journal*() wrappers be
> no-ops).  That way they would get extents, mballoc and other speedups.
> 

We do care about the e2fsck problem, though the cost/benefit of e2fsck 
times/memory problems vs the overhead of journalling doesn't weigh in 
journalling's favour for a lot of our per-spindle-latency bound 
applications.  These apps manage to get pretty good disk locality 
guarantees and the journal overheads can induce undesired head movement.

ext4 does look very promising, though I'm not certain it's ready for our 
consumption.

What are people's thoughts on providing ext3 non-journal mode?  We could 
benefit from several of the additions to ext3 that aren't available in 
ext2 and disabling journalling there sounds much more feasible for us 
instead of trying to backport each ext3 component to ext2.

Mike Waychison

> That said, what is the reason for not using ext3?  Presumably performance
> (which is greatly improved in ext4) or is there something else?
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Principal Software Engineer
> Cluster File Systems, Inc.
> 

  reply	other threads:[~2007-06-29 22:27 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-29 20:01 fallocate support for bitmap-based files Andrew Morton
2007-06-29 20:36 ` Dave Kleikamp
2007-06-29 20:52   ` Mike Waychison
2007-06-29 21:24     ` Dave Kleikamp
2007-06-29 20:55 ` Theodore Tso
2007-06-29 21:38   ` Andrew Morton
2007-06-29 22:07     ` Mike Waychison
2007-07-04 23:11       ` Valerie Henson
2007-07-06 21:15         ` Mike Waychison
2007-06-29 21:46   ` Andreas Dilger
2007-06-29 22:26     ` Mike Waychison [this message]
2007-06-30  5:14       ` Andreas Dilger
2007-06-30 14:31         ` Mingming Cao
2007-06-30 14:13 ` Mingming Cao
2007-06-30 17:29   ` Andreas Dilger
2007-07-02 14:44     ` Mingming Cao
2007-07-02 17:44   ` Badari Pulavarty
2007-07-06 21:33     ` Mike Waychison
2007-07-07  2:05       ` Badari Pulavarty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46858716.2050901@google.com \
    --to=mikew@google.com \
    --cc=adilger@clusterfs.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=sreenivasac@google.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox