linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: tytso@mit.edu
To: Kailas Joshi <kailas.joshi@gmail.com>
Cc: linux-ext4@vger.kernel.org, Jan Kara <jack@suse.cz>,
	Jiaying Zhang <jiayingz@google.com>
Subject: Re: Help on Implementation of EXT3 type Ordered Mode in EXT4
Date: Thu, 11 Feb 2010 14:56:24 -0500	[thread overview]
Message-ID: <20100211195624.GM739@thunk.org> (raw)
In-Reply-To: <38f6fb7d1002102332v3482ef49xb2afd5931c5eb2ad@mail.gmail.com>

On Thu, Feb 11, 2010 at 01:02:15PM +0530, Kailas Joshi wrote:
> 
> We are assessing the use of copy-on-write technique to provide data
> level consistency in EXT3/EXT4. We have implemented this in EXT3 by
> using the Ordered mode of operation. Benchmark results for IOZone and
> Postmark are quiet good. We could get the consistency equivalent to
> Journal mode with the overhead almost same as Ordered mode. However,
> there are few cases(for example, file rewrite) where performance of
> Journal mode is better than our technique. We think that in EXT4, with
> the support for delayed block allocation and extents, these problems
> can be removed.

Ah, sorry, I misread your initial post; I thouht you were trying to
reimplement the proposed ext4 mode data=guarded.

I've mostly given up on trying to get alloc_on_commit work, for two
reasons.

The first is that one of the reasons why you might be closing the
transaction is if there's not enough space left in the journal.  But
if we you going to a large number of data allocations at commit time,
there's no guaratee that there will be space in the journal for all of
the metadata blocks that might have to be modified in order to make
the block allocations.

The second problem with this scheme is a performance problem; while
you are doing handling delayed allocation blocks, you have to do this
while the journal is still locked, using magic handles that are
allowed to be created while the journal is locked.  That adds all
sorts of complexity, and that seems to what you are thinking about
doing.  The problem though is that while this is going on, all other
file system activity has to be blocked.  So this will cause all sorts
of processes to become suspended waiting for the all of the allocation
activity to complete, which may require bitmap allocation blocks to be
read into disk, etc.

The trade off for all of these problems is that it allows you to delay
the block allocation for only 5 seconds.  The question is, is this
worth it, compared with simply mounting the file system with
nodelalloc?  It may be all of this complexity doesn't produce enough
of a performance gain over simply using nodelalloc.

So maybe the solution for certain distributions that are catering to
the "inexperienced user" / "users who like to use unstable video
drivers" market is to mount with nodelalloc by default, and tell them
that if they want the performance improvements of delayed allocation,
they need to lobby to get the applications fixed.  

(After all, these problems are going to be around no matter whether
people use XFS or btrfs; most modern file systems are going to use
delayed allocation, so sooner or later the broken applications really
need to get fixed.  The defiant user's cry, "well, if you don't fix
this I'll switch to xfs/btrfs!" isn't going to help in this case....)

     	  	    		      - Ted


  reply	other threads:[~2010-02-11 19:56 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-04  5:45 Help on Implementation of EXT3 type Ordered Mode in EXT4 Kailas Joshi
2010-02-09 16:05 ` Jan Kara
2010-02-09 17:41   ` tytso
     [not found]     ` <38f6fb7d1002102301x278c3ddt153f570dd1423074@mail.gmail.com>
2010-02-11  7:32       ` Kailas Joshi
2010-02-11 19:56         ` tytso [this message]
2010-02-12  3:22           ` Kailas Joshi
2010-02-12 20:07             ` tytso
2010-02-13  8:43               ` Kailas Joshi
2010-02-15 15:00                 ` Jan Kara
2010-02-16 10:10                   ` Kailas Joshi
2010-02-16 13:10                     ` Jan Kara
2010-02-16 14:18                       ` tytso
2010-02-17 15:37                         ` Kailas Joshi
     [not found]                           ` <38f6fb7d1003182023j5513640csdc797adb49393ea0@mail.gmail.com>
2010-03-22 16:52                             ` Jan Kara
2010-03-23 10:41                               ` Kailas Joshi
2010-03-29 15:45                                 ` Jan Kara
2010-04-17  4:42                                   ` Kailas Joshi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100211195624.GM739@thunk.org \
    --to=tytso@mit.edu \
    --cc=jack@suse.cz \
    --cc=jiayingz@google.com \
    --cc=kailas.joshi@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).