From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org
Subject: Re: The meaning of data=ordered as it relates to delayed allocation
Date: Mon, 19 Jan 2009 10:13:45 +0530 [thread overview]
Message-ID: <20090119044345.GB9482@skywalker> (raw)
In-Reply-To: <E1LOiN8-0001Cj-91@closure.thunk.org>
On Sun, Jan 18, 2009 at 07:52:10PM -0500, Theodore Ts'o wrote:
>
> An Ubuntu user recently complained about a large number of recently
> updated files which were zero-length after an crash. I started looking
> more closely at that, and it's because we have an interesting
> interpretation of data=ordered. It applies for blocks which are already
> allocated, but not for blocks which haven't been allocated yet. This
> can be surprising for users; and indeed, for many workloads where you
> aren't using berk_db some other database, all of the files written will
> be newly created files (or files which are getting rewritten after
> opening with O_TRUNC), so there won't be any difference between
> data=writeback and data=ordered.
That meaning of data=ordered is to ensure that we don't update inode
i_size without writing the data blocks within i_size. So even with
delayed allocation if we have i_size update ( this happen when we
allocate blocks ) we would write the data blocks first.
With that interpretation having a zero block file on crash is fine. But
we should not find the files corrupted.(ie, files with wrong contents).
>
> So I wonder if we should either:
>
> (a) make data=ordered force block allocation and writeback --- which
> should just be a matter of disabling the
> redirty_page_for_writepage() code path in ext4_da_writepage()
We can't do that because we cannot do block allocation there. So we need
to redirty the page that have unmapped buffer_heads.
>
> (b) add a new mount option, call it data=delalloc-ordered which is (a)
>
> (c) change the default mount option to be data=writeback
This won't guarantee that i_size/metadata get updated ONLY after data blocks
are written.
>
> (d) Do (b) and make it the default
>
> (e) Keep things the way they are
>
> Thoughts, comments? My personal favorite is (b). This allows users
> who want something that works functionally much more like ext3 to get
> that, while giving us the current speed advantages of a more aggressive
> delayed allocation.
>
> - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2009-01-19 4:43 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-19 0:52 The meaning of data=ordered as it relates to delayed allocation Theodore Ts'o
2009-01-19 4:43 ` Aneesh Kumar K.V [this message]
2009-01-19 12:45 ` Theodore Tso
2009-01-19 14:45 ` Aneesh Kumar K.V
2009-01-26 13:24 ` Jan Kara
2009-01-19 19:10 ` Andreas Dilger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090119044345.GB9482@skywalker \
--to=aneesh.kumar@linux.vnet.ibm.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.