From: Andreas Dilger <adilger@sun.com>
To: Theodore Tso <tytso@mit.edu>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
linux-ext4 <linux-ext4@vger.kernel.org>
Subject: Re: e2fsprogs and blocks outside i_size
Date: Mon, 21 Jul 2008 17:32:38 -0600 [thread overview]
Message-ID: <20080721233238.GF15203@webber.adilger.int> (raw)
In-Reply-To: <20080721123400.GA28839@mit.edu>
On Jul 21, 2008 08:34 -0400, Theodore Ts'o wrote:
> Wel, as I said originally, we have four choices, only two of which are
> tenable:
>
> 1) Don't change i_size and leave e2fsck confused about whether i_size
> is confused or not; the next time e2fsck runs it can either fix it and
> change i_size, confusing applications that depend on i_size, or not
> fix it and in the case of a corrupted i_size, leave valid data
> inaccessible or do the hack to which Andreas reacted, "Yuck", and
> which Annesh quoted and I assume agree. (i.e., checking the data
> blocks to see if they are non-zero, and electing to to risk confusing
> the application in the case where they are non-zero). This is the
> current case.
>
> 2) Change i_size and always confuse applications that depend on i_size
> carrying some semantic meaning.
>
> 3) Don't aggressively zero-out (as it presents us with these two
> untenable options) and try to explit the extent instead. If the block
> application fails, return ENOSPC.
>
> 4) #3, except if the block allocation fails, try to steal a block that
> had been previously preallocated for some other logical block in that
> inode.
5) Add a flag to the inode which means "blocks beyond i_size" if fallocate()
is called with "KEEP_SIZE" and allocation is actually beyond i_size
and not just filling a hole) so that e2fsck won't "fix" the size,
but allows the extent to be uninitialized. The flag is cleared
(by kernel and/or e2fsck) if the size is extended to the last block.
To avoid consuming our precious inode flags, we might consider to re-use
the EXT3_DIRSYNC_FL or EXT3_TOPDIR_FL for this purpose, since the are
definitely only having meaning for directories. I guess the question
is whether we would need this for directories, but I don't think so as
we could always just add empty directory blocks (at the expense of
having to scan them later).
> The one other thing I would note is that at least for non-root users,
> the reserved blocks will help save us most of the time, except for
> when users explicitly set the reserved blocks down to zero.
Would the index block be allocated from the reserved space tough?
This is also a good idea, but I'm not sure if that is what happens.
I guess the "allocate index block" code path needs to check for
"(uid == s_reserved_uid || is_metadata)"?
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
prev parent reply other threads:[~2008-07-21 23:32 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-18 12:11 e2fsprogs and blocks outside i_size Aneesh Kumar K.V
2008-07-18 12:37 ` Theodore Tso
2008-07-21 5:08 ` Andreas Dilger
2008-07-21 5:59 ` Aneesh Kumar K.V
2008-07-21 12:34 ` Theodore Tso
2008-07-21 23:32 ` Andreas Dilger [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080721233238.GF15203@webber.adilger.int \
--to=adilger@sun.com \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox