From: Jiaying Zhang <jiayingz@google.com>
To: Andreas Dilger <adilger@sun.com>
Cc: Theodore Tso <tytso@mit.edu>, Frank Mayhar <fmayhar@google.com>,
Eric Sandeen <sandeen@redhat.com>,
Curt Wohlgemuth <curtw@google.com>,
ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: Question on fallocate/ftruncate sequence
Date: Wed, 2 Sep 2009 22:32:07 -0700 [thread overview]
Message-ID: <5df78e1d0909022232g591ad7bxe6ca9f44a5be4ab6@mail.gmail.com> (raw)
In-Reply-To: <5df78e1d0909022220m1152b313o92f6cb7cc8858298@mail.gmail.com>
On Wed, Sep 2, 2009 at 10:20 PM, Jiaying Zhang<jiayingz@google.com> wrote:
> On Wed, Sep 2, 2009 at 1:41 AM, Andreas Dilger<adilger@sun.com> wrote:
>> On Aug 31, 2009 16:33 -0700, Jiaying Zhang wrote:
>>> > EXT4_KEEPSIZE_FL should only be cleared if there were writes to
>>> > the end of the fallocated space. In that regard, I think the name
>>> > of this flag should be changed to something like "EXT4_EOFBLOCKS_FL"
>>> > to indicate that blocks are allocated beyond the end of file (i_size).
>>>
>>> Thanks for catching this! I changed the patch to only clear the flag
>>> when the new_size is larger than i_size and changed the flag name
>>> as you suggested. It would be nice if we only clear the flag when we
>>> write beyond the fallocated space, but this seems hard to detect
>>> because we no longer have the allocated size once that keepsize
>>> fallocate call returns.
>>
>> The problem is that if e2fsck depends on the EXT4_EOFBLOCKS_FL set
>> for fallocate-beyond-EOF then it is worse to clear it than to leave
>> it set. At worst, leaving the flag set results in too many truncates
>> on the file. Clearing the flag when not correct may result in user
>> visible data corruption if the file size is extended...
>>
>>> Here is the new patch:
>>>
>>> --- .pc/fallocate_keepsizse.patch/fs/ext4/extents.c 2009-08-31
>>> 12:08:10.000000000 -0700
>>> +++ fs/ext4/extents.c 2009-08-31 15:51:13.000000000 -0700
>>> @@ -3091,11 +3091,19 @@ static void ext4_falloc_update_inode(str
>>> * the file size.
>>> */
>>> if (!(mode & FALLOC_FL_KEEP_SIZE)) {
>>> + if (new_size > i_size_read(inode)) {
>>> i_size_write(inode, new_size);
>>> + inode->i_flags &= ~EXT4_EOFBLOCKS_FL;
>>
>> This again isn't quite correct, since the EOFBLOCKS_FL shouldn't
>> be cleared unless new_size is beyond the allocated size. The
>> allocation code itself might be a better place to clear this,
>> since it knows whether there were new blocks being added beyond
>> the current max allocated block.
>
> We were thinking to clear this flag when we need to allocate new
> blocks, but I was not sure how to get the current max allocated
> block -- that is mostly because I just started working on the ext4
> code. After digging into the ext4 allocation code today, I think we
> can put the check&clear in ext4_ext_get_blocks:
>
> @@ -2968,6 +2968,14 @@ int ext4_ext_get_blocks(handle_t *handle
> newex.ee_len = cpu_to_le16(ar.len);
> if (create == EXT4_CREATE_UNINITIALIZED_EXT) /* Mark uninitialized */
> ext4_ext_mark_uninitialized(&newex);
> +
> + if (unlikely(inode->i_flags & EXT4_EOFBLOCKS_FL)) {
> + BUG_ON(!eh->eh_entries);
> + last_ex = EXT_LAST_EXTENT(eh);
> + if (iblock + max_blocks > le32_to_cpu(last_ex->ee_block)
> + + ext4_ext_get_actual_len(last_ex))
> + inode->i_flags &= ~EXT4_EOFBLOCKS_FL;
> + }
> err = ext4_ext_insert_extent(handle, inode, path, &newex);
> if (err) {
> /* free data blocks we just allocated */
>
> Again, I just started looking at this part of code, so please let me know
> if I am in the right direction.
>
> Another thing I am not sure is whether we can allocate a non-data block,
> like extended attributes, beyond the current max block without changing
> the i_size. In that case, clearing the EOFBLOCKS flag will be wrong.
>
>>> #define FS_FL_USER_VISIBLE 0x0003DFFF /* User visible flags */
>>
>> It probably isn't a bad idea to make this flag user-visible, since it
>> would allow scanning for files that have excess space reserved (e.g.
>> if the filesystem is getting full). Making it user-settable (i.e.
>> clearable) would essentially mean truncating the file to i_size without
>> updating the timestamps so that the reserved space is discarded. I
>> don't think there is any value in allowing a user to turn this flag on
>> for a file.
>
> So to make it user-settable, we need to add the handling in ext4_ioctl
> that calls vmtruncate when the flag to be cleared. But how can we get
> the right size to truncate in that case? Can we just set that to the
> max initialized block shift with block size? But that may also truncate
> the blocks reserved without the KEEP_SIZE flag.
Never mind, that is a stupid question. We can just truncate to the
current i_size.
Jiaying
>
> Jiaying
>
>>
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Sr. Staff Engineer, Lustre Group
>> Sun Microsystems of Canada, Inc.
>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2009-09-03 5:32 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-20 16:36 Question on fallocate/ftruncate sequence Curt Wohlgemuth
2009-07-20 22:45 ` Eric Sandeen
2009-07-21 21:29 ` Frank Mayhar
2009-07-21 21:54 ` Andreas Dilger
2009-07-22 16:24 ` Frank Mayhar
2009-07-22 23:10 ` Frank Mayhar
2009-07-23 3:05 ` Eric Sandeen
2009-07-23 16:27 ` Frank Mayhar
2009-07-23 17:00 ` Eric Sandeen
2009-07-23 18:05 ` Frank Mayhar
2009-07-23 21:56 ` Andreas Dilger
2009-07-23 22:46 ` Frank Mayhar
2009-08-28 18:42 ` Jiaying Zhang
2009-08-28 19:40 ` Andreas Dilger
2009-08-28 21:44 ` Jiaying Zhang
2009-08-28 22:14 ` Andreas Dilger
2009-08-29 0:40 ` Jiaying Zhang
2009-08-30 2:52 ` Theodore Tso
2009-08-31 19:40 ` Jiaying Zhang
2009-08-31 21:56 ` Andreas Dilger
2009-08-31 23:33 ` Jiaying Zhang
2009-09-02 8:41 ` Andreas Dilger
2009-09-03 5:20 ` Jiaying Zhang
2009-09-03 5:32 ` Jiaying Zhang [this message]
2009-09-24 5:27 ` Jiaying Zhang
2009-09-25 7:35 ` Andreas Dilger
2009-09-25 22:08 ` Jiaying Zhang
2009-09-29 19:15 ` Eric Sandeen
2009-09-29 19:38 ` Jiaying Zhang
2009-09-29 19:55 ` Eric Sandeen
2009-09-30 8:10 ` Andreas Dilger
2009-10-02 22:10 ` Jiaying Zhang
2009-10-02 22:29 ` Eric Sandeen
2009-10-02 23:21 ` Jiaying Zhang
2009-07-23 19:48 ` Question on fallocate/ftruncate sequence (and flags) Frank Mayhar
2009-07-23 20:37 ` Eric Sandeen
2009-07-23 21:01 ` Frank Mayhar
2009-07-29 15:29 ` Jan Kara
2009-07-29 15:59 ` Frank Mayhar
2009-07-23 21:53 ` Andreas Dilger
2009-07-23 23:33 ` Greg Freemyer
2009-07-21 22:03 ` Question on fallocate/ftruncate sequence Eric Sandeen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5df78e1d0909022232g591ad7bxe6ca9f44a5be4ab6@mail.gmail.com \
--to=jiayingz@google.com \
--cc=adilger@sun.com \
--cc=curtw@google.com \
--cc=fmayhar@google.com \
--cc=linux-ext4@vger.kernel.org \
--cc=sandeen@redhat.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).