linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiaying Zhang <jiayingz@google.com>
To: Andreas Dilger <adilger@sun.com>
Cc: Theodore Tso <tytso@mit.edu>, Frank Mayhar <fmayhar@google.com>,
	Eric Sandeen <sandeen@redhat.com>,
	Curt Wohlgemuth <curtw@google.com>,
	ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: Question on fallocate/ftruncate sequence
Date: Mon, 31 Aug 2009 16:33:32 -0700	[thread overview]
Message-ID: <5df78e1d0908311633k1f16a096t701e0cdab54b174c@mail.gmail.com> (raw)
In-Reply-To: <20090831215612.GG4197@webber.adilger.int>

On Mon, Aug 31, 2009 at 2:56 PM, Andreas Dilger<adilger@sun.com> wrote:
> On Aug 31, 2009  12:40 -0700, Jiaying Zhang wrote:
>> > It's better to define the flag as EXT4_KEEPSIZE_FL, and to use it as
>> > EXT4_KEEPSIZE_FL, but make a note of that bitfield position as being
>> > reserved in include/linux/fs.h.
>>
>> Here is the modified patch based on your suggestions. I stick with the
>> KEEPSIZE_FL approach that I think can allow us to handle the special
>> truncation accordingly during fsck. Other file systems can also re-use
>> this flag when they want to support fallocate with KEEP_SIZE. As you
>> suggested, I moved the EXT4_KEEPSIZE_FL checking to ext4_setattr
>> that now calls vmtruncate if the KEEPSIZE flag is set in the i_flag.
>> Please let me know what you think about this proposed patch.
>>
>> --- .pc/fallocate_keepsizse.patch/fs/ext4/extents.c   2009-08-31
>> 12:08:10.000000000 -0700
>> +++ fs/ext4/extents.c 2009-08-31 12:12:16.000000000 -0700
>> @@ -3095,7 +3095,13 @@ static void ext4_falloc_update_inode(str
>>                       i_size_write(inode, new_size);
>>               if (new_size > EXT4_I(inode)->i_disksize)
>>                       ext4_update_i_disksize(inode, new_size);
>> +             inode->i_flags &= ~EXT4_KEEPSIZE_FL;
>
> Note that fallocate can be called multiple times for a file.  The
> EXT4_KEEPSIZE_FL should only be cleared if there were writes to
> the end of the fallocated space.  In that regard, I think the name
> of this flag should be changed to something like "EXT4_EOFBLOCKS_FL"
> to indicate that blocks are allocated beyond the end of file (i_size).

Thanks for catching this! I changed the patch to only clear the flag
when the new_size is larger than i_size and changed the flag name
as you suggested. It would be nice if we only clear the flag when we
write beyond the fallocated space, but this seems hard to detect
because we no longer have the allocated size once that keepsize
fallocate call returns.

>
>>       } else {
>> +             /*
>> +              * Mark that we allocate beyond EOF so the subsequent truncate
>> +              * can proceed even if the new size is the same as i_size.
>> +              */
>> +             inode->i_flags |= EXT4_KEEPSIZE_FL;
>
> Similarly, this should only be done in case the fallocate is actually
> beyond i_size.  While that is the most common case, it isn't necessarily
> ALWAYS going to be true (e.g. if multiple threads are calling fallocate()
> on a single file, or if a program always calls fallocate() on a file
> without first checking if the file size is large enough).

Also fixed.

>
>> +++ include/linux/fs.h        2009-08-31 12:12:16.000000000 -0700
>>  #define FS_DIRECTIO_FL                       0x00100000 /* Use direct i/o */
>
>
>> +++ fs/ext4/ext4.h    2009-08-31 12:12:16.000000000 -0700
>>  #define EXT4_EXT_MIGRATE             0x00100000 /* Inode is migrating */
>
> Should we redefine EXT4_EXT_MIGRATE not to conflict with FS_DIRECTIO_FL?
> I don't think much, if any, use has been made of this flag, and I can
> imagine a major headache in the future if this isn't changed now.
>
> Also, EXT4_EXT_MIGRATE doesn't necessarily belong in the i_flags space,
> since it is only used in-memory rather than on-disk as all of the others
> are.

I will leave this out from my patch since it seems to belong to more general
cleanup and I don't know much about the EXT4_EXT_MIGRATE flag :).

Here is the new patch:

--- .pc/fallocate_keepsizse.patch/fs/ext4/extents.c	2009-08-31
12:08:10.000000000 -0700
+++ fs/ext4/extents.c	2009-08-31 15:51:13.000000000 -0700
@@ -3091,11 +3091,19 @@ static void ext4_falloc_update_inode(str
 	 * the file size.
 	 */
 	if (!(mode & FALLOC_FL_KEEP_SIZE)) {
-		if (new_size > i_size_read(inode))
+		if (new_size > i_size_read(inode)) {
 			i_size_write(inode, new_size);
+			inode->i_flags &= ~EXT4_EOFBLOCKS_FL;
+		}
 		if (new_size > EXT4_I(inode)->i_disksize)
 			ext4_update_i_disksize(inode, new_size);
 	} else {
+		/*
+		 * Mark that we allocate beyond EOF so the subsequent truncate
+		 * can proceed even if the new size is the same as i_size.
+		 */
+		if (new_size > i_size_read(inode))
+			inode->i_flags |= EXT4_EOFBLOCKS_FL;
 	}
 }

--- .pc/fallocate_keepsizse.patch/fs/ext4/inode.c	2009-08-31
12:08:10.000000000 -0700
+++ fs/ext4/inode.c	2009-08-31 15:50:56.000000000 -0700
@@ -3973,6 +3973,8 @@ void ext4_truncate(struct inode *inode)
 	if (!ext4_can_truncate(inode))
 		return;

+	inode->i_flags &= ~EXT4_EOFBLOCKS_FL;
+
 	if (inode->i_size == 0 && !test_opt(inode->i_sb, NO_AUTO_DA_ALLOC))
 		ei->i_state |= EXT4_STATE_DA_ALLOC_CLOSE;

@@ -4807,7 +4809,9 @@ int ext4_setattr(struct dentry *dentry,
 	}

 	if (S_ISREG(inode->i_mode) &&
-	    attr->ia_valid & ATTR_SIZE && attr->ia_size < inode->i_size) {
+	    attr->ia_valid & ATTR_SIZE &&
+	    (attr->ia_size < inode->i_size ||
+	     (inode->i_flags & EXT4_EOFBLOCKS_FL))) {
 		handle_t *handle;

 		handle = ext4_journal_start(inode, 3);
@@ -4838,6 +4842,11 @@ int ext4_setattr(struct dentry *dentry,
 				goto err_out;
 			}
 		}
+		if ((inode->i_flags & EXT4_EOFBLOCKS_FL)) {
+			rc = vmtruncate(inode, attr->ia_size);
+			if (rc)
+				goto err_out;
+		}
 	}

 	rc = inode_setattr(inode, attr);
--- .pc/fallocate_keepsizse.patch/include/linux/fs.h	2009-08-31
12:08:10.000000000 -0700
+++ include/linux/fs.h	2009-08-31 16:21:44.000000000 -0700
@@ -343,6 +343,7 @@ struct inodes_stat_t {
 #define FS_TOPDIR_FL			0x00020000 /* Top of directory hierarchies*/
 #define FS_EXTENT_FL			0x00080000 /* Extents */
 #define FS_DIRECTIO_FL			0x00100000 /* Use direct i/o */
+#define FS_EOFBLOCKS_FL			0x00200000 /* Blocks allocated beyond EOF */
 #define FS_RESERVED_FL			0x80000000 /* reserved for ext2 lib */

 #define FS_FL_USER_VISIBLE		0x0003DFFF /* User visible flags */
--- .pc/fallocate_keepsizse.patch/fs/ext4/ext4.h	2009-08-31
12:08:10.000000000 -0700
+++ fs/ext4/ext4.h	2009-08-31 15:52:34.000000000 -0700
@@ -235,6 +235,7 @@ struct flex_groups {
 #define EXT4_HUGE_FILE_FL               0x00040000 /* Set to each huge file */
 #define EXT4_EXTENTS_FL			0x00080000 /* Inode uses extents */
 #define EXT4_EXT_MIGRATE		0x00100000 /* Inode is migrating */
+#define EXT4_EOFBLOCKS_FL		0x00200000 /* Blocks allocated beyond EOF
(bit reserved in fs.h) */
 #define EXT4_RESERVED_FL		0x80000000 /* reserved for ext4 lib */

 #define EXT4_FL_USER_VISIBLE		0x000BDFFF /* User visible flags */
root@outpost:/mnt/work/linux-2.6.30.5#

Jiaying

>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2009-08-31 23:33 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-20 16:36 Question on fallocate/ftruncate sequence Curt Wohlgemuth
2009-07-20 22:45 ` Eric Sandeen
2009-07-21 21:29   ` Frank Mayhar
2009-07-21 21:54     ` Andreas Dilger
2009-07-22 16:24       ` Frank Mayhar
2009-07-22 23:10       ` Frank Mayhar
2009-07-23  3:05         ` Eric Sandeen
2009-07-23 16:27           ` Frank Mayhar
2009-07-23 17:00             ` Eric Sandeen
2009-07-23 18:05               ` Frank Mayhar
2009-07-23 21:56                 ` Andreas Dilger
2009-07-23 22:46                   ` Frank Mayhar
2009-08-28 18:42                     ` Jiaying Zhang
2009-08-28 19:40                       ` Andreas Dilger
2009-08-28 21:44                         ` Jiaying Zhang
2009-08-28 22:14                           ` Andreas Dilger
2009-08-29  0:40                             ` Jiaying Zhang
2009-08-30  2:52                               ` Theodore Tso
2009-08-31 19:40                                 ` Jiaying Zhang
2009-08-31 21:56                                   ` Andreas Dilger
2009-08-31 23:33                                     ` Jiaying Zhang [this message]
2009-09-02  8:41                                       ` Andreas Dilger
2009-09-03  5:20                                         ` Jiaying Zhang
2009-09-03  5:32                                           ` Jiaying Zhang
2009-09-24  5:27                                           ` Jiaying Zhang
2009-09-25  7:35                                             ` Andreas Dilger
2009-09-25 22:08                                               ` Jiaying Zhang
2009-09-29 19:15                                             ` Eric Sandeen
2009-09-29 19:38                                               ` Jiaying Zhang
2009-09-29 19:55                                                 ` Eric Sandeen
2009-09-30  8:10                                                   ` Andreas Dilger
2009-10-02 22:10                                                   ` Jiaying Zhang
2009-10-02 22:29                                                     ` Eric Sandeen
2009-10-02 23:21                                                       ` Jiaying Zhang
2009-07-23 19:48       ` Question on fallocate/ftruncate sequence (and flags) Frank Mayhar
2009-07-23 20:37         ` Eric Sandeen
2009-07-23 21:01           ` Frank Mayhar
2009-07-29 15:29             ` Jan Kara
2009-07-29 15:59               ` Frank Mayhar
2009-07-23 21:53           ` Andreas Dilger
2009-07-23 23:33             ` Greg Freemyer
2009-07-21 22:03   ` Question on fallocate/ftruncate sequence Eric Sandeen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5df78e1d0908311633k1f16a096t701e0cdab54b174c@mail.gmail.com \
    --to=jiayingz@google.com \
    --cc=adilger@sun.com \
    --cc=curtw@google.com \
    --cc=fmayhar@google.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=sandeen@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).