[PATCH 5/7] ext4: Convert filesystem to the new truncate calling convention

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 5/7] ext4: Convert filesystem to the new truncate calling convention
@ 2009-09-22 17:42 Jan Kara
  2009-09-22 17:48 ` Jan Kara
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Kara @ 2009-09-22 17:42 UTC (permalink / raw)
  To: LKML; +Cc: npiggin, viro, linux-ext4, hch, Andrew Morton, Jan Kara, tytso

CC: linux-ext4@vger.kernel.org
CC: tytso@mit.edu
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/file.c  |    2 +-
 fs/ext4/inode.c |  166 ++++++++++++++++++++++++++++++++----------------------
 2 files changed, 99 insertions(+), 69 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 3f1873f..22f49d7 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -198,7 +198,7 @@ const struct file_operations ext4_file_operations = {
 };
 
 const struct inode_operations ext4_file_inode_operations = {
-	.truncate	= ext4_truncate,
+	.new_truncate	= 1,
 	.setattr	= ext4_setattr,
 	.getattr	= ext4_getattr,
 #ifdef CONFIG_EXT4_FS_XATTR
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 58492ab..be25874 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4682,28 +4686,97 @@ int ext4_write_inode(struct inode *inode, int wait)
 }
 
 /*
- * ext4_setattr()
+ * ext4_setsize()
+ *
+ * This is a helper for ext4_setattr(). It sets i_size, truncates page cache
+ * and truncates inode blocks if they are over i_size.
  *
- * Called from notify_change.
+ * We take care of updating i_disksize and adding inode to the orphan list.
+ * That makes sure that we can guarantee that any commit will leave the blocks
+ * being truncated in an unused state on disk.  (On recovery, the inode will
+ * get truncated and the blocks will be freed, so we have a strong guarantee
+ * that no future commit will leave these blocks visible to the user.)
  *
- * We want to trap VFS attempts to truncate the file as soon as
- * possible.  In particular, we want to make sure that when the VFS
- * shrinks i_size, we put the inode on the orphan list and modify
- * i_disksize immediately, so that during the subsequent flushing of
- * dirty pages and freeing of disk blocks, we can guarantee that any
- * commit will leave the blocks being flushed in an unused state on
- * disk.  (On recovery, the inode will get truncated and the blocks will
- * be freed, so we have a strong guarantee that no future commit will
- * leave these blocks visible to the user.)
+ * Another thing we have to assure is that if we are in ordered mode and inode
+ * is still attached to the committing transaction, we must we start writeout
+ * of all the dirty pages which are being truncated.  This way we are sure that
+ * all the data written in the previous transaction are already on disk
+ * (truncate waits for pages under writeback).
+ */
+static int ext4_setsize(struct inode *inode, loff_t newsize)
+{
+	int error = 0, rc;
+	loff_t oldsize = inode->i_size;
+	handle_t *handle;
+
+	error = inode_newsize_ok(inode, newsize);
+	if (error)
+		goto out;
+	/* VFS should have checked these and return error... */
+	WARN_ON(!S_ISREG(inode->i_mode) || IS_APPEND(inode) ||
+		IS_IMMUTABLE(inode));
+
+	if (newsize < oldsize) {
+		handle = ext4_journal_start(inode, 3);
+		if (IS_ERR(handle)) {
+			error = PTR_ERR(handle);
+			goto err_out;
+		}
+
+		error = ext4_orphan_add(handle, inode);
+		EXT4_I(inode)->i_disksize = newsize;
+		rc = ext4_mark_inode_dirty(handle, inode);
+		if (!error)
+			error = rc;
+		ext4_journal_stop(handle);
+
+		if (ext4_should_order_data(inode)) {
+			error = ext4_begin_ordered_truncate(inode, newsize);
+			if (error) {
+				/* Do as much error cleanup as possible */
+				handle = ext4_journal_start(inode, 3);
+				if (IS_ERR(handle)) {
+					ext4_orphan_del(NULL, inode);
+					goto err_out;
+				}
+				ext4_orphan_del(handle, inode);
+				ext4_journal_stop(handle);
+				goto err_out;
+			}
+		}
+	} else if (!(EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL)) {
+		struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
+
+		if (newsize > sbi->s_bitmap_maxbytes) {
+			error = -EFBIG;
+			goto out;
+		}
+	}
+
+	i_size_write(inode, newsize);
+	truncate_pagecache(inode, oldsize, newsize);
+	ext4_truncate(inode);
+
+	/*
+	 * If we failed to get a transaction handle at all, we need to clean up
+         * the in-core orphan list manually.
+	 */
+	if (inode->i_nlink)
+		ext4_orphan_del(NULL, inode);
+err_out:
+	ext4_std_error(inode->i_sb, error);
+out:
+	return error;
+}
+
+
+/*
+ * ext4_setattr()
  *
- * Another thing we have to assure is that if we are in ordered mode
- * and inode is still attached to the committing transaction, we must
- * we start writeout of all the dirty pages which are being truncated.
- * This way we are sure that all the data written in the previous
- * transaction are already on disk (truncate waits for pages under
- * writeback).
+ * Handle special things ext4 needs for changing owner of the file, changing
+ * ACLs, or truncating file.
  *
- * Called with inode->i_mutex down.
+ * Called from notify_change with inode->i_mutex down.
  */
 int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 {
@@ -4743,61 +4816,18 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 	}
 
 	if (attr->ia_valid & ATTR_SIZE) {
-		if (!(EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL)) {
-			struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
-
-			if (attr->ia_size > sbi->s_bitmap_maxbytes) {
-				error = -EFBIG;
-				goto err_out;
-			}
-		}
-	}
-
-	if (S_ISREG(inode->i_mode) &&
-	    attr->ia_valid & ATTR_SIZE && attr->ia_size < inode->i_size) {
-		handle_t *handle;
-
-		handle = ext4_journal_start(inode, 3);
-		if (IS_ERR(handle)) {
-			error = PTR_ERR(handle);
-			goto err_out;
-		}
-
-		error = ext4_orphan_add(handle, inode);
-		EXT4_I(inode)->i_disksize = attr->ia_size;
-		rc = ext4_mark_inode_dirty(handle, inode);
-		if (!error)
-			error = rc;
-		ext4_journal_stop(handle);
-
-		if (ext4_should_order_data(inode)) {
-			error = ext4_begin_ordered_truncate(inode,
-							    attr->ia_size);
-			if (error) {
-				/* Do as much error cleanup as possible */
-				handle = ext4_journal_start(inode, 3);
-				if (IS_ERR(handle)) {
-					ext4_orphan_del(NULL, inode);
-					goto err_out;
-				}
-				ext4_orphan_del(handle, inode);
-				ext4_journal_stop(handle);
-				goto err_out;
-			}
-		}
+		error = ext4_setsize(inode, attr->ia_size);
+		if (error)
+			return error;
 	}
 
-	rc = inode_setattr(inode, attr);

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 5/7] ext4: Convert filesystem to the new truncate calling convention
  2009-09-22 17:42 [PATCH 5/7] ext4: Convert filesystem to the new truncate calling convention Jan Kara
@ 2009-09-22 17:48 ` Jan Kara
  0 siblings, 0 replies; 7+ messages in thread
From: Jan Kara @ 2009-09-22 17:48 UTC (permalink / raw)
  To: LKML; +Cc: npiggin, viro, linux-ext4, hch, Andrew Morton, Jan Kara, tytso

  Oops, sorry for the subject. It should have been just [PATCH] instead
of [PATCH 5/7]...

On Tue 22-09-09 19:42:05, Jan Kara wrote:
> CC: linux-ext4@vger.kernel.org
> CC: tytso@mit.edu
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/ext4/file.c  |    2 +-
>  fs/ext4/inode.c |  166 ++++++++++++++++++++++++++++++++----------------------
>  2 files changed, 99 insertions(+), 69 deletions(-)
> 
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index 3f1873f..22f49d7 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -198,7 +198,7 @@ const struct file_operations ext4_file_operations = {
>  };
>  
>  const struct inode_operations ext4_file_inode_operations = {
> -	.truncate	= ext4_truncate,
> +	.new_truncate	= 1,
>  	.setattr	= ext4_setattr,
>  	.getattr	= ext4_getattr,
>  #ifdef CONFIG_EXT4_FS_XATTR
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 58492ab..be25874 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4682,28 +4686,97 @@ int ext4_write_inode(struct inode *inode, int wait)
>  }
>  
>  /*
> - * ext4_setattr()
> + * ext4_setsize()
> + *
> + * This is a helper for ext4_setattr(). It sets i_size, truncates page cache
> + * and truncates inode blocks if they are over i_size.
>   *
> - * Called from notify_change.
> + * We take care of updating i_disksize and adding inode to the orphan list.
> + * That makes sure that we can guarantee that any commit will leave the blocks
> + * being truncated in an unused state on disk.  (On recovery, the inode will
> + * get truncated and the blocks will be freed, so we have a strong guarantee
> + * that no future commit will leave these blocks visible to the user.)
>   *
> - * We want to trap VFS attempts to truncate the file as soon as
> - * possible.  In particular, we want to make sure that when the VFS
> - * shrinks i_size, we put the inode on the orphan list and modify
> - * i_disksize immediately, so that during the subsequent flushing of
> - * dirty pages and freeing of disk blocks, we can guarantee that any
> - * commit will leave the blocks being flushed in an unused state on
> - * disk.  (On recovery, the inode will get truncated and the blocks will
> - * be freed, so we have a strong guarantee that no future commit will
> - * leave these blocks visible to the user.)
> + * Another thing we have to assure is that if we are in ordered mode and inode
> + * is still attached to the committing transaction, we must we start writeout
> + * of all the dirty pages which are being truncated.  This way we are sure that
> + * all the data written in the previous transaction are already on disk
> + * (truncate waits for pages under writeback).
> + */
> +static int ext4_setsize(struct inode *inode, loff_t newsize)
> +{
> +	int error = 0, rc;
> +	loff_t oldsize = inode->i_size;
> +	handle_t *handle;
> +
> +	error = inode_newsize_ok(inode, newsize);
> +	if (error)
> +		goto out;
> +	/* VFS should have checked these and return error... */
> +	WARN_ON(!S_ISREG(inode->i_mode) || IS_APPEND(inode) ||
> +		IS_IMMUTABLE(inode));
> +
> +	if (newsize < oldsize) {
> +		handle = ext4_journal_start(inode, 3);
> +		if (IS_ERR(handle)) {
> +			error = PTR_ERR(handle);
> +			goto err_out;
> +		}
> +
> +		error = ext4_orphan_add(handle, inode);
> +		EXT4_I(inode)->i_disksize = newsize;
> +		rc = ext4_mark_inode_dirty(handle, inode);
> +		if (!error)
> +			error = rc;
> +		ext4_journal_stop(handle);
> +
> +		if (ext4_should_order_data(inode)) {
> +			error = ext4_begin_ordered_truncate(inode, newsize);
> +			if (error) {
> +				/* Do as much error cleanup as possible */
> +				handle = ext4_journal_start(inode, 3);
> +				if (IS_ERR(handle)) {
> +					ext4_orphan_del(NULL, inode);
> +					goto err_out;
> +				}
> +				ext4_orphan_del(handle, inode);
> +				ext4_journal_stop(handle);
> +				goto err_out;
> +			}
> +		}
> +	} else if (!(EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL)) {
> +		struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> +
> +		if (newsize > sbi->s_bitmap_maxbytes) {
> +			error = -EFBIG;
> +			goto out;
> +		}
> +	}
> +
> +	i_size_write(inode, newsize);
> +	truncate_pagecache(inode, oldsize, newsize);
> +	ext4_truncate(inode);
> +
> +	/*
> +	 * If we failed to get a transaction handle at all, we need to clean up
> +         * the in-core orphan list manually.
> +	 */
> +	if (inode->i_nlink)
> +		ext4_orphan_del(NULL, inode);
> +err_out:
> +	ext4_std_error(inode->i_sb, error);
> +out:
> +	return error;
> +}
> +
> +
> +/*
> + * ext4_setattr()
>   *
> - * Another thing we have to assure is that if we are in ordered mode
> - * and inode is still attached to the committing transaction, we must
> - * we start writeout of all the dirty pages which are being truncated.
> - * This way we are sure that all the data written in the previous
> - * transaction are already on disk (truncate waits for pages under
> - * writeback).
> + * Handle special things ext4 needs for changing owner of the file, changing
> + * ACLs, or truncating file.
>   *
> - * Called with inode->i_mutex down.
> + * Called from notify_change with inode->i_mutex down.
>   */
>  int ext4_setattr(struct dentry *dentry, struct iattr *attr)
>  {
> @@ -4743,61 +4816,18 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
>  	}
>  
>  	if (attr->ia_valid & ATTR_SIZE) {
> -		if (!(EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL)) {
> -			struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> -
> -			if (attr->ia_size > sbi->s_bitmap_maxbytes) {
> -				error = -EFBIG;
> -				goto err_out;
> -			}
> -		}
> -	}
> -
> -	if (S_ISREG(inode->i_mode) &&
> -	    attr->ia_valid & ATTR_SIZE && attr->ia_size < inode->i_size) {
> -		handle_t *handle;
> -
> -		handle = ext4_journal_start(inode, 3);
> -		if (IS_ERR(handle)) {
> -			error = PTR_ERR(handle);
> -			goto err_out;
> -		}
> -
> -		error = ext4_orphan_add(handle, inode);
> -		EXT4_I(inode)->i_disksize = attr->ia_size;
> -		rc = ext4_mark_inode_dirty(handle, inode);
> -		if (!error)
> -			error = rc;
> -		ext4_journal_stop(handle);
> -
> -		if (ext4_should_order_data(inode)) {
> -			error = ext4_begin_ordered_truncate(inode,
> -							    attr->ia_size);
> -			if (error) {
> -				/* Do as much error cleanup as possible */
> -				handle = ext4_journal_start(inode, 3);
> -				if (IS_ERR(handle)) {
> -					ext4_orphan_del(NULL, inode);
> -					goto err_out;
> -				}
> -				ext4_orphan_del(handle, inode);
> -				ext4_journal_stop(handle);
> -				goto err_out;
> -			}
> -		}
> +		error = ext4_setsize(inode, attr->ia_size);
> +		if (error)
> +			return error;
>  	}
>  
> -	rc = inode_setattr(inode, attr);
> -
> -	/* If inode_setattr's call to ext4_truncate failed to get a
> -	 * transaction handle at all, we need to clean up the in-core
> -	 * orphan list manually. */
> -	if (inode->i_nlink)
> -		ext4_orphan_del(NULL, inode);
> +	generic_setattr(inode, attr);
>  
> -	if (!rc && (ia_valid & ATTR_MODE))
> +	if (ia_valid & ATTR_MODE)
>  		rc = ext4_acl_chmod(inode);
>  
> +	/* Mark inode dirty due to changes done by generic_setattr() */
> +	mark_inode_dirty(inode);
>  err_out:
>  	ext4_std_error(inode->i_sb, error);
>  	if (!error)
> -- 
> 1.6.0.2
> 
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC] [PATCH 0/7] Improve VFS to handle better mmaps when blocksize < pagesize (v3)
@ 2009-09-17 15:21 Jan Kara
  2009-09-17 15:21 ` [PATCH 5/7] ext4: Convert filesystem to the new truncate calling convention Jan Kara
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Kara @ 2009-09-17 15:21 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: LKML, linux-ext4, linux-mm, npiggin

  Hi,

  here is my next attempt to solve a problems arising with mmaped writes when
blocksize < pagesize. To recall what's the problem:

We'd like to use page_mkwrite() to allocate blocks under a page which is
becoming writeably mmapped in some process address space. This allows a
filesystem to return a page fault if there is not enough space available, user
exceeds quota or similar problem happens, rather than silently discarding data
later when writepage is called.

On filesystems where blocksize < pagesize the situation is complicated though.
Think for example that blocksize = 1024, pagesize = 4096 and a process does:
  ftruncate(fd, 0);
  pwrite(fd, buf, 1024, 0);
  map = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED, fd, 0);
  map[0] = 'a';  ----> page_mkwrite() for index 0 is called
  ftruncate(fd, 10000); /* or even pwrite(fd, buf, 1, 10000) */
  fsync(fd); ----> writepage() for index 0 is called

At the moment page_mkwrite() is called, filesystem can allocate only one block
for the page because i_size == 1024. Otherwise it would create blocks beyond
i_size which is generally undesirable. But later at writepage() time, we would
like to have blocks allocated for the whole page (and in principle we have to
allocate them because user could have filled the page with data after the
second ftruncate()).
---

  The patches depend on Nick's truncate calling convention rewrite. The first
three patches in the patchset are just cleanups. The series converts ext4 and
ext2 filesystems just to give an idea how conversion of a filesystem will
look like.
  A few notes to the changes the main patch (patch number 4) does:
1) zeroing of tail of the last block now does not happen in writepage (which is
racy anyway as Nick pointed out) and foo_truncate_page but rather when i_size
is going to be extended.
2) writeback path does not care about i_size anymore, it uses buffer flags
instead. An exception is a nobh case where we have to use i_size. Thus
filesystems not using nobh code can update i_size in write_end without holding
page_lock.  Filesystems using nobh code still have to update i_size under the
page_lock since otherwise __mpage_writepage could come early, write just part
of the page, and clear all dirty bits, thus causing a data loss.
3) converted filesystems have to make sure that the buffers with valid data
to write are either mapped or delay before they call block_write_full_page.
The idea is that they should use page_mkwrite() to setup buffers.

  Both ext2 and ext4 have survived some beating with fsx-linux so they should
be at least moderately safe to use :). Any comments?
									Honza

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 5/7] ext4: Convert filesystem to the new truncate calling convention
  2009-09-17 15:21 [RFC] [PATCH 0/7] Improve VFS to handle better mmaps when blocksize < pagesize (v3) Jan Kara
@ 2009-09-17 15:21 ` Jan Kara
  2009-09-22 14:36   ` Al Viro
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Kara @ 2009-09-17 15:21 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: LKML, linux-ext4, linux-mm, npiggin, Jan Kara, tytso

CC: linux-ext4@vger.kernel.org
CC: tytso@mit.edu
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/file.c  |    2 +-
 fs/ext4/inode.c |  166 ++++++++++++++++++++++++++++++++----------------------
 2 files changed, 99 insertions(+), 69 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 3f1873f..22f49d7 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -198,7 +198,7 @@ const struct file_operations ext4_file_operations = {
 };
 
 const struct inode_operations ext4_file_inode_operations = {
-	.truncate	= ext4_truncate,
+	.new_truncate	= 1,
 	.setattr	= ext4_setattr,
 	.getattr	= ext4_getattr,
 #ifdef CONFIG_EXT4_FS_XATTR
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 58492ab..be25874 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3354,6 +3354,7 @@ static int ext4_journalled_set_page_dirty(struct page *page)
 }
 
 static const struct address_space_operations ext4_ordered_aops = {
+	.new_writepage		= 1,
 	.readpage		= ext4_readpage,
 	.readpages		= ext4_readpages,
 	.writepage		= ext4_writepage,
@@ -3369,6 +3370,7 @@ static const struct address_space_operations ext4_ordered_aops = {
 };
 
 static const struct address_space_operations ext4_writeback_aops = {
+	.new_writepage		= 1,
 	.readpage		= ext4_readpage,
 	.readpages		= ext4_readpages,
 	.writepage		= ext4_writepage,
@@ -3384,6 +3386,7 @@ static const struct address_space_operations ext4_writeback_aops = {
 };
 
 static const struct address_space_operations ext4_journalled_aops = {
+	.new_writepage		= 1,
 	.readpage		= ext4_readpage,
 	.readpages		= ext4_readpages,
 	.writepage		= ext4_writepage,
@@ -3398,6 +3401,7 @@ static const struct address_space_operations ext4_journalled_aops = {
 };
 
 static const struct address_space_operations ext4_da_aops = {
+	.new_writepage		= 1,
 	.readpage		= ext4_readpage,
 	.readpages		= ext4_readpages,
 	.writepage		= ext4_writepage,
@@ -4682,28 +4686,97 @@ int ext4_write_inode(struct inode *inode, int wait)
 }
 
 /*
- * ext4_setattr()
+ * ext4_setsize()
+ *
+ * This is a helper for ext4_setattr(). It sets i_size, truncates page cache
+ * and truncates inode blocks if they are over i_size.
  *
- * Called from notify_change.
+ * We take care of updating i_disksize and adding inode to the orphan list.
+ * That makes sure that we can guarantee that any commit will leave the blocks
+ * being truncated in an unused state on disk.  (On recovery, the inode will
+ * get truncated and the blocks will be freed, so we have a strong guarantee
+ * that no future commit will leave these blocks visible to the user.)
  *
- * We want to trap VFS attempts to truncate the file as soon as
- * possible.  In particular, we want to make sure that when the VFS
- * shrinks i_size, we put the inode on the orphan list and modify
- * i_disksize immediately, so that during the subsequent flushing of
- * dirty pages and freeing of disk blocks, we can guarantee that any
- * commit will leave the blocks being flushed in an unused state on
- * disk.  (On recovery, the inode will get truncated and the blocks will
- * be freed, so we have a strong guarantee that no future commit will
- * leave these blocks visible to the user.)
+ * Another thing we have to assure is that if we are in ordered mode and inode
+ * is still attached to the committing transaction, we must we start writeout
+ * of all the dirty pages which are being truncated.  This way we are sure that
+ * all the data written in the previous transaction are already on disk
+ * (truncate waits for pages under writeback).
+ */
+static int ext4_setsize(struct inode *inode, loff_t newsize)
+{
+	int error = 0, rc;
+	loff_t oldsize = inode->i_size;
+	handle_t *handle;
+
+	error = inode_newsize_ok(inode, newsize);
+	if (error)
+		goto out;
+	/* VFS should have checked these and return error... */
+	WARN_ON(!S_ISREG(inode->i_mode) || IS_APPEND(inode) ||
+		IS_IMMUTABLE(inode));
+
+	if (newsize < oldsize) {
+		handle = ext4_journal_start(inode, 3);
+		if (IS_ERR(handle)) {
+			error = PTR_ERR(handle);
+			goto err_out;
+		}
+
+		error = ext4_orphan_add(handle, inode);
+		EXT4_I(inode)->i_disksize = newsize;
+		rc = ext4_mark_inode_dirty(handle, inode);
+		if (!error)
+			error = rc;
+		ext4_journal_stop(handle);
+
+		if (ext4_should_order_data(inode)) {
+			error = ext4_begin_ordered_truncate(inode, newsize);
+			if (error) {
+				/* Do as much error cleanup as possible */
+				handle = ext4_journal_start(inode, 3);
+				if (IS_ERR(handle)) {
+					ext4_orphan_del(NULL, inode);
+					goto err_out;
+				}
+				ext4_orphan_del(handle, inode);
+				ext4_journal_stop(handle);
+				goto err_out;
+			}
+		}
+	} else if (!(EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL)) {
+		struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
+
+		if (newsize > sbi->s_bitmap_maxbytes) {
+			error = -EFBIG;
+			goto out;
+		}
+	}
+
+	i_size_write(inode, newsize);
+	truncate_pagecache(inode, oldsize, newsize);
+	ext4_truncate(inode);
+
+	/*
+	 * If we failed to get a transaction handle at all, we need to clean up
+         * the in-core orphan list manually.
+	 */
+	if (inode->i_nlink)
+		ext4_orphan_del(NULL, inode);
+err_out:
+	ext4_std_error(inode->i_sb, error);
+out:
+	return error;
+}
+
+
+/*
+ * ext4_setattr()
  *
- * Another thing we have to assure is that if we are in ordered mode
- * and inode is still attached to the committing transaction, we must
- * we start writeout of all the dirty pages which are being truncated.
- * This way we are sure that all the data written in the previous
- * transaction are already on disk (truncate waits for pages under
- * writeback).
+ * Handle special things ext4 needs for changing owner of the file, changing
+ * ACLs, or truncating file.
  *
- * Called with inode->i_mutex down.
+ * Called from notify_change with inode->i_mutex down.
  */
 int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 {
@@ -4743,61 +4816,18 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 	}
 
 	if (attr->ia_valid & ATTR_SIZE) {
-		if (!(EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL)) {
-			struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
-
-			if (attr->ia_size > sbi->s_bitmap_maxbytes) {
-				error = -EFBIG;
-				goto err_out;
-			}
-		}
-	}
-
-	if (S_ISREG(inode->i_mode) &&
-	    attr->ia_valid & ATTR_SIZE && attr->ia_size < inode->i_size) {
-		handle_t *handle;
-
-		handle = ext4_journal_start(inode, 3);
-		if (IS_ERR(handle)) {
-			error = PTR_ERR(handle);
-			goto err_out;
-		}
-
-		error = ext4_orphan_add(handle, inode);
-		EXT4_I(inode)->i_disksize = attr->ia_size;
-		rc = ext4_mark_inode_dirty(handle, inode);
-		if (!error)
-			error = rc;
-		ext4_journal_stop(handle);
-
-		if (ext4_should_order_data(inode)) {
-			error = ext4_begin_ordered_truncate(inode,
-							    attr->ia_size);
-			if (error) {
-				/* Do as much error cleanup as possible */
-				handle = ext4_journal_start(inode, 3);
-				if (IS_ERR(handle)) {
-					ext4_orphan_del(NULL, inode);
-					goto err_out;
-				}
-				ext4_orphan_del(handle, inode);
-				ext4_journal_stop(handle);
-				goto err_out;
-			}
-		}
+		error = ext4_setsize(inode, attr->ia_size);
+		if (error)
+			return error;
 	}
 
-	rc = inode_setattr(inode, attr);
-
-	/* If inode_setattr's call to ext4_truncate failed to get a
-	 * transaction handle at all, we need to clean up the in-core
-	 * orphan list manually. */
-	if (inode->i_nlink)
-		ext4_orphan_del(NULL, inode);
+	generic_setattr(inode, attr);
 
-	if (!rc && (ia_valid & ATTR_MODE))
+	if (ia_valid & ATTR_MODE)
 		rc = ext4_acl_chmod(inode);
 
+	/* Mark inode dirty due to changes done by generic_setattr() */
+	mark_inode_dirty(inode);
 err_out:
 	ext4_std_error(inode->i_sb, error);
 	if (!error)
-- 
1.6.0.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 5/7] ext4: Convert filesystem to the new truncate calling convention
  2009-09-17 15:21 ` [PATCH 5/7] ext4: Convert filesystem to the new truncate calling convention Jan Kara
@ 2009-09-22 14:36   ` Al Viro
  2009-09-22 17:16     ` Jan Kara
  0 siblings, 1 reply; 7+ messages in thread
From: Al Viro @ 2009-09-22 14:36 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, LKML, linux-ext4, linux-mm, npiggin, tytso

On Thu, Sep 17, 2009 at 05:21:45PM +0200, Jan Kara wrote:
> CC: linux-ext4@vger.kernel.org
> CC: tytso@mit.edu
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/ext4/file.c  |    2 +-
>  fs/ext4/inode.c |  166 ++++++++++++++++++++++++++++++++----------------------
>  2 files changed, 99 insertions(+), 69 deletions(-)
> 
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index 3f1873f..22f49d7 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -198,7 +198,7 @@ const struct file_operations ext4_file_operations = {
>  };
>  
>  const struct inode_operations ext4_file_inode_operations = {
> -	.truncate	= ext4_truncate,
> +	.new_truncate	= 1,
>  	.setattr	= ext4_setattr,
>  	.getattr	= ext4_getattr,
>  #ifdef CONFIG_EXT4_FS_XATTR
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 58492ab..be25874 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3354,6 +3354,7 @@ static int ext4_journalled_set_page_dirty(struct page *page)
>  }
>  
>  static const struct address_space_operations ext4_ordered_aops = {
> +	.new_writepage		= 1,

No.  We already have one half-finished series here; mixing it with another
one is not going to happen.  Such flags are tolerable only as bisectability
helpers.  They *must* disappear by the end of series.  Before it can be
submitted for merge.

In effect, you are mixing truncate switchover with your writepage one.
Please, split and reorder.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 5/7] ext4: Convert filesystem to the new truncate calling convention
  2009-09-22 14:36   ` Al Viro
@ 2009-09-22 17:16     ` Jan Kara
  2009-09-22 17:23       ` Al Viro
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Kara @ 2009-09-22 17:16 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, LKML, linux-ext4, linux-mm, npiggin, tytso

On Tue 22-09-09 15:36:04, Al Viro wrote:
> On Thu, Sep 17, 2009 at 05:21:45PM +0200, Jan Kara wrote:
> > CC: linux-ext4@vger.kernel.org
> > CC: tytso@mit.edu
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  fs/ext4/file.c  |    2 +-
> >  fs/ext4/inode.c |  166 ++++++++++++++++++++++++++++++++----------------------
> >  2 files changed, 99 insertions(+), 69 deletions(-)
> > 
> > diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> > index 3f1873f..22f49d7 100644
> > --- a/fs/ext4/file.c
> > +++ b/fs/ext4/file.c
> > @@ -198,7 +198,7 @@ const struct file_operations ext4_file_operations = {
> >  };
> >  
> >  const struct inode_operations ext4_file_inode_operations = {
> > -	.truncate	= ext4_truncate,
> > +	.new_truncate	= 1,
> >  	.setattr	= ext4_setattr,
> >  	.getattr	= ext4_getattr,
> >  #ifdef CONFIG_EXT4_FS_XATTR
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index 58492ab..be25874 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -3354,6 +3354,7 @@ static int ext4_journalled_set_page_dirty(struct page *page)
> >  }
> >  
> >  static const struct address_space_operations ext4_ordered_aops = {
> > +	.new_writepage		= 1,
> 
> No.  We already have one half-finished series here; mixing it with another
> one is not going to happen.  Such flags are tolerable only as bisectability
> helpers.  They *must* disappear by the end of series.  Before it can be
> submitted for merge.
> 
> In effect, you are mixing truncate switchover with your writepage one.
> Please, split and reorder.
  Well, this wasn't meant as a final version of those patches. It was
meant as a request for comment whether it makes sence to fix the problem
how I propose to fix it. If we agree on that, I'll go and convert the rest
of filesystems so that we can remove .new_writepage hack. By that time I
hope that new truncate sequence patches will be merged so that dependency
should go away as well...

									Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 5/7] ext4: Convert filesystem to the new truncate calling convention
  2009-09-22 17:16     ` Jan Kara
@ 2009-09-22 17:23       ` Al Viro
  2009-09-22 17:37         ` Jan Kara
  0 siblings, 1 reply; 7+ messages in thread
From: Al Viro @ 2009-09-22 17:23 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, LKML, linux-ext4, linux-mm, npiggin, tytso

On Tue, Sep 22, 2009 at 07:16:04PM +0200, Jan Kara wrote:
> > No.  We already have one half-finished series here; mixing it with another
> > one is not going to happen.  Such flags are tolerable only as bisectability
> > helpers.  They *must* disappear by the end of series.  Before it can be
> > submitted for merge.
> > 
> > In effect, you are mixing truncate switchover with your writepage one.
> > Please, split and reorder.

>   Well, this wasn't meant as a final version of those patches. It was
> meant as a request for comment whether it makes sence to fix the problem
> how I propose to fix it. If we agree on that, I'll go and convert the rest
> of filesystems so that we can remove .new_writepage hack. By that time I
> hope that new truncate sequence patches will be merged so that dependency
> should go away as well...

Could you carve just the ext4 part of truncate series out of that and
post it separately?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 5/7] ext4: Convert filesystem to the new truncate calling convention
  2009-09-22 17:23       ` Al Viro
@ 2009-09-22 17:37         ` Jan Kara
  0 siblings, 0 replies; 7+ messages in thread
From: Jan Kara @ 2009-09-22 17:37 UTC (permalink / raw)
  To: Al Viro; +Cc: Jan Kara, linux-fsdevel, LKML, linux-ext4, linux-mm, npiggin,
	tytso

On Tue 22-09-09 18:23:47, Al Viro wrote:
> On Tue, Sep 22, 2009 at 07:16:04PM +0200, Jan Kara wrote:
> > > No.  We already have one half-finished series here; mixing it with another
> > > one is not going to happen.  Such flags are tolerable only as bisectability
> > > helpers.  They *must* disappear by the end of series.  Before it can be
> > > submitted for merge.
> > > 
> > > In effect, you are mixing truncate switchover with your writepage one.
> > > Please, split and reorder.
> 
> >   Well, this wasn't meant as a final version of those patches. It was
> > meant as a request for comment whether it makes sence to fix the problem
> > how I propose to fix it. If we agree on that, I'll go and convert the rest
> > of filesystems so that we can remove .new_writepage hack. By that time I
> > hope that new truncate sequence patches will be merged so that dependency
> > should go away as well...
> 
> Could you carve just the ext4 part of truncate series out of that and
> post it separately?
  Definitely. Actually, I've sent that patch to Nick in private but you're
right I should have posted it to the list as well. Will do it in a moment.
Thanks for a reminder.

									Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-09-22 17:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-22 17:42 [PATCH 5/7] ext4: Convert filesystem to the new truncate calling convention Jan Kara
2009-09-22 17:48 ` Jan Kara
  -- strict thread matches above, loose matches on Subject: below --
2009-09-17 15:21 [RFC] [PATCH 0/7] Improve VFS to handle better mmaps when blocksize < pagesize (v3) Jan Kara
2009-09-17 15:21 ` [PATCH 5/7] ext4: Convert filesystem to the new truncate calling convention Jan Kara
2009-09-22 14:36   ` Al Viro
2009-09-22 17:16     ` Jan Kara
2009-09-22 17:23       ` Al Viro
2009-09-22 17:37         ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).