linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ext4: truncate block allocated on a failed ext4_write_begin
@ 2008-08-28 17:27 Aneesh Kumar K.V
  2008-08-28 17:27 ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Aneesh Kumar K.V
  0 siblings, 1 reply; 7+ messages in thread
From: Aneesh Kumar K.V @ 2008-08-28 17:27 UTC (permalink / raw)
  To: cmm, tytso, sandeen; +Cc: linux-ext4, Aneesh Kumar K.V

For blocksize < pagesize we need to remove blocks that
got allocte in block_wirte_begin if we fail with ENOSPC
for later blocks. block_write_begin internally does
this if it allocated page locally. This make sure
we don't have blocks outisde inode.i_size during
ENOSPC

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/ext4/inode.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 473d888..a8c6bc8 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1394,6 +1394,13 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping,
  		unlock_page(page);
 		ext4_journal_stop(handle);
  		page_cache_release(page);
+		/*
+		 * block_write_begin may have instantiated a few blocks
+		 * outside i_size.  Trim these off again. Don't need
+		 * i_size_read because we hold i_mutex.
+		 */
+		if (pos + len > inode->i_size)
+			vmtruncate(inode, inode->i_size);
 	}
 
 	if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
-- 
1.6.0.1.90.g27a6e


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH] ext3: truncate block allocated on a failed ext3_write_begin
  2008-08-28 17:27 [PATCH] ext4: truncate block allocated on a failed ext4_write_begin Aneesh Kumar K.V
@ 2008-08-28 17:27 ` Aneesh Kumar K.V
  2008-08-28 17:27   ` [PATCH] ext4: Properly update i_disksize Aneesh Kumar K.V
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Aneesh Kumar K.V @ 2008-08-28 17:27 UTC (permalink / raw)
  To: cmm, tytso, sandeen; +Cc: linux-ext4, Aneesh Kumar K.V

For blocksize < pagesize we need to remove blocks that
got allocte in block_wirte_begin if we fail with ENOSPC
for later blocks. block_write_begin internally does
this if it allocated page locally. This make sure
we don't have blocks outisde inode.i_size during
ENOSPC

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/ext3/inode.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index ebfec4d..f8424ad 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -1186,6 +1186,13 @@ static int ext3_write_begin(struct file *file, struct address_space *mapping,
 		ext3_journal_stop(handle);
 		unlock_page(page);
 		page_cache_release(page);
+		/*
+		 * block_write_begin may have instantiated a few blocks
+		 * outside i_size.  Trim these off again. Don't need
+		 * i_size_read because we hold i_mutex.
+		 */
+		if (pos + len > inode->i_size)
+			vmtruncate(inode, inode->i_size);
 	}
 	if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries))
 		goto retry;
-- 
1.6.0.1.90.g27a6e


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH] ext4: Properly update i_disksize.
  2008-08-28 17:27 ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Aneesh Kumar K.V
@ 2008-08-28 17:27   ` Aneesh Kumar K.V
  2008-08-28 17:56   ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Eric Sandeen
  2008-08-29  8:33   ` Dmitri Monakhov
  2 siblings, 0 replies; 7+ messages in thread
From: Aneesh Kumar K.V @ 2008-08-28 17:27 UTC (permalink / raw)
  To: cmm, tytso, sandeen; +Cc: linux-ext4, Aneesh Kumar K.V

With delayed allocation we use i_data_sem to update
i_disksize. We need to update i_disksize only
if the new size specified is greater than the
current value and we need to make sure we don't
race with other i_disksize update. With delayed
allocation we will switch to nondelalloc write_begin
if we are low on free blocks. That means nondelalloc
write_begin also need to use the same locking.

We also need to check and update i_disksize even
if the new size is less that inode.i_size because
of delayed allocation.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/ext4/ext4.h    |   11 +++++++++
 fs/ext4/extents.c |    9 ++++---
 fs/ext4/inode.c   |   61 ++++++++++++++++++++++++++++++++--------------------
 3 files changed, 53 insertions(+), 28 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 13c69ed..bc856e3 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1307,6 +1307,17 @@ do {								\
 #define EXT4_FREEBLOCKS_WATERMARK 0
 #endif
 
+static inline void ext4_update_i_disksize(struct inode *inode, loff_t newsize)
+{
+	/*
+	 * XXX: replace with spinlock if seen contended -bzzz
+	 */
+	down_write(&EXT4_I(inode)->i_data_sem);
+	if (newsize > EXT4_I(inode)->i_disksize)
+		EXT4_I(inode)->i_disksize = newsize;
+	up_write(&EXT4_I(inode)->i_data_sem);
+	return ;
+}
 
 /*
  * Inodes and files operations
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 268e96d..7def792 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3018,10 +3018,11 @@ static void ext4_falloc_update_inode(struct inode *inode,
 	 * Update only when preallocation was requested beyond
 	 * the file size.
 	 */
-	if (!(mode & FALLOC_FL_KEEP_SIZE) &&
-				new_size > i_size_read(inode)) {
-		i_size_write(inode, new_size);
-		EXT4_I(inode)->i_disksize = new_size;
+	if (!(mode & FALLOC_FL_KEEP_SIZE)) {
+		if (new_size > i_size_read(inode))
+			i_size_write(inode, new_size);
+		if (new_size > EXT4_I(inode)->i_disksize)
+			ext4_update_i_disksize(inode, new_size);
 	}
 
 }
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index a8c6bc8..605ae24 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1437,16 +1437,18 @@ static int ext4_ordered_write_end(struct file *file,
 	ret = ext4_jbd2_file_inode(handle, inode);
 
 	if (ret == 0) {
-		/*
-		 * generic_write_end() will run mark_inode_dirty() if i_size
-		 * changes.  So let's piggyback the i_disksize mark_inode_dirty
-		 * into that.
-		 */
 		loff_t new_i_size;
 
 		new_i_size = pos + copied;
-		if (new_i_size > EXT4_I(inode)->i_disksize)
-			EXT4_I(inode)->i_disksize = new_i_size;
+		if (new_i_size > EXT4_I(inode)->i_disksize) {
+			ext4_update_i_disksize(inode, new_i_size);
+			/* We need to mark inode dirty even if
+			 * new_i_size is less that inode->i_size
+			 * bu greater than i_disksize.(hint delalloc)
+			 */
+			ext4_mark_inode_dirty(handle, inode);
+		}
+
 		ret2 = generic_write_end(file, mapping, pos, len, copied,
 							page, fsdata);
 		copied = ret2;
@@ -1471,8 +1473,14 @@ static int ext4_writeback_write_end(struct file *file,
 	loff_t new_i_size;
 
 	new_i_size = pos + copied;
-	if (new_i_size > EXT4_I(inode)->i_disksize)
-		EXT4_I(inode)->i_disksize = new_i_size;
+	if (new_i_size > EXT4_I(inode)->i_disksize) {
+		ext4_update_i_disksize(inode, new_i_size);
+		/* We need to mark inode dirty even if
+		 * new_i_size is less that inode->i_size
+		 * bu greater than i_disksize.(hint delalloc)
+		 */
+		ext4_mark_inode_dirty(handle, inode);
+	}
 
 	ret2 = generic_write_end(file, mapping, pos, len, copied,
 							page, fsdata);
@@ -1497,6 +1505,7 @@ static int ext4_journalled_write_end(struct file *file,
 	int ret = 0, ret2;
 	int partial = 0;
 	unsigned from, to;
+	loff_t new_i_size;
 
 	from = pos & (PAGE_CACHE_SIZE - 1);
 	to = from + len;
@@ -1511,11 +1520,12 @@ static int ext4_journalled_write_end(struct file *file,
 				to, &partial, write_end_fn);
 	if (!partial)
 		SetPageUptodate(page);
-	if (pos+copied > inode->i_size)
+	new_i_size = pos + copied;
+	if (new_i_size > inode->i_size)
 		i_size_write(inode, pos+copied);
 	EXT4_I(inode)->i_state |= EXT4_STATE_JDATA;
-	if (inode->i_size > EXT4_I(inode)->i_disksize) {
-		EXT4_I(inode)->i_disksize = inode->i_size;
+	if (new_i_size > EXT4_I(inode)->i_disksize) {
+		ext4_update_i_disksize(inode, new_i_size);
 		ret2 = ext4_mark_inode_dirty(handle, inode);
 		if (!ret)
 			ret = ret2;
@@ -2227,18 +2237,9 @@ static int ext4_da_get_block_write(struct inode *inode, sector_t iblock,
 		if (disksize > i_size_read(inode))
 			disksize = i_size_read(inode);
 		if (disksize > EXT4_I(inode)->i_disksize) {
-			/*
-			 * XXX: replace with spinlock if seen contended -bzzz
-			 */
-			down_write(&EXT4_I(inode)->i_data_sem);
-			if (disksize > EXT4_I(inode)->i_disksize)
-				EXT4_I(inode)->i_disksize = disksize;
-			up_write(&EXT4_I(inode)->i_data_sem);
-
-			if (EXT4_I(inode)->i_disksize == disksize) {
-				ret = ext4_mark_inode_dirty(handle, inode);
-				return ret;
-			}
+			ext4_update_i_disksize(inode, disksize);
+			ret = ext4_mark_inode_dirty(handle, inode);
+			return ret;
 		}
 		ret = 0;
 	}
@@ -2567,6 +2568,13 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
 		unlock_page(page);
 		ext4_journal_stop(handle);
 		page_cache_release(page);
+		/*
+		 * block_write_begin may have instantiated a few blocks
+		 * outside i_size.  Trim these off again. Don't need
+		 * i_size_read because we hold i_mutex.
+		 */
+		if (pos + len > inode->i_size)
+			vmtruncate(inode, inode->i_size);
 	}
 
 	if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
@@ -2647,6 +2655,11 @@ static int ext4_da_write_end(struct file *file,
 				EXT4_I(inode)->i_disksize = new_i_size;
 			}
 			up_write(&EXT4_I(inode)->i_data_sem);
+			/* We need to mark inode dirty even if
+			 * new_i_size is less that inode->i_size
+			 * bu greater than i_disksize.(hint delalloc)
+			 */
+			ext4_mark_inode_dirty(handle, inode);
 		}
 	}
 	ret2 = generic_write_end(file, mapping, pos, len, copied,
-- 
1.6.0.1.90.g27a6e


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] ext3: truncate block allocated on a failed ext3_write_begin
  2008-08-28 17:27 ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Aneesh Kumar K.V
  2008-08-28 17:27   ` [PATCH] ext4: Properly update i_disksize Aneesh Kumar K.V
@ 2008-08-28 17:56   ` Eric Sandeen
  2008-08-29 14:44     ` Aneesh Kumar K.V
  2008-08-29  8:33   ` Dmitri Monakhov
  2 siblings, 1 reply; 7+ messages in thread
From: Eric Sandeen @ 2008-08-28 17:56 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: cmm, tytso, linux-ext4

Aneesh Kumar K.V wrote:
> For blocksize < pagesize we need to remove blocks that
> got allocte in block_wirte_begin if we fail with ENOSPC
> for later blocks. block_write_begin internally does
> this if it allocated page locally. This make sure
> we don't have blocks outisde inode.i_size during
> ENOSPC

I think this is good; here's an easy testcase:

# dd if=/dev/zero of=fsfile bs=1k count=2048
# mkfs.ext3 -F fsfile
# mkdir mnt
# mount -o loop fsfile mnt/
# dd if=/dev/zero of=mnt/1kfile bs=1k count=1
# cd mnt/
# dd if=/dev/zero of=bigfile bs=4k
# cd ..
# umount mnt
# e2fsck -f fsfile
e2fsck 1.39 (29-May-2006)
Pass 1: Checking inodes, blocks, and sizes
Inode 13, i_size is 974848, should be 976896.  Fix<y>?
...

can you test with that, unless you already have a testcase you've used?

Assuming it fixes it (it should) you can add:

Acked-by: Eric Sandeen <sandeen@redhat.com>

-Eric

> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  fs/ext3/inode.c |    7 +++++++
>  1 files changed, 7 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
> index ebfec4d..f8424ad 100644
> --- a/fs/ext3/inode.c
> +++ b/fs/ext3/inode.c
> @@ -1186,6 +1186,13 @@ static int ext3_write_begin(struct file *file, struct address_space *mapping,
>  		ext3_journal_stop(handle);
>  		unlock_page(page);
>  		page_cache_release(page);
> +		/*
> +		 * block_write_begin may have instantiated a few blocks
> +		 * outside i_size.  Trim these off again. Don't need
> +		 * i_size_read because we hold i_mutex.
> +		 */
> +		if (pos + len > inode->i_size)
> +			vmtruncate(inode, inode->i_size);
>  	}
>  	if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries))
>  		goto retry;


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ext3: truncate block allocated on a failed ext3_write_begin
  2008-08-28 17:27 ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Aneesh Kumar K.V
  2008-08-28 17:27   ` [PATCH] ext4: Properly update i_disksize Aneesh Kumar K.V
  2008-08-28 17:56   ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Eric Sandeen
@ 2008-08-29  8:33   ` Dmitri Monakhov
  2008-08-29  9:51     ` Aneesh Kumar K.V
  2 siblings, 1 reply; 7+ messages in thread
From: Dmitri Monakhov @ 2008-08-29  8:33 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: cmm, tytso, sandeen, linux-ext4

"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:

> For blocksize < pagesize we need to remove blocks that
> got allocte in block_wirte_begin if we fail with ENOSPC
> for later blocks. block_write_begin internally does
> this if it allocated page locally. This make sure
> we don't have blocks outisde inode.i_size during
> ENOSPC
BTW why this check was moved from generic_XXX_write to fs speciffic code?
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  fs/ext3/inode.c |    7 +++++++
>  1 files changed, 7 insertions(+), 0 deletions(-)
>
> diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
> index ebfec4d..f8424ad 100644
> --- a/fs/ext3/inode.c
> +++ b/fs/ext3/inode.c
> @@ -1186,6 +1186,13 @@ static int ext3_write_begin(struct file *file, struct address_space *mapping,
>  		ext3_journal_stop(handle);
>  		unlock_page(page);
>  		page_cache_release(page);
> +		/*
> +		 * block_write_begin may have instantiated a few blocks
> +		 * outside i_size.  Trim these off again. Don't need
> +		 * i_size_read because we hold i_mutex.
> +		 */
> +		if (pos + len > inode->i_size)
> +			vmtruncate(inode, inode->i_size);
>  	}
>  	if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries))
>  		goto retry;
> -- 
> 1.6.0.1.90.g27a6e
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ext3: truncate block allocated on a failed ext3_write_begin
  2008-08-29  8:33   ` Dmitri Monakhov
@ 2008-08-29  9:51     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 7+ messages in thread
From: Aneesh Kumar K.V @ 2008-08-29  9:51 UTC (permalink / raw)
  To: Dmitri Monakhov; +Cc: cmm, tytso, sandeen, linux-ext4

On Fri, Aug 29, 2008 at 12:33:24PM +0400, Dmitri Monakhov wrote:
> "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:
> 
> > For blocksize < pagesize we need to remove blocks that
> > got allocte in block_wirte_begin if we fail with ENOSPC
> > for later blocks. block_write_begin internally does
> > this if it allocated page locally. This make sure
> > we don't have blocks outisde inode.i_size during
> > ENOSPC
> BTW why this check was moved from generic_XXX_write to fs speciffic code?


Not quite sure what you mean by that ? block_write_begin generic
code already does the vmtruncate if it had allocated page locally.
ext3/4 allocate/grab_cache_page  in write_begin and pass the page
pointer to block_write_begin. That implies block_write_begin
won't do the truncate. So we have to do it in filesystem write_begin.
also we have to do it after unlocking the page.


> >
> > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > ---
> >  fs/ext3/inode.c |    7 +++++++
> >  1 files changed, 7 insertions(+), 0 deletions(-)
> >
> > diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
> > index ebfec4d..f8424ad 100644
> > --- a/fs/ext3/inode.c
> > +++ b/fs/ext3/inode.c
> > @@ -1186,6 +1186,13 @@ static int ext3_write_begin(struct file *file, struct address_space *mapping,
> >  		ext3_journal_stop(handle);
> >  		unlock_page(page);
> >  		page_cache_release(page);
> > +		/*
> > +		 * block_write_begin may have instantiated a few blocks
> > +		 * outside i_size.  Trim these off again. Don't need
> > +		 * i_size_read because we hold i_mutex.
> > +		 */
> > +		if (pos + len > inode->i_size)
> > +			vmtruncate(inode, inode->i_size);
> >  	}
> >  	if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries))
> >  		goto retry;
> > -- 
> > 1.6.0.1.90.g27a6e
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ext3: truncate block allocated on a failed ext3_write_begin
  2008-08-28 17:56   ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Eric Sandeen
@ 2008-08-29 14:44     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 7+ messages in thread
From: Aneesh Kumar K.V @ 2008-08-29 14:44 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: cmm, tytso, linux-ext4

On Thu, Aug 28, 2008 at 12:56:28PM -0500, Eric Sandeen wrote:
> Aneesh Kumar K.V wrote:
> > For blocksize < pagesize we need to remove blocks that
> > got allocte in block_wirte_begin if we fail with ENOSPC
> > for later blocks. block_write_begin internally does
> > this if it allocated page locally. This make sure
> > we don't have blocks outisde inode.i_size during
> > ENOSPC
> 
> I think this is good; here's an easy testcase:
> 
> # dd if=/dev/zero of=fsfile bs=1k count=2048
> # mkfs.ext3 -F fsfile
> # mkdir mnt
> # mount -o loop fsfile mnt/
> # dd if=/dev/zero of=mnt/1kfile bs=1k count=1
> # cd mnt/
> # dd if=/dev/zero of=bigfile bs=4k
> # cd ..
> # umount mnt
> # e2fsck -f fsfile
> e2fsck 1.39 (29-May-2006)
> Pass 1: Checking inodes, blocks, and sizes
> Inode 13, i_size is 974848, should be 976896.  Fix<y>?
> ...
> 
> can you test with that, unless you already have a testcase you've used?

I tested the above and it works fine. The problem appeared during
fsstress run and I used fsstress to debug and fix.


> 
> Assuming it fixes it (it should) you can add:
> 
> Acked-by: Eric Sandeen <sandeen@redhat.com>
> 

-aneesh

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-08-29 14:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-28 17:27 [PATCH] ext4: truncate block allocated on a failed ext4_write_begin Aneesh Kumar K.V
2008-08-28 17:27 ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Aneesh Kumar K.V
2008-08-28 17:27   ` [PATCH] ext4: Properly update i_disksize Aneesh Kumar K.V
2008-08-28 17:56   ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Eric Sandeen
2008-08-29 14:44     ` Aneesh Kumar K.V
2008-08-29  8:33   ` Dmitri Monakhov
2008-08-29  9:51     ` Aneesh Kumar K.V

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).