* [PATCH] ext4: truncate block allocated on a failed ext4_write_begin @ 2008-08-28 17:27 Aneesh Kumar K.V 2008-08-28 17:27 ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Aneesh Kumar K.V 0 siblings, 1 reply; 7+ messages in thread From: Aneesh Kumar K.V @ 2008-08-28 17:27 UTC (permalink / raw) To: cmm, tytso, sandeen; +Cc: linux-ext4, Aneesh Kumar K.V For blocksize < pagesize we need to remove blocks that got allocte in block_wirte_begin if we fail with ENOSPC for later blocks. block_write_begin internally does this if it allocated page locally. This make sure we don't have blocks outisde inode.i_size during ENOSPC Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> --- fs/ext4/inode.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 473d888..a8c6bc8 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1394,6 +1394,13 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping, unlock_page(page); ext4_journal_stop(handle); page_cache_release(page); + /* + * block_write_begin may have instantiated a few blocks + * outside i_size. Trim these off again. Don't need + * i_size_read because we hold i_mutex. + */ + if (pos + len > inode->i_size) + vmtruncate(inode, inode->i_size); } if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries)) -- 1.6.0.1.90.g27a6e ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH] ext3: truncate block allocated on a failed ext3_write_begin 2008-08-28 17:27 [PATCH] ext4: truncate block allocated on a failed ext4_write_begin Aneesh Kumar K.V @ 2008-08-28 17:27 ` Aneesh Kumar K.V 2008-08-28 17:27 ` [PATCH] ext4: Properly update i_disksize Aneesh Kumar K.V ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Aneesh Kumar K.V @ 2008-08-28 17:27 UTC (permalink / raw) To: cmm, tytso, sandeen; +Cc: linux-ext4, Aneesh Kumar K.V For blocksize < pagesize we need to remove blocks that got allocte in block_wirte_begin if we fail with ENOSPC for later blocks. block_write_begin internally does this if it allocated page locally. This make sure we don't have blocks outisde inode.i_size during ENOSPC Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> --- fs/ext3/inode.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c index ebfec4d..f8424ad 100644 --- a/fs/ext3/inode.c +++ b/fs/ext3/inode.c @@ -1186,6 +1186,13 @@ static int ext3_write_begin(struct file *file, struct address_space *mapping, ext3_journal_stop(handle); unlock_page(page); page_cache_release(page); + /* + * block_write_begin may have instantiated a few blocks + * outside i_size. Trim these off again. Don't need + * i_size_read because we hold i_mutex. + */ + if (pos + len > inode->i_size) + vmtruncate(inode, inode->i_size); } if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries)) goto retry; -- 1.6.0.1.90.g27a6e ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH] ext4: Properly update i_disksize. 2008-08-28 17:27 ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Aneesh Kumar K.V @ 2008-08-28 17:27 ` Aneesh Kumar K.V 2008-08-28 17:56 ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Eric Sandeen 2008-08-29 8:33 ` Dmitri Monakhov 2 siblings, 0 replies; 7+ messages in thread From: Aneesh Kumar K.V @ 2008-08-28 17:27 UTC (permalink / raw) To: cmm, tytso, sandeen; +Cc: linux-ext4, Aneesh Kumar K.V With delayed allocation we use i_data_sem to update i_disksize. We need to update i_disksize only if the new size specified is greater than the current value and we need to make sure we don't race with other i_disksize update. With delayed allocation we will switch to nondelalloc write_begin if we are low on free blocks. That means nondelalloc write_begin also need to use the same locking. We also need to check and update i_disksize even if the new size is less that inode.i_size because of delayed allocation. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> --- fs/ext4/ext4.h | 11 +++++++++ fs/ext4/extents.c | 9 ++++--- fs/ext4/inode.c | 61 ++++++++++++++++++++++++++++++++-------------------- 3 files changed, 53 insertions(+), 28 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 13c69ed..bc856e3 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1307,6 +1307,17 @@ do { \ #define EXT4_FREEBLOCKS_WATERMARK 0 #endif +static inline void ext4_update_i_disksize(struct inode *inode, loff_t newsize) +{ + /* + * XXX: replace with spinlock if seen contended -bzzz + */ + down_write(&EXT4_I(inode)->i_data_sem); + if (newsize > EXT4_I(inode)->i_disksize) + EXT4_I(inode)->i_disksize = newsize; + up_write(&EXT4_I(inode)->i_data_sem); + return ; +} /* * Inodes and files operations diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 268e96d..7def792 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -3018,10 +3018,11 @@ static void ext4_falloc_update_inode(struct inode *inode, * Update only when preallocation was requested beyond * the file size. */ - if (!(mode & FALLOC_FL_KEEP_SIZE) && - new_size > i_size_read(inode)) { - i_size_write(inode, new_size); - EXT4_I(inode)->i_disksize = new_size; + if (!(mode & FALLOC_FL_KEEP_SIZE)) { + if (new_size > i_size_read(inode)) + i_size_write(inode, new_size); + if (new_size > EXT4_I(inode)->i_disksize) + ext4_update_i_disksize(inode, new_size); } } diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index a8c6bc8..605ae24 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1437,16 +1437,18 @@ static int ext4_ordered_write_end(struct file *file, ret = ext4_jbd2_file_inode(handle, inode); if (ret == 0) { - /* - * generic_write_end() will run mark_inode_dirty() if i_size - * changes. So let's piggyback the i_disksize mark_inode_dirty - * into that. - */ loff_t new_i_size; new_i_size = pos + copied; - if (new_i_size > EXT4_I(inode)->i_disksize) - EXT4_I(inode)->i_disksize = new_i_size; + if (new_i_size > EXT4_I(inode)->i_disksize) { + ext4_update_i_disksize(inode, new_i_size); + /* We need to mark inode dirty even if + * new_i_size is less that inode->i_size + * bu greater than i_disksize.(hint delalloc) + */ + ext4_mark_inode_dirty(handle, inode); + } + ret2 = generic_write_end(file, mapping, pos, len, copied, page, fsdata); copied = ret2; @@ -1471,8 +1473,14 @@ static int ext4_writeback_write_end(struct file *file, loff_t new_i_size; new_i_size = pos + copied; - if (new_i_size > EXT4_I(inode)->i_disksize) - EXT4_I(inode)->i_disksize = new_i_size; + if (new_i_size > EXT4_I(inode)->i_disksize) { + ext4_update_i_disksize(inode, new_i_size); + /* We need to mark inode dirty even if + * new_i_size is less that inode->i_size + * bu greater than i_disksize.(hint delalloc) + */ + ext4_mark_inode_dirty(handle, inode); + } ret2 = generic_write_end(file, mapping, pos, len, copied, page, fsdata); @@ -1497,6 +1505,7 @@ static int ext4_journalled_write_end(struct file *file, int ret = 0, ret2; int partial = 0; unsigned from, to; + loff_t new_i_size; from = pos & (PAGE_CACHE_SIZE - 1); to = from + len; @@ -1511,11 +1520,12 @@ static int ext4_journalled_write_end(struct file *file, to, &partial, write_end_fn); if (!partial) SetPageUptodate(page); - if (pos+copied > inode->i_size) + new_i_size = pos + copied; + if (new_i_size > inode->i_size) i_size_write(inode, pos+copied); EXT4_I(inode)->i_state |= EXT4_STATE_JDATA; - if (inode->i_size > EXT4_I(inode)->i_disksize) { - EXT4_I(inode)->i_disksize = inode->i_size; + if (new_i_size > EXT4_I(inode)->i_disksize) { + ext4_update_i_disksize(inode, new_i_size); ret2 = ext4_mark_inode_dirty(handle, inode); if (!ret) ret = ret2; @@ -2227,18 +2237,9 @@ static int ext4_da_get_block_write(struct inode *inode, sector_t iblock, if (disksize > i_size_read(inode)) disksize = i_size_read(inode); if (disksize > EXT4_I(inode)->i_disksize) { - /* - * XXX: replace with spinlock if seen contended -bzzz - */ - down_write(&EXT4_I(inode)->i_data_sem); - if (disksize > EXT4_I(inode)->i_disksize) - EXT4_I(inode)->i_disksize = disksize; - up_write(&EXT4_I(inode)->i_data_sem); - - if (EXT4_I(inode)->i_disksize == disksize) { - ret = ext4_mark_inode_dirty(handle, inode); - return ret; - } + ext4_update_i_disksize(inode, disksize); + ret = ext4_mark_inode_dirty(handle, inode); + return ret; } ret = 0; } @@ -2567,6 +2568,13 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping, unlock_page(page); ext4_journal_stop(handle); page_cache_release(page); + /* + * block_write_begin may have instantiated a few blocks + * outside i_size. Trim these off again. Don't need + * i_size_read because we hold i_mutex. + */ + if (pos + len > inode->i_size) + vmtruncate(inode, inode->i_size); } if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries)) @@ -2647,6 +2655,11 @@ static int ext4_da_write_end(struct file *file, EXT4_I(inode)->i_disksize = new_i_size; } up_write(&EXT4_I(inode)->i_data_sem); + /* We need to mark inode dirty even if + * new_i_size is less that inode->i_size + * bu greater than i_disksize.(hint delalloc) + */ + ext4_mark_inode_dirty(handle, inode); } } ret2 = generic_write_end(file, mapping, pos, len, copied, -- 1.6.0.1.90.g27a6e ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] ext3: truncate block allocated on a failed ext3_write_begin 2008-08-28 17:27 ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Aneesh Kumar K.V 2008-08-28 17:27 ` [PATCH] ext4: Properly update i_disksize Aneesh Kumar K.V @ 2008-08-28 17:56 ` Eric Sandeen 2008-08-29 14:44 ` Aneesh Kumar K.V 2008-08-29 8:33 ` Dmitri Monakhov 2 siblings, 1 reply; 7+ messages in thread From: Eric Sandeen @ 2008-08-28 17:56 UTC (permalink / raw) To: Aneesh Kumar K.V; +Cc: cmm, tytso, linux-ext4 Aneesh Kumar K.V wrote: > For blocksize < pagesize we need to remove blocks that > got allocte in block_wirte_begin if we fail with ENOSPC > for later blocks. block_write_begin internally does > this if it allocated page locally. This make sure > we don't have blocks outisde inode.i_size during > ENOSPC I think this is good; here's an easy testcase: # dd if=/dev/zero of=fsfile bs=1k count=2048 # mkfs.ext3 -F fsfile # mkdir mnt # mount -o loop fsfile mnt/ # dd if=/dev/zero of=mnt/1kfile bs=1k count=1 # cd mnt/ # dd if=/dev/zero of=bigfile bs=4k # cd .. # umount mnt # e2fsck -f fsfile e2fsck 1.39 (29-May-2006) Pass 1: Checking inodes, blocks, and sizes Inode 13, i_size is 974848, should be 976896. Fix<y>? ... can you test with that, unless you already have a testcase you've used? Assuming it fixes it (it should) you can add: Acked-by: Eric Sandeen <sandeen@redhat.com> -Eric > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > --- > fs/ext3/inode.c | 7 +++++++ > 1 files changed, 7 insertions(+), 0 deletions(-) > > diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c > index ebfec4d..f8424ad 100644 > --- a/fs/ext3/inode.c > +++ b/fs/ext3/inode.c > @@ -1186,6 +1186,13 @@ static int ext3_write_begin(struct file *file, struct address_space *mapping, > ext3_journal_stop(handle); > unlock_page(page); > page_cache_release(page); > + /* > + * block_write_begin may have instantiated a few blocks > + * outside i_size. Trim these off again. Don't need > + * i_size_read because we hold i_mutex. > + */ > + if (pos + len > inode->i_size) > + vmtruncate(inode, inode->i_size); > } > if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries)) > goto retry; ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ext3: truncate block allocated on a failed ext3_write_begin 2008-08-28 17:56 ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Eric Sandeen @ 2008-08-29 14:44 ` Aneesh Kumar K.V 0 siblings, 0 replies; 7+ messages in thread From: Aneesh Kumar K.V @ 2008-08-29 14:44 UTC (permalink / raw) To: Eric Sandeen; +Cc: cmm, tytso, linux-ext4 On Thu, Aug 28, 2008 at 12:56:28PM -0500, Eric Sandeen wrote: > Aneesh Kumar K.V wrote: > > For blocksize < pagesize we need to remove blocks that > > got allocte in block_wirte_begin if we fail with ENOSPC > > for later blocks. block_write_begin internally does > > this if it allocated page locally. This make sure > > we don't have blocks outisde inode.i_size during > > ENOSPC > > I think this is good; here's an easy testcase: > > # dd if=/dev/zero of=fsfile bs=1k count=2048 > # mkfs.ext3 -F fsfile > # mkdir mnt > # mount -o loop fsfile mnt/ > # dd if=/dev/zero of=mnt/1kfile bs=1k count=1 > # cd mnt/ > # dd if=/dev/zero of=bigfile bs=4k > # cd .. > # umount mnt > # e2fsck -f fsfile > e2fsck 1.39 (29-May-2006) > Pass 1: Checking inodes, blocks, and sizes > Inode 13, i_size is 974848, should be 976896. Fix<y>? > ... > > can you test with that, unless you already have a testcase you've used? I tested the above and it works fine. The problem appeared during fsstress run and I used fsstress to debug and fix. > > Assuming it fixes it (it should) you can add: > > Acked-by: Eric Sandeen <sandeen@redhat.com> > -aneesh ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ext3: truncate block allocated on a failed ext3_write_begin 2008-08-28 17:27 ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Aneesh Kumar K.V 2008-08-28 17:27 ` [PATCH] ext4: Properly update i_disksize Aneesh Kumar K.V 2008-08-28 17:56 ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Eric Sandeen @ 2008-08-29 8:33 ` Dmitri Monakhov 2008-08-29 9:51 ` Aneesh Kumar K.V 2 siblings, 1 reply; 7+ messages in thread From: Dmitri Monakhov @ 2008-08-29 8:33 UTC (permalink / raw) To: Aneesh Kumar K.V; +Cc: cmm, tytso, sandeen, linux-ext4 "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes: > For blocksize < pagesize we need to remove blocks that > got allocte in block_wirte_begin if we fail with ENOSPC > for later blocks. block_write_begin internally does > this if it allocated page locally. This make sure > we don't have blocks outisde inode.i_size during > ENOSPC BTW why this check was moved from generic_XXX_write to fs speciffic code? > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > --- > fs/ext3/inode.c | 7 +++++++ > 1 files changed, 7 insertions(+), 0 deletions(-) > > diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c > index ebfec4d..f8424ad 100644 > --- a/fs/ext3/inode.c > +++ b/fs/ext3/inode.c > @@ -1186,6 +1186,13 @@ static int ext3_write_begin(struct file *file, struct address_space *mapping, > ext3_journal_stop(handle); > unlock_page(page); > page_cache_release(page); > + /* > + * block_write_begin may have instantiated a few blocks > + * outside i_size. Trim these off again. Don't need > + * i_size_read because we hold i_mutex. > + */ > + if (pos + len > inode->i_size) > + vmtruncate(inode, inode->i_size); > } > if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries)) > goto retry; > -- > 1.6.0.1.90.g27a6e > > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ext3: truncate block allocated on a failed ext3_write_begin 2008-08-29 8:33 ` Dmitri Monakhov @ 2008-08-29 9:51 ` Aneesh Kumar K.V 0 siblings, 0 replies; 7+ messages in thread From: Aneesh Kumar K.V @ 2008-08-29 9:51 UTC (permalink / raw) To: Dmitri Monakhov; +Cc: cmm, tytso, sandeen, linux-ext4 On Fri, Aug 29, 2008 at 12:33:24PM +0400, Dmitri Monakhov wrote: > "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes: > > > For blocksize < pagesize we need to remove blocks that > > got allocte in block_wirte_begin if we fail with ENOSPC > > for later blocks. block_write_begin internally does > > this if it allocated page locally. This make sure > > we don't have blocks outisde inode.i_size during > > ENOSPC > BTW why this check was moved from generic_XXX_write to fs speciffic code? Not quite sure what you mean by that ? block_write_begin generic code already does the vmtruncate if it had allocated page locally. ext3/4 allocate/grab_cache_page in write_begin and pass the page pointer to block_write_begin. That implies block_write_begin won't do the truncate. So we have to do it in filesystem write_begin. also we have to do it after unlocking the page. > > > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > > --- > > fs/ext3/inode.c | 7 +++++++ > > 1 files changed, 7 insertions(+), 0 deletions(-) > > > > diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c > > index ebfec4d..f8424ad 100644 > > --- a/fs/ext3/inode.c > > +++ b/fs/ext3/inode.c > > @@ -1186,6 +1186,13 @@ static int ext3_write_begin(struct file *file, struct address_space *mapping, > > ext3_journal_stop(handle); > > unlock_page(page); > > page_cache_release(page); > > + /* > > + * block_write_begin may have instantiated a few blocks > > + * outside i_size. Trim these off again. Don't need > > + * i_size_read because we hold i_mutex. > > + */ > > + if (pos + len > inode->i_size) > > + vmtruncate(inode, inode->i_size); > > } > > if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries)) > > goto retry; > > -- > > 1.6.0.1.90.g27a6e > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-08-29 14:47 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-08-28 17:27 [PATCH] ext4: truncate block allocated on a failed ext4_write_begin Aneesh Kumar K.V 2008-08-28 17:27 ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Aneesh Kumar K.V 2008-08-28 17:27 ` [PATCH] ext4: Properly update i_disksize Aneesh Kumar K.V 2008-08-28 17:56 ` [PATCH] ext3: truncate block allocated on a failed ext3_write_begin Eric Sandeen 2008-08-29 14:44 ` Aneesh Kumar K.V 2008-08-29 8:33 ` Dmitri Monakhov 2008-08-29 9:51 ` Aneesh Kumar K.V
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox