From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Eric Sandeen <sandeen@redhat.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>, linux-ext4@vger.kernel.org
Subject: Re: fsstress-induced corruption reproduced
Date: Wed, 6 Jan 2010 14:19:18 +0530 [thread overview]
Message-ID: <20100106084918.GA3475@skywalker.linux.vnet.ibm.com> (raw)
In-Reply-To: <4B43CD55.1050904@redhat.com>
> Maybe something like this works:
>
> Index: linux-2.6/fs/ext4/inode.c
> ===================================================================
> --- linux-2.6.orig/fs/ext4/inode.c
> +++ linux-2.6/fs/ext4/inode.c
> @@ -1203,6 +1203,7 @@ int ext4_get_blocks(handle_t *handle, st
> int flags)
> {
> int retval;
> + int was_unwritten;
>
> clear_buffer_mapped(bh);
> clear_buffer_unwritten(bh);
> @@ -1253,9 +1254,13 @@ int ext4_get_blocks(handle_t *handle, st
> * part of the uninitialized extent to be an initialized
> * extent. This is because we need to avoid the combination
> * of BH_Unwritten and BH_Mapped flags being simultaneously
> - * set on the buffer_head.
> + * set on the buffer_head. However, if it was unwritten we
> + * don't want to update reserved space later.
> */
> - clear_buffer_unwritten(bh);
> + if (buffer_unwritten(bh)) {
> + was_unwritten = 1;
> + clear_buffer_unwritten(bh);
> + }
That won't work because we can do the fallocate after we did the ext4_ext_get_block
with create = 0 and before we do down_write below. So we would still have was_unwritten = 0
>
> /*
> * New blocks allocate and/or writing to uninitialized extent
> @@ -1301,7 +1306,8 @@ int ext4_get_blocks(handle_t *handle, st
> * Update reserved blocks/metadata blocks after successful
> * block allocation which had been deferred till now.
> */
> - if ((retval > 0) && (flags & EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE))
> + if ((retval > 0) && !was_unwritten &&
> + (flags & EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE))
> ext4_da_update_reserve_space(inode, retval);
>
> up_write((&EXT4_I(inode)->i_data_sem));
>
> but that might leave the previous reservations hanging around from
> prior to the fallocate ...
>
What commit d21cd8f163ac44b15c465aab7306db931c606908 did was to move
quota claim to ext4_get_block function. Earlier we didn't do a
quota claim if we happened to write to fallocate area because
ext4_ext_handle_uninitialized_extents didn't call ext4_mb_mark_diskspace_used
So that should explain why we are seeing this problem with
d21cd8f163ac44b15c465aab7306db931c606908.
How about the patch below ?
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index af7b626..b98de17 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1443,6 +1443,8 @@ extern int ext4_block_truncate_page(handle_t *handle,
extern int ext4_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf);
extern qsize_t *ext4_get_reserved_space(struct inode *inode);
extern int flush_aio_dio_completed_IO(struct inode *inode);
+extern void ext4_da_update_reserve_space(struct inode *inode,
+ int used, int quota_claim);
/* ioctl.c */
extern long ext4_ioctl(struct file *, unsigned int, unsigned long);
extern long ext4_compat_ioctl(struct file *, unsigned int, unsigned long);
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 7d7b74e..3b6ff72 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3132,7 +3132,19 @@ out:
unmap_underlying_metadata_blocks(inode->i_sb->s_bdev,
newblock + max_blocks,
allocated - max_blocks);
+ allocated = max_blocks;
}
+
+ /*
+ * If we have done fallocate with the offset that is already
+ * delayed allocated, we would have block reservation
+ * and quota reservation done in the delayed write path.
+ * But fallocate would have already updated quota and block
+ * count for this offset. So cancel these reservation
+ */
+ if (flags & EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE)
+ ext4_da_update_reserve_space(inode, allocated, 0);
+
map_out:
set_buffer_mapped(bh_result);
out1:
@@ -3368,9 +3380,18 @@ int ext4_ext_get_blocks(handle_t *handle, struct inode *inode,
/* previous routine could use block we allocated */
newblock = ext_pblock(&newex);
allocated = ext4_ext_get_actual_len(&newex);
+ if (allocated > max_blocks)
+ allocated = max_blocks;
set_buffer_new(bh_result);
/*
+ * Update reserved blocks/metadata blocks after successful
+ * block allocation which had been deferred till now.
+ */
+ if (flags & EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE)
+ ext4_da_update_reserve_space(inode, allocated, 1);
+
+ /*
* Cache the extent and update transaction to commit on fdatasync only
* when it is _not_ an uninitialized extent.
*/
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c818972..77ff941 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1053,7 +1053,7 @@ static int ext4_calc_metadata_amount(struct inode *inode, sector_t lblock)
* Called with i_data_sem down, which is important since we can call
* ext4_discard_preallocations() from here.
*/
-static void ext4_da_update_reserve_space(struct inode *inode, int used)
+void ext4_da_update_reserve_space(struct inode *inode, int used, int quota_claim)
{
struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
struct ext4_inode_info *ei = EXT4_I(inode);
@@ -1090,9 +1090,17 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
/* Update quota subsystem */
- vfs_dq_claim_block(inode, used);
- if (mdb_free)
- vfs_dq_release_reservation_block(inode, mdb_free);
+ if (quota_claim) {
+ vfs_dq_claim_block(inode, used);
+ if (mdb_free)
+ vfs_dq_release_reservation_block(inode, mdb_free);
+ } else {
+ /*
+ * This is a request to cancel the reservation. So just
+ * update the resevation and cancel the quota reservation
+ */
+ vfs_dq_release_reservation_block(inode, mdb_free + used);
+ }
/*
* If we have done all the pending block allocations and if
@@ -1292,18 +1300,20 @@ int ext4_get_blocks(handle_t *handle, struct inode *inode, sector_t block,
*/
EXT4_I(inode)->i_state &= ~EXT4_STATE_EXT_MIGRATE;
}
- }
+ /*
+ * Update reserved blocks/metadata blocks after successful
+ * block allocation which had been deferred till now. We don't
+ * support fallocate for non extent files. So we can update
+ * reserve space here.
+ */
+ if ((retval > 0) &&
+ (flags & EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE))
+ ext4_da_update_reserve_space(inode, retval, 1);
+ }
if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)
EXT4_I(inode)->i_delalloc_reserved_flag = 0;
- /*
- * Update reserved blocks/metadata blocks after successful
- * block allocation which had been deferred till now.
- */
- if ((retval > 0) && (flags & EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE))
- ext4_da_update_reserve_space(inode, retval);
prev parent reply other threads:[~2010-01-06 8:50 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-31 15:50 fsstress-induced corruption reproduced Theodore Ts'o
2010-01-04 20:13 ` Eric Sandeen
2010-01-04 23:08 ` Eric Sandeen
2010-01-05 6:17 ` Aneesh Kumar K.V
2010-01-05 14:40 ` Eric Sandeen
2010-01-05 23:37 ` Eric Sandeen
2010-01-06 8:49 ` Aneesh Kumar K.V [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100106084918.GA3475@skywalker.linux.vnet.ibm.com \
--to=aneesh.kumar@linux.vnet.ibm.com \
--cc=linux-ext4@vger.kernel.org \
--cc=sandeen@redhat.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox