From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Mingming Cao <cmm@us.ibm.com>
Cc: tytso@mit.edu, sandeen@redhat.com, linux-ext4@vger.kernel.org
Subject: Re: [RFC PATCH] mark buffer_head mapping preallocate area as new during write_begin with delayed allocation
Date: Tue, 28 Apr 2009 15:01:45 +0530 [thread overview]
Message-ID: <20090428093145.GA13719@skywalker> (raw)
In-Reply-To: <20090428042049.GA6520@skywalker>
On Tue, Apr 28, 2009 at 09:50:49AM +0530, Aneesh Kumar K.V wrote:
> On Mon, Apr 27, 2009 at 04:04:54PM -0700, Mingming Cao wrote:
> .....
>
> >
> > Index: linux-2.6.28-rc6/fs/ext4/inode.c
> > ===================================================================
> > --- linux-2.6.28-rc6.orig/fs/ext4/inode.c 2009-03-12 10:21:05.000000000 -0700
> > +++ linux-2.6.28-rc6/fs/ext4/inode.c 2009-04-27 14:35:21.000000000 -0700
> > @@ -2177,7 +2177,10 @@ static int ext4_da_get_block_prep(struct
> > set_buffer_new(bh_result);
> > set_buffer_delay(bh_result);
> > } else if (ret > 0) {
> > + if (buffer_unwritten(bh_result))
> > + set_buffer_new(bh_result);
> > bh_result->b_size = (ret << inode->i_blkbits);
> > + bh_result->b_bdev = inode->i_sb->s_bdev;
>
>
> Updated patch to set bh_result->b_dev. I also added comments in the
> source to explain whey we need to mark buffer_head new. Also updated
> single line patch summary. I will send the update (-v2) patch.
Looking at the source again i guess setting just b_dev is not enough.
unmap_underlying_metadata looks at the mapping block number, which we
don't have in case on unwritten buffer_head. How about the below patch ?
It involve vfs changes. But i guess it is correct with respect to the
meaning of BH_New (Disk mapping was newly created by get_block). I guess
BH_New implies BH_Mapped.
I haven't tested the patch yet. Also it should be split into multiple
patches. It also a fix a problem where we missed an
unamp_underlying_metadata in case of delayed allocated blocks. I guess
that can also cause corruption with delayed allocation.
From: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Subject: [PATCH -V3] ext4: Fix sub-block zeroing for buffered writes into unwritten extents.
We need to mark the buffer_head mapping prealloc space
as new during write_begin. Otherwise we don't zero out the
page cache content properly for a partial write. This will
cause file corruption with preallocation.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
fs/buffer.c | 9 ++++++++-
fs/ext4/inode.c | 8 +++++---
2 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index b3e5be7..13f0d52 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1867,15 +1867,22 @@ static int __block_prepare_write(struct inode *inode, struct page *page,
err = get_block(inode, block, bh, 1);
if (err)
break;
- if (buffer_new(bh)) {
+ if (buffer_new(bh))
unmap_underlying_metadata(bh->b_bdev,
bh->b_blocknr);
+ if (buffer_new(bh) || buffer_unwritten(bh) ||
+ buffer_delay(bh)) {
if (PageUptodate(page)) {
clear_buffer_new(bh);
set_buffer_uptodate(bh);
mark_buffer_dirty(bh);
continue;
}
+ /*
+ * sub-block writes into unwritten or
+ * delayed buffer should result in zero out
+ * of the rest of the buffer
+ */
if (block_end > to || block_start < from)
zero_user_segments(page,
to, block_end,
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index e91f978..504afb7 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1892,13 +1892,17 @@ static void mpage_put_bnr_to_bhs(struct mpage_da_data *mpd, sector_t logical,
if (buffer_delay(bh)) {
bh->b_blocknr = pblock;
clear_buffer_delay(bh);
+ set_buffer_mapped(bh);
bh->b_bdev = inode->i_sb->s_bdev;
+ unmap_underlying_metadata(bh->b_bdev,
+ pblock);
} else if (buffer_unwritten(bh)) {
bh->b_blocknr = pblock;
clear_buffer_unwritten(bh);
set_buffer_mapped(bh);
- set_buffer_new(bh);
bh->b_bdev = inode->i_sb->s_bdev;
+ unmap_underlying_metadata(bh->b_bdev,
+ pblock);
} else if (buffer_mapped(bh))
BUG_ON(bh->b_blocknr != pblock);
@@ -2318,8 +2322,6 @@ static int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
/* not enough space to reserve */
return ret;
- map_bh(bh_result, inode->i_sb, 0);
- set_buffer_new(bh_result);
set_buffer_delay(bh_result);
} else if (ret > 0) {
bh_result->b_size = (ret << inode->i_blkbits);
--
tg: (2084a87..) preallocate_corruption (depends on: ext4_lock_group_conversion)
next prev parent reply other threads:[~2009-04-28 9:31 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-27 19:05 [RFC PATCH] mark buffer_head mapping preallocate area as new during write_begin with delayed allocation Aneesh Kumar K.V
2009-04-27 19:30 ` Eric Sandeen
2009-04-27 23:04 ` Mingming Cao
2009-04-28 3:03 ` Eric Sandeen
2009-04-28 4:20 ` Aneesh Kumar K.V
2009-04-28 9:31 ` Aneesh Kumar K.V [this message]
2009-04-28 12:48 ` Theodore Tso
2009-04-28 16:35 ` Aneesh Kumar K.V
2009-04-28 17:00 ` Theodore Tso
2009-04-28 18:57 ` Aneesh Kumar K.V
2009-04-28 19:35 ` Eric Sandeen
2009-04-29 11:57 ` Jan Kara
2009-04-29 14:08 ` Eric Sandeen
2009-04-29 18:13 ` Jan Kara
2009-04-29 1:38 ` Mingming
2009-04-28 16:37 ` Eric Sandeen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090428093145.GA13719@skywalker \
--to=aneesh.kumar@linux.vnet.ibm.com \
--cc=cmm@us.ibm.com \
--cc=linux-ext4@vger.kernel.org \
--cc=sandeen@redhat.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.