Linux EXT4 FS development
 help / color / mirror / Atom feed
From: Curt Wohlgemuth <curtw@google.com>
To: Eric Sandeen <sandeen@redhat.com>
Cc: ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH] ext4: Ensure zeroout blocks have no dirty metadata
Date: Thu, 10 Dec 2009 10:01:21 -0800	[thread overview]
Message-ID: <6601abe90912101001o1f6f6ed6jc6ecb6e4c26d2463@mail.gmail.com> (raw)
In-Reply-To: <4B213391.4050603@redhat.com>

On Thu, Dec 10, 2009 at 9:44 AM, Eric Sandeen <sandeen@redhat.com> wrote:
> Curt Wohlgemuth wrote:
>> This fixes a bug in which new blocks returned from an extent created with
>> ext4_ext_zeroout() can have dirty metadata still associated with them.
>
> Do you have a testcase or at least end result details for this corruption?

<sigh>  I wish I had a testcase.  I do know some facts about the
workload(s) involved:

1. No journal
2. Nearly full ext2 partition, mounted as ext4 -- tune2fs used to turn
on "extent dir_index"
3. Existing non-extent based files are removed, new (extent-based)
files are created.
4. Files created with O_DIRECT, ~8MB, fallocate(KEEP_SIZE) used,
writes are generally in increasing offset order, ~64K usually.
5. Some (~1%) writes are to holes in the fallocate'd file; we're not
yet using Mingming's patches, so these writes fall back to buffered
writes.
6. Lots of processes, lots of threads.
7. End result is a block (or sometimes 2 blocks) of all zeros, at a
block-aligned offset.  This zero'ed block is *always* somewhere in a
14-block extent created from using ext4_ext_zeroout().

Lots and lots of tracing showed that these blocks were originally
dirtied metadata blocks from unlinked (hence truncated) non-extent
based files.  These indirect blocks in ext4 have their direct block
pointers turned to zero as the blocks are truncated, and these
indirect blocks are marked as dirty.  Even though bforget() is called
on these metadata blocks, bforget() won't wait on the buffer.  Waiting
on the buffer is meant to happen at allocate time, for "new" buffers.

>
>>       Signed-off-by: Curt Wohlgemuth <curtw@google.com>
>> ---
>>
>> This is for the problem I reported on 23 Nov ("Bug in extent zeroout: blocks
>> not marked as new").  I'm not seeing the corruption with this fix that I was
>> seeing without it.
>>
>> diff -uprN orig/fs/ext4/extents.c new/fs/ext4/extents.c
>> --- orig/fs/ext4/extents.c    2009-12-09 15:09:25.000000000 -0800
>> +++ new/fs/ext4/extents.c     2009-12-09 15:09:37.000000000 -0800
>> @@ -2474,9 +2474,21 @@ static int ext4_ext_zeroout(struct inode
>>               submit_bio(WRITE, bio);
>>               wait_for_completion(&event);
>>
>> -             if (test_bit(BIO_UPTODATE, &bio->bi_flags))
>> +             /* On success, we need to insure all metadata associated
>
> nitpick, "ensure" I think, although I guess they're mostly synonymous
> today so do with that what you will :)

I'll let Ted decide :-)

>
> -Eric
>
>> +              * with each of these blocks is unmapped. */
>> +             if (test_bit(BIO_UPTODATE, &bio->bi_flags)) {
>> +                     sector_t block = ee_pblock;
>> +
>>                       ret = 0;
>> -             else {
>> +                     done = 0;
>> +                     while (done < len) {
>> +                             unmap_underlying_metadata(inode->i_sb->s_bdev,
>> +                                                       block);
>> +
>> +                             done++;
>> +                             block++;
>> +                     }
>> +             } else {
>>                       ret = -EIO;
>>                       break;
>>               }
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2009-12-10 18:01 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-10 17:28 [PATCH] ext4: Ensure zeroout blocks have no dirty metadata Curt Wohlgemuth
2009-12-10 17:44 ` Eric Sandeen
2009-12-10 18:01   ` Curt Wohlgemuth [this message]
2009-12-11  9:09 ` Andreas Dilger
2009-12-11 22:01   ` Curt Wohlgemuth
2009-12-11 23:27     ` Andreas Dilger
2009-12-15 22:33       ` Curt Wohlgemuth
2009-12-18 11:49 ` Aneesh Kumar K.V
2009-12-18 12:10   ` Aneesh Kumar K.V
2009-12-18 23:11     ` Curt Wohlgemuth
2009-12-29 17:56   ` Curt Wohlgemuth
2009-12-29 18:53     ` tytso
2009-12-29 23:23       ` Curt Wohlgemuth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6601abe90912101001o1f6f6ed6jc6ecb6e4c26d2463@mail.gmail.com \
    --to=curtw@google.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=sandeen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox