public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: "Frédéric Bohé" <frederic.bohe@bull.net>
To: "linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH v2] ext4: fix initialization of UNINIT bitmap blocks
Date: Thu, 18 Sep 2008 15:45:14 +0200	[thread overview]
Message-ID: <1221745514.3550.83.camel@frecb007923.frec.bull.fr> (raw)
In-Reply-To: <1221489026.6733.36.camel@frecb007923.frec.bull.fr>

I am not very confident with buffer's behavior, but I think I have
understood what happens. Correct me if I am wrong.
Let's see the second test case, which was :

dd if=/dev/urandom of=/dev/md0 bs=1M count=20
mkfs.ext4 -t ext4dev /dev/md0 10M
mount -t ext4dev /dev/md0 /mnt/test
resize2fs /dev/md0 20M
for i in $(seq 1 3800); do touch /mnt/test/file${i} 2>&1; done
touch: cannot touch `/mnt/test/file3188': No space left on device
touch: cannot touch `/mnt/test/file3189': No space left on device
...
...


The issue here is that you can't use all inode of the second group of
the fs.

This happens because resize2fs make a call to ext2fs_read_bitmaps. This
function reads all bitmaps while paying attention not to read the
uninited bitmap. This works well as long as the fs block size is equal
to the page size. But in the above test case, the fs use 1k blocks and
we have an issue. 

That's because the "read" function issued by ext2fs_read_bitmaps is a
call to kernel's block_read_full_page function. So when a single bitmap
block is asked for, 4 blocks (for 1k blocks fs on x86) are actually read
(including the uninited ones) and their respective buffer set to
uptodate. 

As we rely on the buffer's uptodate flags to initialize or not this
buffer, it may happen that certain bitmap blocks are not initialized at
all. So their buffer contains the random garbage that was present on the
disk prior to the mkfs ( In the above test case, the inode bitmap of the
second group is full a random bits so I can't use all of its inodes ).

If the bitmap block corresponding to this buffer is later changed, its
UNINIT flag will be cleared and the content of the buffer written to the
disk, including the garbage.


I am a bit lost on how to fix this. Aneesh was right, I think it's an
ext2fs_read_bitmaps bug, not a kernel bug. I guess we need a userland
function to read a single block whatever the block size and page size
are. I've made a try using O_DIRECT flag but I was unsuccessful. Any
ideas/suggestions ?

Fred


Le lundi 15 septembre 2008 à 16:30 +0200, Frédéric Bohé a écrit :
> Le lundi 15 septembre 2008 à 19:06 +0530, Aneesh Kumar K.V a écrit :
> > On Mon, Sep 15, 2008 at 02:16:47PM +0200, Frédéric Bohé wrote:
> > > From: Frederic Bohe <frederic.bohe@bull.net>
> > > 
> > > Do not rely on buffer head's uptodate flag to initialize
> > > uninitialized bitmap blocks.
> > > 
> > > Signed-off-by: Frederic Bohe <frederic.bohe@bull.net>
> > > ---
> > > Sorry there was a copy/paste error in the previous mail !
> > > 
> > > This patch makes sure to initialize uninited bitmap blocks.
> > > These are two test cases where bugs appear because of uninited blocks :
> > > 
> > > 1- This test case lead to uninited block bitmap and an error message
> > > from the mballocator during the second dd.
> > > 
> > > dd if=/dev/urandom of=/dev/md0 bs=1M count=300
> > > mkfs.ext4 -t ext4dev /dev/md0 1G
> > > mount -t ext4dev /dev/md0 /mnt/test
> > > resize2fs /dev/md0 2G
> > > dd if=/dev/zero of=/mnt/test/dummy bs=1M count=1500
> > > 
> > > Note that the first dd is to make sure we have random garbage in the
> > > uninited blocks. If not, you could miss the issue depending what was in
> > > those blocks before running mkfs.
> > > 
> > > 2- This test case lead to uninited inode bitmap blocks, making it
> > > impossible to use all the inodes of the fs.
> > > 
> > > dd if=/dev/urandom of=/dev/md0 bs=1M count=20
> > > mkfs.ext4 -t ext4dev /dev/md0 10M
> > > mount -t ext4dev /dev/md0 /mnt/test
> > > resize2fs /dev/md0 20M
> > > for i in $(seq 1 3800); do touch /mnt/test/file${i} 2>&1; done
> > > 
> > >  balloc.c  |    4 +++-
> > >  ialloc.c  |    4 +++-
> > >  mballoc.c |    4 +++-
> > >  3 files changed, 9 insertions(+), 3 deletions(-)
> > > 
> > > Index: linux-2.6.27-rc5+patch_queue/fs/ext4/balloc.c
> > > ===================================================================
> > > --- linux-2.6.27-rc5+patch_queue.orig/fs/ext4/balloc.c	2008-09-15 10:59:27.000000000 +0200
> > > +++ linux-2.6.27-rc5+patch_queue/fs/ext4/balloc.c	2008-09-15 14:03:04.000000000 +0200
> > > @@ -318,9 +318,11 @@ ext4_read_block_bitmap(struct super_bloc
> > >  			    block_group, bitmap_blk);
> > >  		return NULL;
> > >  	}
> > > -	if (bh_uptodate_or_lock(bh))
> > > +	if (buffer_uptodate(bh) &&
> > > +	    !(desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)))
> > >  		return bh;
> > > 
> > > +	lock_buffer(bh);
> > >  	spin_lock(sb_bgl_lock(EXT4_SB(sb), block_group));
> > >  	if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
> > >  		ext4_init_block_bitmap(sb, bh, block_group, desc);
> > 
> > Why ? I guess resize should mark those buffer_heads as not uptodate so
> > that we do a reinit of block bitmap again later. The above change will
> > result in calling ext4_init_block_bitmap everytime we do a
> > read_block_bitmap on an uninit group
> 
> Thanks for your comment Aneesh. I thought ext4_init_block_bitmap was
> setting the EXT4_BG_BLOCK_UNINIT flags, but it seems it is not true.
> I will try to fix it on the resize side.
> 
> 
> > 
> > 
> > 
> > 
> > > Index: linux-2.6.27-rc5+patch_queue/fs/ext4/ialloc.c
> > > ===================================================================
> > > --- linux-2.6.27-rc5+patch_queue.orig/fs/ext4/ialloc.c	2008-09-15 10:59:27.000000000 +0200
> > > +++ linux-2.6.27-rc5+patch_queue/fs/ext4/ialloc.c	2008-09-15 11:12:16.000000000 +0200
> > > @@ -115,9 +115,11 @@ ext4_read_inode_bitmap(struct super_bloc
> > >  			    block_group, bitmap_blk);
> > >  		return NULL;
> > >  	}
> > > -	if (bh_uptodate_or_lock(bh))
> > > +	if (buffer_uptodate(bh) &&
> > > +	    !(desc->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT)))
> > >  		return bh;
> > > 
> > > +	lock_buffer(bh);
> > >  	spin_lock(sb_bgl_lock(EXT4_SB(sb), block_group));
> > >  	if (desc->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT)) {
> > >  		ext4_init_inode_bitmap(sb, bh, block_group, desc);
> > > Index: linux-2.6.27-rc5+patch_queue/fs/ext4/mballoc.c
> > > ===================================================================
> > > --- linux-2.6.27-rc5+patch_queue.orig/fs/ext4/mballoc.c	2008-09-15 10:59:27.000000000 +0200
> > > +++ linux-2.6.27-rc5+patch_queue/fs/ext4/mballoc.c	2008-09-15 14:02:44.000000000 +0200
> > > @@ -785,9 +785,11 @@ static int ext4_mb_init_cache(struct pag
> > >  		if (bh[i] == NULL)
> > >  			goto out;
> > > 
> > > -		if (bh_uptodate_or_lock(bh[i]))
> > > +		if (buffer_uptodate(bh[i]) &&
> > > +		    !(desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)))
> > >  			continue;
> > > 
> > > +		lock_buffer(bh[i]);
> > >  		spin_lock(sb_bgl_lock(EXT4_SB(sb), first_group + i));
> > >  		if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
> > >  			ext4_init_block_bitmap(sb, bh[i],
> > > 
> > 
> > -aneesh
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2008-09-18 13:43 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-15 11:41 [PATCH] ext4: fix initialization of UNINIT bitmap blocks Frédéric Bohé
2008-09-15 12:16 ` [PATCH v2] " Frédéric Bohé
2008-09-15 13:36   ` Aneesh Kumar K.V
2008-09-15 14:30     ` Frédéric Bohé
2008-09-18 13:45       ` Frédéric Bohé [this message]
2008-09-21  0:44         ` Theodore Tso
2008-09-22  8:09           ` Frédéric Bohé
2008-09-22  8:47             ` Aneesh Kumar K.V
2008-09-22  9:32               ` Frédéric Bohé
2008-09-23 23:13                 ` Andreas Dilger
2008-09-24 12:57                   ` Frédéric Bohé
2008-09-24 16:23                     ` Theodore Tso
2008-09-25 23:04                       ` Andreas Dilger
2008-09-24 12:38                 ` Frédéric Bohé
2008-09-26 13:17                   ` Frédéric Bohé
2008-09-28 22:49   ` Theodore Tso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1221745514.3550.83.camel@frecb007923.frec.bull.fr \
    --to=frederic.bohe@bull.net \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox