linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ext4: fix oops when online resizing a filesystem with flex_bg
@ 2009-01-23 20:41 Thadeu Lima de Souza Cascardo
  2009-01-27  0:07 ` Theodore Tso
  2009-01-27  0:22 ` Theodore Tso
  0 siblings, 2 replies; 4+ messages in thread
From: Thadeu Lima de Souza Cascardo @ 2009-01-23 20:41 UTC (permalink / raw)
  To: linux-ext4; +Cc: Thadeu Lima de Souza Cascardo

http://kerneloops.org/raw.php?rawid=176715 and
http://bugzilla.kernel.org/show_bug.cgi?id=12433 report a bug when
resizing a mounted ext4 filesystem. I could easily reproduce it using
2.6.29-rc2, 2.6.28 and 2.6.27.

ext4_error was called just before the NULL derefence from
ext4_get_group_desc, indicating a block group beyond s_groups_count.
This function returned a NULL. The requested block group was beyond
s_groups_count because this is the last changed bit when resizing.

The caller, ext4_group_used_meta_blocks, did not check the returned
pointer. It needs the gdp anyway. Fortunately, its only caller,
ext4_init_block_bitmap, has a gdp and it was simply a matter of adding
this parameter to ext4_group_used_meta_blocks and use it.

Tested and the oops is gone.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
---
 fs/ext4/balloc.c |    8 ++------
 1 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index 9a50b80..d7dc22a 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -55,7 +55,7 @@ static int ext4_block_in_group(struct super_block *sb, ext4_fsblk_t block,
 }
 
 static int ext4_group_used_meta_blocks(struct super_block *sb,
-				ext4_group_t block_group)
+		ext4_group_t block_group, struct ext4_group_desc *gdp)
 {
 	ext4_fsblk_t tmp;
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
@@ -63,10 +63,6 @@ static int ext4_group_used_meta_blocks(struct super_block *sb,
 	int used_blocks = sbi->s_itb_per_group + 2;
 
 	if (EXT4_HAS_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_FLEX_BG)) {
-		struct ext4_group_desc *gdp;
-		struct buffer_head *bh;

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] ext4: fix oops when online resizing a filesystem with flex_bg
  2009-01-23 20:41 [PATCH] ext4: fix oops when online resizing a filesystem with flex_bg Thadeu Lima de Souza Cascardo
@ 2009-01-27  0:07 ` Theodore Tso
  2009-01-27 15:25   ` Thadeu Lima de Souza Cascardo
  2009-01-27  0:22 ` Theodore Tso
  1 sibling, 1 reply; 4+ messages in thread
From: Theodore Tso @ 2009-01-27  0:07 UTC (permalink / raw)
  To: Thadeu Lima de Souza Cascardo; +Cc: linux-ext4

On Fri, Jan 23, 2009 at 06:41:49PM -0200, Thadeu Lima de Souza Cascardo wrote:
> http://kerneloops.org/raw.php?rawid=176715 and
> http://bugzilla.kernel.org/show_bug.cgi?id=12433 report a bug when
> resizing a mounted ext4 filesystem. I could easily reproduce it using
> 2.6.29-rc2, 2.6.28 and 2.6.27.
> 
> ext4_error was called just before the NULL derefence from
> ext4_get_group_desc, indicating a block group beyond s_groups_count.
> This function returned a NULL. The requested block group was beyond
> s_groups_count because this is the last changed bit when resizing.
> 
> The caller, ext4_group_used_meta_blocks, did not check the returned
> pointer. It needs the gdp anyway. Fortunately, its only caller,
> ext4_init_block_bitmap, has a gdp and it was simply a matter of adding
> this parameter to ext4_group_used_meta_blocks and use it.
> 
> Tested and the oops is gone.

This fixes the symptom, and the patch is a good thing to do since it
reduces the code size; but I don't think it fixes the root cause of
the problem.  The much bigger problem is in ext4_group_add around line
869 of fs/ext4/resize.c:

	gdp->bg_flags |= cpu_to_le16(EXT4_BG_INODE_ZEROED);   <-------

The problem here is that the group descriptor contains arbitrary
garbage here, so this should be:

	gdp->bg_flags = cpu_to_le16(EXT4_BG_INODE_ZEROED);

instead.  What's happening is bg_flags contains random garbage, and if
happens to have the EXT4_BG_BLOCK_UNINIT bit set, then we end up
calling ext4_init_block_bitmap(), which causes the OOPS.  However, if
EXT4_BG_BLOCK_UNINIT is not set, then we don't end up going down this
code path --- which is why we didn't catch it in our testing, and why
the oops in BZ #12433 is somewhat hard to reproduce.

Fortunately having random bits in bg_flags is otherwise mostly
harmless, but this is clearly a bug.  Can you test and confirm this
also fixes your failure case?

						- Ted

commit 13e707e95258844c8872f1c4dcac4f0a545b8ef6
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Mon Jan 26 19:06:41 2009 -0500

    ext4: Initialize the new group descriptor when resizing the filesystem
    
    Make sure all of the fields of the group descriptor are properly
    initialized.  Previously, we allowed bg_flags field to be contain
    random garbage, which could trigger non-deterministic behavior,
    including a kernel OOPS.
    
    http://bugzilla.kernel.org/show_bug.cgi?id=12433
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Cc: stable@kernel.org

diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index c328be5..c06886a 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -861,12 +861,13 @@ int ext4_group_add(struct super_block *sb, struct ext4_new_group_data *input)
 	gdp = (struct ext4_group_desc *)((char *)primary->b_data +
 					 gdb_off * EXT4_DESC_SIZE(sb));
 
+	memset(gdp, 0, EXT4_DESC_SIZE(sb));
 	ext4_block_bitmap_set(sb, gdp, input->block_bitmap); /* LV FIXME */
 	ext4_inode_bitmap_set(sb, gdp, input->inode_bitmap); /* LV FIXME */
 	ext4_inode_table_set(sb, gdp, input->inode_table); /* LV FIXME */
 	ext4_free_blks_set(sb, gdp, input->free_blocks_count);
 	ext4_free_inodes_set(sb, gdp, EXT4_INODES_PER_GROUP(sb));
-	gdp->bg_flags |= cpu_to_le16(EXT4_BG_INODE_ZEROED);
+	gdp->bg_flags = cpu_to_le16(EXT4_BG_INODE_ZEROED);
 	gdp->bg_checksum = ext4_group_desc_csum(sbi, input->group, gdp);
 
 	/*

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] ext4: fix oops when online resizing a filesystem with flex_bg
  2009-01-23 20:41 [PATCH] ext4: fix oops when online resizing a filesystem with flex_bg Thadeu Lima de Souza Cascardo
  2009-01-27  0:07 ` Theodore Tso
@ 2009-01-27  0:22 ` Theodore Tso
  1 sibling, 0 replies; 4+ messages in thread
From: Theodore Tso @ 2009-01-27  0:22 UTC (permalink / raw)
  To: Thadeu Lima de Souza Cascardo; +Cc: linux-ext4

I'm going to include this patch as well, but with a slightly different
changelog description, since I think making sure the group descriptor
is initialized addresses the root cause of the problem.  This still
saves a small amount of code space, so it's still a good patch, but it
shouldn't be strictly speaking necessary.

							- Ted

commit 030d677ef688dea36245bdeaab74826aece02ee8
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Mon Jan 26 19:20:18 2009 -0500

    ext4: remove call to ext4_group_desc() in ext4_group_used_meta_blocks()
    
    The static function ext4_group_used_meta_blocks() only has one caller,
    who already has access to the block group's group descriptor.  So it's
    better to have ext4_init_block_bitmap() pass the group descriptor to
    ext4_group_used_meta_blocks(), so it doesn't need to call
    ext4_group_desc().  Previously this function did not check if
    ext4_group_desc() returned NULL due to an error, potentially causing a
    kernel OOPS report.  This avoids the issue entirely.
    
    Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] ext4: fix oops when online resizing a filesystem with flex_bg
  2009-01-27  0:07 ` Theodore Tso
@ 2009-01-27 15:25   ` Thadeu Lima de Souza Cascardo
  0 siblings, 0 replies; 4+ messages in thread
From: Thadeu Lima de Souza Cascardo @ 2009-01-27 15:25 UTC (permalink / raw)
  To: Theodore Tso; +Cc: linux-ext4

[-- Attachment #1: Type: text/plain, Size: 399 bytes --]

On Mon, Jan 26, 2009 at 07:07:43PM -0500, Theodore Tso wrote:
> Fortunately having random bits in bg_flags is otherwise mostly
> harmless, but this is clearly a bug.  Can you test and confirm this
> also fixes your failure case?
> 
> 						- Ted

Yes, it fix my test case. No oops while online resizing anymore with
only your patch applied on top of 2.6.29-rc2.

Best Regards,
Cascardo.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-01-27 15:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-23 20:41 [PATCH] ext4: fix oops when online resizing a filesystem with flex_bg Thadeu Lima de Souza Cascardo
2009-01-27  0:07 ` Theodore Tso
2009-01-27 15:25   ` Thadeu Lima de Souza Cascardo
2009-01-27  0:22 ` Theodore Tso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).