* [PATCH] make ext4_valid_block_bitmap() more verbose
@ 2010-11-12 23:26 Bernd Schubert
2010-11-12 23:56 ` Andreas Dilger
2010-11-15 16:21 ` Eric Sandeen
0 siblings, 2 replies; 5+ messages in thread
From: Bernd Schubert @ 2010-11-12 23:26 UTC (permalink / raw)
To: linux-ext4; +Cc: Bernd Schubert, Andreas Dilger
The real issue we want to debug with the patch below actually came up while
stress testing Lustre using the RHEL5.5 kernel (so 2.6.32'ish ext4), but a
more verbose error function should not hurt for vanilla ext4 either.
make ext4_valid_block_bitmap() more verbose
While running our stress test suite, ext4_valid_block_bitmap()
frequently complains about an invalid block bitmap.
However, e2fsck does not find anything. So in oder to be able
to better debug this issue, make the function more verbose and
let it complain about the two possible invalid bitmaps.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
---
fs/ext4/balloc.c | 44 +++++++++++++++++++++++++++++++-------------
1 file changed, 31 insertions(+), 13 deletions(-)
Index: linux-git/fs/ext4/balloc.c
===================================================================
--- linux-git.orig/fs/ext4/balloc.c
+++ linux-git/fs/ext4/balloc.c
@@ -230,6 +230,16 @@ struct ext4_group_desc * ext4_get_group_
return desc;
}
+/**
+ * ext4_get_group_desc() -- load group descriptor from disk
+ * @sb: super block
+ * @ext4_group_desc: blocks group descriptor
+ * @block_group block group to check
+ * @bh: pointer to the buffer head to store the block
+ * group descriptor
+ *
+ * return 0 on error or 1 if valid
+ */
static int ext4_valid_block_bitmap(struct super_block *sb,
struct ext4_group_desc *desc,
unsigned int block_group,
@@ -239,6 +249,7 @@ static int ext4_valid_block_bitmap(struc
ext4_grpblk_t next_zero_bit;
ext4_fsblk_t bitmap_blk;
ext4_fsblk_t group_first_block;
+ int valid = 1;
if (EXT4_HAS_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_FLEX_BG)) {
/* with FLEX_BG, the inode/block bitmaps and itable
@@ -254,16 +265,20 @@ static int ext4_valid_block_bitmap(struc
/* check whether block bitmap block number is set */
bitmap_blk = ext4_block_bitmap(sb, desc);
offset = bitmap_blk - group_first_block;
- if (!ext4_test_bit(offset, bh->b_data))
- /* bad block bitmap */
- goto err_out;
+ if (!ext4_test_bit(offset, bh->b_data)) {
+ ext4_warning(sb, "bad block bitmap block = %llu, offset = %d",
+ bitmap_blk, (int) offset);
+ valid = 0;
+ }
/* check whether the inode bitmap block number is set */
bitmap_blk = ext4_inode_bitmap(sb, desc);
offset = bitmap_blk - group_first_block;
- if (!ext4_test_bit(offset, bh->b_data))
- /* bad block bitmap */
- goto err_out;
+ if (!ext4_test_bit(offset, bh->b_data)) {
+ ext4_warning(sb, "bad inode bitmap block = %llu, offset = %d",
+ bitmap_blk, (int) offset);
+ valid = 0;
+ }
/* check whether the inode table block number is set */
bitmap_blk = ext4_inode_table(sb, desc);
@@ -271,14 +286,17 @@ static int ext4_valid_block_bitmap(struc
next_zero_bit = ext4_find_next_zero_bit(bh->b_data,
offset + EXT4_SB(sb)->s_itb_per_group,
offset);
- if (next_zero_bit >= offset + EXT4_SB(sb)->s_itb_per_group)
- /* good bitmap for inode tables */
- return 1;
+ /* good bitmap for inode tables */
+ if (next_zero_bit < offset + EXT4_SB(sb)->s_itb_per_group) {
+ ext4_warning(sb, "bad inode table block = %llu, offset = %d",
+ bitmap_blk, (int) offset);
+ valid = 0;
+ }
-err_out:
- ext4_error(sb, "Invalid block bitmap - block_group = %d, block = %llu",
- block_group, bitmap_blk);
- return 0;
+ if (!valid)
+ ext4_error(sb, "Invalid block bitmap - block_group = %d",
+ block_group);
+ return valid;
}
/**
* ext4_read_block_bitmap()
--
Bernd Schubert
DataDirect Networks
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] make ext4_valid_block_bitmap() more verbose
2010-11-12 23:26 [PATCH] make ext4_valid_block_bitmap() more verbose Bernd Schubert
@ 2010-11-12 23:56 ` Andreas Dilger
2010-11-15 19:15 ` Bernd Schubert
2010-11-15 16:21 ` Eric Sandeen
1 sibling, 1 reply; 5+ messages in thread
From: Andreas Dilger @ 2010-11-12 23:56 UTC (permalink / raw)
To: Bernd Schubert; +Cc: linux-ext4, Bernd Schubert
On 2010-11-12, at 16:26, Bernd Schubert wrote:
> The real issue we want to debug with the patch below actually came up while
> stress testing Lustre using the RHEL5.5 kernel (so 2.6.32'ish ext4), but a
> more verbose error function should not hurt for vanilla ext4 either.
>
> make ext4_valid_block_bitmap() more verbose
>
> While running our stress test suite, ext4_valid_block_bitmap()
> frequently complains about an invalid block bitmap.
> However, e2fsck does not find anything. So in oder to be able
> to better debug this issue, make the function more verbose and
> let it complain about the two possible invalid bitmaps.
Bernd,
thanks for sending this in. I like the idea of making these messages more verbose, since they should rarely be hit and when they are it would be good to know why these checks failed.
> +/**
> + * ext4_get_group_desc() -- load group descriptor from disk
> + * @sb: super block
> + * @ext4_group_desc: blocks group descriptor
> + * @block_group block group to check
> + * @bh: pointer to the buffer head to store the block
> + * group descriptor
> + *
> + * return 0 on error or 1 if valid
> + */
> static int ext4_valid_block_bitmap(struct super_block *sb,
> struct ext4_group_desc *desc,
> unsigned int block_group,
The function name in the comment block does not actually match the function...
> @@ -254,16 +265,20 @@ static int ext4_valid_block_bitmap(struc
> + if (!ext4_test_bit(offset, bh->b_data)) {
> + ext4_warning(sb, "bad block bitmap block = %llu, offset = %d",
> + bitmap_blk, (int) offset);
> + valid = 0;
> + }
>
> /* check whether the inode bitmap block number is set */
> bitmap_blk = ext4_inode_bitmap(sb, desc);
> offset = bitmap_blk - group_first_block;
> + if (!ext4_test_bit(offset, bh->b_data)) {
> + ext4_warning(sb, "bad inode bitmap block = %llu, offset = %d",
> + bitmap_blk, (int) offset);
> + valid = 0;
This message (while correct) is a bit confusing, since it implies there is something wrong with the inode bitmap block itself, rather than the the bit in the block bitmap. I think it would be clearer to rewrite this as follows:
"unset bit for inode bitmap: block %llu, bit %u"
for consistency we should then change the block bitmap error to match.
While we are changing this code, all of the ext4_test_bit() checks should be wrapped in unlikely(), since we should very rarely be seeing these errors.
> @@ -271,14 +286,17 @@ static int ext4_valid_block_bitmap(struc
> + if (next_zero_bit < offset + EXT4_SB(sb)->s_itb_per_group) {
> + ext4_warning(sb, "bad inode table block = %llu, offset = %d",
> + bitmap_blk, (int) offset);
> + valid = 0;
Similarly, this implies that there is a bad inode table block, so I prefer:
"unset bit for inode table block %u: block %llu, bit %u"
where the "inode table block" value is, I think, (next_zero_bit - offset)
> + if (!valid)
> + ext4_error(sb, "Invalid block bitmap - block_group = %d",
> + block_group);
It would probably be worthwhile to also print the block number of the bitmap itself.
Cheers, Andreas
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] make ext4_valid_block_bitmap() more verbose
2010-11-12 23:26 [PATCH] make ext4_valid_block_bitmap() more verbose Bernd Schubert
2010-11-12 23:56 ` Andreas Dilger
@ 2010-11-15 16:21 ` Eric Sandeen
2010-11-15 19:22 ` Bernd Schubert
1 sibling, 1 reply; 5+ messages in thread
From: Eric Sandeen @ 2010-11-15 16:21 UTC (permalink / raw)
To: Bernd Schubert; +Cc: linux-ext4, Bernd Schubert, Andreas Dilger
On 11/12/10 5:26 PM, Bernd Schubert wrote:
> The real issue we want to debug with the patch below actually came up while
> stress testing Lustre using the RHEL5.5 kernel (so 2.6.32'ish ext4), but a
> more verbose error function should not hurt for vanilla ext4 either.
>
> make ext4_valid_block_bitmap() more verbose
>
> While running our stress test suite, ext4_valid_block_bitmap()
> frequently complains about an invalid block bitmap.
> However, e2fsck does not find anything. So in oder to be able
> to better debug this issue, make the function more verbose and
> let it complain about the two possible invalid bitmaps.
Making a raw e2image of the problematic filesystem would let us take a
look at why e2fsck isn't finding problems; unless you plan to fix that
yourself as well, which is just fine of course. :)
-Eric
> Signed-off-by: Bernd Schubert <bschubert@ddn.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] make ext4_valid_block_bitmap() more verbose
2010-11-12 23:56 ` Andreas Dilger
@ 2010-11-15 19:15 ` Bernd Schubert
0 siblings, 0 replies; 5+ messages in thread
From: Bernd Schubert @ 2010-11-15 19:15 UTC (permalink / raw)
To: Andreas Dilger; +Cc: Bernd Schubert, linux-ext4@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 1455 bytes --]
On 11/13/2010 12:56 AM, Andreas Dilger wrote:
> On 2010-11-12, at 16:26, Bernd Schubert wrote:
>> The real issue we want to debug with the patch below actually came
>> up while stress testing Lustre using the RHEL5.5 kernel (so
>> 2.6.32'ish ext4), but a more verbose error function should not hurt
>> for vanilla ext4 either.
>>
>> make ext4_valid_block_bitmap() more verbose
>>
>> While running our stress test suite, ext4_valid_block_bitmap()
>> frequently complains about an invalid block bitmap. However, e2fsck
>> does not find anything. So in oder to be able to better debug this
>> issue, make the function more verbose and let it complain about the
>> two possible invalid bitmaps.
>
> Bernd, thanks for sending this in. I like the idea of making these
> messages more verbose, since they should rarely be hit and when they
> are it would be good to know why these checks failed.
Andreas, thanks for your helpful review, I will send an updated patch on
Wednesday.
>> + if (!valid) + ext4_error(sb, "Invalid block bitmap - block_group
>> = %d", + block_group);
>
> It would probably be worthwhile to also print the block number of the
> bitmap itself.
I guess you mean bitmap_blk here? But that changes for every of the
possible checks, so I already printed it above. Is it worth to print it
again? And what if more than one problem is found, might become a bit
confusing then?
Thanks,
Bernd
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] make ext4_valid_block_bitmap() more verbose
2010-11-15 16:21 ` Eric Sandeen
@ 2010-11-15 19:22 ` Bernd Schubert
0 siblings, 0 replies; 5+ messages in thread
From: Bernd Schubert @ 2010-11-15 19:22 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Bernd Schubert, linux-ext4@vger.kernel.org, Andreas Dilger
[-- Attachment #1: Type: text/plain, Size: 1214 bytes --]
On 11/15/2010 05:21 PM, Eric Sandeen wrote:
> On 11/12/10 5:26 PM, Bernd Schubert wrote:
>> The real issue we want to debug with the patch below actually came up while
>> stress testing Lustre using the RHEL5.5 kernel (so 2.6.32'ish ext4), but a
>> more verbose error function should not hurt for vanilla ext4 either.
>>
>> make ext4_valid_block_bitmap() more verbose
>>
>> While running our stress test suite, ext4_valid_block_bitmap()
>> frequently complains about an invalid block bitmap.
>> However, e2fsck does not find anything. So in oder to be able
>> to better debug this issue, make the function more verbose and
>> let it complain about the two possible invalid bitmaps.
>
> Making a raw e2image of the problematic filesystem would let us take a
> look at why e2fsck isn't finding problems; unless you plan to fix that
> yourself as well, which is just fine of course. :)
>
I already captured an e2image and will do again on Wednesday, now that I
could reproduce it again with the updated code. Unfortunately no time
before. Will try to check out myself with debugfs first, if there is
really something wrong. It might be a race in the code as well...
Cheers,
Bernd
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-11-15 19:23 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-12 23:26 [PATCH] make ext4_valid_block_bitmap() more verbose Bernd Schubert
2010-11-12 23:56 ` Andreas Dilger
2010-11-15 19:15 ` Bernd Schubert
2010-11-15 16:21 ` Eric Sandeen
2010-11-15 19:22 ` Bernd Schubert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).