* s_first_meta_bg treatment incompatibility between kernel and e2fsprogs
@ 2009-11-11 12:09 Damien Guibouret
2009-11-15 4:20 ` Theodore Tso
0 siblings, 1 reply; 5+ messages in thread
From: Damien Guibouret @ 2009-11-11 12:09 UTC (permalink / raw)
To: linux-ext4
Hello,
I have taken a look at META_BG feature and think there is some
incoherency between kernel and e2fsprogs about s_first_meta_bg handling.
When considering initialisation of unitialised block bitmaps for groups
before first meta group one:
- kernel considers that descriptors blocks occupy
EXT4_SB(sb)->s_gdb_count blocks (see ext4_bg_num_gdb_nometa at
balloc.c:764 that is indirectly called from ext4_init_block_bitmap),
s_gdb_count being the number of blocks to store all descriptors
(computed from super.c).
- e2fsprogs considers that descriptors blocks occupy s_first_meta_bg
(see ext2fs_reserve_super_and_bgd at alloc_sb.c:55-68).
The difference of behaviour is wrong as e2fsck will certainly complain
there is a bitmap marked unused when it should be if the bitmap was
initialised by kernel and number of descriptors blocks is lower than
s_first_meta_bg or reverse if number of descriptors blocks is higher
than s_first_meta_bg.
So, the kernel behaviour seems to be wrong in the META_BG case when
s_first_meta_bg is not 0: ext4_bg_num_gdb returns either 1 (META_BG
feature present and group being one of the meta group) or s_gdb_count
(META_BG feature not present or group being one before first meta
group). As s_gdb_count is number of blocks for all groups, I think it
should returns either 1 (META_BG present and group being one of the meta
group) or s_first_meta_bg (META_BG present and group being one before
first meta group) or s_gdb_count (META_BG not set).
I see also that the resize2fs does not handle the s_first_meta_bg flag
in case a filesystem is shrunk such as number of descriptor blocks goes
below s_first_meta_bg. To what I looked, it does not seem to be a
problem (apart from the problem in kernel described above), but I did
not perform a complete check about that. At least there is some blocks
still allocated when there is no more need for that (but e2fsck does not
complain as it uses also the s_first_meta_bg value). I do not know if it
is desired behaviour. In case the s_first_meta_bg is lowered and blocks
freed, it will certainly be better to add a check into e2fsck to check
that s_first_meta_bg is coherent with number of descriptor blocks
(s_first_meta_bg <= fs->desc_blocks).
If you want to perform some tests on that, I modified tune2fs to allow
setting the META_BG flag on a filesystem that does not have it with
setting s_first_meta_bg to the current number of blocks for descriptors
(it is how I understand META_BG/s_first_meta_group should be used).
Regards,
Damien
*** misc/tune2fs.old 2009-11-11 12:20:33.698192912 +0100
--- misc/tune2fs.c 2009-11-11 11:38:20.265333248 +0100
***************
*** 121,126 ****
--- 121,127 ----
EXT2_FEATURE_COMPAT_DIR_INDEX,
/* Incompat */
EXT2_FEATURE_INCOMPAT_FILETYPE |
+ EXT2_FEATURE_INCOMPAT_META_BG |
EXT3_FEATURE_INCOMPAT_EXTENTS |
EXT4_FEATURE_INCOMPAT_FLEX_BG,
/* R/O compat */
***************
*** 418,423 ****
--- 419,440 ----
}
}
+ if (FEATURE_ON(E2P_FEATURE_INCOMPAT, EXT2_FEATURE_INCOMPAT_META_BG)) {
+ if (mount_flags & EXT2_MF_MOUNTED) {
+ fputs(_("The meta_bg feature may only be "
+ "set when the filesystem is\n"
+ "unmounted.\n"), stderr);
+ exit(1);
+ }
+ if (sb->s_feature_compat & EXT2_FEATURE_COMPAT_RESIZE_INODE) {
+ fputs(_("The meta_bg feature cannot be "
+ "set when the resize_inode is\n"
+ "set.\n"), stderr);
+ exit(1);
+ }
+ sb->s_first_meta_bg = fs->desc_blocks;
+ }
+
if (FEATURE_OFF(E2P_FEATURE_RO_INCOMPAT,
EXT4_FEATURE_RO_COMPAT_HUGE_FILE)) {
if ((mount_flags & EXT2_MF_MOUNTED) &&
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: s_first_meta_bg treatment incompatibility between kernel and e2fsprogs 2009-11-11 12:09 s_first_meta_bg treatment incompatibility between kernel and e2fsprogs Damien Guibouret @ 2009-11-15 4:20 ` Theodore Tso 2009-11-15 10:28 ` Damien Guibouret 0 siblings, 1 reply; 5+ messages in thread From: Theodore Tso @ 2009-11-15 4:20 UTC (permalink / raw) To: Damien Guibouret; +Cc: linux-ext4 On Wed, Nov 11, 2009 at 01:09:00PM +0100, Damien Guibouret wrote: > Hello, > > I have taken a look at META_BG feature and think there is some > incoherency between kernel and e2fsprogs about s_first_meta_bg handling. > > When considering initialisation of unitialised block bitmaps for groups > before first meta group one: > - kernel considers that descriptors blocks occupy > EXT4_SB(sb)->s_gdb_count blocks (see ext4_bg_num_gdb_nometa at > balloc.c:764 that is indirectly called from ext4_init_block_bitmap), > s_gdb_count being the number of blocks to store all descriptors > (computed from super.c). > - e2fsprogs considers that descriptors blocks occupy s_first_meta_bg > (see ext2fs_reserve_super_and_bgd at alloc_sb.c:55-68). Yup, you're right. Ouch. That is a kernel bug, and it means that if we resize a filesystem to the point where we need to use meta_bg (because we've run out of blocks to reserve), if there are uninitialized block bitmaps, kernels that don't have a fix will misbehave by reserving too many file system metadata blocks. This will waste bit of disk space, which fsck will fix. (s_first_meta_bg by definition is always less than or equal to s_gdb_count.) I think this patch should fix things up. - Ted commit b33c339814f97fc48a843f45f6068f84bc735141 Author: Theodore Ts'o <tytso@mit.edu> Date: Sat Nov 14 23:20:30 2009 -0500 ext4: Fix uninit block bitmap initialization when s_meta_first_bg is non-zero The number of old-style block group descriptor blocks is s_meta_first_bg when the meta_bg feature flag is set. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c index 1d04189..f3032c9 100644 --- a/fs/ext4/balloc.c +++ b/fs/ext4/balloc.c @@ -761,7 +761,13 @@ static unsigned long ext4_bg_num_gdb_meta(struct super_block *sb, static unsigned long ext4_bg_num_gdb_nometa(struct super_block *sb, ext4_group_t group) { - return ext4_bg_has_super(sb, group) ? EXT4_SB(sb)->s_gdb_count : 0; + if (!ext4_bg_has_super(sb, group)) + return 0; + + if (EXT4_HAS_INCOMPAT_FEATURE(sb,EXT4_FEATURE_INCOMPAT_META_BG)) + return le32_to_cpu(EXT4_SB(sb)->s_es->s_first_meta_bg); + else + return EXT4_SB(sb)->s_gdb_count; } /** ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: s_first_meta_bg treatment incompatibility between kernel and e2fsprogs 2009-11-15 4:20 ` Theodore Tso @ 2009-11-15 10:28 ` Damien Guibouret 2009-11-15 19:23 ` Theodore Tso 0 siblings, 1 reply; 5+ messages in thread From: Damien Guibouret @ 2009-11-15 10:28 UTC (permalink / raw) To: Theodore Tso; +Cc: linux-ext4 Theodore Tso wrote: > On Wed, Nov 11, 2009 at 01:09:00PM +0100, Damien Guibouret wrote: > > Yup, you're right. Ouch. That is a kernel bug, and it means that if > we resize a filesystem to the point where we need to use meta_bg > (because we've run out of blocks to reserve), if there are > uninitialized block bitmaps, kernels that don't have a fix will > misbehave by reserving too many file system metadata blocks. This > will waste bit of disk space, which fsck will fix. > > (s_first_meta_bg by definition is always less than or equal to > s_gdb_count.) > > I think this patch should fix things up. > > - Ted > > commit b33c339814f97fc48a843f45f6068f84bc735141 > Author: Theodore Ts'o <tytso@mit.edu> > Date: Sat Nov 14 23:20:30 2009 -0500 > > ext4: Fix uninit block bitmap initialization when s_meta_first_bg is non-zero > > The number of old-style block group descriptor blocks is > s_meta_first_bg when the meta_bg feature flag is set. > > Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> > > diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c > index 1d04189..f3032c9 100644 > --- a/fs/ext4/balloc.c > +++ b/fs/ext4/balloc.c > @@ -761,7 +761,13 @@ static unsigned long ext4_bg_num_gdb_meta(struct super_block *sb, > static unsigned long ext4_bg_num_gdb_nometa(struct super_block *sb, > ext4_group_t group) > { > - return ext4_bg_has_super(sb, group) ? EXT4_SB(sb)->s_gdb_count : 0; > + if (!ext4_bg_has_super(sb, group)) > + return 0; > + > + if (EXT4_HAS_INCOMPAT_FEATURE(sb,EXT4_FEATURE_INCOMPAT_META_BG)) > + return le32_to_cpu(EXT4_SB(sb)->s_es->s_first_meta_bg); > + else > + return EXT4_SB(sb)->s_gdb_count; > } > > /** > > Hello, I've open a kernel bug since: http://bugzilla.kernel.org/show_bug.cgi?id=14601 with a proposed patch (little different from yours but it is matter of taste :) And I think there is some other places where kernel should be fixed when it uses s_gdb_count (but here my knowledge of the sources are not deep enough to be sure on what shall be performed). Regards, Damien ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: s_first_meta_bg treatment incompatibility between kernel and e2fsprogs 2009-11-15 10:28 ` Damien Guibouret @ 2009-11-15 19:23 ` Theodore Tso 2009-11-16 15:51 ` Damien Guibouret 0 siblings, 1 reply; 5+ messages in thread From: Theodore Tso @ 2009-11-15 19:23 UTC (permalink / raw) To: Damien Guibouret; +Cc: linux-ext4 On Sun, Nov 15, 2009 at 11:28:13AM +0100, Damien Guibouret wrote: > I've open a kernel bug since: > http://bugzilla.kernel.org/show_bug.cgi?id=14601 > with a proposed patch (little different from yours but it is matter of > taste :) For future references, patches are less likely to slip through the cracks if they are sent to the linux-ext4 mailing list as opposed to having a BZ bug opened. (Yeah, I know, that's unusual). The reason for that is that patches are tracked via patchwork, here: http://patchwork.ozlabs.org/project/linux-ext4 Basically, anything that looks like a patch which is sent to linux-ext4 gets snagged by patchwork, and it's a good place to look for stuff that hasn't yet been merged. In some cases there are good reasons why a patch has been kept out, and in other cases patches have been merged or definitely rejected and I don't get to getting that status updated in patchwork, but overall I've found it to work very well. As far as the matter of taste issue is concerned, I think we already have too many static functions with a single caller, and it actually makes the code harder to understand. So adding yet another simple static function I think is a bad thing, not a good thing. > And I think there is some other places where kernel should be fixed when > it uses s_gdb_count (but here my knowledge of the sources are not deep > enough to be sure on what shall be performed). I've looked through the other areas, and the one place where I see a problem is in the block validity checks in ext4_iget() for the extended attribute block and in block_validity.c. The former can and should be fixed to use the latter. Here's the fix that I plan to be using. Comments, anyone? - Ted ext4: fix block validity checks so they work correctly with meta_bg The block validity checks used by ext4_data_block_valid() wasn't correctly written to check file systems with the meta_bg feature. Fix this. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> --- fs/ext4/block_validity.c | 2 +- fs/ext4/inode.c | 5 +---- 2 files changed, 2 insertions(+), 5 deletions(-) diff --git a/fs/ext4/block_validity.c b/fs/ext4/block_validity.c index 50784ef..dc79b75 100644 --- a/fs/ext4/block_validity.c +++ b/fs/ext4/block_validity.c @@ -160,7 +160,7 @@ int ext4_setup_system_zone(struct super_block *sb) if (ext4_bg_has_super(sb, i) && ((i < 5) || ((i % flex_size) == 0))) add_system_zone(sbi, ext4_group_first_block_no(sb, i), - sbi->s_gdb_count + 1); + ext4_bg_num_gdb(sb, i) + 1); gdp = ext4_get_group_desc(sb, i, NULL); ret = add_system_zone(sbi, ext4_block_bitmap(sb, gdp), 1); if (ret) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index b5cdb88..c62ca93 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4886,10 +4886,7 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino) ret = 0; if (ei->i_file_acl && - ((ei->i_file_acl < - (le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block) + - EXT4_SB(sb)->s_gdb_count)) || - (ei->i_file_acl >= ext4_blocks_count(EXT4_SB(sb)->s_es)))) { + !ext4_data_block_valid(EXT4_SB(sb), ei->i_file_acl, 1)) { ext4_error(sb, __func__, "bad extended attribute block %llu in inode #%lu", ei->i_file_acl, inode->i_ino); ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: s_first_meta_bg treatment incompatibility between kernel and e2fsprogs 2009-11-15 19:23 ` Theodore Tso @ 2009-11-16 15:51 ` Damien Guibouret 0 siblings, 0 replies; 5+ messages in thread From: Damien Guibouret @ 2009-11-16 15:51 UTC (permalink / raw) To: Theodore Tso; +Cc: linux-ext4 Hello, Theodore Tso wrote: > On Sun, Nov 15, 2009 at 11:28:13AM +0100, Damien Guibouret wrote: [...] > As far as the matter of taste issue is concerned, I think we already > have too many static functions with a single caller, and it actually > makes the code harder to understand. So adding yet another simple > static function I think is a bad thing, not a good thing. > It was just to mimic the existing function, but I agree with you. The other difference is that it shall be applied on ext3 also. > >>And I think there is some other places where kernel should be fixed when >>it uses s_gdb_count (but here my knowledge of the sources are not deep >>enough to be sure on what shall be performed). > > > I've looked through the other areas, and the one place where I see a > problem is in the block validity checks in ext4_iget() for the > extended attribute block and in block_validity.c. The former can and > should be fixed to use the latter. > > Here's the fix that I plan to be using. Comments, anyone? > For the first one (on block_validity.c), as far as I understand, presence of superblock and descriptors blocks in a group are no more related in case of meta_bg group, so shouldn't be the code divided into 2 distincts part: one to treat super block, second to treat descriptor blocks (I do not understand the ((i < 5) || ((i % flex_size) == 0) part into the test, so add it if it is trully needed), something as: ext4_fsblk_t firstSystemBlock = ext4_group_first_block_no(sb, i); unsigned long nbDescBlocks; if (ext4_bg_has_super(sb, i)) { add_system_zone(sbi, firstSystemBlock, 1); firstSystemBlock++; } nbDescBlocks = ext4_bg_num_gdb(sb, i); if (nbDescBlocks != 0) add_system_zone(sbi, firstSystemBlock, nbDescBlocks); Regards, Damien > - Ted > > ext4: fix block validity checks so they work correctly with meta_bg > > The block validity checks used by ext4_data_block_valid() wasn't > correctly written to check file systems with the meta_bg feature. Fix > this. > > Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> > --- > fs/ext4/block_validity.c | 2 +- > fs/ext4/inode.c | 5 +---- > 2 files changed, 2 insertions(+), 5 deletions(-) > > diff --git a/fs/ext4/block_validity.c b/fs/ext4/block_validity.c > index 50784ef..dc79b75 100644 > --- a/fs/ext4/block_validity.c > +++ b/fs/ext4/block_validity.c > @@ -160,7 +160,7 @@ int ext4_setup_system_zone(struct super_block *sb) > if (ext4_bg_has_super(sb, i) && > ((i < 5) || ((i % flex_size) == 0))) > add_system_zone(sbi, ext4_group_first_block_no(sb, i), > - sbi->s_gdb_count + 1); > + ext4_bg_num_gdb(sb, i) + 1); > gdp = ext4_get_group_desc(sb, i, NULL); > ret = add_system_zone(sbi, ext4_block_bitmap(sb, gdp), 1); > if (ret) > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index b5cdb88..c62ca93 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -4886,10 +4886,7 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino) > > ret = 0; > if (ei->i_file_acl && > - ((ei->i_file_acl < > - (le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block) + > - EXT4_SB(sb)->s_gdb_count)) || > - (ei->i_file_acl >= ext4_blocks_count(EXT4_SB(sb)->s_es)))) { > + !ext4_data_block_valid(EXT4_SB(sb), ei->i_file_acl, 1)) { > ext4_error(sb, __func__, > "bad extended attribute block %llu in inode #%lu", > ei->i_file_acl, inode->i_ino); > > ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-11-16 15:50 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-11-11 12:09 s_first_meta_bg treatment incompatibility between kernel and e2fsprogs Damien Guibouret 2009-11-15 4:20 ` Theodore Tso 2009-11-15 10:28 ` Damien Guibouret 2009-11-15 19:23 ` Theodore Tso 2009-11-16 15:51 ` Damien Guibouret
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox