* [PATCH 2/2] libext2fs: Skip start_blk adjustment when stride and flex_bg is set
2017-08-24 13:35 [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks Lukas Czerner
@ 2017-08-24 13:35 ` Lukas Czerner
2017-10-14 14:50 ` [2/2] " Theodore Ts'o
2017-09-12 11:20 ` [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks Lukas Czerner
` (2 subsequent siblings)
3 siblings, 1 reply; 6+ messages in thread
From: Lukas Czerner @ 2017-08-24 13:35 UTC (permalink / raw)
To: linux-ext4; +Cc: Lukas Czerner
Currently some stride optimization is done in
ext2fs_allocate_group_table() by adjusting start_blk block where we
start allocating block, or inode bitmaps.
However in flex_bg case this is currently useless since the values are
going to be overridden anyway. Moreover in flex_bg case the group might
already be full and the stride optimization will fail. As a result file
system resize might fail needlessly in some situations.
It can be shown by this example:
mke2fs -b 1024 -i 1024 -E stride=8192 -t ext4 /dev/loop0 1024000
resize2fs /dev/loop0 102400000
resize2fs 1.43.5 (04-Aug-2017)
Resizing the filesystem on /dev/loop0 to 102400000 (1k) blocks.
./resize/resize2fs: Could not allocate block in ext2 filesystem while trying to resize /dev/loop0
Please run 'e2fsck -fy /dev/loop0' to fix the filesystem
after the aborted resize operation.
Fix this by not doing the stride adjustment in case of flex_bg.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
lib/ext2fs/alloc_tables.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/ext2fs/alloc_tables.c b/lib/ext2fs/alloc_tables.c
index 407283f..0a36630 100644
--- a/lib/ext2fs/alloc_tables.c
+++ b/lib/ext2fs/alloc_tables.c
@@ -107,7 +107,7 @@ errcode_t ext2fs_allocate_group_table(ext2_filsys fs, dgrp_t group,
/*
* Allocate the block and inode bitmaps, if necessary
*/
- if (fs->stride) {
+ if (fs->stride && !flexbg_size) {
retval = ext2fs_get_free_blocks2(fs, group_blk, last_blk,
1, bmap, &start_blk);
if (retval)
--
2.7.5
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [2/2] libext2fs: Skip start_blk adjustment when stride and flex_bg is set
2017-08-24 13:35 ` [PATCH 2/2] libext2fs: Skip start_blk adjustment when stride and flex_bg is set Lukas Czerner
@ 2017-10-14 14:50 ` Theodore Ts'o
0 siblings, 0 replies; 6+ messages in thread
From: Theodore Ts'o @ 2017-10-14 14:50 UTC (permalink / raw)
To: Lukas Czerner; +Cc: linux-ext4
On Thu, Aug 24, 2017 at 03:35:39PM +0200, Lukas Czerner wrote:
> Currently some stride optimization is done in
> ext2fs_allocate_group_table() by adjusting start_blk block where we
> start allocating block, or inode bitmaps.
>
> However in flex_bg case this is currently useless since the values are
> going to be overridden anyway. Moreover in flex_bg case the group might
> already be full and the stride optimization will fail. As a result file
> system resize might fail needlessly in some situations.
>
> It can be shown by this example:
>
> mke2fs -b 1024 -i 1024 -E stride=8192 -t ext4 /dev/loop0 1024000
> resize2fs /dev/loop0 102400000
> resize2fs 1.43.5 (04-Aug-2017)
> Resizing the filesystem on /dev/loop0 to 102400000 (1k) blocks.
> ./resize/resize2fs: Could not allocate block in ext2 filesystem while trying to resize /dev/loop0
> Please run 'e2fsck -fy /dev/loop0' to fix the filesystem
> after the aborted resize operation.
>
> Fix this by not doing the stride adjustment in case of flex_bg.
>
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Thanks, applied.
- Ted
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks
2017-08-24 13:35 [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks Lukas Czerner
2017-08-24 13:35 ` [PATCH 2/2] libext2fs: Skip start_blk adjustment when stride and flex_bg is set Lukas Czerner
@ 2017-09-12 11:20 ` Lukas Czerner
2017-10-12 13:54 ` Lukas Czerner
2017-10-14 14:50 ` [1/2] " Theodore Ts'o
3 siblings, 0 replies; 6+ messages in thread
From: Lukas Czerner @ 2017-09-12 11:20 UTC (permalink / raw)
To: linux-ext4
On Thu, Aug 24, 2017 at 03:35:38PM +0200, Lukas Czerner wrote:
> Currently it's possible for ext2fs_allocate_group_table() to place inode
> tables to blocks that are already occupied by different inode table.
> This can be reproduced by resize2fs on the file system where we need to
> move more than one inode table to a different location due to increase
> in group descriptor blocks, inode and block bitmaps.
>
> Best way I can reproduce this is to create big enough file system with
> huge amount of inodes and without resize_inode
>
> mke2fs -F -b 1024 -i 1024 -O ^resize_inode -t ext4 /dev/loop0 1024000
> resize2fs /dev/loop0 10240000
>
> e2fsck -fn /dev/loop0 | less
> e2fsck 1.43.5 (04-Aug-2017)
> ext2fs_check_desc: Corrupt group descriptor: bad block for inode table
> e2fsck: Group descriptors look bad... trying backup blocks...
> e2fsck: The journal superblock is corrupt while checking journal for /dev/loop0
> e2fsck: Cannot proceed with file system check
> Superblock has an invalid journal (inode 8).
> Clear? no
>
> /dev/loop0: ********** WARNING: Filesystem still has errors **********
>
> None of the settings are strictly necessary and it can be reproducer in
> various ways. This is just an example of one easy way to reproduce this.
>
> This bug was introduced with commit fccdbac39454 ("libext2fs: optimize
> ext2fs_allocate_group_table()") and is caused by the fact that wrong
> bitmap is used to mark the blocks as used.
>
> Fix this by using ext2fs_mark_block_bitmap_range2() in both (flex_bg and
> non flex_bg) cases and handle flex_bg case manually instead of relying
> on ext2fs_block_alloc_stats_range() because there is no way in that
> function to use different bitmap than fs->block_map.
ping
>
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> ---
> lib/ext2fs/alloc_tables.c | 32 ++++++++++++++++++++++++++------
> 1 file changed, 26 insertions(+), 6 deletions(-)
>
> diff --git a/lib/ext2fs/alloc_tables.c b/lib/ext2fs/alloc_tables.c
> index da0b15b..407283f 100644
> --- a/lib/ext2fs/alloc_tables.c
> +++ b/lib/ext2fs/alloc_tables.c
> @@ -222,12 +222,32 @@ errcode_t ext2fs_allocate_group_table(ext2_filsys fs, dgrp_t group,
> bmap, &new_blk);
> if (retval)
> return retval;
> - if (flexbg_size)
> - ext2fs_block_alloc_stats_range(fs, new_blk,
> - fs->inode_blocks_per_group, +1);
> - else
> - ext2fs_mark_block_bitmap_range2(fs->block_map,
> - new_blk, fs->inode_blocks_per_group);
> +
> + ext2fs_mark_block_bitmap_range2(bmap,
> + new_blk, fs->inode_blocks_per_group);
> + if (flexbg_size) {
> + blk64_t num, blk;
> + num = fs->inode_blocks_per_group;
> + blk = new_blk;
> + while (num) {
> + int gr = ext2fs_group_of_blk2(fs, blk);
> + last_blk = ext2fs_group_last_block2(fs, gr);
> + blk64_t n = num;
> +
> + if (blk + num > last_blk)
> + n = last_blk - blk + 1;
> +
> + ext2fs_bg_free_blocks_count_set(fs, gr,
> + ext2fs_bg_free_blocks_count(fs, gr) -
> + n/EXT2FS_CLUSTER_RATIO(fs));
> + ext2fs_bg_flags_clear(fs, gr,
> + EXT2_BG_BLOCK_UNINIT);
> + ext2fs_group_desc_csum_set(fs, gr);
> + ext2fs_free_blocks_count_add(fs->super, -n);
> + blk += n;
> + num -= n;
> + }
> + }
> ext2fs_inode_table_loc_set(fs, group, new_blk);
> }
> ext2fs_group_desc_csum_set(fs, group);
> --
> 2.7.5
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks
2017-08-24 13:35 [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks Lukas Czerner
2017-08-24 13:35 ` [PATCH 2/2] libext2fs: Skip start_blk adjustment when stride and flex_bg is set Lukas Czerner
2017-09-12 11:20 ` [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks Lukas Czerner
@ 2017-10-12 13:54 ` Lukas Czerner
2017-10-14 14:50 ` [1/2] " Theodore Ts'o
3 siblings, 0 replies; 6+ messages in thread
From: Lukas Czerner @ 2017-10-12 13:54 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso
Hi Ted,
this patchset is sitting here with no response on the list for almost
two months now. Do yo have any comments ? What I can do to move this
forward ? We have real customers hitting this issue and given our
"upstream first" policy we're waiting for your response.
Thanks!
-Lukas
On Thu, Aug 24, 2017 at 03:35:38PM +0200, Lukas Czerner wrote:
> Currently it's possible for ext2fs_allocate_group_table() to place inode
> tables to blocks that are already occupied by different inode table.
> This can be reproduced by resize2fs on the file system where we need to
> move more than one inode table to a different location due to increase
> in group descriptor blocks, inode and block bitmaps.
>
> Best way I can reproduce this is to create big enough file system with
> huge amount of inodes and without resize_inode
>
> mke2fs -F -b 1024 -i 1024 -O ^resize_inode -t ext4 /dev/loop0 1024000
> resize2fs /dev/loop0 10240000
>
> e2fsck -fn /dev/loop0 | less
> e2fsck 1.43.5 (04-Aug-2017)
> ext2fs_check_desc: Corrupt group descriptor: bad block for inode table
> e2fsck: Group descriptors look bad... trying backup blocks...
> e2fsck: The journal superblock is corrupt while checking journal for /dev/loop0
> e2fsck: Cannot proceed with file system check
> Superblock has an invalid journal (inode 8).
> Clear? no
>
> /dev/loop0: ********** WARNING: Filesystem still has errors **********
>
> None of the settings are strictly necessary and it can be reproducer in
> various ways. This is just an example of one easy way to reproduce this.
>
> This bug was introduced with commit fccdbac39454 ("libext2fs: optimize
> ext2fs_allocate_group_table()") and is caused by the fact that wrong
> bitmap is used to mark the blocks as used.
>
> Fix this by using ext2fs_mark_block_bitmap_range2() in both (flex_bg and
> non flex_bg) cases and handle flex_bg case manually instead of relying
> on ext2fs_block_alloc_stats_range() because there is no way in that
> function to use different bitmap than fs->block_map.
>
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> ---
> lib/ext2fs/alloc_tables.c | 32 ++++++++++++++++++++++++++------
> 1 file changed, 26 insertions(+), 6 deletions(-)
>
> diff --git a/lib/ext2fs/alloc_tables.c b/lib/ext2fs/alloc_tables.c
> index da0b15b..407283f 100644
> --- a/lib/ext2fs/alloc_tables.c
> +++ b/lib/ext2fs/alloc_tables.c
> @@ -222,12 +222,32 @@ errcode_t ext2fs_allocate_group_table(ext2_filsys fs, dgrp_t group,
> bmap, &new_blk);
> if (retval)
> return retval;
> - if (flexbg_size)
> - ext2fs_block_alloc_stats_range(fs, new_blk,
> - fs->inode_blocks_per_group, +1);
> - else
> - ext2fs_mark_block_bitmap_range2(fs->block_map,
> - new_blk, fs->inode_blocks_per_group);
> +
> + ext2fs_mark_block_bitmap_range2(bmap,
> + new_blk, fs->inode_blocks_per_group);
> + if (flexbg_size) {
> + blk64_t num, blk;
> + num = fs->inode_blocks_per_group;
> + blk = new_blk;
> + while (num) {
> + int gr = ext2fs_group_of_blk2(fs, blk);
> + last_blk = ext2fs_group_last_block2(fs, gr);
> + blk64_t n = num;
> +
> + if (blk + num > last_blk)
> + n = last_blk - blk + 1;
> +
> + ext2fs_bg_free_blocks_count_set(fs, gr,
> + ext2fs_bg_free_blocks_count(fs, gr) -
> + n/EXT2FS_CLUSTER_RATIO(fs));
> + ext2fs_bg_flags_clear(fs, gr,
> + EXT2_BG_BLOCK_UNINIT);
> + ext2fs_group_desc_csum_set(fs, gr);
> + ext2fs_free_blocks_count_add(fs->super, -n);
> + blk += n;
> + num -= n;
> + }
> + }
> ext2fs_inode_table_loc_set(fs, group, new_blk);
> }
> ext2fs_group_desc_csum_set(fs, group);
> --
> 2.7.5
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [1/2] libext2fs: Prevent allocating inode table from already used blocks
2017-08-24 13:35 [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks Lukas Czerner
` (2 preceding siblings ...)
2017-10-12 13:54 ` Lukas Czerner
@ 2017-10-14 14:50 ` Theodore Ts'o
3 siblings, 0 replies; 6+ messages in thread
From: Theodore Ts'o @ 2017-10-14 14:50 UTC (permalink / raw)
To: Lukas Czerner; +Cc: linux-ext4
On Thu, Aug 24, 2017 at 03:35:38PM +0200, Lukas Czerner wrote:
> Currently it's possible for ext2fs_allocate_group_table() to place inode
> tables to blocks that are already occupied by different inode table.
> This can be reproduced by resize2fs on the file system where we need to
> move more than one inode table to a different location due to increase
> in group descriptor blocks, inode and block bitmaps.
>
> Best way I can reproduce this is to create big enough file system with
> huge amount of inodes and without resize_inode
>
> mke2fs -F -b 1024 -i 1024 -O ^resize_inode -t ext4 /dev/loop0 1024000
> resize2fs /dev/loop0 10240000
>
> e2fsck -fn /dev/loop0 | less
> e2fsck 1.43.5 (04-Aug-2017)
> ext2fs_check_desc: Corrupt group descriptor: bad block for inode table
> e2fsck: Group descriptors look bad... trying backup blocks...
> e2fsck: The journal superblock is corrupt while checking journal for /dev/loop0
> e2fsck: Cannot proceed with file system check
> Superblock has an invalid journal (inode 8).
> Clear? no
>
> /dev/loop0: ********** WARNING: Filesystem still has errors **********
>
> None of the settings are strictly necessary and it can be reproducer in
> various ways. This is just an example of one easy way to reproduce this.
>
> This bug was introduced with commit fccdbac39454 ("libext2fs: optimize
> ext2fs_allocate_group_table()") and is caused by the fact that wrong
> bitmap is used to mark the blocks as used.
>
> Fix this by using ext2fs_mark_block_bitmap_range2() in both (flex_bg and
> non flex_bg) cases and handle flex_bg case manually instead of relying
> on ext2fs_block_alloc_stats_range() because there is no way in that
> function to use different bitmap than fs->block_map.
>
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Thanks, applied. Apologies for taking so long to apply this.
- Ted
^ permalink raw reply [flat|nested] 6+ messages in thread