* [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks
@ 2017-08-24 13:35 Lukas Czerner
2017-08-24 13:35 ` [PATCH 2/2] libext2fs: Skip start_blk adjustment when stride and flex_bg is set Lukas Czerner
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: Lukas Czerner @ 2017-08-24 13:35 UTC (permalink / raw)
To: linux-ext4; +Cc: Lukas Czerner
Currently it's possible for ext2fs_allocate_group_table() to place inode
tables to blocks that are already occupied by different inode table.
This can be reproduced by resize2fs on the file system where we need to
move more than one inode table to a different location due to increase
in group descriptor blocks, inode and block bitmaps.
Best way I can reproduce this is to create big enough file system with
huge amount of inodes and without resize_inode
mke2fs -F -b 1024 -i 1024 -O ^resize_inode -t ext4 /dev/loop0 1024000
resize2fs /dev/loop0 10240000
e2fsck -fn /dev/loop0 | less
e2fsck 1.43.5 (04-Aug-2017)
ext2fs_check_desc: Corrupt group descriptor: bad block for inode table
e2fsck: Group descriptors look bad... trying backup blocks...
e2fsck: The journal superblock is corrupt while checking journal for /dev/loop0
e2fsck: Cannot proceed with file system check
Superblock has an invalid journal (inode 8).
Clear? no
/dev/loop0: ********** WARNING: Filesystem still has errors **********
None of the settings are strictly necessary and it can be reproducer in
various ways. This is just an example of one easy way to reproduce this.
This bug was introduced with commit fccdbac39454 ("libext2fs: optimize
ext2fs_allocate_group_table()") and is caused by the fact that wrong
bitmap is used to mark the blocks as used.
Fix this by using ext2fs_mark_block_bitmap_range2() in both (flex_bg and
non flex_bg) cases and handle flex_bg case manually instead of relying
on ext2fs_block_alloc_stats_range() because there is no way in that
function to use different bitmap than fs->block_map.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
lib/ext2fs/alloc_tables.c | 32 ++++++++++++++++++++++++++------
1 file changed, 26 insertions(+), 6 deletions(-)
diff --git a/lib/ext2fs/alloc_tables.c b/lib/ext2fs/alloc_tables.c
index da0b15b..407283f 100644
--- a/lib/ext2fs/alloc_tables.c
+++ b/lib/ext2fs/alloc_tables.c
@@ -222,12 +222,32 @@ errcode_t ext2fs_allocate_group_table(ext2_filsys fs, dgrp_t group,
bmap, &new_blk);
if (retval)
return retval;
- if (flexbg_size)
- ext2fs_block_alloc_stats_range(fs, new_blk,
- fs->inode_blocks_per_group, +1);
- else
- ext2fs_mark_block_bitmap_range2(fs->block_map,
- new_blk, fs->inode_blocks_per_group);
+
+ ext2fs_mark_block_bitmap_range2(bmap,
+ new_blk, fs->inode_blocks_per_group);
+ if (flexbg_size) {
+ blk64_t num, blk;
+ num = fs->inode_blocks_per_group;
+ blk = new_blk;
+ while (num) {
+ int gr = ext2fs_group_of_blk2(fs, blk);
+ last_blk = ext2fs_group_last_block2(fs, gr);
+ blk64_t n = num;
+
+ if (blk + num > last_blk)
+ n = last_blk - blk + 1;
+
+ ext2fs_bg_free_blocks_count_set(fs, gr,
+ ext2fs_bg_free_blocks_count(fs, gr) -
+ n/EXT2FS_CLUSTER_RATIO(fs));
+ ext2fs_bg_flags_clear(fs, gr,
+ EXT2_BG_BLOCK_UNINIT);
+ ext2fs_group_desc_csum_set(fs, gr);
+ ext2fs_free_blocks_count_add(fs->super, -n);
+ blk += n;
+ num -= n;
+ }
+ }
ext2fs_inode_table_loc_set(fs, group, new_blk);
}
ext2fs_group_desc_csum_set(fs, group);
--
2.7.5
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] libext2fs: Skip start_blk adjustment when stride and flex_bg is set
2017-08-24 13:35 [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks Lukas Czerner
@ 2017-08-24 13:35 ` Lukas Czerner
2017-10-14 14:50 ` [2/2] " Theodore Ts'o
2017-09-12 11:20 ` [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks Lukas Czerner
` (2 subsequent siblings)
3 siblings, 1 reply; 6+ messages in thread
From: Lukas Czerner @ 2017-08-24 13:35 UTC (permalink / raw)
To: linux-ext4; +Cc: Lukas Czerner
Currently some stride optimization is done in
ext2fs_allocate_group_table() by adjusting start_blk block where we
start allocating block, or inode bitmaps.
However in flex_bg case this is currently useless since the values are
going to be overridden anyway. Moreover in flex_bg case the group might
already be full and the stride optimization will fail. As a result file
system resize might fail needlessly in some situations.
It can be shown by this example:
mke2fs -b 1024 -i 1024 -E stride=8192 -t ext4 /dev/loop0 1024000
resize2fs /dev/loop0 102400000
resize2fs 1.43.5 (04-Aug-2017)
Resizing the filesystem on /dev/loop0 to 102400000 (1k) blocks.
./resize/resize2fs: Could not allocate block in ext2 filesystem while trying to resize /dev/loop0
Please run 'e2fsck -fy /dev/loop0' to fix the filesystem
after the aborted resize operation.
Fix this by not doing the stride adjustment in case of flex_bg.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
lib/ext2fs/alloc_tables.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/ext2fs/alloc_tables.c b/lib/ext2fs/alloc_tables.c
index 407283f..0a36630 100644
--- a/lib/ext2fs/alloc_tables.c
+++ b/lib/ext2fs/alloc_tables.c
@@ -107,7 +107,7 @@ errcode_t ext2fs_allocate_group_table(ext2_filsys fs, dgrp_t group,
/*
* Allocate the block and inode bitmaps, if necessary
*/
- if (fs->stride) {
+ if (fs->stride && !flexbg_size) {
retval = ext2fs_get_free_blocks2(fs, group_blk, last_blk,
1, bmap, &start_blk);
if (retval)
--
2.7.5
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks
2017-08-24 13:35 [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks Lukas Czerner
2017-08-24 13:35 ` [PATCH 2/2] libext2fs: Skip start_blk adjustment when stride and flex_bg is set Lukas Czerner
@ 2017-09-12 11:20 ` Lukas Czerner
2017-10-12 13:54 ` Lukas Czerner
2017-10-14 14:50 ` [1/2] " Theodore Ts'o
3 siblings, 0 replies; 6+ messages in thread
From: Lukas Czerner @ 2017-09-12 11:20 UTC (permalink / raw)
To: linux-ext4
On Thu, Aug 24, 2017 at 03:35:38PM +0200, Lukas Czerner wrote:
> Currently it's possible for ext2fs_allocate_group_table() to place inode
> tables to blocks that are already occupied by different inode table.
> This can be reproduced by resize2fs on the file system where we need to
> move more than one inode table to a different location due to increase
> in group descriptor blocks, inode and block bitmaps.
>
> Best way I can reproduce this is to create big enough file system with
> huge amount of inodes and without resize_inode
>
> mke2fs -F -b 1024 -i 1024 -O ^resize_inode -t ext4 /dev/loop0 1024000
> resize2fs /dev/loop0 10240000
>
> e2fsck -fn /dev/loop0 | less
> e2fsck 1.43.5 (04-Aug-2017)
> ext2fs_check_desc: Corrupt group descriptor: bad block for inode table
> e2fsck: Group descriptors look bad... trying backup blocks...
> e2fsck: The journal superblock is corrupt while checking journal for /dev/loop0
> e2fsck: Cannot proceed with file system check
> Superblock has an invalid journal (inode 8).
> Clear? no
>
> /dev/loop0: ********** WARNING: Filesystem still has errors **********
>
> None of the settings are strictly necessary and it can be reproducer in
> various ways. This is just an example of one easy way to reproduce this.
>
> This bug was introduced with commit fccdbac39454 ("libext2fs: optimize
> ext2fs_allocate_group_table()") and is caused by the fact that wrong
> bitmap is used to mark the blocks as used.
>
> Fix this by using ext2fs_mark_block_bitmap_range2() in both (flex_bg and
> non flex_bg) cases and handle flex_bg case manually instead of relying
> on ext2fs_block_alloc_stats_range() because there is no way in that
> function to use different bitmap than fs->block_map.
ping
>
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> ---
> lib/ext2fs/alloc_tables.c | 32 ++++++++++++++++++++++++++------
> 1 file changed, 26 insertions(+), 6 deletions(-)
>
> diff --git a/lib/ext2fs/alloc_tables.c b/lib/ext2fs/alloc_tables.c
> index da0b15b..407283f 100644
> --- a/lib/ext2fs/alloc_tables.c
> +++ b/lib/ext2fs/alloc_tables.c
> @@ -222,12 +222,32 @@ errcode_t ext2fs_allocate_group_table(ext2_filsys fs, dgrp_t group,
> bmap, &new_blk);
> if (retval)
> return retval;
> - if (flexbg_size)
> - ext2fs_block_alloc_stats_range(fs, new_blk,
> - fs->inode_blocks_per_group, +1);
> - else
> - ext2fs_mark_block_bitmap_range2(fs->block_map,
> - new_blk, fs->inode_blocks_per_group);
> +
> + ext2fs_mark_block_bitmap_range2(bmap,
> + new_blk, fs->inode_blocks_per_group);
> + if (flexbg_size) {
> + blk64_t num, blk;
> + num = fs->inode_blocks_per_group;
> + blk = new_blk;
> + while (num) {
> + int gr = ext2fs_group_of_blk2(fs, blk);
> + last_blk = ext2fs_group_last_block2(fs, gr);
> + blk64_t n = num;
> +
> + if (blk + num > last_blk)
> + n = last_blk - blk + 1;
> +
> + ext2fs_bg_free_blocks_count_set(fs, gr,
> + ext2fs_bg_free_blocks_count(fs, gr) -
> + n/EXT2FS_CLUSTER_RATIO(fs));
> + ext2fs_bg_flags_clear(fs, gr,
> + EXT2_BG_BLOCK_UNINIT);
> + ext2fs_group_desc_csum_set(fs, gr);
> + ext2fs_free_blocks_count_add(fs->super, -n);
> + blk += n;
> + num -= n;
> + }
> + }
> ext2fs_inode_table_loc_set(fs, group, new_blk);
> }
> ext2fs_group_desc_csum_set(fs, group);
> --
> 2.7.5
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks
2017-08-24 13:35 [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks Lukas Czerner
2017-08-24 13:35 ` [PATCH 2/2] libext2fs: Skip start_blk adjustment when stride and flex_bg is set Lukas Czerner
2017-09-12 11:20 ` [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks Lukas Czerner
@ 2017-10-12 13:54 ` Lukas Czerner
2017-10-14 14:50 ` [1/2] " Theodore Ts'o
3 siblings, 0 replies; 6+ messages in thread
From: Lukas Czerner @ 2017-10-12 13:54 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso
Hi Ted,
this patchset is sitting here with no response on the list for almost
two months now. Do yo have any comments ? What I can do to move this
forward ? We have real customers hitting this issue and given our
"upstream first" policy we're waiting for your response.
Thanks!
-Lukas
On Thu, Aug 24, 2017 at 03:35:38PM +0200, Lukas Czerner wrote:
> Currently it's possible for ext2fs_allocate_group_table() to place inode
> tables to blocks that are already occupied by different inode table.
> This can be reproduced by resize2fs on the file system where we need to
> move more than one inode table to a different location due to increase
> in group descriptor blocks, inode and block bitmaps.
>
> Best way I can reproduce this is to create big enough file system with
> huge amount of inodes and without resize_inode
>
> mke2fs -F -b 1024 -i 1024 -O ^resize_inode -t ext4 /dev/loop0 1024000
> resize2fs /dev/loop0 10240000
>
> e2fsck -fn /dev/loop0 | less
> e2fsck 1.43.5 (04-Aug-2017)
> ext2fs_check_desc: Corrupt group descriptor: bad block for inode table
> e2fsck: Group descriptors look bad... trying backup blocks...
> e2fsck: The journal superblock is corrupt while checking journal for /dev/loop0
> e2fsck: Cannot proceed with file system check
> Superblock has an invalid journal (inode 8).
> Clear? no
>
> /dev/loop0: ********** WARNING: Filesystem still has errors **********
>
> None of the settings are strictly necessary and it can be reproducer in
> various ways. This is just an example of one easy way to reproduce this.
>
> This bug was introduced with commit fccdbac39454 ("libext2fs: optimize
> ext2fs_allocate_group_table()") and is caused by the fact that wrong
> bitmap is used to mark the blocks as used.
>
> Fix this by using ext2fs_mark_block_bitmap_range2() in both (flex_bg and
> non flex_bg) cases and handle flex_bg case manually instead of relying
> on ext2fs_block_alloc_stats_range() because there is no way in that
> function to use different bitmap than fs->block_map.
>
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> ---
> lib/ext2fs/alloc_tables.c | 32 ++++++++++++++++++++++++++------
> 1 file changed, 26 insertions(+), 6 deletions(-)
>
> diff --git a/lib/ext2fs/alloc_tables.c b/lib/ext2fs/alloc_tables.c
> index da0b15b..407283f 100644
> --- a/lib/ext2fs/alloc_tables.c
> +++ b/lib/ext2fs/alloc_tables.c
> @@ -222,12 +222,32 @@ errcode_t ext2fs_allocate_group_table(ext2_filsys fs, dgrp_t group,
> bmap, &new_blk);
> if (retval)
> return retval;
> - if (flexbg_size)
> - ext2fs_block_alloc_stats_range(fs, new_blk,
> - fs->inode_blocks_per_group, +1);
> - else
> - ext2fs_mark_block_bitmap_range2(fs->block_map,
> - new_blk, fs->inode_blocks_per_group);
> +
> + ext2fs_mark_block_bitmap_range2(bmap,
> + new_blk, fs->inode_blocks_per_group);
> + if (flexbg_size) {
> + blk64_t num, blk;
> + num = fs->inode_blocks_per_group;
> + blk = new_blk;
> + while (num) {
> + int gr = ext2fs_group_of_blk2(fs, blk);
> + last_blk = ext2fs_group_last_block2(fs, gr);
> + blk64_t n = num;
> +
> + if (blk + num > last_blk)
> + n = last_blk - blk + 1;
> +
> + ext2fs_bg_free_blocks_count_set(fs, gr,
> + ext2fs_bg_free_blocks_count(fs, gr) -
> + n/EXT2FS_CLUSTER_RATIO(fs));
> + ext2fs_bg_flags_clear(fs, gr,
> + EXT2_BG_BLOCK_UNINIT);
> + ext2fs_group_desc_csum_set(fs, gr);
> + ext2fs_free_blocks_count_add(fs->super, -n);
> + blk += n;
> + num -= n;
> + }
> + }
> ext2fs_inode_table_loc_set(fs, group, new_blk);
> }
> ext2fs_group_desc_csum_set(fs, group);
> --
> 2.7.5
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [1/2] libext2fs: Prevent allocating inode table from already used blocks
2017-08-24 13:35 [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks Lukas Czerner
` (2 preceding siblings ...)
2017-10-12 13:54 ` Lukas Czerner
@ 2017-10-14 14:50 ` Theodore Ts'o
3 siblings, 0 replies; 6+ messages in thread
From: Theodore Ts'o @ 2017-10-14 14:50 UTC (permalink / raw)
To: Lukas Czerner; +Cc: linux-ext4
On Thu, Aug 24, 2017 at 03:35:38PM +0200, Lukas Czerner wrote:
> Currently it's possible for ext2fs_allocate_group_table() to place inode
> tables to blocks that are already occupied by different inode table.
> This can be reproduced by resize2fs on the file system where we need to
> move more than one inode table to a different location due to increase
> in group descriptor blocks, inode and block bitmaps.
>
> Best way I can reproduce this is to create big enough file system with
> huge amount of inodes and without resize_inode
>
> mke2fs -F -b 1024 -i 1024 -O ^resize_inode -t ext4 /dev/loop0 1024000
> resize2fs /dev/loop0 10240000
>
> e2fsck -fn /dev/loop0 | less
> e2fsck 1.43.5 (04-Aug-2017)
> ext2fs_check_desc: Corrupt group descriptor: bad block for inode table
> e2fsck: Group descriptors look bad... trying backup blocks...
> e2fsck: The journal superblock is corrupt while checking journal for /dev/loop0
> e2fsck: Cannot proceed with file system check
> Superblock has an invalid journal (inode 8).
> Clear? no
>
> /dev/loop0: ********** WARNING: Filesystem still has errors **********
>
> None of the settings are strictly necessary and it can be reproducer in
> various ways. This is just an example of one easy way to reproduce this.
>
> This bug was introduced with commit fccdbac39454 ("libext2fs: optimize
> ext2fs_allocate_group_table()") and is caused by the fact that wrong
> bitmap is used to mark the blocks as used.
>
> Fix this by using ext2fs_mark_block_bitmap_range2() in both (flex_bg and
> non flex_bg) cases and handle flex_bg case manually instead of relying
> on ext2fs_block_alloc_stats_range() because there is no way in that
> function to use different bitmap than fs->block_map.
>
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Thanks, applied. Apologies for taking so long to apply this.
- Ted
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [2/2] libext2fs: Skip start_blk adjustment when stride and flex_bg is set
2017-08-24 13:35 ` [PATCH 2/2] libext2fs: Skip start_blk adjustment when stride and flex_bg is set Lukas Czerner
@ 2017-10-14 14:50 ` Theodore Ts'o
0 siblings, 0 replies; 6+ messages in thread
From: Theodore Ts'o @ 2017-10-14 14:50 UTC (permalink / raw)
To: Lukas Czerner; +Cc: linux-ext4
On Thu, Aug 24, 2017 at 03:35:39PM +0200, Lukas Czerner wrote:
> Currently some stride optimization is done in
> ext2fs_allocate_group_table() by adjusting start_blk block where we
> start allocating block, or inode bitmaps.
>
> However in flex_bg case this is currently useless since the values are
> going to be overridden anyway. Moreover in flex_bg case the group might
> already be full and the stride optimization will fail. As a result file
> system resize might fail needlessly in some situations.
>
> It can be shown by this example:
>
> mke2fs -b 1024 -i 1024 -E stride=8192 -t ext4 /dev/loop0 1024000
> resize2fs /dev/loop0 102400000
> resize2fs 1.43.5 (04-Aug-2017)
> Resizing the filesystem on /dev/loop0 to 102400000 (1k) blocks.
> ./resize/resize2fs: Could not allocate block in ext2 filesystem while trying to resize /dev/loop0
> Please run 'e2fsck -fy /dev/loop0' to fix the filesystem
> after the aborted resize operation.
>
> Fix this by not doing the stride adjustment in case of flex_bg.
>
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Thanks, applied.
- Ted
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-10-14 14:51 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-24 13:35 [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks Lukas Czerner
2017-08-24 13:35 ` [PATCH 2/2] libext2fs: Skip start_blk adjustment when stride and flex_bg is set Lukas Czerner
2017-10-14 14:50 ` [2/2] " Theodore Ts'o
2017-09-12 11:20 ` [PATCH 1/2] libext2fs: Prevent allocating inode table from already used blocks Lukas Czerner
2017-10-12 13:54 ` Lukas Czerner
2017-10-14 14:50 ` [1/2] " Theodore Ts'o
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).