* Re: [PATCH] ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal()
2026-03-02 13:46 [PATCH] ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal() Ye Bin
@ 2026-03-02 16:27 ` Jan Kara
2026-03-02 19:41 ` Andreas Dilger
` (4 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2026-03-02 16:27 UTC (permalink / raw)
To: Ye Bin; +Cc: tytso, adilger.kernel, linux-ext4, jack
On Mon 02-03-26 21:46:19, Ye Bin wrote:
> From: Ye Bin <yebin10@huawei.com>
>
> There's issue as follows:
> ...
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2243 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2239 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): error count since last fsck: 1
> EXT4-fs (mmcblk0p1): initial error at time 1765597433: ext4_mb_generate_buddy:760
> EXT4-fs (mmcblk0p1): last error at time 1765597433: ext4_mb_generate_buddy:760
> ...
>
> According to the log analysis, blocks are always requested from the
> corrupted block group. This may happen as follows:
> ext4_mb_find_by_goal
> ext4_mb_load_buddy
> ext4_mb_load_buddy_gfp
> ext4_mb_init_cache
> ext4_read_block_bitmap_nowait
> ext4_wait_block_bitmap
> ext4_validate_block_bitmap
> if (!grp || EXT4_MB_GRP_BBITMAP_CORRUPT(grp))
> return -EFSCORRUPTED; // There's no logs.
> if (err)
> return err; // Will return error
> ext4_lock_group(ac->ac_sb, group);
> if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info))) // Unreachable
> goto out;
>
> After commit 9008a58e5dce ("ext4: make the bitmap read routines return
> real error codes") merged, Commit 163a203ddb36 ("ext4: mark block group
> as corrupt on block bitmap error") is no real solution for allocating
> blocks from corrupted block groups. This is because if
> 'EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)' is true, then
> 'ext4_mb_load_buddy()' may return an error. This means that the block
> allocation will fail.
> Therefore, check block group if corrupted when ext4_mb_load_buddy()
> returns error.
>
> Fixes: 163a203ddb36 ("ext4: mark block group as corrupt on block bitmap error")
> Fixes: 9008a58e5dce ("ext4: make the bitmap read routines return real error codes")
> Signed-off-by: Ye Bin <yebin10@huawei.com>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/mballoc.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index e2341489f4d0..ffa6886de8a3 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2443,8 +2443,12 @@ int ext4_mb_find_by_goal(struct ext4_allocation_context *ac,
> return 0;
>
> err = ext4_mb_load_buddy(ac->ac_sb, group, e4b);
> - if (err)
> + if (err) {
> + if (EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info) &&
> + !(ac->ac_flags & EXT4_MB_HINT_GOAL_ONLY))
> + return 0;
> return err;
> + }
>
> ext4_lock_group(ac->ac_sb, group);
> if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)))
> --
> 2.34.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal()
2026-03-02 13:46 [PATCH] ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal() Ye Bin
2026-03-02 16:27 ` Jan Kara
@ 2026-03-02 19:41 ` Andreas Dilger
2026-03-03 2:36 ` Baokun Li
` (3 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: Andreas Dilger @ 2026-03-02 19:41 UTC (permalink / raw)
To: Ye Bin; +Cc: tytso, linux-ext4, jack
On Mar 2, 2026, at 06:46, Ye Bin <yebin@huaweicloud.com> wrote:
>
> From: Ye Bin <yebin10@huawei.com>
>
> There's issue as follows:
> ...
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2243 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2239 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): error count since last fsck: 1
> EXT4-fs (mmcblk0p1): initial error at time 1765597433: ext4_mb_generate_buddy:760
> EXT4-fs (mmcblk0p1): last error at time 1765597433: ext4_mb_generate_buddy:760
> ...
>
> According to the log analysis, blocks are always requested from the
> corrupted block group. This may happen as follows:
> ext4_mb_find_by_goal
> ext4_mb_load_buddy
> ext4_mb_load_buddy_gfp
> ext4_mb_init_cache
> ext4_read_block_bitmap_nowait
> ext4_wait_block_bitmap
> ext4_validate_block_bitmap
> if (!grp || EXT4_MB_GRP_BBITMAP_CORRUPT(grp))
> return -EFSCORRUPTED; // There's no logs.
> if (err)
> return err; // Will return error
> ext4_lock_group(ac->ac_sb, group);
> if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info))) // Unreachable
> goto out;
>
> After commit 9008a58e5dce ("ext4: make the bitmap read routines return
> real error codes") merged, Commit 163a203ddb36 ("ext4: mark block group
> as corrupt on block bitmap error") is no real solution for allocating
> blocks from corrupted block groups. This is because if
> 'EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)' is true, then
> 'ext4_mb_load_buddy()' may return an error. This means that the block
> allocation will fail.
> Therefore, check block group if corrupted when ext4_mb_load_buddy()
> returns error.
>
> Fixes: 163a203ddb36 ("ext4: mark block group as corrupt on block bitmap error")
> Fixes: 9008a58e5dce ("ext4: make the bitmap read routines return real error codes")
> Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca <mailto:adilger@dilger.ca>>
> ---
> fs/ext4/mballoc.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index e2341489f4d0..ffa6886de8a3 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2443,8 +2443,12 @@ int ext4_mb_find_by_goal(struct ext4_allocation_context *ac,
> return 0;
>
> err = ext4_mb_load_buddy(ac->ac_sb, group, e4b);
> - if (err)
> + if (err) {
> + if (EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info) &&
> + !(ac->ac_flags & EXT4_MB_HINT_GOAL_ONLY))
> + return 0;
> return err;
> + }
>
> ext4_lock_group(ac->ac_sb, group);
> if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)))
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal()
2026-03-02 13:46 [PATCH] ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal() Ye Bin
2026-03-02 16:27 ` Jan Kara
2026-03-02 19:41 ` Andreas Dilger
@ 2026-03-03 2:36 ` Baokun Li
2026-03-03 7:55 ` yebin (H)
2026-03-07 8:07 ` Zhang Yi
` (2 subsequent siblings)
5 siblings, 1 reply; 8+ messages in thread
From: Baokun Li @ 2026-03-03 2:36 UTC (permalink / raw)
To: Ye Bin; +Cc: jack, tytso, adilger.kernel, linux-ext4
On 3/2/26 9:46 PM, Ye Bin wrote:
> From: Ye Bin <yebin10@huawei.com>
>
> There's issue as follows:
> ...
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2243 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2239 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): error count since last fsck: 1
> EXT4-fs (mmcblk0p1): initial error at time 1765597433: ext4_mb_generate_buddy:760
> EXT4-fs (mmcblk0p1): last error at time 1765597433: ext4_mb_generate_buddy:760
> ...
>
> According to the log analysis, blocks are always requested from the
> corrupted block group. This may happen as follows:
> ext4_mb_find_by_goal
> ext4_mb_load_buddy
> ext4_mb_load_buddy_gfp
> ext4_mb_init_cache
> ext4_read_block_bitmap_nowait
> ext4_wait_block_bitmap
> ext4_validate_block_bitmap
> if (!grp || EXT4_MB_GRP_BBITMAP_CORRUPT(grp))
> return -EFSCORRUPTED; // There's no logs.
> if (err)
> return err; // Will return error
> ext4_lock_group(ac->ac_sb, group);
> if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info))) // Unreachable
> goto out;
>
> After commit 9008a58e5dce ("ext4: make the bitmap read routines return
> real error codes") merged, Commit 163a203ddb36 ("ext4: mark block group
> as corrupt on block bitmap error") is no real solution for allocating
> blocks from corrupted block groups. This is because if
> 'EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)' is true, then
> 'ext4_mb_load_buddy()' may return an error. This means that the block
> allocation will fail.
> Therefore, check block group if corrupted when ext4_mb_load_buddy()
> returns error.
Good catch!
Agreed, we should try other groups upon failure unless it's a goal-only
allocation.
But note that e4b->bd_info might be uninitialized if ext4_mb_load_buddy()
fails.
I think we can optimize this in ext4_mb_regular_allocator(): we can record
the error from ext4_mb_find_by_goal() but avoid an early exit.
Specifically, after checking that EXT4_MB_HINT_GOAL_ONLY is not set,
we can assign the error to ac->ac_first_err. This way, if subsequent
allocation attempts still fail, we can preserve the original.
Cheers,
Baokun
> Fixes: 163a203ddb36 ("ext4: mark block group as corrupt on block bitmap error")
> Fixes: 9008a58e5dce ("ext4: make the bitmap read routines return real error codes")
> Signed-off-by: Ye Bin <yebin10@huawei.com>
> ---
> fs/ext4/mballoc.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index e2341489f4d0..ffa6886de8a3 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2443,8 +2443,12 @@ int ext4_mb_find_by_goal(struct ext4_allocation_context *ac,
> return 0;
>
> err = ext4_mb_load_buddy(ac->ac_sb, group, e4b);
> - if (err)
> + if (err) {
> + if (EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info) &&
> + !(ac->ac_flags & EXT4_MB_HINT_GOAL_ONLY))
> + return 0;
> return err;
> + }
>
> ext4_lock_group(ac->ac_sb, group);
> if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)))
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal()
2026-03-03 2:36 ` Baokun Li
@ 2026-03-03 7:55 ` yebin (H)
0 siblings, 0 replies; 8+ messages in thread
From: yebin (H) @ 2026-03-03 7:55 UTC (permalink / raw)
To: Baokun Li, Ye Bin; +Cc: jack, tytso, adilger.kernel, linux-ext4
On 2026/3/3 10:36, Baokun Li wrote:
>
> On 3/2/26 9:46 PM, Ye Bin wrote:
>> From: Ye Bin <yebin10@huawei.com>
>>
>> There's issue as follows:
>> ...
>> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
>> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>>
>> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
>> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>>
>> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
>> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>>
>> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
>> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>>
>> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2243 at logical offset 0 with max blocks 1 with error 117
>> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>>
>> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2239 at logical offset 0 with max blocks 1 with error 117
>> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>>
>> EXT4-fs (mmcblk0p1): error count since last fsck: 1
>> EXT4-fs (mmcblk0p1): initial error at time 1765597433: ext4_mb_generate_buddy:760
>> EXT4-fs (mmcblk0p1): last error at time 1765597433: ext4_mb_generate_buddy:760
>> ...
>>
>> According to the log analysis, blocks are always requested from the
>> corrupted block group. This may happen as follows:
>> ext4_mb_find_by_goal
>> ext4_mb_load_buddy
>> ext4_mb_load_buddy_gfp
>> ext4_mb_init_cache
>> ext4_read_block_bitmap_nowait
>> ext4_wait_block_bitmap
>> ext4_validate_block_bitmap
>> if (!grp || EXT4_MB_GRP_BBITMAP_CORRUPT(grp))
>> return -EFSCORRUPTED; // There's no logs.
>> if (err)
>> return err; // Will return error
>> ext4_lock_group(ac->ac_sb, group);
>> if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info))) // Unreachable
>> goto out;
>>
>> After commit 9008a58e5dce ("ext4: make the bitmap read routines return
>> real error codes") merged, Commit 163a203ddb36 ("ext4: mark block group
>> as corrupt on block bitmap error") is no real solution for allocating
>> blocks from corrupted block groups. This is because if
>> 'EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)' is true, then
>> 'ext4_mb_load_buddy()' may return an error. This means that the block
>> allocation will fail.
>> Therefore, check block group if corrupted when ext4_mb_load_buddy()
>> returns error.
>
> Good catch!
>
> Agreed, we should try other groups upon failure unless it's a goal-only
> allocation.
>
> But note that e4b->bd_info might be uninitialized if ext4_mb_load_buddy()
> fails.
>
The situation you mentioned probably doesn't exist.
ext4_mb_find_by_goal
struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb, group);
if (!grp) // The possibility that e4b->bd_info is not initialized
has been avoided.
return -EFSCORRUPTED;
err = ext4_mb_load_buddy(ac->ac_sb, group, e4b);
ext4_mb_load_buddy_gfp(sb, group, e4b, GFP_NOFS);
grp = ext4_get_group_info(sb, group);
if (!grp) // This condition probably will not be met.
return -EFSCORRUPTED;
e4b->bd_info = grp;
> I think we can optimize this in ext4_mb_regular_allocator(): we can record
> the error from ext4_mb_find_by_goal() but avoid an early exit.
>
> Specifically, after checking that EXT4_MB_HINT_GOAL_ONLY is not set,
> we can assign the error to ac->ac_first_err. This way, if subsequent
> allocation attempts still fail, we can preserve the original.
>
>
> Cheers,
> Baokun
>
>> Fixes: 163a203ddb36 ("ext4: mark block group as corrupt on block bitmap error")
>> Fixes: 9008a58e5dce ("ext4: make the bitmap read routines return real error codes")
>> Signed-off-by: Ye Bin <yebin10@huawei.com>
>> ---
>> fs/ext4/mballoc.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
>> index e2341489f4d0..ffa6886de8a3 100644
>> --- a/fs/ext4/mballoc.c
>> +++ b/fs/ext4/mballoc.c
>> @@ -2443,8 +2443,12 @@ int ext4_mb_find_by_goal(struct ext4_allocation_context *ac,
>> return 0;
>>
>> err = ext4_mb_load_buddy(ac->ac_sb, group, e4b);
>> - if (err)
>> + if (err) {
>> + if (EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info) &&
>> + !(ac->ac_flags & EXT4_MB_HINT_GOAL_ONLY))
>> + return 0;
>> return err;
>> + }
>>
>> ext4_lock_group(ac->ac_sb, group);
>> if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)))
>
>
> .
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal()
2026-03-02 13:46 [PATCH] ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal() Ye Bin
` (2 preceding siblings ...)
2026-03-03 2:36 ` Baokun Li
@ 2026-03-07 8:07 ` Zhang Yi
2026-03-14 8:39 ` Ritesh Harjani
2026-03-27 4:06 ` Theodore Ts'o
5 siblings, 0 replies; 8+ messages in thread
From: Zhang Yi @ 2026-03-07 8:07 UTC (permalink / raw)
To: Ye Bin, tytso, adilger.kernel, linux-ext4; +Cc: jack
On 3/2/2026 9:46 PM, Ye Bin wrote:
> From: Ye Bin <yebin10@huawei.com>
>
> There's issue as follows:
> ...
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2243 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2239 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): error count since last fsck: 1
> EXT4-fs (mmcblk0p1): initial error at time 1765597433: ext4_mb_generate_buddy:760
> EXT4-fs (mmcblk0p1): last error at time 1765597433: ext4_mb_generate_buddy:760
> ...
>
> According to the log analysis, blocks are always requested from the
> corrupted block group. This may happen as follows:
> ext4_mb_find_by_goal
> ext4_mb_load_buddy
> ext4_mb_load_buddy_gfp
> ext4_mb_init_cache
> ext4_read_block_bitmap_nowait
> ext4_wait_block_bitmap
> ext4_validate_block_bitmap
> if (!grp || EXT4_MB_GRP_BBITMAP_CORRUPT(grp))
> return -EFSCORRUPTED; // There's no logs.
> if (err)
> return err; // Will return error
> ext4_lock_group(ac->ac_sb, group);
> if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info))) // Unreachable
> goto out;
>
> After commit 9008a58e5dce ("ext4: make the bitmap read routines return
> real error codes") merged, Commit 163a203ddb36 ("ext4: mark block group
> as corrupt on block bitmap error") is no real solution for allocating
> blocks from corrupted block groups. This is because if
> 'EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)' is true, then
> 'ext4_mb_load_buddy()' may return an error. This means that the block
> allocation will fail.
> Therefore, check block group if corrupted when ext4_mb_load_buddy()
> returns error.
>
> Fixes: 163a203ddb36 ("ext4: mark block group as corrupt on block bitmap error")
> Fixes: 9008a58e5dce ("ext4: make the bitmap read routines return real error codes")
> Signed-off-by: Ye Bin <yebin10@huawei.com>
Looks good to me.
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
> ---
> fs/ext4/mballoc.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index e2341489f4d0..ffa6886de8a3 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2443,8 +2443,12 @@ int ext4_mb_find_by_goal(struct ext4_allocation_context *ac,
> return 0;
>
> err = ext4_mb_load_buddy(ac->ac_sb, group, e4b);
> - if (err)
> + if (err) {
> + if (EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info) &&
> + !(ac->ac_flags & EXT4_MB_HINT_GOAL_ONLY))
> + return 0;
> return err;
> + }
>
> ext4_lock_group(ac->ac_sb, group);
> if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)))
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal()
2026-03-02 13:46 [PATCH] ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal() Ye Bin
` (3 preceding siblings ...)
2026-03-07 8:07 ` Zhang Yi
@ 2026-03-14 8:39 ` Ritesh Harjani
2026-03-27 4:06 ` Theodore Ts'o
5 siblings, 0 replies; 8+ messages in thread
From: Ritesh Harjani @ 2026-03-14 8:39 UTC (permalink / raw)
To: Ye Bin, tytso, adilger.kernel, linux-ext4; +Cc: jack
Ye Bin <yebin@huaweicloud.com> writes:
> From: Ye Bin <yebin10@huawei.com>
>
> There's issue as follows:
> ...
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2243 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2239 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): error count since last fsck: 1
> EXT4-fs (mmcblk0p1): initial error at time 1765597433: ext4_mb_generate_buddy:760
> EXT4-fs (mmcblk0p1): last error at time 1765597433: ext4_mb_generate_buddy:760
> ...
>
> According to the log analysis, blocks are always requested from the
> corrupted block group. This may happen as follows:
> ext4_mb_find_by_goal
> ext4_mb_load_buddy
> ext4_mb_load_buddy_gfp
> ext4_mb_init_cache
> ext4_read_block_bitmap_nowait
> ext4_wait_block_bitmap
> ext4_validate_block_bitmap
> if (!grp || EXT4_MB_GRP_BBITMAP_CORRUPT(grp))
> return -EFSCORRUPTED; // There's no logs.
> if (err)
> return err; // Will return error
> ext4_lock_group(ac->ac_sb, group);
> if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info))) // Unreachable
> goto out;
>
> After commit 9008a58e5dce ("ext4: make the bitmap read routines return
> real error codes") merged, Commit 163a203ddb36 ("ext4: mark block group
> as corrupt on block bitmap error") is no real solution for allocating
> blocks from corrupted block groups. This is because if
> 'EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)' is true, then
> 'ext4_mb_load_buddy()' may return an error. This means that the block
> allocation will fail.
> Therefore, check block group if corrupted when ext4_mb_load_buddy()
> returns error.
>
> Fixes: 163a203ddb36 ("ext4: mark block group as corrupt on block bitmap error")
> Fixes: 9008a58e5dce ("ext4: make the bitmap read routines return real error codes")
> Signed-off-by: Ye Bin <yebin10@huawei.com>
> ---
> fs/ext4/mballoc.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index e2341489f4d0..ffa6886de8a3 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2443,8 +2443,12 @@ int ext4_mb_find_by_goal(struct ext4_allocation_context *ac,
> return 0;
>
> err = ext4_mb_load_buddy(ac->ac_sb, group, e4b);
> - if (err)
> + if (err) {
> + if (EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info) &&
> + !(ac->ac_flags & EXT4_MB_HINT_GOAL_ONLY))
> + return 0;
> return err;
> + }
So, if we had to load the buddy info and if the group's block bitmap was
marked as corrupted, then we always return error, instead of
seaching for free blocks in other block groups (even for non-goal-only
allocations).
This patch fixes that path..
Nice catch! Was this happening as part of some xfstests?
Feel free to add:
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal()
2026-03-02 13:46 [PATCH] ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal() Ye Bin
` (4 preceding siblings ...)
2026-03-14 8:39 ` Ritesh Harjani
@ 2026-03-27 4:06 ` Theodore Ts'o
5 siblings, 0 replies; 8+ messages in thread
From: Theodore Ts'o @ 2026-03-27 4:06 UTC (permalink / raw)
To: adilger.kernel, linux-ext4, Ye Bin; +Cc: Theodore Ts'o, jack
On Mon, 02 Mar 2026 21:46:19 +0800, Ye Bin wrote:
> There's issue as follows:
> ...
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>
> [...]
Applied, thanks!
[1/1] ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal()
commit: 4a1e038b056fca4a9644de1af8009c4980e158e3
Best regards,
--
Theodore Ts'o <tytso@mit.edu>
^ permalink raw reply [flat|nested] 8+ messages in thread