From: "yebin (H)" <yebin10@huawei.com>
To: Baokun Li <libaokun@linux.alibaba.com>, Ye Bin <yebin@huaweicloud.com>
Cc: <jack@suse.cz>, <tytso@mit.edu>, <adilger.kernel@dilger.ca>,
<linux-ext4@vger.kernel.org>
Subject: Re: [PATCH] ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal()
Date: Tue, 3 Mar 2026 15:55:49 +0800 [thread overview]
Message-ID: <69A69405.3050109@huawei.com> (raw)
In-Reply-To: <80be4b7a-976f-44cc-a50d-66a0e9ed05a0@linux.alibaba.com>
On 2026/3/3 10:36, Baokun Li wrote:
>
> On 3/2/26 9:46 PM, Ye Bin wrote:
>> From: Ye Bin <yebin10@huawei.com>
>>
>> There's issue as follows:
>> ...
>> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
>> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>>
>> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
>> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>>
>> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
>> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>>
>> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
>> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>>
>> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2243 at logical offset 0 with max blocks 1 with error 117
>> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>>
>> EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2239 at logical offset 0 with max blocks 1 with error 117
>> EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
>>
>> EXT4-fs (mmcblk0p1): error count since last fsck: 1
>> EXT4-fs (mmcblk0p1): initial error at time 1765597433: ext4_mb_generate_buddy:760
>> EXT4-fs (mmcblk0p1): last error at time 1765597433: ext4_mb_generate_buddy:760
>> ...
>>
>> According to the log analysis, blocks are always requested from the
>> corrupted block group. This may happen as follows:
>> ext4_mb_find_by_goal
>> ext4_mb_load_buddy
>> ext4_mb_load_buddy_gfp
>> ext4_mb_init_cache
>> ext4_read_block_bitmap_nowait
>> ext4_wait_block_bitmap
>> ext4_validate_block_bitmap
>> if (!grp || EXT4_MB_GRP_BBITMAP_CORRUPT(grp))
>> return -EFSCORRUPTED; // There's no logs.
>> if (err)
>> return err; // Will return error
>> ext4_lock_group(ac->ac_sb, group);
>> if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info))) // Unreachable
>> goto out;
>>
>> After commit 9008a58e5dce ("ext4: make the bitmap read routines return
>> real error codes") merged, Commit 163a203ddb36 ("ext4: mark block group
>> as corrupt on block bitmap error") is no real solution for allocating
>> blocks from corrupted block groups. This is because if
>> 'EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)' is true, then
>> 'ext4_mb_load_buddy()' may return an error. This means that the block
>> allocation will fail.
>> Therefore, check block group if corrupted when ext4_mb_load_buddy()
>> returns error.
>
> Good catch!
>
> Agreed, we should try other groups upon failure unless it's a goal-only
> allocation.
>
> But note that e4b->bd_info might be uninitialized if ext4_mb_load_buddy()
> fails.
>
The situation you mentioned probably doesn't exist.
ext4_mb_find_by_goal
struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb, group);
if (!grp) // The possibility that e4b->bd_info is not initialized
has been avoided.
return -EFSCORRUPTED;
err = ext4_mb_load_buddy(ac->ac_sb, group, e4b);
ext4_mb_load_buddy_gfp(sb, group, e4b, GFP_NOFS);
grp = ext4_get_group_info(sb, group);
if (!grp) // This condition probably will not be met.
return -EFSCORRUPTED;
e4b->bd_info = grp;
> I think we can optimize this in ext4_mb_regular_allocator(): we can record
> the error from ext4_mb_find_by_goal() but avoid an early exit.
>
> Specifically, after checking that EXT4_MB_HINT_GOAL_ONLY is not set,
> we can assign the error to ac->ac_first_err. This way, if subsequent
> allocation attempts still fail, we can preserve the original.
>
>
> Cheers,
> Baokun
>
>> Fixes: 163a203ddb36 ("ext4: mark block group as corrupt on block bitmap error")
>> Fixes: 9008a58e5dce ("ext4: make the bitmap read routines return real error codes")
>> Signed-off-by: Ye Bin <yebin10@huawei.com>
>> ---
>> fs/ext4/mballoc.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
>> index e2341489f4d0..ffa6886de8a3 100644
>> --- a/fs/ext4/mballoc.c
>> +++ b/fs/ext4/mballoc.c
>> @@ -2443,8 +2443,12 @@ int ext4_mb_find_by_goal(struct ext4_allocation_context *ac,
>> return 0;
>>
>> err = ext4_mb_load_buddy(ac->ac_sb, group, e4b);
>> - if (err)
>> + if (err) {
>> + if (EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info) &&
>> + !(ac->ac_flags & EXT4_MB_HINT_GOAL_ONLY))
>> + return 0;
>> return err;
>> + }
>>
>> ext4_lock_group(ac->ac_sb, group);
>> if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)))
>
>
> .
>
next prev parent reply other threads:[~2026-03-03 7:56 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-02 13:46 [PATCH] ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal() Ye Bin
2026-03-02 16:27 ` Jan Kara
2026-03-02 19:41 ` Andreas Dilger
2026-03-03 2:36 ` Baokun Li
2026-03-03 7:55 ` yebin (H) [this message]
2026-03-07 8:07 ` Zhang Yi
2026-03-14 8:39 ` Ritesh Harjani
2026-03-27 4:06 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=69A69405.3050109@huawei.com \
--to=yebin10@huawei.com \
--cc=adilger.kernel@dilger.ca \
--cc=jack@suse.cz \
--cc=libaokun@linux.alibaba.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=yebin@huaweicloud.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.