From: Albin Babu Varghese <albinbabuvarghese20@gmail.com>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>,
syzbot+f3185be57d7e8dda32b8@syzkaller.appspotmail.com,
Ahmet Eray Karadag <eraykrdg1@gmail.com>,
linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] ext4: synchronize free block counter when detecting corruption
Date: Tue, 11 Nov 2025 03:45:37 -0500 [thread overview]
Message-ID: <aRL3sT7AbUgZTbem@arch-box> (raw)
In-Reply-To: <20251106153035.GA3125470@mit.edu>
Hey Ted, Thanks for the feedback.
On Thu, Nov 06, 2025 at 10:30:35AM -0500, Theodore Ts'o wrote:
> On Fri, Oct 10, 2025 at 03:38:00AM -0400, Albin Babu Varghese wrote:
> > When ext4_mb_generate_buddy() detects block group descriptor
> > corruption (free block count mismatch between descriptor and
> > bitmap), it corrects the in-memory group descriptor (grp->bb_free)
> > but does not synchronize the percpu free clusters counter.
>
> Actually, we do. This happens in ext4_mark_group_bitmap_corrupted in
> fs/ext4/super.c.
>
> if (flags & EXT4_GROUP_INFO_BBITMAP_CORRUPT) {
> ret = ext4_test_and_set_bit(EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT,
> &grp->bb_state);
> if (!ret)
> percpu_counter_sub(&sbi->s_freeclusters_counter,
> grp->bb_free);
> }
>
> So we've *already* subtracted out the blocks that were in the block
> group which we've busied out.
Thanks for pointing that out. It was naive of me to overlook the other
occurences.
> > This causes delayed allocation to read stale counter values when
> > checking for available space. The allocator believes space is
> > available based on the stale counter, makes reservation promises,
> > but later fails during writeback when trying to allocate actual
> > blocks from the bitmap. This results in "Delayed block allocation
> > failed" errors and potential system crashes.
>
> I suspect there is something else going on with s_freeclusters_counter
> being incorrect, but adding an additional correction to
> s_freeclusters_counter is not the answer.
>
> How is the system crashing? If we have errors=continue, then we
> really shouldn't let the system crash. If there is delayed allocation
> failures, the user might lose data, but if the user really cares about
> that, they shouldn't be using errors=continue.
I think the existing check in `ext4_mb_generate_buddy` is for runtime errors,
and the issue here is happening at mount time due to an already corrupted
filesystem. The value of `grp->bb_free` and `s_freeclusters_counter` was
`150994969` vs `25`, which is the actual free clusters count. The existing
update call subtracts `grp->bb_free` from `s_freeclusters_counter` assuming
that the group descriptor is accurate, but in this case it is not. So we
still end up with an incorrect global counter.
I tried the patch here:
https://syzkaller.appspot.com/text?tag=Patch&x=1771a7cd980000
In that version, I attempted to compute and pass the corrected value to the
update function, but it failed with the warning at `ext4_dirty_folio`:
https://syzkaller.appspot.com/x/report.txt?x=1306a7cd980000
From what I understand, even after adjusting the counter, the dirty buffers had
already been created, so it returns an error that unmapped dirty buffers
remain.
My earlier fix ended up making `s_freeclusters_counter` become 0 due to the
update in `ext4_mark_group_bitmap_corrupted()` that I had overlooked. As a
result, no warnings or errors were triggered at that time.
I might be off here, and I’m not sure how best to proceed.
Best,
Albin
prev parent reply other threads:[~2025-11-11 8:45 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-10 7:38 [PATCH v2] ext4: synchronize free block counter when detecting corruption Albin Babu Varghese
2025-11-06 15:30 ` Theodore Ts'o
2025-11-11 8:45 ` Albin Babu Varghese [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aRL3sT7AbUgZTbem@arch-box \
--to=albinbabuvarghese20@gmail.com \
--cc=adilger.kernel@dilger.ca \
--cc=eraykrdg1@gmail.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=syzbot+f3185be57d7e8dda32b8@syzkaller.appspotmail.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).