From: "Diangang Li" <lidiangang@bytedance.com>
To: "Andreas Dilger" <adilger@dilger.ca>,
"Diangang Li" <diangangli@gmail.com>
Cc: <tytso@mit.edu>, <linux-ext4@vger.kernel.org>,
<linux-fsdevel@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<changfengnan@bytedance.com>
Subject: Re: [RFC 1/1] ext4: fail fast on repeated metadata reads after IO failure
Date: Wed, 25 Mar 2026 19:13:21 +0800 [thread overview]
Message-ID: <c6f4b982-c6e4-4f77-a16d-0c381c1e25f0@bytedance.com> (raw)
In-Reply-To: <B53E253C-F314-4376-BD9D-58867FC8D3F6@dilger.ca>
Hi Andreas,
BH_Read_EIO is cleared on successful read or write.
In practice bad blocks are typically repaired/remapped on write, so we
expect recovery after a successful rewrite. If the block is never
rewritten, repeatedly issuing the same failing read does not help.
We clear the flag on successful reads so the buffer can recover
immediately if the error was transient. Since read-ahead reads are not
blocked, a later successful read-ahead will clear the flag and allow
subsequent synchronous readers to proceed normally.
Best,
Diangang
On 3/25/26 6:15 PM, Andreas Dilger wrote:
> On Mar 25, 2026, at 03:33, Diangang Li <diangangli@gmail.com> wrote:
>>
>> From: Diangang Li <lidiangang@bytedance.com>
>>
>> ext4 metadata reads serialize on BH_Lock (lock_buffer). If the read fails,
>> the buffer remains !Uptodate. With concurrent callers, each waiter can
>> retry the same failing read after the previous holder drops BH_Lock. This
>> amplifies device retry latency and may trigger hung tasks.
>>
>> In the normal read path the block driver already performs its own retries.
>> Once the retries keep failing, re-submitting the same metadata read from
>> the filesystem just amplifies the latency by serializing waiters on
>> BH_Lock.
>>
>> Remember read failures on buffer_head and fail fast for ext4 metadata reads
>> once a buffer has already failed to read. Clear the flag on successful
>> read/write completion so the buffer can recover. ext4 read-ahead uses
>> ext4_read_bh_nowait(), so it does not set the failure flag and remains
>> best-effort.
>
> Not that the patch is bad, but if the BH_Read_EIO flag is set on a buffer
> and it prevents other tasks from reading that block again, how would the
> buffer ever become Uptodate to clear the flag? There isn't enough state
> in a 1-bit flag to have any kind of expiry and later retry.
>
> Cheers, Andreas
next prev parent reply other threads:[~2026-03-25 11:13 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-25 9:33 [RFC PATCH 0/1] ext4: fail fast on repeated metadata reads after IO failure Diangang Li
2026-03-25 9:33 ` [RFC 1/1] " Diangang Li
2026-03-25 10:15 ` Andreas Dilger
2026-03-25 11:13 ` Diangang Li [this message]
2026-03-25 14:27 ` Zhang Yi
2026-03-26 2:26 ` changfengnan
2026-03-26 7:42 ` Diangang Li
2026-03-26 11:09 ` Zhang Yi
2026-03-25 15:06 ` Matthew Wilcox
2026-03-26 12:09 ` Diangang Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c6f4b982-c6e4-4f77-a16d-0c381c1e25f0@bytedance.com \
--to=lidiangang@bytedance.com \
--cc=adilger@dilger.ca \
--cc=changfengnan@bytedance.com \
--cc=diangangli@gmail.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox