From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from va-2-111.ptr.blmpb.com (va-2-111.ptr.blmpb.com [209.127.231.111]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 238383CAE92 for ; Wed, 25 Mar 2026 11:13:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.127.231.111 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774437229; cv=none; b=H2jAwpz/JDE8eJ93gK2dSjVC0cwQexI9rCCLrScqToIfYFkK4lqFb6QV1Wr3oa09Vkj2ykzbO+U429c12HkRJ7v5E3arESW3Oj9Pc0a92Qs3SDwC7Bm+VZfez+U2v6y6ydRaRhUhmeExlOLjuRJnyMKr3CU+wxJuFtxeaVnW2p8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774437229; c=relaxed/simple; bh=SFHCUZ67nBPcXWqe4pxF+iah/BBWuCNThzxz4J6SuqU=; h=Message-Id:To:Cc:Subject:Mime-Version:References:From:Date: Content-Type:In-Reply-To; b=UIjoUGSiT+8Hv/jL+NlvWcvPvFVdGoOf6yxF87ku4XNpprenWViRjr7fGnZ2YjSYppWl2GNBMmpk7iga5NOJYbyz8v0VMw2+EGcjor1HFwr4azoRlsFfJz21AXmSpjvC7nffzOrPM+UBIRX27AHvJGrcsfvrjNy/JvnSwOpI6q8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=lp7VoRpX; arc=none smtp.client-ip=209.127.231.111 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="lp7VoRpX" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1774437216; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=/2uLAgm3FI5zhNajV9t6fx76xgfhgjkuo6m+NR3j7To=; b=lp7VoRpX9+VtgW0e0EYqD10+zbp8UcEA3tsnrK0jjQ1DSvQHk4BRcfg0QX8NhDE0ESbU4I 4SazBM4BV1ZXKr9VlJyFtBkHIglUlJrlZfdOpE1oEE/5Lxg5qUBASPWBMLt1pmpxdtTLPY +UXzNKqUjPhasZENl/BQfsv+iwvmHHjWGMUpmC8uLFpDkXrNIFTp5wFoPmv3vnfZDkUMJe MWhdvlKKodsvds3d4u6hJ30EEugsWfv8RPZJLuWX+8WzbWfBJHpCZJxMj4wHWhcVV6tOso uiliJxBc4EkMtl60u0jgT42EapGxptTwrINUIPWMCCIGZGi8eqxzDDHvIxts3Q== Message-Id: X-Original-From: Diangang Li User-Agent: Mozilla Thunderbird To: "Andreas Dilger" , "Diangang Li" Cc: , , , , Subject: Re: [RFC 1/1] ext4: fail fast on repeated metadata reads after IO failure Content-Transfer-Encoding: 7bit X-Lms-Return-Path: Content-Language: en-US Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260325093349.630193-1-diangangli@gmail.com> <20260325093349.630193-2-diangangli@gmail.com> From: "Diangang Li" Date: Wed, 25 Mar 2026 19:13:21 +0800 Content-Type: text/plain; charset=UTF-8 In-Reply-To: Hi Andreas, BH_Read_EIO is cleared on successful read or write. In practice bad blocks are typically repaired/remapped on write, so we expect recovery after a successful rewrite. If the block is never rewritten, repeatedly issuing the same failing read does not help. We clear the flag on successful reads so the buffer can recover immediately if the error was transient. Since read-ahead reads are not blocked, a later successful read-ahead will clear the flag and allow subsequent synchronous readers to proceed normally. Best, Diangang On 3/25/26 6:15 PM, Andreas Dilger wrote: > On Mar 25, 2026, at 03:33, Diangang Li wrote: >> >> From: Diangang Li >> >> ext4 metadata reads serialize on BH_Lock (lock_buffer). If the read fails, >> the buffer remains !Uptodate. With concurrent callers, each waiter can >> retry the same failing read after the previous holder drops BH_Lock. This >> amplifies device retry latency and may trigger hung tasks. >> >> In the normal read path the block driver already performs its own retries. >> Once the retries keep failing, re-submitting the same metadata read from >> the filesystem just amplifies the latency by serializing waiters on >> BH_Lock. >> >> Remember read failures on buffer_head and fail fast for ext4 metadata reads >> once a buffer has already failed to read. Clear the flag on successful >> read/write completion so the buffer can recover. ext4 read-ahead uses >> ext4_read_bh_nowait(), so it does not set the failure flag and remains >> best-effort. > > Not that the patch is bad, but if the BH_Read_EIO flag is set on a buffer > and it prevents other tasks from reading that block again, how would the > buffer ever become Uptodate to clear the flag? There isn't enough state > in a 1-bit flag to have any kind of expiry and later retry. > > Cheers, Andreas