From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7C5FC3C7DF7
	for <linux-fsdevel@vger.kernel.org>; Mon, 13 Apr 2026 12:48:39 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=18.9.28.11
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1776084520; cv=none; b=r0bPARPgdyfB7xmDX2SQfbcPKlUNJw+QEOG6tsv503E8jPV5+WYeS22/Fb1O/7pU3Q6zRDPNs+F1EeVRKI8PCpUYIwHHJtUD/HrwBEC8dxLuYeBWYTyPqoYGoDXQZBZ3B8GGJCAp+/EDvuq27Pw1gj2dDXRWtmIoA+XCeIbiKm0=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1776084520; c=relaxed/simple;
	bh=v7asg6qA7pO2JxM8JSNzirNqzT0RT3u69UBi0wGyXEE=;
	h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version:
	 Content-Type:Content-Disposition:In-Reply-To; b=ip1a6dK5INorGucRAgw8WdDCYvrZ3KKFLXmtHNr0oHsZUgHX7M2xMUboIwecJSIrkHOGTsP6IsYreQJe6dw6/hmQU/aHxaE1zS0cM5O2Qfcnw6VODxHAqWYT4TVXjuHGtuOGrG5IWn71+pnAH8WPdAZUGK7uRnwc9I9ytSZQEk8=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=mit.edu; spf=pass smtp.mailfrom=mit.edu; dkim=pass (2048-bit key) header.d=mit.edu header.i=@mit.edu header.b=Dr7NHjXJ; arc=none smtp.client-ip=18.9.28.11
Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=mit.edu
Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=mit.edu
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=mit.edu header.i=@mit.edu header.b="Dr7NHjXJ"
Received: from macsyma.thunk.org (pool-173-48-113-10.bstnma.fios.verizon.net [173.48.113.10])
	(authenticated bits=0)
        (User authenticated as tytso@ATHENA.MIT.EDU)
	by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 63DCm4lu013301
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
	Mon, 13 Apr 2026 08:48:05 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mit.edu; s=outgoing;
	t=1776084487; bh=ew0N4RW/+mMo1okUz8IGgXrBvzrPN7l+s8J2ydS6XZI=;
	h=Date:From:Subject:Message-ID:MIME-Version:Content-Type;
	b=Dr7NHjXJPY+Ug6d5M4eVJih7sF68tlFdq1cMBnCxIwJJkQob4ee6dd0dXgT5zcXOs
	 Aq4t9D3pjx4bp/bw7i3vf9AJdc2XGYB7eyq+0PKUaLa6ZurQunYb9bJLVHsJVoobxF
	 bOwGP6sk/x8lXvfQib94symCoyaKSJTSYQlki3v0bsLZr7HjZ9gP3rbY3B17qvwo7K
	 E6pChC1Yjfh1spZwnVOqZ4+o4sxXzEWLrHUvFzsqlknMx03hiSGq+Eh2SAeIuxlwkQ
	 nZ2numKdP/Y+6lmfv89AUJY8cqA5FZLODZlW60zdD0KdNdRTwEd6TD43h3WJ4xbT+/
	 6c/5wrg0ukgPQ==
Received: by macsyma.thunk.org (Postfix, from userid 15806)
	id EEB1962D9DC2; Mon, 13 Apr 2026 08:47:03 -0400 (EDT)
Date: Mon, 13 Apr 2026 08:47:03 -0400
From: "Theodore Tso" <tytso@mit.edu>
To: Diangang Li <diangangli@gmail.com>
Cc: adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org,
        linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
        changfengnan@bytedance.com, yizhang089@gmail.com, willy@infradead.org,
        Diangang Li <lidiangang@bytedance.com>
Subject: Re: [RFC v2 0/1] ext4: fail fast on repeated buffer_head reads after
 IO failure
Message-ID: <20260413124703.GA20496@macsyma-wired.lan>
References: <20260325093349.630193-1-diangangli@gmail.com>
 <20260413062500.1380307-1-diangangli@gmail.com>
Precedence: bulk
X-Mailing-List: linux-fsdevel@vger.kernel.org
List-Id: <linux-fsdevel.vger.kernel.org>
List-Subscribe: <mailto:linux-fsdevel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-fsdevel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20260413062500.1380307-1-diangangli@gmail.com>

On Mon, Apr 13, 2026 at 02:24:59PM +0800, Diangang Li wrote:
> From: Diangang Li <lidiangang@bytedance.com>
> 
> A production system reported hung tasks blocked for 300s+ in ext4
> buffer_head paths....
> 
>   [Tue Mar 24 14:16:24 2026] blk_update_request: I/O error, dev sdi,
>       sector 10704150288 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
>   [Tue Mar 24 14:16:25 2026] blk_update_request: I/O error, dev sdi,
>       sector 10704488160 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
>   [Tue Mar 24 14:16:26 2026] blk_update_request: I/O error, dev sdi,
>       sector 10704382912 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0

I wonder whether the ext4 layer is the right place to be handle this
sort of issue.  For example, it could be handled by having a subsystem
scanning dmesg (or by wiring up notifications so block device errors
get sent to a userspace daemon), and when certain criteria is met, the
machine is automatically sent to hardware operations to run
diagnostics and (most likey) replace the failing disk.

It could also be handled in the driver or SCSI layer so the "fail
fast" semantics are handled there, so that it supports all file
systems, not just ext4.  The SCSI layer also has more information
about the type of error; you might want to handle things like media
errors differently from Fibre Channel or iSCSI timeouts (which might
be something where "fast fast" is not appropriate).

By the time the error gets propagated up to the buffer head, we lose a
lot of detail about why the error took place.  Also, in the long term
we will hopefully be moving away from using buffer cache.

   		     	    	      	    - Ted