Linux XFS filesystem development
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Christoph Hellwig <hch@infradead.org>
Cc: Eric Sandeen <sandeen@redhat.com>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	Donald Douwsma <ddouwsma@redhat.com>,
	Dave Chinner <dchinner@redhat.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH] xfs: do not propagate ENODATA disk errors into xattr code
Date: Wed, 27 Aug 2025 08:56:20 -0700	[thread overview]
Message-ID: <20250827155620.GA8117@frogsfrogsfrogs> (raw)
In-Reply-To: <aK61FCz0wgz1s2Ab@infradead.org>

On Wed, Aug 27, 2025 at 12:34:44AM -0700, Christoph Hellwig wrote:
> On Mon, Aug 25, 2025 at 08:34:14AM -0700, Darrick J. Wong wrote:
> > > +	case BLK_STS_NOSPC:
> > > +		return -ENOSPC;
> > > +	case BLK_STS_OFFLINE:
> > > +		return -ENODEV;
> > > +	default:
> > > +		return -EIO;
> > 
> > Well as I pointed out earlier, one interesting "quality" of the current
> > behavior is that online fsck captures the ENODATA and turns that into a
> > metadata corruption report.  I'd like to keep that behavior.
> 
> -EIO is just as much of a metadata corruption, so if you only catch
> ENODATA you're missing most of them.

Hrmm, well an EIO (or an ENODATA) coming from the block layer causes the
scrub code to return to userspace with EIO, and xfs_scrub will complain
about the IO error and exit.

It doesn't explicitly mark the data structure as corrupt, but scrub
failing should be enough to conclude that the fs is corrupt.

I could patch the kernel to set the CORRUPT flag on the data structure
and keep going, since the likelihood of random bit errors causing media
errors is pretty high now that we have disks that store more than 1e15
bits.

> > >  	if (bio->bi_status)
> > > -		xfs_buf_ioerror(bp, blk_status_to_errno(bio->bi_status));
> > > +		xfs_buf_ioerror(bp, xfs_buf_bio_status(bio));
> > 
> > I think you'd also want to wrap all the submit_bio_wait here too, right?
> > 
> > Hrm, only discard bios, log writes, and zonegc use that function.  Maybe
> > not?  I think a failed log write takes down the system no matter what
> > error code, nobody cares about failing discard, and I think zonegc write
> > failures just lead to the gc ... aborting?
> 
> Yes.  In Linux -EIO means an unrecoverable I/O error that the lower
> layers gave up retrying. Not much we can do about that.

<nod>

--D

      reply	other threads:[~2025-08-27 15:56 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-21 21:36 [PATCH] xfs: do not propagate ENODATA disk errors into xattr code Eric Sandeen
2025-08-22 15:21 ` Darrick J. Wong
2025-08-22 17:52   ` Eric Sandeen
2025-08-25  7:55   ` Christoph Hellwig
2025-08-25 15:31     ` Eric Sandeen
2025-08-25 15:34     ` Darrick J. Wong
2025-08-27  7:34       ` Christoph Hellwig
2025-08-27 15:56         ` Darrick J. Wong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250827155620.GA8117@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=dchinner@redhat.com \
    --cc=ddouwsma@redhat.com \
    --cc=hch@infradead.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox