public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Noah Misch <noah@leadboat.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org
Subject: Re: After block device error, FICLONE and sync_file_range() make NULs, unlike read()
Date: Fri, 9 Dec 2022 23:43:44 -0800	[thread overview]
Message-ID: <20221210074344.GA646514@rfd.leadboat.com> (raw)
In-Reply-To: <Y4Vzk54RzjjEApOR@magnolia>

On Mon, Nov 28, 2022 at 06:50:59PM -0800, Darrick J. Wong wrote:
> On Sat, Nov 19, 2022 at 05:34:12PM -0800, Noah Misch wrote:
> > On Tue, Nov 15, 2022 at 07:14:47PM -0800, Darrick J. Wong wrote:
> > > On Wed, Nov 09, 2022 at 08:54:52PM -0800, Noah Misch wrote:
> > > > Subject line has my typo: s/sync_file_range/copy_file_range/

> > > Another dumb thing about how the pagecache tracks errors is that it sets
> > > a single state bit for the whole mapping, which means that we can't
> > > actually /tell/ userspace which part of their file is now busted.  We
> > > can't even tell if userspace has successfully rewrite()d all the regions
> > > where writeback failed, which leads me to...
> > > 
> > > Another another dumb thing about how the pagecache tracks errors is that
> > > any fsync-lik operation will test_and_clear_bit the EIO state, which
> > > means that if we find a past EIO, we'll clear that state and return the
> > > EIO to userspace.
> > > 
> > > We /could/ change FICLONE to flush the dirty pagecache, sample the EIO
> > > status *without* clearing it, and return EIO if it's set.  That's
> > > probably the most unabsurd way to deal with this, but it's unsettling
> > > that even cp ignores errno returns now.  The manpage for FICLONE doesn't
> > > explicitly mention any fsync behaviors, so perhaps "flush and retain
> > > EIO" is the right choice here.
> > 
> > That reminds me of
> > https://postgr.es/m//20180427222842.in2e4mibx45zdth5@alap3.anarazel.de.  Its
> > summary of a LSF/MM 2018 discussion mentioned NFS writeback errors detected
> > and cleared at close(), which I find similar.  I might favor a uniform policy,
> > one of:
> > 
> > a. Any syscall with a file descriptor argument might return EIO.  If it does,
> >    it clears the EIO.
> > b. Any syscall with a file descriptor argument might return EIO.  Only a
> >    specific list of syscalls, having writeback-oriented names, clear EIO:
> >    fsync(), syncfs(), [...].  Others report EIO without clearing it.
> > 
> > One argument for (b) is that, on EIO from FICLONE or copy_file_range(), the
> > caller can't know whether the broken file is the source or the destination.  A
> > cautious caller should assume both are broken.  What other considerations
> > should influence the decision?
> 
> That's a very good point you've raised -- userspace can't associate an
> EIO return value with either of the fds in use.  It can't even tell if
> the filesystem itself hit some metadata error somewhere else (e.g.
> refcount data), and that's the real reason why EIO got thrown back to
> userspace.
> 
> On those grounds, I think FICLONE/FIEDEDUPE need to preserve the
> AS_EIO/AS_ENOSPC state in the address_space so that actual fsync (or
> syncfs, or any of the known 'persist me now' calls) can also return the
> status.
> 
> I'll try to push that for 6.3.

That sounds good.  Thank you.  Please CC me on any threads you create for
this, if not inconvenient.

  reply	other threads:[~2022-12-10  7:43 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-08 17:24 After block device error, FICLONE and sync_file_range() make NULs, unlike read() Noah Misch
2022-11-09 16:47 ` Darrick J. Wong
2022-11-10  4:54   ` Noah Misch
2022-11-16  3:14     ` Darrick J. Wong
2022-11-20  1:34       ` Noah Misch
2022-11-29  2:50         ` Darrick J. Wong
2022-12-10  7:43           ` Noah Misch [this message]
2022-12-13 19:20             ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221210074344.GA646514@rfd.leadboat.com \
    --to=noah@leadboat.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox