All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Carlos Maiolino <cem@kernel.org>,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-raid@vger.kernel.org,
	linux-block@vger.kernel.org
Subject: Re: fall back from direct to buffered I/O when stable writes are required
Date: Fri, 31 Oct 2025 10:18:46 +1100	[thread overview]
Message-ID: <aQPyVtkvTg4W1nyz@dread.disaster.area> (raw)
In-Reply-To: <20251030143324.GA31550@lst.de>

On Thu, Oct 30, 2025 at 03:33:24PM +0100, Christoph Hellwig wrote:
> On Thu, Oct 30, 2025 at 10:20:02PM +1100, Dave Chinner wrote:
> > > use cases, so I'm not exactly happy about.
> > 
> > How many applications actually have this problem? I've not heard of
> > anyone encoutnering such RAID corruption problems on production
> > XFS filesystems -ever-, so it cannot be a common thing.
> 
> The most common application to hit this is probably the most common
> use of O_DIRECT: qemu.  Look up for btrfs errors with PI, caused by
> the interaction of checksumming.  Btrfs finally fixed this a short
> while ago, and there are reports for other applications a swell.

I'm not asking about btrfs - I'm asking about actual, real world
problems reported in production XFS environments.

> For RAID you probably won't see too many reports, as with RAID the
> problem will only show up as silent corruption long after a rebuild
> rebuild happened that made use of the racy data.

Yet we are not hearing about this, either. Nobody is reporting that
their data is being found to be corrupt days/weeks/months/years down
the track.

This is important, because software RAID5 is pretty much the -only-
common usage of BLK_FEAT_STABLE_WRITES that users are exposed to.
This patch set is effectively disallowing direct IO for anyone
using software RAID5.

That is simply not an acceptible outcome here.

> With checksums
> it is much easier to reproduce and trivially shown by various xfstests.

Such as? 

> With increasing storage capacities checksums are becoming more and
> more important, and I'm trying to get Linux in general and XFS
> specifically to use them well.

So when XFS implements checksums and that implementation is
incompatible with Direct IO, then we can talk about disabling Direct
IO on XFS when that feature is enabled. But right now, that feature
does not exist, and ....

> Right now I don't think anyone is
> using PI with XFS or any Linux file system given the amount of work
> I had to put in to make it work well, and how often I see regressions
> with it.

.... as you say, "nobody is using PI with XFS".

So patchset is a "fix" for a problem that no-one is actually having
right now.

> > Forcing a performance regression on users, then telling them "you
> > need to work around the performance regression" is a pretty horrible
> > thing to do in the first place.
> 
> I disagree.  Not corruption user data for applications that use the
> interface correctly per all documentation is a prime priority.

Modifying an IO buffer whilst a DIO is in flight on that buffer has
-always- been an application bug.  It is a vector for torn writes
that don't get detected until the next read. It is a vector for
in-memory data corruption of read buffers.

Indeed, it does not matter if the underlying storage asserts
BLK_FEAT_STABLE_WRITES or not, modifying DIO buffers that are under
IO will (eventually) result in data corruption.  Hence, by your
logic, we should disable Direct IO for everyone.

That's just .... insane.

Remember: O_DIRECT means the application takes full responsibility
for ensuring IO concurrency semantics are correctly implemented.
Modifying IO buffers whilst the IO buffer is being read from or
written to by the hardware has always been an IO concurrency bug in
the application.

The behaviour being talked about here is, and always has been, an
application IO concurrency bug, regardless of PI, stable writes,
etc. Such an application bug existing is *not a valid reason for the
kernel or filesystem to disable Direct IO*.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2025-10-30 23:18 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-29  7:15 fall back from direct to buffered I/O when stable writes are required Christoph Hellwig
2025-10-29  7:15 ` [PATCH 1/4] fs: replace FOP_DIO_PARALLEL_WRITE with a fmode bits Christoph Hellwig
2025-10-29 16:01   ` Darrick J. Wong
2025-11-04  7:00   ` Nirjhar Roy (IBM)
2025-11-05 14:04     ` Christoph Hellwig
2025-11-11  9:44   ` Christian Brauner
2025-10-29  7:15 ` [PATCH 2/4] fs: return writeback errors for IOCB_DONTCACHE in generic_write_sync Christoph Hellwig
2025-10-29 16:01   ` Darrick J. Wong
2025-10-29 16:37     ` Christoph Hellwig
2025-10-29 18:12       ` Darrick J. Wong
2025-10-30  5:59         ` Christoph Hellwig
2025-11-04 12:04       ` Nirjhar Roy (IBM)
2025-11-04 15:53         ` Christoph Hellwig
2025-10-29  7:15 ` [PATCH 3/4] xfs: use IOCB_DONTCACHE when falling back to buffered writes Christoph Hellwig
2025-10-29 15:57   ` Darrick J. Wong
2025-11-04 12:33   ` Nirjhar Roy (IBM)
2025-11-04 15:52     ` Christoph Hellwig
2025-10-29  7:15 ` [PATCH 4/4] xfs: fallback to buffered I/O for direct I/O when stable writes are required Christoph Hellwig
2025-10-29 15:53   ` Darrick J. Wong
2025-10-29 16:35     ` Christoph Hellwig
2025-10-29 21:23       ` Qu Wenruo
2025-10-30  5:58         ` Christoph Hellwig
2025-10-30  6:37           ` Qu Wenruo
2025-10-30  6:49             ` Christoph Hellwig
2025-10-30  6:53               ` Qu Wenruo
2025-10-30  6:55                 ` Christoph Hellwig
2025-10-30  7:14                   ` Qu Wenruo
2025-10-30  7:17                     ` Christoph Hellwig
2025-11-10 13:38   ` Nirjhar Roy (IBM)
2025-11-10 13:59     ` Christoph Hellwig
2025-11-12  7:13       ` Nirjhar Roy (IBM)
2025-10-29 15:58 ` fall back from direct to buffered " Bart Van Assche
2025-10-29 16:14   ` Darrick J. Wong
2025-10-29 16:33   ` Christoph Hellwig
2025-10-30 11:20 ` Dave Chinner
2025-10-30 12:00   ` Geoff Back
2025-10-30 12:54     ` Jan Kara
2025-10-30 14:35     ` Christoph Hellwig
2025-10-30 22:02     ` Dave Chinner
2025-10-30 14:33   ` Christoph Hellwig
2025-10-30 23:18     ` Dave Chinner [this message]
2025-10-31 13:00       ` Christoph Hellwig
2025-10-31 15:57         ` Keith Busch
2025-10-31 16:47           ` Christoph Hellwig
2025-11-03 11:14             ` Jan Kara
2025-11-03 12:21               ` Christoph Hellwig
2025-11-03 22:47                 ` Keith Busch
2025-11-04 23:38                 ` Darrick J. Wong
2025-11-05 14:11                   ` Christoph Hellwig
2025-11-05 21:44                     ` Darrick J. Wong
2025-11-06  9:50                       ` Johannes Thumshirn
2025-11-06 12:49                         ` hch
2025-11-12 14:18                           ` Ming Lei
2025-11-12 14:38                             ` hch
2025-11-13 17:39                 ` Kevin Wolf
2025-11-14  5:39                   ` Christoph Hellwig
2025-11-14  9:29                     ` Kevin Wolf
2025-11-14 12:01                       ` Christoph Hellwig
2025-11-14 12:31                         ` Kevin Wolf
2025-11-14 15:36                           ` Christoph Hellwig
2025-11-14 16:55                             ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aQPyVtkvTg4W1nyz@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=brauner@kernel.org \
    --cc=cem@kernel.org \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.