From: Kevin Wolf <kwolf@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>, Keith Busch <kbusch@kernel.org>,
Dave Chinner <david@fromorbit.com>,
Carlos Maiolino <cem@kernel.org>,
Christian Brauner <brauner@kernel.org>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-raid@vger.kernel.org,
linux-block@vger.kernel.org
Subject: Re: fall back from direct to buffered I/O when stable writes are required
Date: Fri, 14 Nov 2025 13:31:20 +0100 [thread overview]
Message-ID: <aRchGBJA1ExoGi8W@redhat.com> (raw)
In-Reply-To: <20251114120152.GA13689@lst.de>
Am 14.11.2025 um 13:01 hat Christoph Hellwig geschrieben:
> On Fri, Nov 14, 2025 at 10:29:39AM +0100, Kevin Wolf wrote:
> > Right, but since this is direct I/O and the approach with only declaring
> > I/O from the page cache safe without a bounce buffer means that RAID has
> > to use a bounce buffer here anyway (with or without PI), doesn't this
> > automatically solve it?
> >
> > So if it's only PI, it's the problem of userspace, and if you add RAID
> > on top, then the normal rules for RAID apply. (And that the buffer
> > doesn't get modified and PI doesn't become invalid until RAID does its
> > thing is still a userspace problem.)
>
> Well, only if we have different levels of I/O stability guarantees:
>
> Level 0
> - trusted caller guarantees pages are stable (buffered I/O,
> in-kernel direct I/O callers that control the buffer)
>
> Level 1:
> - untrusted caller declares the pages are stable
> (direct I/O with PI)
>
> Level 2:
> - no one guarantees nothing
> (other direct I/O directly or indirectly fed from userspace)
>
> PI formatted devices would only bounce for 1, parity would bounce for
> 1 and 2. Software checksums could probably get away with only 1,
> although 2 would feel safer.
My main point above was that RAID and (potentially passed through) PI
are independent of each other and I think that's still true with or
without multiple stability levels.
If you don't have these levels, you just have to treat level 1 and 2 the
same, i.e. bounce all the time if the kernel needs the guarantee (which
is not for userspace PI, unless the same request needs the bounce buffer
for another reason in a different place like RAID). That might be less
optimal, but still correct and better than what happens today because at
least you don't bounce for level 0 any more.
If there is something you can optimise by delegating the responsibility
to userspace in some cases - like you can prove that only the
application itself would be harmed by doing things wrong - then having
level 1 separate could certainly be interesting. In this case, I'd
consider adding an RWF_* flag for userspace to make the promise even
outside PI passthrough. But while potentially worthwhile, it feels like
this is a separate optimisation from what you tried to address here.
Kevin
next prev parent reply other threads:[~2025-11-14 12:31 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-29 7:15 fall back from direct to buffered I/O when stable writes are required Christoph Hellwig
2025-10-29 7:15 ` [PATCH 1/4] fs: replace FOP_DIO_PARALLEL_WRITE with a fmode bits Christoph Hellwig
2025-10-29 16:01 ` Darrick J. Wong
2025-11-04 7:00 ` Nirjhar Roy (IBM)
2025-11-05 14:04 ` Christoph Hellwig
2025-11-11 9:44 ` Christian Brauner
2025-10-29 7:15 ` [PATCH 2/4] fs: return writeback errors for IOCB_DONTCACHE in generic_write_sync Christoph Hellwig
2025-10-29 16:01 ` Darrick J. Wong
2025-10-29 16:37 ` Christoph Hellwig
2025-10-29 18:12 ` Darrick J. Wong
2025-10-30 5:59 ` Christoph Hellwig
2025-11-04 12:04 ` Nirjhar Roy (IBM)
2025-11-04 15:53 ` Christoph Hellwig
2025-10-29 7:15 ` [PATCH 3/4] xfs: use IOCB_DONTCACHE when falling back to buffered writes Christoph Hellwig
2025-10-29 15:57 ` Darrick J. Wong
2025-11-04 12:33 ` Nirjhar Roy (IBM)
2025-11-04 15:52 ` Christoph Hellwig
2025-10-29 7:15 ` [PATCH 4/4] xfs: fallback to buffered I/O for direct I/O when stable writes are required Christoph Hellwig
2025-10-29 15:53 ` Darrick J. Wong
2025-10-29 16:35 ` Christoph Hellwig
2025-10-29 21:23 ` Qu Wenruo
2025-10-30 5:58 ` Christoph Hellwig
2025-10-30 6:37 ` Qu Wenruo
2025-10-30 6:49 ` Christoph Hellwig
2025-10-30 6:53 ` Qu Wenruo
2025-10-30 6:55 ` Christoph Hellwig
2025-10-30 7:14 ` Qu Wenruo
2025-10-30 7:17 ` Christoph Hellwig
2025-11-10 13:38 ` Nirjhar Roy (IBM)
2025-11-10 13:59 ` Christoph Hellwig
2025-11-12 7:13 ` Nirjhar Roy (IBM)
2025-10-29 15:58 ` fall back from direct to buffered " Bart Van Assche
2025-10-29 16:14 ` Darrick J. Wong
2025-10-29 16:33 ` Christoph Hellwig
2025-10-30 11:20 ` Dave Chinner
2025-10-30 12:00 ` Geoff Back
2025-10-30 12:54 ` Jan Kara
2025-10-30 14:35 ` Christoph Hellwig
2025-10-30 22:02 ` Dave Chinner
2025-10-30 14:33 ` Christoph Hellwig
2025-10-30 23:18 ` Dave Chinner
2025-10-31 13:00 ` Christoph Hellwig
2025-10-31 15:57 ` Keith Busch
2025-10-31 16:47 ` Christoph Hellwig
2025-11-03 11:14 ` Jan Kara
2025-11-03 12:21 ` Christoph Hellwig
2025-11-03 22:47 ` Keith Busch
2025-11-04 23:38 ` Darrick J. Wong
2025-11-05 14:11 ` Christoph Hellwig
2025-11-05 21:44 ` Darrick J. Wong
2025-11-06 9:50 ` Johannes Thumshirn
2025-11-06 12:49 ` hch
2025-11-12 14:18 ` Ming Lei
2025-11-12 14:38 ` hch
2025-11-13 17:39 ` Kevin Wolf
2025-11-14 5:39 ` Christoph Hellwig
2025-11-14 9:29 ` Kevin Wolf
2025-11-14 12:01 ` Christoph Hellwig
2025-11-14 12:31 ` Kevin Wolf [this message]
2025-11-14 15:36 ` Christoph Hellwig
2025-11-14 16:55 ` Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aRchGBJA1ExoGi8W@redhat.com \
--to=kwolf@redhat.com \
--cc=brauner@kernel.org \
--cc=cem@kernel.org \
--cc=david@fromorbit.com \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=martin.petersen@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).