From: Bart Van Assche <bart.vanassche@gmail.com>
To: Christoph Hellwig <hch@lst.de>, Carlos Maiolino <cem@kernel.org>,
Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-raid@vger.kernel.org,
linux-block@vger.kernel.org
Subject: Re: fall back from direct to buffered I/O when stable writes are required
Date: Wed, 29 Oct 2025 08:58:52 -0700 [thread overview]
Message-ID: <ea07dede-5baa-41e5-ad5d-a9f6a90ac6e8@gmail.com> (raw)
In-Reply-To: <20251029071537.1127397-1-hch@lst.de>
On 10/29/25 12:15 AM, Christoph Hellwig wrote:
> we've had a long standing issue that direct I/O to and from devices that
> require stable writes can corrupt data because the user memory can be
> modified while in flight. This series tries to address this by falling
> back to uncached buffered I/O. Given that this requires an extra copy it
> is usually going to be a slow down, especially for very high bandwith
> use cases, so I'm not exactly happy about.
>
> I suspect we need a way to opt out of this for applications that know
> what they are doing, and I can think of a few ways to do that:
>
> 1a) Allow a mount option to override the behavior
>
> This allows the sysadmin to get back to the previous state.
> This is fairly easy to implement, but the scope might be to wide.
>
> 1b) Sysfs attribute
>
> Same as above. Slightly easier to modify, but a more unusual
> interface.
>
> 2) Have a per-inode attribute
>
> Allows to set it on a specific file. Would require an on-disk
> format change for the usual attr options.
>
> 3) Have a fcntl or similar to allow an application to override it
>
> Fine granularity. Requires application change. We might not
> allow any application to force this as it could be used to inject
> corruption.
>
> In other words, they are all kinda horrible.
Hi Christoph,
Has the opposite been considered: only fall back to buffered I/O for
buggy software that modifies direct I/O buffers before I/O has
completed?
Regarding selecting the direct I/O behavior for a process, how about
introducing a new prctl() flag and introducing a new command-line
utility that follows the style of ionice and sets the new flag before
any code runs in the started process?
Thanks,
Bart.
next prev parent reply other threads:[~2025-10-29 15:58 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-29 7:15 fall back from direct to buffered I/O when stable writes are required Christoph Hellwig
2025-10-29 7:15 ` [PATCH 1/4] fs: replace FOP_DIO_PARALLEL_WRITE with a fmode bits Christoph Hellwig
2025-10-29 16:01 ` Darrick J. Wong
2025-11-04 7:00 ` Nirjhar Roy (IBM)
2025-11-05 14:04 ` Christoph Hellwig
2025-11-11 9:44 ` Christian Brauner
2025-10-29 7:15 ` [PATCH 2/4] fs: return writeback errors for IOCB_DONTCACHE in generic_write_sync Christoph Hellwig
2025-10-29 16:01 ` Darrick J. Wong
2025-10-29 16:37 ` Christoph Hellwig
2025-10-29 18:12 ` Darrick J. Wong
2025-10-30 5:59 ` Christoph Hellwig
2025-11-04 12:04 ` Nirjhar Roy (IBM)
2025-11-04 15:53 ` Christoph Hellwig
2025-10-29 7:15 ` [PATCH 3/4] xfs: use IOCB_DONTCACHE when falling back to buffered writes Christoph Hellwig
2025-10-29 15:57 ` Darrick J. Wong
2025-11-04 12:33 ` Nirjhar Roy (IBM)
2025-11-04 15:52 ` Christoph Hellwig
2025-10-29 7:15 ` [PATCH 4/4] xfs: fallback to buffered I/O for direct I/O when stable writes are required Christoph Hellwig
2025-10-29 15:53 ` Darrick J. Wong
2025-10-29 16:35 ` Christoph Hellwig
2025-10-29 21:23 ` Qu Wenruo
2025-10-30 5:58 ` Christoph Hellwig
2025-10-30 6:37 ` Qu Wenruo
2025-10-30 6:49 ` Christoph Hellwig
2025-10-30 6:53 ` Qu Wenruo
2025-10-30 6:55 ` Christoph Hellwig
2025-10-30 7:14 ` Qu Wenruo
2025-10-30 7:17 ` Christoph Hellwig
2025-11-10 13:38 ` Nirjhar Roy (IBM)
2025-11-10 13:59 ` Christoph Hellwig
2025-11-12 7:13 ` Nirjhar Roy (IBM)
2025-10-29 15:58 ` Bart Van Assche [this message]
2025-10-29 16:14 ` fall back from direct to buffered " Darrick J. Wong
2025-10-29 16:33 ` Christoph Hellwig
2025-10-30 11:20 ` Dave Chinner
2025-10-30 12:00 ` Geoff Back
2025-10-30 12:54 ` Jan Kara
2025-10-30 14:35 ` Christoph Hellwig
2025-10-30 22:02 ` Dave Chinner
2025-10-30 14:33 ` Christoph Hellwig
2025-10-30 23:18 ` Dave Chinner
2025-10-31 13:00 ` Christoph Hellwig
2025-10-31 15:57 ` Keith Busch
2025-10-31 16:47 ` Christoph Hellwig
2025-11-03 11:14 ` Jan Kara
2025-11-03 12:21 ` Christoph Hellwig
2025-11-03 22:47 ` Keith Busch
2025-11-04 23:38 ` Darrick J. Wong
2025-11-05 14:11 ` Christoph Hellwig
2025-11-05 21:44 ` Darrick J. Wong
2025-11-06 9:50 ` Johannes Thumshirn
2025-11-06 12:49 ` hch
2025-11-12 14:18 ` Ming Lei
2025-11-12 14:38 ` hch
2025-11-13 17:39 ` Kevin Wolf
2025-11-14 5:39 ` Christoph Hellwig
2025-11-14 9:29 ` Kevin Wolf
2025-11-14 12:01 ` Christoph Hellwig
2025-11-14 12:31 ` Kevin Wolf
2025-11-14 15:36 ` Christoph Hellwig
2025-11-14 16:55 ` Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ea07dede-5baa-41e5-ad5d-a9f6a90ac6e8@gmail.com \
--to=bart.vanassche@gmail.com \
--cc=brauner@kernel.org \
--cc=cem@kernel.org \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=martin.petersen@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).