From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org, linux-block@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org
Subject: [PATCH v3 0/4] btrfs: use IOMAP_DIO_BOUNCE flag instead of falling back to buffered IO
Date: Fri, 12 Jun 2026 19:21:11 +0930 [thread overview]
Message-ID: <cover.1781253428.git.wqu@suse.com> (raw)
[CHANGELOG]
v3:
- Fix a bug in error handling of bio_iov_iter_bounce_write()
Which can lead to generic/708 failure on btrfs.
- Respect nofault flag in bio_iov_iter_bounce_write()
To avoid btrfs specific deadlocks.
- Reject NOWAIT and BOUNCE direct IOs
Since BOUNCE always allocate pages using GFP_KERNEL, which can sleep
and break NOWAIT requirement, has to reject such combination.
v2:
- Rework the comment in btrfs_dio_write()
Commit 968f19c5b1b7 ("btrfs: always fallback to buffered write if the
inode requires checksum") solved the csum mismatch caused by unstable
direct IO buffers, it has a pretty hefty performance penalty.
Meanwhile upstream iomap has introduce IOMAP_DIO_BOUNCE flag to get
stable buffers meanwhile without falling back to buffered IOs.
Using that flag btrfs can reach 95% of the original zero-copy direct IO
performance, almost 2x the current buffered fallback performance.
However during my tests, there are several bugs related to iomap that
can lead to direct IO test case failures:
- generic/708
Results garbage in the end of the writes, is a bug in the error
handling of a short copy.
Fixed in the first patch.
- Deadlock if using the page cache as direct IO buffer
This is because bio_iov_iter_bounce_write() doesn't respect
iov_iter::nofault flag.
Fixed in the second patch.
- Possible NOWAIT and BOUNCE conflicts
BOUNCE flag for both reads and writes will allocate new folios using
GFP_KERNEL, which can sleep and break NOWAIT requirement.
Reject such combination in iomap_dio_bio_iter() directly in the 3rd
patch.
And the final one will enable btrfs to use IOMAP_DIO_BOUNCE flag, so
that even with data checksum we do not need to fallback to buffered IO
and reclaim most of the dropped direct IO performance.
Qu Wenruo (4):
block: revert the iov_iter after a short copy in
bio_iov_iter_bounce_write()
block: respect iov_iter::nofault flag in bio_iov_iter_bounce_write()
iomap: reject NOWAIT and BOUNCE direct IOs
btrfs: use IOMAP_DIO_BOUNCE flag instead of falling back to buffered
IO
block/bio.c | 10 ++++++---
fs/btrfs/direct-io.c | 53 ++++++++++++++++++++------------------------
fs/iomap/direct-io.c | 4 ++++
3 files changed, 35 insertions(+), 32 deletions(-)
--
2.54.0
next reply other threads:[~2026-06-12 9:51 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-12 9:51 Qu Wenruo [this message]
2026-06-12 9:51 ` [PATCH v3 1/4] block: revert the iov_iter after a short copy in bio_iov_iter_bounce_write() Qu Wenruo
2026-06-12 9:51 ` [PATCH v3 2/4] block: respect iov_iter::nofault flag " Qu Wenruo
2026-06-12 9:51 ` [PATCH v3 3/4] iomap: reject NOWAIT and BOUNCE direct IOs Qu Wenruo
2026-06-12 9:51 ` [PATCH v3 4/4] btrfs: use IOMAP_DIO_BOUNCE flag instead of falling back to buffered IO Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1781253428.git.wqu@suse.com \
--to=wqu@suse.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.