From: David Sterba <dsterba@suse.cz>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org, linux-block@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org
Subject: Re: [PATCH v4 3/3] btrfs: use IOMAP_DIO_BOUNCE flag instead of falling back to buffered IO
Date: Tue, 30 Jun 2026 17:11:45 +0200 [thread overview]
Message-ID: <20260630151145.GA2907432@twin.jikos.cz> (raw)
In-Reply-To: <1a89047dac91b6b12d190c37cd7bb3d8328b2073.1781597506.git.wqu@suse.com>
On Tue, Jun 16, 2026 at 05:42:37PM +0930, Qu Wenruo wrote:
> Previously btrfs forces direct writes to fall back to buffered ones if the
> inode has data checksum or the profile has duplication.
>
> That fallback is to avoid the content being modified that the final
> content may mismatch with the checksum or the other mirrors.
>
> That brings a pretty huge performance cost, which already caused some
> concern at that time.
>
> But later upstream commit c9d114846b38 ("iomap: add a flag to bounce
> buffer direct I/O") introduced a new method by copying the content into
> new pages, and do all the operations based on the newly allocated pages.
>
> So let btrfs to utilize the new flag for direct writes if we require
> stable folios.
>
> There is a quick benchmark, using the following fio setup:
>
> fio --name=randwrite --filename $mnt/foobar --ioengine=libaio --size=4G \
> --rw=randwrite --iodepth=64 --runtime=60 --time_based --direct=1 \
> --bs=$blocksize
>
> Unit is MiB/s.
>
> Blocksize | Zero-copy (*) | Buffered | Bounce
> -----------+---------------+----------+-----------
> 4K | 35.1 | 17.1 | 33.8
> 64K | 522 | 251 | 492
>
> *: This is done by reverting the commit 968f19c5b1b7 ("btrfs: always
> fallback to buffered write if the inode requires checksum")
>
> Although with page bouncing the performance is only around 95% of
> true-zero copy, it's still almost double the performance of buffered
> fallback.
>
> There will be a small change in behavior, since we're using
> IOMAP_DIO_BOUNCE flag to allocate new folios, NOWAIT flag will
> immediately fail.
>
> So for true NOWAIT direct IOs, NODATASUM and RAID0/SINGLE profiles are
> still required.
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
The block layer patches have been merged and our for-next is now based
on 7.2-rc1 so pleaase add this one too so we can get back the dio
performance. Thanks.
next prev parent reply other threads:[~2026-06-30 15:11 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-16 8:12 [PATCH v4 0/3] btrfs: use IOMAP_DIO_BOUNCE flag instead of falling back to buffered IO Qu Wenruo
2026-06-16 8:12 ` [PATCH v4 1/3] block: revert the iov_iter after a short copy in bio_iov_iter_bounce_write() Qu Wenruo
2026-06-16 12:44 ` Christoph Hellwig
2026-06-16 8:12 ` [PATCH v4 2/3] block: respect iov_iter::nofault flag " Qu Wenruo
2026-06-16 12:44 ` Christoph Hellwig
2026-06-16 8:12 ` [PATCH v4 3/3] btrfs: use IOMAP_DIO_BOUNCE flag instead of falling back to buffered IO Qu Wenruo
2026-06-30 15:11 ` David Sterba [this message]
2026-06-16 12:45 ` [PATCH v4 0/3] " Christoph Hellwig
2026-06-16 20:48 ` Jens Axboe
2026-06-16 20:51 ` (subset) " Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260630151145.GA2907432@twin.jikos.cz \
--to=dsterba@suse.cz \
--cc=linux-block@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox