public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
	Christian Brauner <brauner@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	Carlos Maiolino <cem@kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-block@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: bounce buffer direct I/O when stable pages are required
Date: Wed, 14 Jan 2026 20:22:27 +1030	[thread overview]
Message-ID: <f5568a83-75df-4e84-8cf0-01df6dd4e810@suse.com> (raw)
In-Reply-To: <20260114074145.3396036-1-hch@lst.de>



在 2026/1/14 18:10, Christoph Hellwig 写道:
> Hi all,
> 
> this series tries to address the problem that under I/O pages can be
> modified during direct I/O, even when the device or file system require
> stable pages during I/O to calculate checksums, parity or data
> operations.  It does so by adding block layer helpers to bounce buffer
> an iov_iter into a bio, then wires that up in iomap and ultimately
> XFS.
> 
> The reason that the file system even needs to know about it, is because
> reads need a user context to copy the data back, and the infrastructure
> to defer ioends to a workqueue currently sits in XFS.  I'm going to look
> into moving that into ioend and enabling it for other file systems.
> Additionally btrfs already has it's own infrastructure for this, and
> actually an urgent need to bounce buffer, so this should be useful there
> and could be wire up easily.  In fact the idea comes from patches by
> Qu that did this in btrfs.

I guess the final reason to bounce other than falling back to buffered 
IO is still performance, especially for AIO cases?

If iomap is going to handle the page bouncing I guess we btrfs people 
will be pretty happy to use that, without implementing our own bouncing 
code.

My previous tests didn't result much difference between falling back to 
buffered and bouncing pages, although in that case no AIO/io_uring involved.

Thanks,
Qu

> 
> This patch fixes all but one xfstests failures on T10 PI capable devices
> (generic/095 seems to have issues with a mix of mmap and splice still,
> I'm looking into that separate), and make qemu VMs running Windows,
> or Linux with swap enabled fine on an XFS file on a device using PI.
> 
> Performance numbers on my (not exactly state of the art) NVMe PI test
> setup:
> 
>    Sequential reads using io_uring, QD=16.
>    Bandwidth and CPU usage (usr/sys):
> 
>    | size |        zero copy         |          bounce          |
>    +------+--------------------------+--------------------------+
>    |   4k | 1316MiB/s (12.65/55.40%) | 1081MiB/s (11.76/49.78%) |
>    |  64K | 3370MiB/s ( 5.46/18.20%) | 3365MiB/s ( 4.47/15.68%) |
>    |   1M | 3401MiB/s ( 0.76/23.05%) | 3400MiB/s ( 0.80/09.06%) |
>    +------+--------------------------+--------------------------+
> 
>    Sequential writes using io_uring, QD=16.
>    Bandwidth and CPU usage (usr/sys):
> 
>    | size |        zero copy         |          bounce          |
>    +------+--------------------------+--------------------------+
>    |   4k |  882MiB/s (11.83/33.88%) |  750MiB/s (10.53/34.08%) |
>    |  64K | 2009MiB/s ( 7.33/15.80%) | 2007MiB/s ( 7.47/24.71%) |
>    |   1M | 1992MiB/s ( 7.26/ 9.13%) | 1992MiB/s ( 9.21/19.11%) |
>    +------+--------------------------+--------------------------+
> 
> Note that the 64k read numbers look really odd to me for the baseline
> zero copy case, but are reproducible over many repeated runs.
> 
> The bounce read numbers should further improve when moving the PI
> validation to the file system and removing the double context switch,
> which I have patches for that will sent as soon as we are done with
> this series.
> 
> Diffstat:
>   block/bio.c           |  323 ++++++++++++++++++++++++++++++--------------------
>   block/blk.h           |   11 -
>   fs/iomap/direct-io.c  |  189 +++++++++++++++--------------
>   fs/iomap/ioend.c      |    8 +
>   fs/xfs/xfs_aops.c     |    8 -
>   fs/xfs/xfs_file.c     |   41 +++++-
>   include/linux/bio.h   |   26 ++++
>   include/linux/iomap.h |    9 +
>   include/linux/uio.h   |    3
>   lib/iov_iter.c        |   98 +++++++++++++++
>   10 files changed, 490 insertions(+), 226 deletions(-)


  parent reply	other threads:[~2026-01-14  9:52 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-14  7:40 bounce buffer direct I/O when stable pages are required Christoph Hellwig
2026-01-14  7:40 ` [PATCH 01/14] block: refactor get_contig_folio_len Christoph Hellwig
2026-01-14  7:41 ` [PATCH 02/14] block: open code bio_add_page and fix handling of mismatching P2P ranges Christoph Hellwig
2026-01-14 12:46   ` Johannes Thumshirn
2026-01-14 13:01     ` hch
2026-01-14  7:41 ` [PATCH 03/14] iov_iter: extract a iov_iter_extract_bvecs helper from bio code Christoph Hellwig
2026-01-14  7:41 ` [PATCH 04/14] block: remove bio_release_page Christoph Hellwig
2026-01-14  7:41 ` [PATCH 05/14] block: add helpers to bounce buffer an iov_iter into bios Christoph Hellwig
2026-01-14 12:51   ` Johannes Thumshirn
2026-01-14  7:41 ` [PATCH 06/14] iomap: fix submission side handling of completion side errors Christoph Hellwig
2026-01-14 22:35   ` Darrick J. Wong
2026-01-15  6:17     ` Christoph Hellwig
2026-01-14  7:41 ` [PATCH 07/14] iomap: simplify iomap_dio_bio_iter Christoph Hellwig
2026-01-14 22:51   ` Darrick J. Wong
2026-01-15  6:20     ` Christoph Hellwig
2026-01-14  7:41 ` [PATCH 08/14] iomap: split out the per-bio logic from iomap_dio_bio_iter Christoph Hellwig
2026-01-14 22:53   ` Darrick J. Wong
2026-01-14  7:41 ` [PATCH 09/14] iomap: share code between iomap_dio_bio_end_io and iomap_finish_ioend_direct Christoph Hellwig
2026-01-14 22:54   ` Darrick J. Wong
2026-01-14  7:41 ` [PATCH 10/14] iomap: free the bio before completing the dio Christoph Hellwig
2026-01-14 22:55   ` Darrick J. Wong
2026-01-15  6:21     ` Christoph Hellwig
2026-01-14  7:41 ` [PATCH 11/14] iomap: rename IOMAP_DIO_DIRTY to IOMAP_DIO_USER_BACKED Christoph Hellwig
2026-01-14 22:56   ` Darrick J. Wong
2026-01-14  7:41 ` [PATCH 12/14] iomap: support ioends for direct reads Christoph Hellwig
2026-01-14 22:57   ` Darrick J. Wong
2026-01-15  6:21     ` Christoph Hellwig
2026-01-14  7:41 ` [PATCH 13/14] iomap: add a flag to bounce buffer direct I/O Christoph Hellwig
2026-01-14 22:59   ` Darrick J. Wong
2026-01-15  6:21     ` Christoph Hellwig
2026-01-14  7:41 ` [PATCH 14/14] xfs: use bounce buffering direct I/O when the device requires stable pages Christoph Hellwig
2026-01-14 23:07   ` Darrick J. Wong
2026-01-15  6:24     ` Christoph Hellwig
2026-01-14  9:52 ` Qu Wenruo [this message]
2026-01-14 12:39   ` bounce buffer direct I/O when stable pages are required Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f5568a83-75df-4e84-8cf0-01df6dd4e810@suse.com \
    --to=wqu@suse.com \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=cem@kernel.org \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox