From: Zhu Yanjun <yanjun.zhu@linux.dev>
To: Chuck Lever <cel@kernel.org>, Jason Gunthorpe <jgg@nvidia.com>,
Leon Romanovsky <leon@kernel.org>, Christoph Hellwig <hch@lst.de>
Cc: NeilBrown <neilb@ownmail.net>, Jeff Layton <jlayton@kernel.org>,
Olga Kornievskaia <okorniev@redhat.com>,
Dai Ngo <dai.ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org,
Chuck Lever <chuck.lever@oracle.com>
Subject: Re: [PATCH v3 0/5] Add a bio_vec based API to core/rw.c
Date: Thu, 22 Jan 2026 22:04:39 -0800 [thread overview]
Message-ID: <c02ab348-5243-4e97-b916-6bd59ffe769a@linux.dev> (raw)
In-Reply-To: <20260122220401.1143331-1-cel@kernel.org>
在 2026/1/22 14:03, Chuck Lever 写道:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> This series introduces a bio_vec based API for RDMA read and write
> operations in the RDMA core, eliminating unnecessary scatterlist
> conversions for callers that already work with bvecs.
>
> Current users of rdma_rw_ctx_init() must convert their native data
> structures into scatterlists. For subsystems like svcrdma that
> maintain data in bvec format, this conversion adds overhead both in
> CPU cycles and memory footprint. The new API accepts bvec arrays
> directly.
>
> For hardware RDMA devices, the implementation uses the IOVA-based
> DMA mapping API to reduce IOTLB synchronization overhead from O(n)
> per-page syncs to a single O(1) sync after all mappings complete.
> Software RDMA devices (rxe, siw) continue using virtual addressing.
>
> The series includes MR registration support for bvec arrays,
> enabling iWARP devices and the force_mr debug parameter. The MR
> path reuses existing ib_map_mr_sg() infrastructure by constructing
> a synthetic scatterlist from the bvec DMA addresses.
Hi, Chuck Lever
I’ve read through the patch series. As I understand it, the new
bio_vec–based RDMA read/write API allows callers that already operate on
bvecs (for example, svcrdma and potentially NVMe-oF) to avoid converting
their data into scatterlists, which should reduce CPU overhead and
memory usage in the data path.
For hardware RDMA devices, the use of the IOVA-based DMA mapping API
also seems likely to reduce IOTLB synchronization overhead compared to
the existing per-page approach, while software devices (rxe, siw) retain
the current virtual-addressing model.
Do you happen to have any performance or functional test results you
could share for this series, in particular:
Hardware RDMA devices (e.g., latency, bandwidth, or CPU utilization
changes), and/or
Software RDMA devices such as rxe or siw?
Any data points or qualitative observations would be very helpful for
evaluating the impact of the new API.
Zhu Yanjun
>
> The final patch adds the first consumer for the new API: svcrdma.
>
> Based on v6.19-rc6.
>
> ---
>
> Changes since v2:
> - Add bvec iter arguments to the new API
> - Add a synthetic SGL in the MR mapping function
> - Try IOVA coalescing before max_sgl_rd triggers MR in bvec path
> - Attempt once again to address SQ/CQ/max_rdma_ctxs sizing issues
>
> Changes since v1:
> - Simplify rw.c by using bvec iters internally
> - IOVA mapping produces a contiguous DMA address range
> - Clarify the comment that documents struct svc_rdma_rw_ctxt
> - svcrdma now uses pre-allocated bio_vec arrays
>
> Chuck Lever (5):
> RDMA/core: add bio_vec based RDMA read/write API
> RDMA/core: use IOVA-based DMA mapping for bvec RDMA operations
> RDMA/core: add MR support for bvec-based RDMA operations
> RDMA/core: add rdma_rw_max_sge() helper for SQ sizing
> svcrdma: use bvec-based RDMA read/write API
>
> drivers/infiniband/core/rw.c | 591 ++++++++++++++++++++---
> drivers/infiniband/ulp/isert/ib_isert.c | 4 +-
> drivers/nvme/target/rdma.c | 4 +-
> include/rdma/ib_verbs.h | 42 ++
> include/rdma/rw.h | 36 +-
> net/sunrpc/xprtrdma/svc_rdma_rw.c | 155 +++---
> net/sunrpc/xprtrdma/svc_rdma_transport.c | 8 +-
> 7 files changed, 699 insertions(+), 141 deletions(-)
>
next prev parent reply other threads:[~2026-01-23 6:04 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-22 22:03 [PATCH v3 0/5] Add a bio_vec based API to core/rw.c Chuck Lever
2026-01-22 22:03 ` [PATCH v3 1/5] RDMA/core: add bio_vec based RDMA read/write API Chuck Lever
2026-01-23 6:26 ` Christoph Hellwig
2026-01-22 22:03 ` [PATCH v3 2/5] RDMA/core: use IOVA-based DMA mapping for bvec RDMA operations Chuck Lever
2026-01-23 6:28 ` Christoph Hellwig
2026-01-23 15:04 ` Chuck Lever
2026-01-26 6:14 ` Christoph Hellwig
2026-01-22 22:03 ` [PATCH v3 3/5] RDMA/core: add MR support for bvec-based " Chuck Lever
2026-01-23 6:36 ` Christoph Hellwig
2026-01-23 15:06 ` Chuck Lever
2026-01-26 6:17 ` Christoph Hellwig
2026-01-26 16:48 ` Chuck Lever
2026-01-23 16:47 ` Chuck Lever
2026-01-26 6:16 ` Christoph Hellwig
2026-01-22 22:04 ` [PATCH v3 4/5] RDMA/core: add rdma_rw_max_sge() helper for SQ sizing Chuck Lever
2026-01-23 6:36 ` Christoph Hellwig
2026-01-22 22:04 ` [PATCH v3 5/5] svcrdma: use bvec-based RDMA read/write API Chuck Lever
2026-01-23 6:04 ` Zhu Yanjun [this message]
2026-01-23 14:13 ` [PATCH v3 0/5] Add a bio_vec based API to core/rw.c Chuck Lever
2026-01-24 18:19 ` Zhu Yanjun
2026-01-26 17:13 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c02ab348-5243-4e97-b916-6bd59ffe769a@linux.dev \
--to=yanjun.zhu@linux.dev \
--cc=cel@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=dai.ngo@oracle.com \
--cc=hch@lst.de \
--cc=jgg@nvidia.com \
--cc=jlayton@kernel.org \
--cc=leon@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=neilb@ownmail.net \
--cc=okorniev@redhat.com \
--cc=tom@talpey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.