public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <cel@kernel.org>
To: Zhu Yanjun <yanjun.zhu@linux.dev>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Leon Romanovsky <leon@kernel.org>, Christoph Hellwig <hch@lst.de>
Cc: NeilBrown <neilb@ownmail.net>, Jeff Layton <jlayton@kernel.org>,
	Olga Kornievskaia <okorniev@redhat.com>,
	Dai Ngo <dai.ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
	linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org,
	Chuck Lever <chuck.lever@oracle.com>
Subject: Re: [PATCH v3 0/5] Add a bio_vec based API to core/rw.c
Date: Fri, 23 Jan 2026 09:13:47 -0500	[thread overview]
Message-ID: <d67a30a0-5ff1-4e31-a168-81f8b7bee97f@kernel.org> (raw)
In-Reply-To: <c02ab348-5243-4e97-b916-6bd59ffe769a@linux.dev>

On 1/23/26 1:04 AM, Zhu Yanjun wrote:
> 在 2026/1/22 14:03, Chuck Lever 写道:
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> This series introduces a bio_vec based API for RDMA read and write
>> operations in the RDMA core, eliminating unnecessary scatterlist
>> conversions for callers that already work with bvecs.
>>
>> Current users of rdma_rw_ctx_init() must convert their native data
>> structures into scatterlists. For subsystems like svcrdma that
>> maintain data in bvec format, this conversion adds overhead both in
>> CPU cycles and memory footprint. The new API accepts bvec arrays
>> directly.
>>
>> For hardware RDMA devices, the implementation uses the IOVA-based
>> DMA mapping API to reduce IOTLB synchronization overhead from O(n)
>> per-page syncs to a single O(1) sync after all mappings complete.
>> Software RDMA devices (rxe, siw) continue using virtual addressing.
>>
>> The series includes MR registration support for bvec arrays,
>> enabling iWARP devices and the force_mr debug parameter. The MR
>> path reuses existing ib_map_mr_sg() infrastructure by constructing
>> a synthetic scatterlist from the bvec DMA addresses.
> 
> Hi, Chuck Lever
> 
> I’ve read through the patch series. As I understand it, the new bio_vec–
> based RDMA read/write API allows callers that already operate on bvecs
> (for example, svcrdma and potentially NVMe-oF) to avoid converting their
> data into scatterlists, which should reduce CPU overhead and memory
> usage in the data path.
> 
> For hardware RDMA devices, the use of the IOVA-based DMA mapping API
> also seems likely to reduce IOTLB synchronization overhead compared to
> the existing per-page approach, while software devices (rxe, siw) retain
> the current virtual-addressing model.
> 
> Do you happen to have any performance or functional test results you
> could share for this series, in particular:
> 
> Hardware RDMA devices (e.g., latency, bandwidth, or CPU utilization
> changes), and/or

Functional tests with CX-5 Infiniband and NFS/RDMA show no regression.

Performance tests are difficult to evaluate because I don't have a
multi-client set-up here to drive a heavy workload, plus filesystems
bottleneck long before the network transport does. The changes are
designed to improve scalability (eg lower CPU utilization for the same
workload and less interaction between host and RNIC) more than improve
raw throughput. So far I have seen no throughput regression and perhaps
a bit of improvement for tail latencies.

The main purpose of the series, however, is part of an effort to enable
kernel-wide replacement of the use of scatter-gather lists, which are
technical debt. Socket APIs already support struct bio_vec.


> Software RDMA devices such as rxe or siw?

Software providers are not likely to see much change. However, you will
need to test the series with your own preferred configuration and
workload to assess performance and scalability delta.


-- 
Chuck Lever

  reply	other threads:[~2026-01-23 14:13 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-22 22:03 [PATCH v3 0/5] Add a bio_vec based API to core/rw.c Chuck Lever
2026-01-22 22:03 ` [PATCH v3 1/5] RDMA/core: add bio_vec based RDMA read/write API Chuck Lever
2026-01-23  6:26   ` Christoph Hellwig
2026-01-22 22:03 ` [PATCH v3 2/5] RDMA/core: use IOVA-based DMA mapping for bvec RDMA operations Chuck Lever
2026-01-23  6:28   ` Christoph Hellwig
2026-01-23 15:04     ` Chuck Lever
2026-01-26  6:14       ` Christoph Hellwig
2026-01-22 22:03 ` [PATCH v3 3/5] RDMA/core: add MR support for bvec-based " Chuck Lever
2026-01-23  6:36   ` Christoph Hellwig
2026-01-23 15:06     ` Chuck Lever
2026-01-26  6:17       ` Christoph Hellwig
2026-01-26 16:48         ` Chuck Lever
2026-01-23 16:47     ` Chuck Lever
2026-01-26  6:16       ` Christoph Hellwig
2026-01-22 22:04 ` [PATCH v3 4/5] RDMA/core: add rdma_rw_max_sge() helper for SQ sizing Chuck Lever
2026-01-23  6:36   ` Christoph Hellwig
2026-01-22 22:04 ` [PATCH v3 5/5] svcrdma: use bvec-based RDMA read/write API Chuck Lever
2026-01-23  6:04 ` [PATCH v3 0/5] Add a bio_vec based API to core/rw.c Zhu Yanjun
2026-01-23 14:13   ` Chuck Lever [this message]
2026-01-24 18:19     ` Zhu Yanjun
2026-01-26 17:13     ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d67a30a0-5ff1-4e31-a168-81f8b7bee97f@kernel.org \
    --to=cel@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=dai.ngo@oracle.com \
    --cc=hch@lst.de \
    --cc=jgg@nvidia.com \
    --cc=jlayton@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=neilb@ownmail.net \
    --cc=okorniev@redhat.com \
    --cc=tom@talpey.com \
    --cc=yanjun.zhu@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox