From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-183.mta1.migadu.com (out-183.mta1.migadu.com [95.215.58.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 024B533CEA1 for ; Fri, 23 Jan 2026 06:05:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.183 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769148309; cv=none; b=fra9qJA8GjWho3A+M1bqtdftyZdcIyN+s1ytsk3VAkguaSw53S10Ii7/tWe1oj8m8cQGPu4koTB7K8RwitPqM3D4rgJt0bkwiBk9NiVdYl7d1/SHEvnbFG2yGNJZRdHsO60c634CUhMjvyEZ/gcnQOy5TpVuvFUrip12uavbWPg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769148309; c=relaxed/simple; bh=h9g2gyRuNShziaYijgJlBgvgS/UY03RUaLaxQiejYLI=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=eNzdVN7e/niu0A8EpnB9838JiSU4E/WD52zOgPeehgVWW2ZCvSFhFL7lfqRAFyDV5bPmaKsIDYn9FiqiDnCjRSaqhL/VWRIIXVMFLW3GmQ98q8btFGcXmLsvhFNbXKMy9Zy85Bs2F0ASlrAk977HUCIRP02L5r1BYOuJPS3UJcs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=FDjGp9D5; arc=none smtp.client-ip=95.215.58.183 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="FDjGp9D5" Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1769148291; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HBjPnAPLtIJQgYd9Yf7ajYQCnwWpwqRQJkI1YWyJuP0=; b=FDjGp9D52m5EaikR8iM3b52LZyCo3lZ7rh025hmcIikxPAPpLNFlEbkeEgP0tS1PK5pm+7 fDqoWZYrIxP5XPp3VZdbUt6jUec/nApqH+bJgrFORFTwOg0lJGpHDZzwoTXxaVgZ+dYkzf TKbvZ8/IoWLQUKHp33HeKtn/J2RPTNo= Date: Thu, 22 Jan 2026 22:04:39 -0800 Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH v3 0/5] Add a bio_vec based API to core/rw.c To: Chuck Lever , Jason Gunthorpe , Leon Romanovsky , Christoph Hellwig Cc: NeilBrown , Jeff Layton , Olga Kornievskaia , Dai Ngo , Tom Talpey , linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org, Chuck Lever References: <20260122220401.1143331-1-cel@kernel.org> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Zhu Yanjun In-Reply-To: <20260122220401.1143331-1-cel@kernel.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT 在 2026/1/22 14:03, Chuck Lever 写道: > From: Chuck Lever > > This series introduces a bio_vec based API for RDMA read and write > operations in the RDMA core, eliminating unnecessary scatterlist > conversions for callers that already work with bvecs. > > Current users of rdma_rw_ctx_init() must convert their native data > structures into scatterlists. For subsystems like svcrdma that > maintain data in bvec format, this conversion adds overhead both in > CPU cycles and memory footprint. The new API accepts bvec arrays > directly. > > For hardware RDMA devices, the implementation uses the IOVA-based > DMA mapping API to reduce IOTLB synchronization overhead from O(n) > per-page syncs to a single O(1) sync after all mappings complete. > Software RDMA devices (rxe, siw) continue using virtual addressing. > > The series includes MR registration support for bvec arrays, > enabling iWARP devices and the force_mr debug parameter. The MR > path reuses existing ib_map_mr_sg() infrastructure by constructing > a synthetic scatterlist from the bvec DMA addresses. Hi, Chuck Lever I’ve read through the patch series. As I understand it, the new bio_vec–based RDMA read/write API allows callers that already operate on bvecs (for example, svcrdma and potentially NVMe-oF) to avoid converting their data into scatterlists, which should reduce CPU overhead and memory usage in the data path. For hardware RDMA devices, the use of the IOVA-based DMA mapping API also seems likely to reduce IOTLB synchronization overhead compared to the existing per-page approach, while software devices (rxe, siw) retain the current virtual-addressing model. Do you happen to have any performance or functional test results you could share for this series, in particular: Hardware RDMA devices (e.g., latency, bandwidth, or CPU utilization changes), and/or Software RDMA devices such as rxe or siw? Any data points or qualitative observations would be very helpful for evaluating the impact of the new API. Zhu Yanjun > > The final patch adds the first consumer for the new API: svcrdma. > > Based on v6.19-rc6. > > --- > > Changes since v2: > - Add bvec iter arguments to the new API > - Add a synthetic SGL in the MR mapping function > - Try IOVA coalescing before max_sgl_rd triggers MR in bvec path > - Attempt once again to address SQ/CQ/max_rdma_ctxs sizing issues > > Changes since v1: > - Simplify rw.c by using bvec iters internally > - IOVA mapping produces a contiguous DMA address range > - Clarify the comment that documents struct svc_rdma_rw_ctxt > - svcrdma now uses pre-allocated bio_vec arrays > > Chuck Lever (5): > RDMA/core: add bio_vec based RDMA read/write API > RDMA/core: use IOVA-based DMA mapping for bvec RDMA operations > RDMA/core: add MR support for bvec-based RDMA operations > RDMA/core: add rdma_rw_max_sge() helper for SQ sizing > svcrdma: use bvec-based RDMA read/write API > > drivers/infiniband/core/rw.c | 591 ++++++++++++++++++++--- > drivers/infiniband/ulp/isert/ib_isert.c | 4 +- > drivers/nvme/target/rdma.c | 4 +- > include/rdma/ib_verbs.h | 42 ++ > include/rdma/rw.h | 36 +- > net/sunrpc/xprtrdma/svc_rdma_rw.c | 155 +++--- > net/sunrpc/xprtrdma/svc_rdma_transport.c | 8 +- > 7 files changed, 699 insertions(+), 141 deletions(-) >