public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <cel@kernel.org>
To: Jason Gunthorpe <jgg@nvidia.com>,
	Leon Romanovsky <leon@kernel.org>, Christoph Hellwig <hch@lst.de>
Cc: <linux-rdma@vger.kernel.org>, Chuck Lever <chuck.lever@oracle.com>
Subject: [PATCH] RDMA/rw: Fix MR pool exhaustion in bvec RDMA READ path
Date: Mon,  9 Mar 2026 23:46:21 -0400	[thread overview]
Message-ID: <20260310034621.5799-1-cel@kernel.org> (raw)

From: Chuck Lever <chuck.lever@oracle.com>

When IOVA-based DMA mapping is unavailable (eg IOMMU passthrough
mode), rdma_rw_ctx_init_bvec() falls back to checking
rdma_rw_io_needs_mr() with the raw bvec count. Unlike the
scatterlist path in rdma_rw_ctx_init(), which passes a
post-DMA-mapping entry count that reflects coalescing of
physically contiguous pages, the bvec path passes the
pre-mapping page count. This overstates the number of DMA
entries, causing every multi-bvec RDMA READ to consume an MR
from the QP's pool.

Under NFS WRITE workloads the server performs RDMA READs to
pull data from the client. With the inflated MR demand, the
pool is rapidly exhausted, ib_mr_pool_get() returns NULL, and
rdma_rw_init_one_mr() returns -EAGAIN. svcrdma treats this as
a DMA mapping failure, closes the connection, and the client
reconnects -- producing a cycle of 71% RPC retransmissions and
~100 reconnections per test run. RDMA WRITEs (NFS READ
direction) are unaffected because DMA_TO_DEVICE never triggers
the max_sgl_rd check.

Fixes: bea28ac14cab ("RDMA/core: add MR support for bvec-based RDMA operations")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 drivers/infiniband/core/rw.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/core/rw.c b/drivers/infiniband/core/rw.c
index fc45c384833f..9e227b7746a1 100644
--- a/drivers/infiniband/core/rw.c
+++ b/drivers/infiniband/core/rw.c
@@ -686,14 +686,15 @@ int rdma_rw_ctx_init_bvec(struct rdma_rw_ctx *ctx, struct ib_qp *qp,
 		return ret;
 
 	/*
-	 * IOVA mapping not available. Check if MR registration provides
-	 * better performance than multiple SGE entries.
+	 * IOVA not available; map each bvec individually. Do not
+	 * check max_sgl_rd here: nr_bvec is a raw page count that
+	 * overstates DMA entry demand and exhausts the MR pool.
+	 *
+	 * TODO: A bulk DMA mapping API for bvecs analogous to
+	 * dma_map_sgtable() would provide a proper post-DMA-
+	 * coalescing segment count here, enabling the map_wrs
+	 * path in more cases.
 	 */
-	if (rdma_rw_io_needs_mr(dev, port_num, dir, nr_bvec))
-		return rdma_rw_init_mr_wrs_bvec(ctx, qp, port_num, bvecs,
-						nr_bvec, &iter, remote_addr,
-						rkey, dir);
-
 	return rdma_rw_init_map_wrs_bvec(ctx, qp, bvecs, nr_bvec, &iter,
 			remote_addr, rkey, dir);
 }
-- 
2.53.0


             reply	other threads:[~2026-03-10  3:46 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-10  3:46 Chuck Lever [this message]
2026-03-10 13:42 ` [PATCH] RDMA/rw: Fix MR pool exhaustion in bvec RDMA READ path Christoph Hellwig
2026-03-10 14:36   ` Chuck Lever
2026-03-10 18:37     ` Leon Romanovsky
2026-03-10 18:49       ` Chuck Lever
2026-03-10 19:31         ` Leon Romanovsky
2026-03-10 19:56           ` [RFC PATCH] svcrdma: Use compound pages for RDMA Read sink buffers Chuck Lever
2026-03-10 20:27             ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260310034621.5799-1-cel@kernel.org \
    --to=cel@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=hch@lst.de \
    --cc=jgg@nvidia.com \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox