From: Chuck Lever <cel@kernel.org>
To: Anna Schumaker <anna@kernel.org>
Cc: <linux-rdma@vger.kernel.org>, <linux-nfs@vger.kernel.org>,
Chuck Lever <chuck.lever@oracle.com>
Subject: [PATCH v3 3/5] xprtrdma: Add request-pool slack for delayed recycling
Date: Tue, 26 May 2026 10:14:03 -0400 [thread overview]
Message-ID: <20260526141405.39877-4-cel@kernel.org> (raw)
In-Reply-To: <20260526141405.39877-1-cel@kernel.org>
From: Chuck Lever <chuck.lever@oracle.com>
After the previous patch gates req recycling on Send completion,
a completed RPC's rpcrdma_req can remain pinned by the sendctx
ring until the next signaled Send completion releases it. The
transmitted-RPC ceiling is unchanged: xprt_request_get_cong()
gates Sends against xprt->cwnd, the RPC/RDMA credit window fed
by server-granted credits and capped at re_max_requests. The
req pool, however, must exceed max_reqs by enough that this
recycle delay does not stall a slot allocation that the credit
window would admit.
The headroom is bounded. frwr_open() sets re_send_batch to
re_max_requests >> 3 -- one in every eight Sends is signaled --
so at most re_send_batch unsignaled Sends can be outstanding
before the next signaled completion releases them. That equals
max_reqs / 8 reqs in the worst case, with a one-slot floor for
small max_reqs values where the right-shift rounds to zero.
The sendctx ring and the hardware Send Queue are not enlarged
to match. Both are sized in rpcrdma_sendctxs_create() and
frwr_query_device() for re_max_requests in-flight Sends, which
is the ceiling the credit window enforces. The pool slack does
not raise that ceiling -- it only lets allocation keep pace
with the credit window during the brief interval in which
earlier reqs are pinned waiting for the next signaled
completion. At any moment, at most re_send_batch sendctxes are
held by unswept unsignaled Sends, leaving the rest of the ring
available for newly admitted Sends.
Allocate max_reqs + DIV_ROUND_UP(max_reqs, 8) request objects
and name the slack calculation at the allocation site so the
1/8 bound stays tied to the Send-signaling batch size.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
net/sunrpc/xprtrdma/verbs.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 97b8b2376602..98bd965787e6 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -1080,6 +1080,22 @@ static void rpcrdma_reps_destroy(struct rpcrdma_buffer *buf)
spin_unlock(&buf->rb_lock);
}
+static unsigned int rpcrdma_req_pool_slack(unsigned int max_reqs)
+{
+ /* The sendctx ring can hold up to one Send-signaling batch
+ * (re_send_batch, set by frwr_open() to re_max_requests >> 3)
+ * of unfinished Sends. Each pins its req until a signaled Send
+ * completion releases the sendctx. Size the pool above max_reqs
+ * by that batch so the recycle delay does not stall a slot
+ * allocation that the RPC/RDMA credit window would admit.
+ *
+ * Round up: re_max_requests >> 3 is zero when max_reqs < 8, but
+ * a single unsignaled Send is still enough to pin one req. One
+ * slack slot covers that case.
+ */
+ return DIV_ROUND_UP(max_reqs, 8);
+}
+
/**
* rpcrdma_buffer_create - Create initial set of req/rep objects
* @r_xprt: transport instance to (re)initialize
@@ -1089,6 +1105,7 @@ static void rpcrdma_reps_destroy(struct rpcrdma_buffer *buf)
int rpcrdma_buffer_create(struct rpcrdma_xprt *r_xprt)
{
struct rpcrdma_buffer *buf = &r_xprt->rx_buf;
+ unsigned int max_reqs;
int i, rc;
buf->rb_bc_srv_max_requests = 0;
@@ -1102,7 +1119,9 @@ int rpcrdma_buffer_create(struct rpcrdma_xprt *r_xprt)
INIT_LIST_HEAD(&buf->rb_all_reps);
rc = -ENOMEM;
- for (i = 0; i < r_xprt->rx_xprt.max_reqs; i++) {
+ max_reqs = r_xprt->rx_xprt.max_reqs;
+ max_reqs += rpcrdma_req_pool_slack(max_reqs);
+ for (i = 0; i < max_reqs; i++) {
struct rpcrdma_req *req;
req = rpcrdma_req_create(r_xprt,
--
2.54.0
next prev parent reply other threads:[~2026-05-26 14:14 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-26 14:14 [PATCH v3 0/5] xprtrdma: Decouple req recycling from RPC completion Chuck Lever
2026-05-26 14:14 ` [PATCH v3 1/5] xprtrdma: Use sendctx DMA state for Send signaling Chuck Lever
2026-05-26 14:14 ` [PATCH v3 2/5] xprtrdma: Decouple req recycling from RPC completion Chuck Lever
2026-05-26 14:14 ` Chuck Lever [this message]
2026-05-26 14:14 ` [PATCH v3 4/5] xprtrdma: Clear receive-side ownership pointers on release Chuck Lever
2026-05-26 14:14 ` [PATCH v3 5/5] xprtrdma: Document and assert reply-handler invariants Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260526141405.39877-4-cel@kernel.org \
--to=cel@kernel.org \
--cc=anna@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox