public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <cel@kernel.org>
To: NeilBrown <neilb@ownmail.net>, Jeff Layton <jlayton@kernel.org>,
	Olga Kornievskaia <okorniev@redhat.com>,
	Dai Ngo <dai.ngo@oracle.com>, Tom Talpey <tom@talpey.com>
Cc: <linux-nfs@vger.kernel.org>, Chuck Lever <chuck.lever@oracle.com>
Subject: [PATCH 1/3] sunrpc: skip svc_xprt_enqueue when no work is pending
Date: Tue, 24 Mar 2026 09:04:47 -0400	[thread overview]
Message-ID: <20260324130449.16437-2-cel@kernel.org> (raw)
In-Reply-To: <20260324130449.16437-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

svc_reserve() and svc_xprt_release_slot() call
svc_xprt_enqueue() after modifying xpt_reserved or
xpt_nr_rqsts. The purpose is to re-dispatch the
transport when write-space or a slot becomes available.
However, when neither XPT_DATA nor XPT_DEFERRED is
set, no thread can make progress on the transport and
the enqueue accomplishes nothing.

Trace data from a 256KB NFSv3 WRITE workload over RDMA
shows 11.2 svc_xprt_enqueue() calls per RPC. Of these,
6.9 per RPC lack XPT_DATA and exit svc_xprt_ready()
immediately after executing the smp_rmb(), READ_ONCE(),
and tracepoint. svc_reserve() and svc_xprt_release_slot()
account for roughly five of these per RPC.

A new helper, svc_xprt_resource_released(), checks
XPT_DATA | XPT_DEFERRED before calling
svc_xprt_enqueue(). The existing smp_wmb() barriers
are upgraded to smp_mb() to ensure the flags check
observes a concurrent producer's set_bit(XPT_DATA).
Each producer (svc_rdma_wc_receive, etc.) both sets
XPT_DATA and calls svc_xprt_enqueue(), so even if the
check reads a stale value, the producer's own enqueue
provides a fallback path.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/svc_xprt.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 56a663b8939f..73149280167c 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -425,13 +425,28 @@ static bool svc_xprt_reserve_slot(struct svc_rqst *rqstp, struct svc_xprt *xprt)
 	return true;
 }
 
+/*
+ * After a caller releases write-space or a request slot,
+ * re-enqueue the transport only when there is pending
+ * work that a thread could act on. The smp_mb() pairs
+ * with the smp_rmb() in svc_xprt_ready() and orders the
+ * preceding counter update before the flags read so a
+ * concurrent set_bit(XPT_DATA) is visible here.
+ */
+static void svc_xprt_resource_released(struct svc_xprt *xprt)
+{
+	smp_mb();
+	if (READ_ONCE(xprt->xpt_flags) &
+	    (BIT(XPT_DATA) | BIT(XPT_DEFERRED)))
+		svc_xprt_enqueue(xprt);
+}
+
 static void svc_xprt_release_slot(struct svc_rqst *rqstp)
 {
 	struct svc_xprt	*xprt = rqstp->rq_xprt;
 	if (test_and_clear_bit(RQ_DATA, &rqstp->rq_flags)) {
 		atomic_dec(&xprt->xpt_nr_rqsts);
-		smp_wmb(); /* See smp_rmb() in svc_xprt_ready() */
-		svc_xprt_enqueue(xprt);
+		svc_xprt_resource_released(xprt);
 	}
 }
 
@@ -525,10 +540,10 @@ void svc_reserve(struct svc_rqst *rqstp, int space)
 	space += rqstp->rq_res.head[0].iov_len;
 
 	if (xprt && space < rqstp->rq_reserved) {
-		atomic_sub((rqstp->rq_reserved - space), &xprt->xpt_reserved);
+		atomic_sub((rqstp->rq_reserved - space),
+			   &xprt->xpt_reserved);
 		rqstp->rq_reserved = space;
-		smp_wmb(); /* See smp_rmb() in svc_xprt_ready() */
-		svc_xprt_enqueue(xprt);
+		svc_xprt_resource_released(xprt);
 	}
 }
 EXPORT_SYMBOL_GPL(svc_reserve);
-- 
2.53.0


  reply	other threads:[~2026-03-24 13:04 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-24 13:04 [PATCH 0/3] Avoid no-op transport enqueues Chuck Lever
2026-03-24 13:04 ` Chuck Lever [this message]
2026-03-24 13:26   ` [PATCH 1/3] sunrpc: skip svc_xprt_enqueue when no work is pending Jeff Layton
2026-03-24 13:04 ` [PATCH 2/3] sunrpc: skip svc_xprt_enqueue in svc_xprt_received when idle Chuck Lever
2026-03-24 13:39   ` Jeff Layton
2026-03-24 13:04 ` [PATCH 3/3] sunrpc: skip svc_xprt_enqueue when transport is busy Chuck Lever
2026-03-24 13:42   ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260324130449.16437-2-cel@kernel.org \
    --to=cel@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=dai.ngo@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@ownmail.net \
    --cc=okorniev@redhat.com \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox