From: Jeff Layton <jlayton@kernel.org>
To: Trond Myklebust <trondmy@kernel.org>,
Anna Schumaker <anna@kernel.org>, NeilBrown <neil@brown.name>,
Olga Kornievskaia <okorniev@redhat.com>,
Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
Chuck Lever <cel@kernel.org>
Cc: linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org,
Jeff Layton <jlayton@kernel.org>
Subject: [PATCH v3 1/4] sunrpc: route to a populated pool in svc_pool_for_cpu()
Date: Mon, 29 Jun 2026 13:48:05 -0400 [thread overview]
Message-ID: <20260629-sunrpc-pool-mode-v3-1-d92676606dfd@kernel.org> (raw)
In-Reply-To: <20260629-sunrpc-pool-mode-v3-0-d92676606dfd@kernel.org>
svc_set_num_threads() spreads the requested threads evenly across the
service's pools (base = nrservs / sv_nrpools). When a service runs
fewer threads than it has pools -- e.g. an nfsd configured with fewer
threads than the host has NUMA nodes while running in "pernode" or
"percpu" mode -- the trailing pools are left with no threads at all.
svc_xprt_enqueue() selects a pool from the CPU servicing the transport,
queues the transport on that pool's sp_xprts, and only wakes a thread
from the same pool. Each thread services exclusively its own pool, so a
transport that lands on a threadless pool is enqueued on sp_xprts and
never picked up: the connection hangs indefinitely.
Have svc_pool_for_cpu() skip pools that currently have no threads,
falling back to the next populated pool. This trades NUMA locality for
a guarantee that the work is actually serviced. sp_nrthreads is only
updated under the service mutex; the lockless read here is a best-effort
routing hint, so annotate it with data_race().
Fixes: 0f0257eaa5d2 ("svc: Move the xprt independent code to the svc_xprt.c file")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
net/sunrpc/svc.c | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index dd80a2eaaa74..82fb7faf563f 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -402,6 +402,7 @@ struct svc_pool *svc_pool_for_cpu(struct svc_serv *serv)
struct svc_pool_map *m = &svc_pool_map;
int cpu = raw_smp_processor_id();
unsigned int pidx = 0;
+ unsigned int i;
if (serv->sv_nrpools <= 1)
return serv->sv_pools;
@@ -414,8 +415,31 @@ struct svc_pool *svc_pool_for_cpu(struct svc_serv *serv)
pidx = m->to_pool[cpu_to_node(cpu)];
break;
}
+ pidx %= serv->sv_nrpools;
+
+ /*
+ * Threads are spread evenly across the pools, but when there are
+ * fewer threads than pools some pools can end up with none. A
+ * transport enqueued on a threadless pool would never be picked
+ * up, since each thread only services its own pool. Fall back to
+ * the next populated pool, trading NUMA locality for a guarantee
+ * that the transport is serviced.
+ */
+ for (i = 0; i < serv->sv_nrpools; i++) {
+ struct svc_pool *pool = &serv->sv_pools[pidx];
+
+ /* This is set under the sp_mutex and rarely ever changes. A
+ * data race here is harmless.
+ */
+ if (data_race(pool->sp_nrthreads))
+ return pool;
+
+ if (++pidx >= serv->sv_nrpools)
+ pidx = 0;
+ }
- return &serv->sv_pools[pidx % serv->sv_nrpools];
+ /* No pool has any threads; nothing can service the transport. */
+ return &serv->sv_pools[pidx];
}
static int svc_rpcb_setup(struct svc_serv *serv, struct net *net)
--
2.54.0
next prev parent reply other threads:[~2026-06-29 17:48 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-29 17:48 [PATCH v3 0/4] sunrpc: hardcode pool_mode to pernode, remove other modes Jeff Layton
2026-06-29 17:48 ` Jeff Layton [this message]
2026-06-29 17:48 ` [PATCH v3 2/4] " Jeff Layton
2026-06-30 17:24 ` Chuck Lever
2026-06-29 17:48 ` [PATCH v3 3/4] sunrpc: guarantee a thread per CPU-bearing node when auto-distributing Jeff Layton
2026-06-29 17:48 ` [PATCH v3 4/4] sunrpc: eliminate a modulus operation from the enqueueing codepath Jeff Layton
2026-06-30 12:48 ` [PATCH 5/4] sunrpc: protect the svc_pool_map pool_to[] array with RCU Jeff Layton
2026-07-01 1:50 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260629-sunrpc-pool-mode-v3-1-d92676606dfd@kernel.org \
--to=jlayton@kernel.org \
--cc=Dai.Ngo@oracle.com \
--cc=anna@kernel.org \
--cc=cel@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neil@brown.name \
--cc=okorniev@redhat.com \
--cc=tom@talpey.com \
--cc=trondmy@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox