Linux NFS development
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Trond Myklebust <trondmy@kernel.org>,
	Anna Schumaker <anna@kernel.org>,  NeilBrown <neil@brown.name>,
	Olga Kornievskaia <okorniev@redhat.com>,
	 Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
	 Chuck Lever <cel@kernel.org>
Cc: linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	 Jeff Layton <jlayton@kernel.org>
Subject: [PATCH v3 1/4] sunrpc: route to a populated pool in svc_pool_for_cpu()
Date: Mon, 29 Jun 2026 13:48:05 -0400	[thread overview]
Message-ID: <20260629-sunrpc-pool-mode-v3-1-d92676606dfd@kernel.org> (raw)
In-Reply-To: <20260629-sunrpc-pool-mode-v3-0-d92676606dfd@kernel.org>

svc_set_num_threads() spreads the requested threads evenly across the
service's pools (base = nrservs / sv_nrpools).  When a service runs
fewer threads than it has pools -- e.g. an nfsd configured with fewer
threads than the host has NUMA nodes while running in "pernode" or
"percpu" mode -- the trailing pools are left with no threads at all.

svc_xprt_enqueue() selects a pool from the CPU servicing the transport,
queues the transport on that pool's sp_xprts, and only wakes a thread
from the same pool.  Each thread services exclusively its own pool, so a
transport that lands on a threadless pool is enqueued on sp_xprts and
never picked up: the connection hangs indefinitely.

Have svc_pool_for_cpu() skip pools that currently have no threads,
falling back to the next populated pool.  This trades NUMA locality for
a guarantee that the work is actually serviced.  sp_nrthreads is only
updated under the service mutex; the lockless read here is a best-effort
routing hint, so annotate it with data_race().

Fixes: 0f0257eaa5d2 ("svc: Move the xprt independent code to the svc_xprt.c file")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 net/sunrpc/svc.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index dd80a2eaaa74..82fb7faf563f 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -402,6 +402,7 @@ struct svc_pool *svc_pool_for_cpu(struct svc_serv *serv)
 	struct svc_pool_map *m = &svc_pool_map;
 	int cpu = raw_smp_processor_id();
 	unsigned int pidx = 0;
+	unsigned int i;
 
 	if (serv->sv_nrpools <= 1)
 		return serv->sv_pools;
@@ -414,8 +415,31 @@ struct svc_pool *svc_pool_for_cpu(struct svc_serv *serv)
 		pidx = m->to_pool[cpu_to_node(cpu)];
 		break;
 	}
+	pidx %= serv->sv_nrpools;
+
+	/*
+	 * Threads are spread evenly across the pools, but when there are
+	 * fewer threads than pools some pools can end up with none. A
+	 * transport enqueued on a threadless pool would never be picked
+	 * up, since each thread only services its own pool. Fall back to
+	 * the next populated pool, trading NUMA locality for a guarantee
+	 * that the transport is serviced.
+	 */
+	for (i = 0; i < serv->sv_nrpools; i++) {
+		struct svc_pool *pool = &serv->sv_pools[pidx];
+
+		/* This is set under the sp_mutex and rarely ever changes. A
+		 * data race here is harmless.
+		 */
+		if (data_race(pool->sp_nrthreads))
+			return pool;
+
+		if (++pidx >= serv->sv_nrpools)
+			pidx = 0;
+	}
 
-	return &serv->sv_pools[pidx % serv->sv_nrpools];
+	/* No pool has any threads; nothing can service the transport. */
+	return &serv->sv_pools[pidx];
 }
 
 static int svc_rpcb_setup(struct svc_serv *serv, struct net *net)

-- 
2.54.0


  reply	other threads:[~2026-06-29 17:48 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-29 17:48 [PATCH v3 0/4] sunrpc: hardcode pool_mode to pernode, remove other modes Jeff Layton
2026-06-29 17:48 ` Jeff Layton [this message]
2026-06-29 17:48 ` [PATCH v3 2/4] " Jeff Layton
2026-06-30 17:24   ` Chuck Lever
2026-06-29 17:48 ` [PATCH v3 3/4] sunrpc: guarantee a thread per CPU-bearing node when auto-distributing Jeff Layton
2026-06-29 17:48 ` [PATCH v3 4/4] sunrpc: eliminate a modulus operation from the enqueueing codepath Jeff Layton
2026-06-30 12:48 ` [PATCH 5/4] sunrpc: protect the svc_pool_map pool_to[] array with RCU Jeff Layton
2026-07-01  1:50   ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260629-sunrpc-pool-mode-v3-1-d92676606dfd@kernel.org \
    --to=jlayton@kernel.org \
    --cc=Dai.Ngo@oracle.com \
    --cc=anna@kernel.org \
    --cc=cel@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neil@brown.name \
    --cc=okorniev@redhat.com \
    --cc=tom@talpey.com \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox