From: "Chuck Lever" <cel@kernel.org>
To: "Jeff Layton" <jlayton@kernel.org>,
"Trond Myklebust" <trondmy@kernel.org>,
"Anna Schumaker" <anna@kernel.org>,
"Chuck Lever" <chuck.lever@oracle.com>,
NeilBrown <neil@brown.name>,
"Olga Kornievskaia" <okorniev@redhat.com>,
"Dai Ngo" <Dai.Ngo@oracle.com>, "Tom Talpey" <tom@talpey.com>
Cc: linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC] sunrpc: hardcode pool_mode to pernode, remove other modes
Date: Fri, 24 Apr 2026 10:02:46 -0400 [thread overview]
Message-ID: <4b4fe625-b67b-46fc-8172-a9d0cf26c2c4@app.fastmail.com> (raw)
In-Reply-To: <55bfc1442ee1a80723520b714fc5f358d11c4c38.camel@kernel.org>
On Thu, Apr 23, 2026, at 8:26 PM, Jeff Layton wrote:
> On Thu, 2026-04-23 at 12:45 -0400, Chuck Lever wrote:
>> On Thu, Apr 23, 2026, at 10:39 AM, Jeff Layton wrote:
>> > The SVC_POOL_AUTO/GLOBAL/PERCPU/PERNODE pool mode selection machinery
>> > was added when NUMA was new and the right default was unclear. Today,
>> > pernode is the right choice everywhere:
>> >
>> > - On multi-NUMA hosts, it gives one pool per node with proper thread
>> > affinity and NUMA-local memory allocation.
>> > - On single-node hosts, pernode degenerates to exactly one pool,
>> > identical to the old "global" mode -- svc_pool_for_cpu() short-
>> > circuits when sv_nrpools <= 1, no CPU affinity is set, and memory
>> > is allocated from the single node.
>> >
>> > The percpu mode (one pool per CPU) created excessive pools relative to
>> > the number of threads most deployments run, and was only auto-selected
>> > in a narrow case (single node, >2 CPUs).
>> >
>> > Remove the SVC_POOL_* enum, mode selection heuristic,
>> > svc_pool_map_init_percpu(), and all mode-based switch statements.
>> > Simplify pool map functions to always use the pernode path.
>> >
>> > The module parameter and netlink interfaces are preserved for backward
>> > compatibility:
>> > - Writing "pernode" succeeds; any other value returns -EINVAL
>> > - Reading always returns "pernode"
>> > - Writing to the module parameter emits a deprecation notice
>> >
>> > Assisted-by: Claude:claude-opus-4-6
>> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
>> > ---
>> > In hindsight, I wish I had proposed this before adding the pool-mode
>> > settings to the new nfsd netlink interfaces.
>> >
>> > If this is too radical as a single step, we just could switch the
>> > default to "pernode", add a warning and leave the interfaces alone for
>> > now. Or maybe do that and go ahead and remove the percpu setting?
>> >
>> > Thoughts?
>>
>> Generally, I think the end goal of making "per-node" the default
>> and only setting is correct. My comments below are about how we
>> get there.
>>
>> The main concern is not perturbing any configuration that today
>> happens to set the module parameter.
>>
>> The proposed patch correctly preserves the shape of the user/kernel
>> interfaces (same module parameter name and perms, same netlink command
>> and attribute, same exported symbol names and signatures). The concern
>> is that the accepted input set has narrowed from four strings to one
>> and the setters reject the legacy three with -EINVAL. For the module
>> parameter that path runs at module load, so existing modprobe.d configs
>> written any time in the last ~19 years will cause load-time parameter
>> errors. Some might categorize that as a regression.
>>
>> The commit message documents that non-"pernode" writes now return
>> -EINVAL. The historically correct approach for this kind of obsoleted
>> tuning knob is to accept-and-ignore the old values (plus the pr_notice)
>> rather than hard-fail.
>>
>> Or, put another way, the proposed patch implements something slightly
>> different than true backwards compatibility. Userspace that previously
>> set pool_mode=global/percpu/auto now gets -EINVAL, which technically
>> speaking is a behavioral narrowing, not "backward compatibility."
>>
>> I might go even further and suggest that perhaps for v7.2:
>>
>> - Change the behavior of "auto" to be per-node
>
> That's already the case if you're on a NUMA box. The problem is that
> the default is not "auto" but "global". We could change that (easily),
> but I'd argue that "auto" mode just devolves into "pernode" for two
> reasons:
>
> - "pernode" is functionally identical to "global" when there is only a
> single NUMA node
>
> - "percpu" mode is basically useless
>
> If we want to "go slow" on this, then what I'd probably suggest is that
> we just change the default to "pernode" initially.
>
> Single node boxes that have this set to global today should see no real
> change (functionally identical). Boxes with multiple NUMA nodes that
> don't set it now, will start using pernode mode, but that's probably a
> good thing in most cases.
OK, there seems to be a difference between "auto" and "default". That's
a little whacky.
>> - Add a deprecation warning that is emitted when setting other modes,
>> but don't warn when the value is specifically the only accepted one.
>>
>
> We probably do still want to warn in that case. Once we remove the
> option, the module load can fail, so we probably want to discourage
> people from keeping the setting.
I'm saying that setting a deprecated pool mode should not cause
module loading to fail at all. Failing to load will be more painful
than loading and having different thread affinity behavior.
--
Chuck Lever
prev parent reply other threads:[~2026-04-24 14:03 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-23 14:39 [PATCH RFC] sunrpc: hardcode pool_mode to pernode, remove other modes Jeff Layton
2026-04-23 16:45 ` Chuck Lever
2026-04-24 0:26 ` Jeff Layton
2026-04-24 14:02 ` Chuck Lever [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4b4fe625-b67b-46fc-8172-a9d0cf26c2c4@app.fastmail.com \
--to=cel@kernel.org \
--cc=Dai.Ngo@oracle.com \
--cc=anna@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=jlayton@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neil@brown.name \
--cc=okorniev@redhat.com \
--cc=tom@talpey.com \
--cc=trondmy@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox