Linux NFS development
 help / color / mirror / Atom feed
From: "Chuck Lever" <cel@kernel.org>
To: "Jeff Layton" <jlayton@kernel.org>,
	"Trond Myklebust" <trondmy@kernel.org>,
	"Anna Schumaker" <anna@kernel.org>,
	"Chuck Lever" <chuck.lever@oracle.com>,
	NeilBrown <neil@brown.name>,
	"Olga Kornievskaia" <okorniev@redhat.com>,
	"Dai Ngo" <Dai.Ngo@oracle.com>, "Tom Talpey" <tom@talpey.com>
Cc: linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC] sunrpc: hardcode pool_mode to pernode, remove other modes
Date: Fri, 24 Apr 2026 10:02:46 -0400	[thread overview]
Message-ID: <4b4fe625-b67b-46fc-8172-a9d0cf26c2c4@app.fastmail.com> (raw)
In-Reply-To: <55bfc1442ee1a80723520b714fc5f358d11c4c38.camel@kernel.org>



On Thu, Apr 23, 2026, at 8:26 PM, Jeff Layton wrote:
> On Thu, 2026-04-23 at 12:45 -0400, Chuck Lever wrote:
>> On Thu, Apr 23, 2026, at 10:39 AM, Jeff Layton wrote:
>> > The SVC_POOL_AUTO/GLOBAL/PERCPU/PERNODE pool mode selection machinery
>> > was added when NUMA was new and the right default was unclear.  Today,
>> > pernode is the right choice everywhere:
>> > 
>> > - On multi-NUMA hosts, it gives one pool per node with proper thread
>> >   affinity and NUMA-local memory allocation.
>> > - On single-node hosts, pernode degenerates to exactly one pool,
>> >   identical to the old "global" mode -- svc_pool_for_cpu() short-
>> >   circuits when sv_nrpools <= 1, no CPU affinity is set, and memory
>> >   is allocated from the single node.
>> > 
>> > The percpu mode (one pool per CPU) created excessive pools relative to
>> > the number of threads most deployments run, and was only auto-selected
>> > in a narrow case (single node, >2 CPUs).
>> > 
>> > Remove the SVC_POOL_* enum, mode selection heuristic,
>> > svc_pool_map_init_percpu(), and all mode-based switch statements.
>> > Simplify pool map functions to always use the pernode path.
>> > 
>> > The module parameter and netlink interfaces are preserved for backward
>> > compatibility:
>> > - Writing "pernode" succeeds; any other value returns -EINVAL
>> > - Reading always returns "pernode"
>> > - Writing to the module parameter emits a deprecation notice
>> > 
>> > Assisted-by: Claude:claude-opus-4-6
>> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
>> > ---
>> > In hindsight, I wish I had proposed this before adding the pool-mode
>> > settings to the new nfsd netlink interfaces.
>> > 
>> > If this is too radical as a single step, we just could switch the
>> > default to "pernode", add a warning and leave the interfaces alone for
>> > now. Or maybe do that and go ahead and remove the percpu setting?
>> > 
>> > Thoughts?
>> 
>> Generally, I think the end goal of making "per-node" the default
>> and only setting is correct. My comments below are about how we
>> get there.
>> 
>> The main concern is not perturbing any configuration that today
>> happens to set the module parameter.
>> 
>> The proposed patch correctly preserves the shape of the user/kernel
>> interfaces (same module parameter name and perms, same netlink command
>> and attribute, same exported symbol names and signatures). The concern
>> is that the accepted input set has narrowed from four strings to one
>> and the setters reject the legacy three with -EINVAL. For the module
>> parameter that path runs at module load, so existing modprobe.d configs
>> written any time in the last ~19 years will cause load-time parameter
>> errors. Some might categorize that as a regression.
>>                                       
>> The commit message documents that non-"pernode" writes now return
>> -EINVAL. The historically correct approach for this kind of obsoleted
>> tuning knob is to accept-and-ignore the old values (plus the pr_notice)
>> rather than hard-fail.
>> 
>> Or, put another way, the proposed patch implements something slightly
>> different than true backwards compatibility. Userspace that previously
>> set pool_mode=global/percpu/auto now gets -EINVAL, which technically
>> speaking is a behavioral narrowing, not "backward compatibility."
>> 
>> I might go even further and suggest that perhaps for v7.2:
>> 
>> - Change the behavior of "auto" to be per-node
>
> That's already the case if you're on a NUMA box. The problem is that
> the default is not "auto" but "global". We could change that (easily),
> but I'd argue that "auto" mode just devolves into "pernode" for two
> reasons:
>
> - "pernode" is functionally identical to "global" when there is only a
> single NUMA node
>
> - "percpu" mode is basically useless
>
> If we want to "go slow" on this, then what I'd probably suggest is that
> we just change the default to "pernode" initially.
>
> Single node boxes that have this set to global today should see no real
> change (functionally identical). Boxes with multiple NUMA nodes that
> don't set it now, will start using pernode mode, but that's probably a
> good thing in most cases.

OK, there seems to be a difference between "auto" and "default". That's
a little whacky.


>> - Add a deprecation warning that is emitted when setting other modes,
>>   but don't warn when the value is specifically the only accepted one.
>> 
>
> We probably do still want to warn in that case. Once we remove the
> option, the module load can fail, so we probably want to discourage
> people from keeping the setting.

I'm saying that setting a deprecated pool mode should not cause
module loading to fail at all. Failing to load will be more painful
than loading and having different thread affinity behavior.


-- 
Chuck Lever

      reply	other threads:[~2026-04-24 14:03 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-23 14:39 [PATCH RFC] sunrpc: hardcode pool_mode to pernode, remove other modes Jeff Layton
2026-04-23 16:45 ` Chuck Lever
2026-04-24  0:26   ` Jeff Layton
2026-04-24 14:02     ` Chuck Lever [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4b4fe625-b67b-46fc-8172-a9d0cf26c2c4@app.fastmail.com \
    --to=cel@kernel.org \
    --cc=Dai.Ngo@oracle.com \
    --cc=anna@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neil@brown.name \
    --cc=okorniev@redhat.com \
    --cc=tom@talpey.com \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox