Linux NFS development
 help / color / mirror / Atom feed
From: Chuck Lever III <chuck.lever@oracle.com>
To: Jeff Layton <jlayton@kernel.org>
Cc: Josef Bacik <josef@toxicpanda.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	"kernel-team@fb.com" <kernel-team@fb.com>
Subject: Re: [PATCH v2 10/13] nfsd: move th_cnt into nfsd_net
Date: Fri, 26 Jan 2024 14:27:31 +0000	[thread overview]
Message-ID: <52460FD0-34B8-4D07-8063-EA4BA5CB3D0B@oracle.com> (raw)
In-Reply-To: <3b20db588e31be6f39415aa18b5ffb13214c759d.camel@kernel.org>



> On Jan 26, 2024, at 9:08 AM, Jeff Layton <jlayton@kernel.org> wrote:
> 
> On Fri, 2024-01-26 at 13:48 +0000, Chuck Lever III wrote:
>> 
>>> On Jan 26, 2024, at 8:01 AM, Jeff Layton <jlayton@kernel.org> wrote:
>>> 
>>> On Thu, 2024-01-25 at 16:56 -0500, Josef Bacik wrote:
>>>> On Thu, Jan 25, 2024 at 04:01:27PM -0500, Chuck Lever wrote:
>>>>> On Thu, Jan 25, 2024 at 02:53:20PM -0500, Josef Bacik wrote:
>>>>>> This is the last global stat, move it into nfsd_net and adjust all the
>>>>>> users to use that variant instead of the global one.
>>>>> 
>>>>> Hm. I thought nfsd threads were a global resource -- they service
>>>>> all network namespaces. So, shouldn't the same thread count be
>>>>> surfaced to all containers? Won't they all see all of the nfsd
>>>>> processes?
>>> 
>>> Each container is going to start /proc/fs/nfsd/threads number of threads
>>> regardless. I hadn't actually grokked that they just get tossed onto the
>>> pile of threads that service requests.
>>> 
>>> Is is possible for one container to start a small number of threads but
>>> have its client load be such that it spills over and ends up stealing
>>> threads from other containers?
>> 
>> I haven't seen any code that manages resources based on namespace,
>> except in filecache.c to restrict writeback per namespace.
>> 
>> My impression is that any nfsd thread can serve any namespace. I'm
>> not sure it is currently meaningful for a particular net namespace to
>> "create" more threads.
>> 
>> If someone would like that level of control, we could implement a
>> cgroup mechanism and have one or more separate svc_pools per net
>> namespace, maybe? </hand wave>
>> 
> 
> AFAICT, the total number of threads on the system will be the sum of the
> threads started in each of the containers. They do just go into a big
> pile, and whichever one wakes up will service the request, so the
> threads aren't associated with the netns, per-se. The svc_rqst's however
> _are_ per-netns. So, I don't see anything that ensures that a container
> doesn't exceed the number of threads it started on its own behalf.
> 
> <hand wave>
> I'm not sure we'd need to tie this in to cgroups.

Not a need, but cgroups are typically the way that such
resource constraints are managed, that's all.


> Now that Josef is
> moving some of these key structures to be per-net, it should be fairly
> simple to have nfsd() just look at the th_cnt and the thread count in
> the current namespace, and just enqueue the RPC rather than doing it?
> </hand wave>
> 
> OTOH, maybe I'm overly concerned here.
> 
> 
>> 
>>>> I don't think we want the network namespaces seeing how many threads exist in
>>>> the entire system right?
>> 
>> If someone in a non-init net namespace does a "pgrep -c nfsd" don't
>> they see the total nfsd thread count for the host?
>> 
> 
> Yes, they're kernel threads and they aren't associated with a particular
> pid namespace.
> 
>> 
>>>> Additionally it appears that we can have multiple threads per network namespace,
>>>> so it's not like this will just show 1 for each individual nn, it'll show
>>>> however many threads have been configured for that nfsd in that network
>>>> namespace.
>> 
>> I've never tried this, so I'm speculating. But it seems like for
>> now, because all nfsd threads can serve all namespaces, they should
>> all see the global thread count stat.
>> 
>> Then later we can refine it.
>> 
> 
> I don't think that info is particularly useful though, and it certainly
> breaks expectations wrt container isolation.
> 
> Look at it this way:
> 
> Suppose I have access to a container and I spin up nfsd with a
> particular number of threads. I now want to know "did I spin up enough
> threads?"

It makes sense to me for a container to ask for one or more
NFSD listeners. But I'm not yet clear on what it means for
a container to try to alter the NFSD thread count, which is
currently global.


> By making this per-namespace as Josef suggests it should be
> fairly simple to tell whether my clients are regularly overrunning the
> threads I started. With this info as global, I have no idea what netns
> the RPCs being counted are against. I can't do anything with that info.

That's true. I'm just not sure we need to do anything
significant about it as part of this patch series.

I'm not at all advocating leaving it like this. I agree it
needs updating.

For now, just fill in that thread count stat with the global
thread count, and later when we have a better concept for
what it means to "adjust the nfsd thread count from inside a
container" we can come back and make it make sense.


--
Chuck Lever



  reply	other threads:[~2024-01-26 14:27 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-25 19:53 [PATCH v2 00/13] Make nfs and nfsd stats visible in network ns Josef Bacik
2024-01-25 19:53 ` [PATCH v2 01/13] sunrpc: don't change ->sv_stats if it doesn't exist Josef Bacik
2024-01-25 19:53 ` [PATCH v2 02/13] nfs: stop setting ->pg_stats for unused stats Josef Bacik
2024-01-25 19:53 ` [PATCH v2 03/13] sunrpc: pass in the sv_stats struct through svc_create* Josef Bacik
2024-01-25 20:56   ` Chuck Lever
2024-01-25 21:56     ` Josef Bacik
2024-01-25 19:53 ` [PATCH v2 04/13] sunrpc: remove ->pg_stats from svc_program Josef Bacik
2024-01-25 19:53 ` [PATCH v2 05/13] sunrpc: add a struct rpc_stats arg to rpc_create_args Josef Bacik
2024-01-25 20:53   ` Chuck Lever
2024-01-25 21:54     ` Josef Bacik
2024-01-25 22:30       ` Jeff Layton
2024-01-26 13:49       ` Chuck Lever III
2024-01-25 19:53 ` [PATCH v2 06/13] sunrpc: use the struct net as the svc proc private Josef Bacik
2024-01-25 19:53 ` [PATCH v2 07/13] nfsd: rename NFSD_NET_* to NFSD_STATS_* Josef Bacik
2024-01-25 19:53 ` [PATCH v2 08/13] nfsd: expose /proc/net/sunrpc/nfsd in net namespaces Josef Bacik
2024-01-25 19:53 ` [PATCH v2 09/13] nfsd: make all of the nfsd stats per-network namespace Josef Bacik
2024-01-25 19:53 ` [PATCH v2 10/13] nfsd: move th_cnt into nfsd_net Josef Bacik
2024-01-25 21:01   ` Chuck Lever
2024-01-25 21:56     ` Josef Bacik
2024-01-26 13:01       ` Jeff Layton
2024-01-26 13:48         ` Chuck Lever III
2024-01-26 14:08           ` Jeff Layton
2024-01-26 14:27             ` Chuck Lever III [this message]
2024-01-26 15:03               ` Jeff Layton
2024-01-26 15:16                 ` Chuck Lever III
2024-01-26 15:35                   ` Jeff Layton
2024-01-25 19:53 ` [PATCH v2 11/13] nfsd: make svc_stat per-network namespace instead of global Josef Bacik
2024-01-25 19:53 ` [PATCH v2 12/13] nfs: expose /proc/net/sunrpc/nfs in net namespaces Josef Bacik
2024-01-25 19:53 ` [PATCH v2 13/13] nfs: make the rpc_stat per net namespace Josef Bacik
2024-01-26 13:12 ` [PATCH v2 00/13] Make nfs and nfsd stats visible in network ns Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52460FD0-34B8-4D07-8063-EA4BA5CB3D0B@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox