Linux NFS development
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Chuck Lever III <chuck.lever@oracle.com>,
	Josef Bacik <josef@toxicpanda.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	"kernel-team@fb.com" <kernel-team@fb.com>
Subject: Re: [PATCH v2 10/13] nfsd: move th_cnt into nfsd_net
Date: Fri, 26 Jan 2024 09:08:28 -0500	[thread overview]
Message-ID: <3b20db588e31be6f39415aa18b5ffb13214c759d.camel@kernel.org> (raw)
In-Reply-To: <1E87E98E-EA7B-4BBD-A9FA-EF4B217141E0@oracle.com>

On Fri, 2024-01-26 at 13:48 +0000, Chuck Lever III wrote:
> 
> > On Jan 26, 2024, at 8:01 AM, Jeff Layton <jlayton@kernel.org> wrote:
> > 
> > On Thu, 2024-01-25 at 16:56 -0500, Josef Bacik wrote:
> > > On Thu, Jan 25, 2024 at 04:01:27PM -0500, Chuck Lever wrote:
> > > > On Thu, Jan 25, 2024 at 02:53:20PM -0500, Josef Bacik wrote:
> > > > > This is the last global stat, move it into nfsd_net and adjust all the
> > > > > users to use that variant instead of the global one.
> > > > 
> > > > Hm. I thought nfsd threads were a global resource -- they service
> > > > all network namespaces. So, shouldn't the same thread count be
> > > > surfaced to all containers? Won't they all see all of the nfsd
> > > > processes?
> > 
> > Each container is going to start /proc/fs/nfsd/threads number of threads
> > regardless. I hadn't actually grokked that they just get tossed onto the
> > pile of threads that service requests.
> > 
> > Is is possible for one container to start a small number of threads but
> > have its client load be such that it spills over and ends up stealing
> > threads from other containers?
> 
> I haven't seen any code that manages resources based on namespace,
> except in filecache.c to restrict writeback per namespace.
> 
> My impression is that any nfsd thread can serve any namespace. I'm
> not sure it is currently meaningful for a particular net namespace to
> "create" more threads.
> 
> If someone would like that level of control, we could implement a
> cgroup mechanism and have one or more separate svc_pools per net
> namespace, maybe? </hand wave>
> 

AFAICT, the total number of threads on the system will be the sum of the
threads started in each of the containers. They do just go into a big
pile, and whichever one wakes up will service the request, so the
threads aren't associated with the netns, per-se. The svc_rqst's however
_are_ per-netns. So, I don't see anything that ensures that a container
doesn't exceed the number of threads it started on its own behalf.

<hand wave>
I'm not sure we'd need to tie this in to cgroups. Now that Josef is
moving some of these key structures to be per-net, it should be fairly
simple to have nfsd() just look at the th_cnt and the thread count in
the current namespace, and just enqueue the RPC rather than doing it?
</hand wave>

OTOH, maybe I'm overly concerned here.


> 
> > > I don't think we want the network namespaces seeing how many threads exist in
> > > the entire system right?
> 
> If someone in a non-init net namespace does a "pgrep -c nfsd" don't
> they see the total nfsd thread count for the host?
> 

Yes, they're kernel threads and they aren't associated with a particular
pid namespace.

> 
> > > Additionally it appears that we can have multiple threads per network namespace,
> > > so it's not like this will just show 1 for each individual nn, it'll show
> > > however many threads have been configured for that nfsd in that network
> > > namespace.
> 
> I've never tried this, so I'm speculating. But it seems like for
> now, because all nfsd threads can serve all namespaces, they should
> all see the global thread count stat.
> 
> Then later we can refine it.
> 

I don't think that info is particularly useful though, and it certainly
breaks expectations wrt container isolation.

Look at it this way:

Suppose I have access to a container and I spin up nfsd with a
particular number of threads. I now want to know "did I spin up enough
threads?" By making this per-namespace as Josef suggests it should be
fairly simple to tell whether my clients are regularly overrunning the
threads I started. With this info as global, I have no idea what netns
the RPCs being counted are against. I can't do anything with that info.

> 
> > > I'm good either way, but it makes sense to me to only surface the network
> > > namespace related thread count.  I could probably have a global counter and only
> > > surface the global counter if net == &init_net.  Let me know what you prefer.
> 

-- 
Jeff Layton <jlayton@kernel.org>

  reply	other threads:[~2024-01-26 14:08 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-25 19:53 [PATCH v2 00/13] Make nfs and nfsd stats visible in network ns Josef Bacik
2024-01-25 19:53 ` [PATCH v2 01/13] sunrpc: don't change ->sv_stats if it doesn't exist Josef Bacik
2024-01-25 19:53 ` [PATCH v2 02/13] nfs: stop setting ->pg_stats for unused stats Josef Bacik
2024-01-25 19:53 ` [PATCH v2 03/13] sunrpc: pass in the sv_stats struct through svc_create* Josef Bacik
2024-01-25 20:56   ` Chuck Lever
2024-01-25 21:56     ` Josef Bacik
2024-01-25 19:53 ` [PATCH v2 04/13] sunrpc: remove ->pg_stats from svc_program Josef Bacik
2024-01-25 19:53 ` [PATCH v2 05/13] sunrpc: add a struct rpc_stats arg to rpc_create_args Josef Bacik
2024-01-25 20:53   ` Chuck Lever
2024-01-25 21:54     ` Josef Bacik
2024-01-25 22:30       ` Jeff Layton
2024-01-26 13:49       ` Chuck Lever III
2024-01-25 19:53 ` [PATCH v2 06/13] sunrpc: use the struct net as the svc proc private Josef Bacik
2024-01-25 19:53 ` [PATCH v2 07/13] nfsd: rename NFSD_NET_* to NFSD_STATS_* Josef Bacik
2024-01-25 19:53 ` [PATCH v2 08/13] nfsd: expose /proc/net/sunrpc/nfsd in net namespaces Josef Bacik
2024-01-25 19:53 ` [PATCH v2 09/13] nfsd: make all of the nfsd stats per-network namespace Josef Bacik
2024-01-25 19:53 ` [PATCH v2 10/13] nfsd: move th_cnt into nfsd_net Josef Bacik
2024-01-25 21:01   ` Chuck Lever
2024-01-25 21:56     ` Josef Bacik
2024-01-26 13:01       ` Jeff Layton
2024-01-26 13:48         ` Chuck Lever III
2024-01-26 14:08           ` Jeff Layton [this message]
2024-01-26 14:27             ` Chuck Lever III
2024-01-26 15:03               ` Jeff Layton
2024-01-26 15:16                 ` Chuck Lever III
2024-01-26 15:35                   ` Jeff Layton
2024-01-25 19:53 ` [PATCH v2 11/13] nfsd: make svc_stat per-network namespace instead of global Josef Bacik
2024-01-25 19:53 ` [PATCH v2 12/13] nfs: expose /proc/net/sunrpc/nfs in net namespaces Josef Bacik
2024-01-25 19:53 ` [PATCH v2 13/13] nfs: make the rpc_stat per net namespace Josef Bacik
2024-01-26 13:12 ` [PATCH v2 00/13] Make nfs and nfsd stats visible in network ns Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3b20db588e31be6f39415aa18b5ffb13214c759d.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox