Linux NFS development
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@kernel.org>
To: Jeff Layton <jlayton@kernel.org>, NeilBrown <neilb@suse.de>
Cc: linux-nfs@vger.kernel.org, Chuck Lever <chuck.lever@oracle.com>,
	Trond Myklebust <trondmy@hammerspace.com>,
	snitzer@hammerspace.com
Subject: Re: [for-6.11 PATCH 10/29] nfs/nfsd: add "local io" support
Date: Tue, 11 Jun 2024 22:25:58 -0400	[thread overview]
Message-ID: <ZmkHNr5jtGHDpko_@kernel.org> (raw)
In-Reply-To: <ZmctDkCCg331JS4M@kernel.org>

On Mon, Jun 10, 2024 at 12:42:54PM -0400, Mike Snitzer wrote:
> On Mon, Jun 10, 2024 at 08:43:34AM -0400, Jeff Layton wrote:
> > On Fri, 2024-06-07 at 10:26 -0400, Mike Snitzer wrote:
> > > From: Weston Andros Adamson <dros@primarydata.com>
> > > 
> > > Add client support for bypassing NFS for localhost reads, writes, and commits.
> > > 
> > > This is only useful when the client and the server are running on the same
> > > host and in the same container.
> > > 
> > > This has dynamic binding with the nfsd module. Local i/o will only work if
> > > nfsd is already loaded.
> > > 
> > > [snitm: rebase accounted for commit d8b26071e65e8 ("NFSD: simplify struct nfsfh")
> > >  and commit 7c98f7cb8fda ("remove call_{read,write}_iter() functions")]
> > > 
> > > Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
> > > Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
> > > Signed-off-by: Peng Tao <tao.peng@primarydata.com>
> > > Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
> > > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> > > Signed-off-by: Mike Snitzer <snitzer@kernel.org>
...
> > > diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c
> > > new file mode 100644
> > > index 000000000000..ff68454a4017
> > > --- /dev/null
> > > +++ b/fs/nfsd/localio.c
> > > @@ -0,0 +1,179 @@
> > > +/*
> > > + * NFS server support for local clients to bypass network stack
> > > + *
> > > + * Copyright (C) 2014 Weston Andros Adamson <dros@primarydata.com>
> > > + */
> > > +
> > > +#include <linux/exportfs.h>
> > > +#include <linux/sunrpc/svcauth_gss.h>
> > > +#include <linux/sunrpc/clnt.h>
> > > +#include <linux/nfs.h>
> > > +#include <linux/string.h>
> > > +
> > > +#include "nfsd.h"
> > > +#include "vfs.h"
> > > +#include "netns.h"
> > > +#include "filecache.h"
> > > +
> > > +#define NFSDDBG_FACILITY		NFSDDBG_FH
> > > +
> > > +static void
> > > +nfsd_local_fakerqst_destroy(struct svc_rqst *rqstp)
> > > +{
> > > +	if (rqstp->rq_client)
> > > +		auth_domain_put(rqstp->rq_client);
> > > +	if (rqstp->rq_cred.cr_group_info)
> > > +		put_group_info(rqstp->rq_cred.cr_group_info);
> > > +	kfree(rqstp->rq_cred.cr_principal);
> > > +	kfree(rqstp->rq_xprt);
> > > +	kfree(rqstp);
> > > +}
> > > +
> > > +static struct svc_rqst *
> > > +nfsd_local_fakerqst_create(struct rpc_clnt *rpc_clnt, const struct cred *cred)
> > > +{
> > > +	struct svc_rqst *rqstp;
> > > +	struct net *net = rpc_net_ns(rpc_clnt);
> > > +	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > > +	int status;
> > > +
> > > +	if (!nn->nfsd_serv) {
> > > +		dprintk("%s: localio denied. Server not running\n", __func__);
> > > +		return ERR_PTR(-ENXIO);
> > > +	}
> > > +
> > 
> > Note that the above check is racy. The nfsd_serv can go away at any
> > time since you're not holding the (global) nfsd_mutex (I assume?).
> 
> Yes, worst case we should fallback to going over the network.

Actual worst case is we could crash... ;)

> > > +	rqstp = kzalloc(sizeof(*rqstp), GFP_KERNEL);
> > > +	if (!rqstp)
> > > +		return ERR_PTR(-ENOMEM);
> > > +
> > > +	rqstp->rq_xprt = kzalloc(sizeof(*rqstp->rq_xprt), GFP_KERNEL);
> > > +	if (!rqstp->rq_xprt) {
> > > +		status = -ENOMEM;
> > > +		goto out_err;
> > > +	}
> > > +
> > > +	rqstp->rq_xprt->xpt_net = net;
> > > +	__set_bit(RQ_SECURE, &rqstp->rq_flags);
> > > +	rqstp->rq_proc = 1;
> > > +	rqstp->rq_vers = 3;
> > > +	rqstp->rq_prot = IPPROTO_TCP;
> > > +	rqstp->rq_server = nn->nfsd_serv;
> > > +
> > 
> > I suspect you need to carry a reference of some sort so that the
> > nfsd_serv doesn't go away out from under you while this is running,
> > since this is not operating in nfsd thread context.
> > 
> > Typically, every nfsd thread holds a reference to the serv (in serv-
> > >sv_nrthreads), so that when you shut down all of the threads, it goes
> > away. The catch is that that refcount is currently under the protection
> > of the global nfsd_mutex and I doubt you want to take that in this
> > codepath.
> 
> OK, I can look closer at the implications.

SO I looked, and I'm saddened to see Neil's 6.8 commit 1e3577a4521e
("SUNRPC: discard sv_refcnt, and svc_get/svc_put").

[the lack of useful refcounting with the current code kind of blew me
away.. but nice to see it existed not too long ago.]

Rather than immediately invest the effort to revert commit
1e3577a4521e for my apparent needs... I'll send out v2 to allow for
further review and discussion.

But it really does feel like I _need_ svc_{get,put} and nfsd_{get,put}

Mike

  reply	other threads:[~2024-06-12  2:26 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-07 14:26 [for-6.11 PATCH 00/29] nfs/nfsd: add support for localio bypass Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 01/29] nfs: pass nfs_client to nfs_initiate_pgio Mike Snitzer
2024-06-10 12:02   ` Jeff Layton
2024-06-07 14:26 ` [for-6.11 PATCH 02/29] nfs: pass nfs_client to nfs_initiate_commit Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 03/29] nfs: pass descriptor thru nfs_initiate_pgio path Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 04/29] sunrpc: handle NULL req->defer in cache_defer_req Mike Snitzer
2024-06-10 12:21   ` Jeff Layton
2024-06-11  1:03     ` NeilBrown
2024-06-11  2:57       ` Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 05/29] sunrpc: export svc_defer Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 06/29] sunrpc: add rpcauth_map_to_svc_cred Mike Snitzer
2024-06-10 12:19   ` Jeff Layton
2024-06-07 14:26 ` [for-6.11 PATCH 07/29] sunrpc: add and export rpc_ntop6_addr_noscopeid Mike Snitzer
2024-06-09 12:36   ` Jeff Layton
2024-06-10 16:33     ` Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 08/29] nfs: move nfs_stat_to_errno to nfs.h Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 09/29] NFS: Manage boot verifier correctly in the case of localio Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 10/29] nfs/nfsd: add "local io" support Mike Snitzer
2024-06-10 12:43   ` Jeff Layton
2024-06-10 16:42     ` Mike Snitzer
2024-06-12  2:25       ` Mike Snitzer [this message]
2024-06-12  3:17         ` NeilBrown
2024-06-12  3:41           ` Mike Snitzer
2024-06-12  4:09             ` NeilBrown
2024-06-12  4:48               ` Mike Snitzer
2024-06-12  6:30                 ` NeilBrown
2024-06-07 14:26 ` [for-6.11 PATCH 11/29] NFS: Enable localio for non-pNFS I/O Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 12/29] nfs/flexfiles: check local DS when making DS connections Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 13/29] pnfs/flexfiles: Enable localio for flexfiles I/O Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 14/29] NFS: Add tracepoints for nfs_local_enable and nfs_local_disable Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 15/29] NFS: Don't call filesystem write() routines directly Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 16/29] NFS: Don't call filesystem read() " Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 17/29] NFS: Use completion rather than flush_work() in nfs_local_commit() Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 18/29] NFS: localio writes need to use a normal workqueue Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 19/29] nfs/write: fix nfs_initiate_commit to return error from nfs_local_commit Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 20/29] nfs/localio: discontinue network address based localio setup Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 21/29] nfs_common: add NFS v3 LOCALIO protocol extension enablement Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 22/29] nfs: implement v3 client support for NFS_LOCALIO_PROGRAM Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 23/29] nfsd: implement v3 server " Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 24/29] nfs_common: add NFS v4 LOCALIO protocol extension enablement Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 25/29] nfs: implement v4 client support for NFS_LOCALIO_PROGRAM Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 26/29] nfsd: implement v4 server " Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 27/29] nfs/nfsd: switch GETUUID to using {encode,decode}_opaque_fixed Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 28/29] nfs/nfsd: consolidate {encode,decode}_opaque_fixed in nfs_xdr.h Mike Snitzer
2024-06-07 14:26 ` [for-6.11 PATCH 29/29] nfs/localio: move managing nfsd_open_local_fh symbol to nfs_common Mike Snitzer
2024-06-07 18:06 ` [for-6.11 PATCH 30/29] nfs/nfsd: ensure localio server always uses its network namespace Mike Snitzer
2024-06-09 15:44   ` Chuck Lever
2024-06-10 16:50     ` Mike Snitzer
2024-06-10 22:37       ` Mike Snitzer
2024-06-07 18:09 ` [for-6.11 PATCH 00/29] nfs/nfsd: add support for localio bypass Mike Snitzer
2024-06-10 12:47 ` Jeff Layton
2024-06-10 16:47   ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZmkHNr5jtGHDpko_@kernel.org \
    --to=snitzer@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=snitzer@hammerspace.com \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox