From: Mike Snitzer <snitzer@kernel.org>
To: Jeff Layton <jlayton@kernel.org>
Cc: linux-nfs@vger.kernel.org, Chuck Lever <chuck.lever@oracle.com>,
Anna Schumaker <anna@kernel.org>,
Trond Myklebust <trondmy@hammerspace.com>,
NeilBrown <neilb@suse.de>,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v14 16/25] nfsd: add localio support
Date: Thu, 29 Aug 2024 12:59:20 -0400 [thread overview]
Message-ID: <ZtCo6NrSQ6KR-MZf@kernel.org> (raw)
In-Reply-To: <30842ebcf33e97f2f9af8eb57b2eeaec05e7dea6.camel@kernel.org>
On Thu, Aug 29, 2024 at 12:49:23PM -0400, Jeff Layton wrote:
> On Wed, 2024-08-28 at 21:04 -0400, Mike Snitzer wrote:
> > From: Weston Andros Adamson <dros@primarydata.com>
> >
> > Add server support for bypassing NFS for localhost reads, writes, and
> > commits. This is only useful when both the client and server are
> > running on the same host.
> >
> > If nfsd_open_local_fh() fails then the NFS client will both retry and
> > fallback to normal network-based read, write and commit operations if
> > localio is no longer supported.
> >
> > Care is taken to ensure the same NFS security mechanisms are used
> > (authentication, etc) regardless of whether localio or regular NFS
> > access is used. The auth_domain established as part of the traditional
> > NFS client access to the NFS server is also used for localio. Store
> > auth_domain for localio in nfsd_uuid_t and transfer it to the client
> > if it is local to the server.
> >
> > Relative to containers, localio gives the client access to the network
> > namespace the server has. This is required to allow the client to
> > access the server's per-namespace nfsd_net struct.
> >
> > CONFIG_NFSD_LOCALIO controls the server enablement for localio.
> > A later commit will add CONFIG_NFS_LOCALIO to allow the client
> > enablement.
>
> Do we need separate CONFIG options? Surely if you have one, you'll
> always want the other.
We used to have 4 (2 for each)... yeah I hear you. Its fiddley but I
can look at making it a single one with more feeling. Same as the
nfs_to opes work I just commited to: worst case we keep what we have
with the 2 CONFIG options, but 1 option _should_ be doable.
> > This commit also introduces the use of nfsd's percpu_ref to interlock
> > nfsd_destroy_serv and nfsd_open_local_fh, to ensure nn->nfsd_serv is
> > not destroyed while in use by nfsd_open_local_fh, and warrants a more
> > detailed explanation:
> >
> > nfsd_open_local_fh uses nfsd_serv_try_get before opening its file
> > handle and then the reference must be dropped by the caller using
> > nfsd_serv_put (via nfs_localio_ctx_free).
> >
> > This "interlock" working relies heavily on nfsd_open_local_fh()'s
> > maybe_get_net() safely dealing with the possibility that the struct
> > net (and nfsd_net by association) may have been destroyed by
> > nfsd_destroy_serv() via nfsd_shutdown_net().
This ^ 3rd paragraph no longer applicable, the use of proper long-term
ref on the 'nfsd_net' coupled with the use of RCU makes it so.
> >
> > Verified to fix an easy to hit crash that would occur if an nfsd
> > instance running in a container, with a localio client mounted, is
> > shutdown. Upon restart of the container and associated nfsd the client
> > would go on to crash due to NULL pointer dereference that occuured due
> > to the nfs client's localio attempting to nfsd_open_local_fh(), using
> > nn->nfsd_serv, without having a proper reference on nn->nfsd_serv.
> >
>
> Maybe transplant a version of the above 4 paragraphs to the patch that
> adds the percpu_ref handling?
I think it best to be mention where the use of nfsd_serv_{try_get,put}
meets the road. Hopefully you're cool with the 3 paragraphs staying
in this header? ;)
> > Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
> > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> > Co-developed-by: Mike Snitzer <snitzer@kernel.org>
> > Signed-off-by: Mike Snitzer <snitzer@kernel.org>
> > ---
> > fs/Kconfig | 3 ++
> > fs/nfsd/Kconfig | 16 +++++++
> > fs/nfsd/Makefile | 1 +
> > fs/nfsd/filecache.c | 2 +-
> > fs/nfsd/localio.c | 105 ++++++++++++++++++++++++++++++++++++++++++++
> > fs/nfsd/trace.h | 3 +-
> > fs/nfsd/vfs.h | 7 +++
> > 7 files changed, 135 insertions(+), 2 deletions(-)
> > create mode 100644 fs/nfsd/localio.c
> >
> > diff --git a/fs/Kconfig b/fs/Kconfig
> > index a46b0cbc4d8f..1b8a5edbddff 100644
> > --- a/fs/Kconfig
> > +++ b/fs/Kconfig
> > @@ -377,6 +377,9 @@ config NFS_ACL_SUPPORT
> > tristate
> > select FS_POSIX_ACL
> >
> > +config NFS_COMMON_LOCALIO_SUPPORT
> > + bool
> > +
> > config NFS_COMMON
> > bool
> > depends on NFSD || NFS_FS || LOCKD
> > diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig
> > index c0bd1509ccd4..e6fa7eaa1db0 100644
> > --- a/fs/nfsd/Kconfig
> > +++ b/fs/nfsd/Kconfig
> > @@ -90,6 +90,22 @@ config NFSD_V4
> >
> > If unsure, say N.
> >
> > +config NFSD_LOCALIO
> > + bool "NFS server support for the LOCALIO auxiliary protocol"
> > + depends on NFSD
> > + select NFS_COMMON_LOCALIO_SUPPORT
> > + default n
> > + help
> > + Some NFS servers support an auxiliary NFS LOCALIO protocol
> > + that is not an official part of the NFS protocol.
> > +
> > + This option enables support for the LOCALIO protocol in the
> > + kernel's NFS server. Enable this to permit local NFS clients
> > + to bypass the network when issuing reads and writes to the
> > + local NFS server.
> > +
> > + If unsure, say N.
> > +
> > config NFSD_PNFS
> > bool
> >
> > diff --git a/fs/nfsd/Makefile b/fs/nfsd/Makefile
> > index b8736a82e57c..78b421778a79 100644
> > --- a/fs/nfsd/Makefile
> > +++ b/fs/nfsd/Makefile
> > @@ -23,3 +23,4 @@ nfsd-$(CONFIG_NFSD_PNFS) += nfs4layouts.o
> > nfsd-$(CONFIG_NFSD_BLOCKLAYOUT) += blocklayout.o blocklayoutxdr.o
> > nfsd-$(CONFIG_NFSD_SCSILAYOUT) += blocklayout.o blocklayoutxdr.o
> > nfsd-$(CONFIG_NFSD_FLEXFILELAYOUT) += flexfilelayout.o flexfilelayoutxdr.o
> > +nfsd-$(CONFIG_NFSD_LOCALIO) += localio.o
> > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > index a83d469bca6b..49f4aab3208a 100644
> > --- a/fs/nfsd/filecache.c
> > +++ b/fs/nfsd/filecache.c
> > @@ -53,7 +53,7 @@
> > #define NFSD_FILE_CACHE_UP (0)
> >
> > /* We only care about NFSD_MAY_READ/WRITE for this cache */
> > -#define NFSD_FILE_MAY_MASK (NFSD_MAY_READ|NFSD_MAY_WRITE)
> > +#define NFSD_FILE_MAY_MASK (NFSD_MAY_READ|NFSD_MAY_WRITE|NFSD_MAY_LOCALIO)
> >
> > static DEFINE_PER_CPU(unsigned long, nfsd_file_cache_hits);
> > static DEFINE_PER_CPU(unsigned long, nfsd_file_acquisitions);
> > diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c
> > new file mode 100644
> > index 000000000000..4b65c66be129
> > --- /dev/null
> > +++ b/fs/nfsd/localio.c
> > @@ -0,0 +1,105 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * NFS server support for local clients to bypass network stack
> > + *
> > + * Copyright (C) 2014 Weston Andros Adamson <dros@primarydata.com>
> > + * Copyright (C) 2019 Trond Myklebust <trond.myklebust@hammerspace.com>
> > + * Copyright (C) 2024 Mike Snitzer <snitzer@hammerspace.com>
> > + */
> > +
> > +#include <linux/exportfs.h>
> > +#include <linux/sunrpc/svcauth.h>
> > +#include <linux/sunrpc/clnt.h>
> > +#include <linux/nfs.h>
> > +#include <linux/nfs_common.h>
> > +#include <linux/nfslocalio.h>
> > +#include <linux/string.h>
> > +
> > +#include "nfsd.h"
> > +#include "vfs.h"
> > +#include "netns.h"
> > +#include "filecache.h"
> > +
> > +/**
> > + * nfsd_open_local_fh - lookup a local filehandle @nfs_fh and map to nfsd_file
> > + *
> > + * @cl_nfssvc_net: the 'struct net' to use to get the proper nfsd_net
> > + * @cl_nfssvc_dom: the 'struct auth_domain' required for localio access
> > + * @rpc_clnt: rpc_clnt that the client established, used for sockaddr and cred
> > + * @cred: cred that the client established
> > + * @nfs_fh: filehandle to lookup
> > + * @fmode: fmode_t to use for open
> > + *
> > + * This function maps a local fh to a path on a local filesystem.
> > + * This is useful when the nfs client has the local server mounted - it can
> > + * avoid all the NFS overhead with reads, writes and commits.
> > + *
> > + * On successful return, returned nfs_localio_ctx will have its nfsd_file and
> > + * nfsd_net members set. Caller is responsible for calling nfsd_file_put and
> > + * nfsd_serv_put (via nfs_localio_ctx_free).
> > + */
> > +struct nfs_localio_ctx *
> > +nfsd_open_local_fh(struct net *cl_nfssvc_net, struct auth_domain *cl_nfssvc_dom,
> > + struct rpc_clnt *rpc_clnt, const struct cred *cred,
> > + const struct nfs_fh *nfs_fh, const fmode_t fmode)
> > +{
> > + int mayflags = NFSD_MAY_LOCALIO;
> > + int status = 0;
> > + struct nfsd_net *nn;
> > + struct svc_cred rq_cred;
> > + struct svc_fh fh;
> > + struct nfs_localio_ctx *localio;
> > + __be32 beres;
> > +
> > + if (nfs_fh->size > NFS4_FHSIZE)
> > + return ERR_PTR(-EINVAL);
> > +
> > + localio = nfs_localio_ctx_alloc();
> > + if (!localio)
> > + return ERR_PTR(-ENOMEM);
> > +
> > + /*
> > + * Not running in nfsd context, so must safely get reference on nfsd_serv.
> > + * But the server may already be shutting down, if so disallow new localio.
> > + */
> > + nn = net_generic(cl_nfssvc_net, nfsd_net_id);
> > + if (unlikely(!nfsd_serv_try_get(nn))) {
> > + status = -ENXIO;
> > + goto out_nfsd_serv;
> > + }
> > +
> > + /* nfs_fh -> svc_fh */
> > + fh_init(&fh, NFS4_FHSIZE);
> > + fh.fh_handle.fh_size = nfs_fh->size;
> > + memcpy(fh.fh_handle.fh_raw, nfs_fh->data, nfs_fh->size);
> > +
> > + if (fmode & FMODE_READ)
> > + mayflags |= NFSD_MAY_READ;
> > + if (fmode & FMODE_WRITE)
> > + mayflags |= NFSD_MAY_WRITE;
> > +
> > + svcauth_map_clnt_to_svc_cred_local(rpc_clnt, cred, &rq_cred);
> > +
> > + beres = nfsd_file_acquire_local(cl_nfssvc_net, &rq_cred, cl_nfssvc_dom,
> > + &fh, mayflags, &localio->nf);
> > + if (beres) {
> > + status = nfs_stat_to_errno(be32_to_cpu(beres));
> > + goto out_fh_put;
> > + }
> > + localio->nn = nn;
> > +
> > +out_fh_put:
> > + fh_put(&fh);
> > + if (rq_cred.cr_group_info)
> > + put_group_info(rq_cred.cr_group_info);
> > +out_nfsd_serv:
> > + if (status) {
> > + nfs_localio_ctx_free(localio);
> > + return ERR_PTR(status);
> > + }
> > + return localio;
> > +}
> > +EXPORT_SYMBOL_GPL(nfsd_open_local_fh);
> > +
> > +/* Compile time type checking, not used by anything */
> > +static nfs_to_nfsd_open_local_fh_t __maybe_unused nfsd_open_local_fh_typecheck = nfsd_open_local_fh;
> > diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
> > index d22027e23761..82bcefcd1f21 100644
> > --- a/fs/nfsd/trace.h
> > +++ b/fs/nfsd/trace.h
> > @@ -86,7 +86,8 @@ DEFINE_NFSD_XDR_ERR_EVENT(cant_encode);
> > { NFSD_MAY_NOT_BREAK_LEASE, "NOT_BREAK_LEASE" }, \
> > { NFSD_MAY_BYPASS_GSS, "BYPASS_GSS" }, \
> > { NFSD_MAY_READ_IF_EXEC, "READ_IF_EXEC" }, \
> > - { NFSD_MAY_64BIT_COOKIE, "64BIT_COOKIE" })
> > + { NFSD_MAY_64BIT_COOKIE, "64BIT_COOKIE" }, \
> > + { NFSD_MAY_LOCALIO, "LOCALIO" })
> >
> > TRACE_EVENT(nfsd_compound,
> > TP_PROTO(
> > diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
> > index 01947561d375..e12310dd5f4c 100644
> > --- a/fs/nfsd/vfs.h
> > +++ b/fs/nfsd/vfs.h
> > @@ -33,6 +33,8 @@
> >
> > #define NFSD_MAY_64BIT_COOKIE 0x1000 /* 64 bit readdir cookies for >= NFSv3 */
> >
> > +#define NFSD_MAY_LOCALIO 0x2000 /* for tracing, reflects when localio used */
> > +
> > #define NFSD_MAY_CREATE (NFSD_MAY_EXEC|NFSD_MAY_WRITE)
> > #define NFSD_MAY_REMOVE (NFSD_MAY_EXEC|NFSD_MAY_WRITE|NFSD_MAY_TRUNC)
> >
> > @@ -158,6 +160,11 @@ __be32 nfsd_permission(struct svc_cred *cred, struct svc_export *exp,
> >
> > void nfsd_filp_close(struct file *fp);
> >
> > +struct nfs_localio_ctx *
> > +nfsd_open_local_fh(struct net *, struct auth_domain *,
> > + struct rpc_clnt *, const struct cred *,
> > + const struct nfs_fh *, const fmode_t);
> > +
> > static inline int fh_want_write(struct svc_fh *fh)
> > {
> > int ret;
>
> Reviewed-by: Jeff Layton <jlayton@kernel.org>
>
Thanks,
Mike
next prev parent reply other threads:[~2024-08-29 16:59 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-29 1:03 [PATCH v14 00/25] nfs/nfsd: add support for LOCALIO Mike Snitzer
2024-08-29 1:03 ` [PATCH v14 01/25] nfs_common: factor out nfs_errtbl and nfs_stat_to_errno Mike Snitzer
2024-08-29 14:17 ` Jeff Layton
2024-08-29 1:03 ` [PATCH v14 02/25] nfs_common: factor out nfs4_errtbl and nfs4_stat_to_errno Mike Snitzer
2024-08-29 14:17 ` Jeff Layton
2024-08-29 1:03 ` [PATCH v14 03/25] nfs: factor out {encode,decode}_opaque_fixed to nfs_xdr.h Mike Snitzer
2024-08-29 14:19 ` Jeff Layton
2024-08-29 1:03 ` [PATCH v14 04/25] NFSD: Handle @rqstp == NULL in check_nfsd_access() Mike Snitzer
2024-08-29 14:20 ` Jeff Layton
2024-08-29 1:04 ` [PATCH v14 05/25] NFSD: Refactor nfsd_setuser_and_check_port() Mike Snitzer
2024-08-29 14:23 ` Jeff Layton
2024-08-29 1:04 ` [PATCH v14 06/25] NFSD: Avoid using rqstp->rq_vers in nfsd_set_fh_dentry() Mike Snitzer
2024-08-29 1:45 ` [PATCH v14.5 " Mike Snitzer
2024-08-29 16:52 ` Jeff Layton
2024-08-29 14:28 ` [PATCH v14 " Jeff Layton
2024-08-29 15:28 ` Mike Snitzer
2024-08-29 1:04 ` [PATCH v14 07/25] NFSD: Short-circuit fh_verify tracepoints for LOCALIO Mike Snitzer
2024-08-29 14:33 ` Jeff Layton
2024-08-29 14:35 ` Chuck Lever
2024-08-29 1:04 ` [PATCH v14 08/25] nfsd: factor out __fh_verify to allow NULL rqstp to be passed Mike Snitzer
2024-08-29 14:39 ` Jeff Layton
2024-08-29 15:35 ` Mike Snitzer
2024-08-29 1:04 ` [PATCH v14 09/25] nfsd: add nfsd_file_acquire_local() Mike Snitzer
2024-08-29 14:49 ` Jeff Layton
2024-08-29 15:47 ` Chuck Lever
2024-08-29 15:59 ` Mike Snitzer
2024-08-29 1:04 ` [PATCH v14 10/25] nfsd: add nfsd_serv_try_get and nfsd_serv_put Mike Snitzer
2024-08-29 15:49 ` Chuck Lever
2024-08-29 15:57 ` Jeff Layton
2024-08-29 16:01 ` Mike Snitzer
2024-08-29 16:04 ` Chuck Lever
2024-08-29 1:04 ` [PATCH v14 11/25] SUNRPC: remove call_allocate() BUG_ONs Mike Snitzer
2024-08-29 15:58 ` Jeff Layton
2024-08-29 1:04 ` [PATCH v14 12/25] SUNRPC: add svcauth_map_clnt_to_svc_cred_local Mike Snitzer
2024-08-29 15:50 ` Chuck Lever
2024-08-29 16:01 ` Jeff Layton
2024-08-29 1:04 ` [PATCH v14 13/25] SUNRPC: replace program list with program array Mike Snitzer
2024-08-29 16:02 ` Jeff Layton
2024-08-29 1:04 ` [PATCH v14 14/25] nfs_common: add NFS LOCALIO auxiliary protocol enablement Mike Snitzer
2024-08-29 16:07 ` Jeff Layton
2024-08-29 16:22 ` Mike Snitzer
2024-08-29 23:39 ` NeilBrown
2024-08-30 1:45 ` Mike Snitzer
2024-08-29 1:04 ` [PATCH v14 15/25] nfs_common: introduce nfs_localio_ctx struct and interfaces Mike Snitzer
2024-08-29 16:40 ` Jeff Layton
2024-08-29 16:52 ` Mike Snitzer
2024-08-29 17:48 ` Jeff Layton
2024-08-30 4:36 ` NeilBrown
2024-08-30 5:01 ` Mike Snitzer
2024-08-30 5:08 ` Mike Snitzer
2024-08-30 5:12 ` Mike Snitzer
2024-08-30 5:34 ` NeilBrown
2024-08-30 6:02 ` Mike Snitzer
2024-08-30 5:46 ` NeilBrown
2024-08-30 5:56 ` Mike Snitzer
2024-08-29 1:04 ` [PATCH v14 16/25] nfsd: add localio support Mike Snitzer
2024-08-29 16:01 ` Chuck Lever
2024-08-29 16:15 ` Mike Snitzer
2024-08-29 23:10 ` NeilBrown
2024-08-29 16:49 ` Jeff Layton
2024-08-29 16:59 ` Mike Snitzer [this message]
2024-08-29 17:18 ` Chuck Lever
2024-08-29 1:04 ` [PATCH v14 17/25] nfsd: implement server support for NFS_LOCALIO_PROGRAM Mike Snitzer
2024-08-29 16:50 ` Jeff Layton
2024-08-29 1:04 ` [PATCH v14 18/25] nfs: pass struct nfs_localio_ctx to nfs_init_pgio and nfs_init_commit Mike Snitzer
2024-08-29 1:04 ` [PATCH v14 19/25] nfs: add localio support Mike Snitzer
2024-08-29 1:04 ` [PATCH v14 20/25] nfs: enable localio for non-pNFS IO Mike Snitzer
2024-08-29 1:04 ` [PATCH v14 21/25] pnfs/flexfiles: enable localio support Mike Snitzer
2024-08-29 1:04 ` [PATCH v14 22/25] nfs/localio: use dedicated workqueues for filesystem read and write Mike Snitzer
2024-08-29 1:04 ` [PATCH v14 23/25] nfs: implement client support for NFS_LOCALIO_PROGRAM Mike Snitzer
2024-08-29 1:04 ` [PATCH v14 24/25] nfs: add Documentation/filesystems/nfs/localio.rst Mike Snitzer
2024-08-29 1:04 ` [PATCH v14 25/25] nfs: add FAQ section to Documentation/filesystems/nfs/localio.rst Mike Snitzer
2024-08-29 1:47 ` [PATCH v14.5 " Mike Snitzer
2024-08-29 1:42 ` [PATCH v14 00/25] nfs/nfsd: add support for LOCALIO Mike Snitzer
2024-08-29 1:50 ` Mike Snitzer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZtCo6NrSQ6KR-MZf@kernel.org \
--to=snitzer@kernel.org \
--cc=anna@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=jlayton@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
--cc=trondmy@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).