From: Mike Snitzer <snitzer@kernel.org>
To: Jeff Layton <jlayton@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>,
linux-nfs@vger.kernel.org, Anna Schumaker <anna@kernel.org>,
Trond Myklebust <trondmy@hammerspace.com>,
NeilBrown <neilb@suse.de>,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v15 16/26] nfsd: add LOCALIO support
Date: Tue, 3 Sep 2024 11:00:05 -0400 [thread overview]
Message-ID: <ZtckdSIT6a3N-VTU@kernel.org> (raw)
In-Reply-To: <cd02bbdc0059afaff52d0aab1da0ecf91d101a0a.camel@kernel.org>
On Tue, Sep 03, 2024 at 10:40:28AM -0400, Jeff Layton wrote:
> On Tue, 2024-09-03 at 10:34 -0400, Chuck Lever wrote:
> > On Sat, Aug 31, 2024 at 06:37:36PM -0400, Mike Snitzer wrote:
> > > From: Weston Andros Adamson <dros@primarydata.com>
> > >
> > > Add server support for bypassing NFS for localhost reads, writes, and
> > > commits. This is only useful when both the client and server are
> > > running on the same host.
> > >
> > > If nfsd_open_local_fh() fails then the NFS client will both retry and
> > > fallback to normal network-based read, write and commit operations if
> > > localio is no longer supported.
> > >
> > > Care is taken to ensure the same NFS security mechanisms are used
> > > (authentication, etc) regardless of whether localio or regular NFS
> > > access is used. The auth_domain established as part of the traditional
> > > NFS client access to the NFS server is also used for localio. Store
> > > auth_domain for localio in nfsd_uuid_t and transfer it to the client
> > > if it is local to the server.
> > >
> > > Relative to containers, localio gives the client access to the network
> > > namespace the server has. This is required to allow the client to
> > > access the server's per-namespace nfsd_net struct.
> > >
> > > This commit also introduces the use of NFSD's percpu_ref to interlock
> > > nfsd_destroy_serv and nfsd_open_local_fh, to ensure nn->nfsd_serv is
> > > not destroyed while in use by nfsd_open_local_fh and other LOCALIO
> > > client code.
> > >
> > > CONFIG_NFS_LOCALIO enables NFS server support for LOCALIO.
> > >
> > > Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
> > > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> > > Co-developed-by: Mike Snitzer <snitzer@kernel.org>
> > > Signed-off-by: Mike Snitzer <snitzer@kernel.org>
> > > Co-developed-by: NeilBrown <neilb@suse.de>
> > > Signed-off-by: NeilBrown <neilb@suse.de>
> > >
> > > Not-Acked-by: Chuck Lever <chuck.lever@oracle.com>
> > > Not-Reviewed-by: Jeff Layton <jlayton@kernel.org>
> > > ---
> > > fs/nfsd/Makefile | 1 +
> > > fs/nfsd/filecache.c | 2 +-
> > > fs/nfsd/localio.c | 112 +++++++++++++++++++++++++++++++++++++
> > > fs/nfsd/netns.h | 4 ++
> > > fs/nfsd/nfsctl.c | 25 ++++++++-
> > > fs/nfsd/trace.h | 3 +-
> > > fs/nfsd/vfs.h | 2 +
> > > include/linux/nfslocalio.h | 8 +++
> > > 8 files changed, 154 insertions(+), 3 deletions(-)
> > > create mode 100644 fs/nfsd/localio.c
> > >
> > > diff --git a/fs/nfsd/Makefile b/fs/nfsd/Makefile
> > > index b8736a82e57c..18cbd3fa7691 100644
> > > --- a/fs/nfsd/Makefile
> > > +++ b/fs/nfsd/Makefile
> > > @@ -23,3 +23,4 @@ nfsd-$(CONFIG_NFSD_PNFS) += nfs4layouts.o
> > > nfsd-$(CONFIG_NFSD_BLOCKLAYOUT) += blocklayout.o blocklayoutxdr.o
> > > nfsd-$(CONFIG_NFSD_SCSILAYOUT) += blocklayout.o blocklayoutxdr.o
> > > nfsd-$(CONFIG_NFSD_FLEXFILELAYOUT) += flexfilelayout.o flexfilelayoutxdr.o
> > > +nfsd-$(CONFIG_NFS_LOCALIO) += localio.o
> > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > > index 89ff380ec31e..348c1b97092e 100644
> > > --- a/fs/nfsd/filecache.c
> > > +++ b/fs/nfsd/filecache.c
> > > @@ -52,7 +52,7 @@
> > > #define NFSD_FILE_CACHE_UP (0)
> > >
> > > /* We only care about NFSD_MAY_READ/WRITE for this cache */
> > > -#define NFSD_FILE_MAY_MASK (NFSD_MAY_READ|NFSD_MAY_WRITE)
> > > +#define NFSD_FILE_MAY_MASK (NFSD_MAY_READ|NFSD_MAY_WRITE|NFSD_MAY_LOCALIO)
> > >
> > > static DEFINE_PER_CPU(unsigned long, nfsd_file_cache_hits);
> > > static DEFINE_PER_CPU(unsigned long, nfsd_file_acquisitions);
> > > diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c
> > > new file mode 100644
> > > index 000000000000..75df709c6903
> > > --- /dev/null
> > > +++ b/fs/nfsd/localio.c
> > > @@ -0,0 +1,112 @@
> > > +// SPDX-License-Identifier: GPL-2.0-only
> > > +/*
> > > + * NFS server support for local clients to bypass network stack
> > > + *
> > > + * Copyright (C) 2014 Weston Andros Adamson <dros@primarydata.com>
> > > + * Copyright (C) 2019 Trond Myklebust <trond.myklebust@hammerspace.com>
> > > + * Copyright (C) 2024 Mike Snitzer <snitzer@hammerspace.com>
> > > + * Copyright (C) 2024 NeilBrown <neilb@suse.de>
> > > + */
> > > +
> > > +#include <linux/exportfs.h>
> > > +#include <linux/sunrpc/svcauth.h>
> > > +#include <linux/sunrpc/clnt.h>
> > > +#include <linux/nfs.h>
> > > +#include <linux/nfs_common.h>
> > > +#include <linux/nfslocalio.h>
> > > +#include <linux/string.h>
> > > +
> > > +#include "nfsd.h"
> > > +#include "vfs.h"
> > > +#include "netns.h"
> > > +#include "filecache.h"
> > > +
> > > +static const struct nfsd_localio_operations nfsd_localio_ops = {
> > > + .nfsd_open_local_fh = nfsd_open_local_fh,
> > > + .nfsd_file_put_local = nfsd_file_put_local,
> > > + .nfsd_file_file = nfsd_file_file,
> > > +};
> > > +
> > > +void nfsd_localio_ops_init(void)
> > > +{
> > > + memcpy(&nfs_to, &nfsd_localio_ops, sizeof(nfsd_localio_ops));
> > > +}
> >
> > Same comment as Neil: this should surface a pointer to the
> > localio_ops struct. Copying the whole set of function pointers is
> > generally unnecessary.
> >
> >
> > > +
> > > +/**
> > > + * nfsd_open_local_fh - lookup a local filehandle @nfs_fh and map to nfsd_file
> > > + *
> > > + * @uuid: nfs_uuid_t which provides the 'struct net' to get the proper nfsd_net
> > > + * and the 'struct auth_domain' required for LOCALIO access
> > > + * @rpc_clnt: rpc_clnt that the client established, used for sockaddr and cred
> > > + * @cred: cred that the client established
> > > + * @nfs_fh: filehandle to lookup
> > > + * @fmode: fmode_t to use for open
> > > + *
> > > + * This function maps a local fh to a path on a local filesystem.
> > > + * This is useful when the nfs client has the local server mounted - it can
> > > + * avoid all the NFS overhead with reads, writes and commits.
> > > + *
> > > + * On successful return, returned nfsd_file will have its nf_net member
> > > + * set. Caller (NFS client) is responsible for calling nfsd_serv_put and
> > > + * nfsd_file_put (via nfs_to.nfsd_file_put_local).
> > > + */
> > > +struct nfsd_file *
> > > +nfsd_open_local_fh(nfs_uuid_t *uuid,
> > > + struct rpc_clnt *rpc_clnt, const struct cred *cred,
> > > + const struct nfs_fh *nfs_fh, const fmode_t fmode)
> > > + __must_hold(rcu)
> > > +{
> > > + int mayflags = NFSD_MAY_LOCALIO;
> > > + struct nfsd_net *nn = NULL;
> > > + struct net *net;
> > > + struct svc_cred rq_cred;
> > > + struct svc_fh fh;
> > > + struct nfsd_file *localio;
> > > + __be32 beres;
> > > +
> > > + if (nfs_fh->size > NFS4_FHSIZE)
> > > + return ERR_PTR(-EINVAL);
> > > +
> > > + /*
> > > + * Not running in nfsd context, so must safely get reference on nfsd_serv.
> > > + * But the server may already be shutting down, if so disallow new localio.
> > > + * uuid->net is NOT a counted reference, but caller's rcu_read_lock() ensures
> > > + * that if uuid->net is not NULL, then calling nfsd_serv_try_get() is safe
> > > + * and if it succeeds we will have an implied reference to the net.
> > > + */
> > > + net = rcu_dereference(uuid->net);
> > > + if (net)
> > > + nn = net_generic(net, nfsd_net_id);
> > > + if (unlikely(!nn || !nfsd_serv_try_get(nn)))
> > > + return ERR_PTR(-ENXIO);
> > > +
> > > + /* Drop the rcu lock for nfsd_file_acquire_local() */
> > > + rcu_read_unlock();
> >
> > I'm struggling with the locking logistics. Caller takes the RCU read
> > lock, this function drops the lock, then takes it again. So:
> >
> > - A caller might rely on the lock being held continuously, but
> > - The API contract documented above doesn't indicate that this
> > function drops that lock
> > - The __must_hold(rcu) annotation doesn't indicate that this
> > function drops that lock, IIUC
> >
> > Dropping and retaking the lock in here is an anti-pattern that
> > should be avoided. I suggest we are better off in the long run if
> > the caller does not need to take the RCU read lock, but instead,
> > nfsd_open_local_fh takes it right here just for the rcu_dereference.
I thought so too when I first saw how Neil approached fixing this to
be safe. It was only after putting further time to it (and having the
benefit of being so close to all this) that I realized the nuance at
play (please see my reply to Jeff below for the nuance I'm speaking
of).
> >
> > OTOH, Why drop the lock before calling nfsd_file_acquire_local()?
> > The RCU read lock can safely be taken more than once in succession.
> >
> > Let's rethink the locking strategy.
> >
Yes, _that_ is a very valid point. I did wonder the same: it seems
perfectly fine to simply retain the RCU throughout the entirety of
nfsd_open_local_fh().
> Agreed. The only caller does this:
>
> rcu_read_lock();
> if (!rcu_access_pointer(uuid->net)) {
> rcu_read_unlock();
> return ERR_PTR(-ENXIO);
> }
> localio = nfs_to.nfsd_open_local_fh(uuid, rpc_clnt, cred,
> nfs_fh, fmode);
> rcu_read_unlock();
>
> Maybe just move the check for uuid->net down into nfsd_open_local_fh,
> and it can acquire the rcu_read_lock for itself?
No, sorry we cannot. The call to nfs_to.nfsd_open_local_fh (which is
a symbol provided by nfsd) is only safe if the RCU protected pre-check
shows the uuid->net valid.
Mike
next prev parent reply other threads:[~2024-09-03 15:00 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-31 22:37 [PATCH v15 00/26] nfs/nfsd: add support for LOCALIO Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 01/26] nfs_common: factor out nfs_errtbl and nfs_stat_to_errno Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 02/26] nfs_common: factor out nfs4_errtbl and nfs4_stat_to_errno Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 03/26] nfs: factor out {encode,decode}_opaque_fixed to nfs_xdr.h Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 04/26] NFSD: Handle @rqstp == NULL in check_nfsd_access() Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 05/26] NFSD: Refactor nfsd_setuser_and_check_port() Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 06/26] NFSD: Avoid using rqstp->rq_vers in nfsd_set_fh_dentry() Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 07/26] NFSD: Short-circuit fh_verify tracepoints for LOCALIO Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 08/26] nfsd: factor out __fh_verify to allow NULL rqstp to be passed Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 09/26] nfsd: add nfsd_file_acquire_local() Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 10/26] nfsd: add nfsd_serv_try_get and nfsd_serv_put Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 11/26] SUNRPC: remove call_allocate() BUG_ONs Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 12/26] SUNRPC: add svcauth_map_clnt_to_svc_cred_local Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 13/26] SUNRPC: replace program list with program array Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 14/26] nfs_common: add NFS LOCALIO auxiliary protocol enablement Mike Snitzer
2024-09-01 23:25 ` NeilBrown
2024-09-03 16:33 ` Mike Snitzer
2024-09-05 19:24 ` Anna Schumaker
2024-09-05 19:38 ` Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 15/26] nfs_common: prepare for the NFS client to use nfsd_file for LOCALIO Mike Snitzer
2024-09-01 23:37 ` NeilBrown
2024-08-31 22:37 ` [PATCH v15 16/26] nfsd: add LOCALIO support Mike Snitzer
2024-09-01 23:46 ` NeilBrown
2024-09-03 14:34 ` Chuck Lever
2024-09-03 14:40 ` Jeff Layton
2024-09-03 15:00 ` Mike Snitzer [this message]
2024-09-03 15:19 ` Jeff Layton
2024-09-03 15:29 ` Mike Snitzer
2024-09-03 15:59 ` Chuck Lever III
2024-09-03 16:09 ` Mike Snitzer
2024-09-03 17:07 ` Chuck Lever III
2024-09-03 22:31 ` NeilBrown
2024-09-04 5:01 ` NeilBrown
2024-09-04 13:47 ` Chuck Lever
2024-09-05 14:21 ` Mike Snitzer
2024-09-05 15:41 ` Chuck Lever III
2024-09-05 23:34 ` NeilBrown
2024-09-06 15:04 ` Mike Snitzer
2024-09-06 18:07 ` Mike Snitzer
2024-09-06 21:56 ` NeilBrown
2024-09-06 22:33 ` Chuck Lever III
2024-09-06 23:14 ` NeilBrown
2024-09-07 15:17 ` Mike Snitzer
2024-09-07 16:09 ` Chuck Lever III
2024-09-07 19:08 ` Mike Snitzer
2024-09-07 21:12 ` Chuck Lever III
2024-09-08 15:05 ` Chuck Lever III
2024-09-07 15:52 ` Mike Snitzer
2024-09-04 13:54 ` Jeff Layton
2024-09-04 13:56 ` Chuck Lever III
2024-08-31 22:37 ` [PATCH v15 17/26] nfsd: implement server support for NFS_LOCALIO_PROGRAM Mike Snitzer
2024-09-03 14:11 ` Chuck Lever
2024-08-31 22:37 ` [PATCH v15 18/26] nfs: pass struct nfsd_file to nfs_init_pgio and nfs_init_commit Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 19/26] nfs: add LOCALIO support Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 20/26] nfs: enable localio for non-pNFS IO Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 21/26] pnfs/flexfiles: enable localio support Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 22/26] nfs/localio: use dedicated workqueues for filesystem read and write Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 23/26] nfs: implement client support for NFS_LOCALIO_PROGRAM Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 24/26] nfs: add Documentation/filesystems/nfs/localio.rst Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 25/26] nfs: add FAQ section to Documentation/filesystems/nfs/localio.rst Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 26/26] nfs: add "NFS Client and Server Interlock" section to localio.rst Mike Snitzer
2024-09-01 23:52 ` [PATCH v15 00/26] nfs/nfsd: add support for LOCALIO NeilBrown
2024-09-03 14:49 ` Jeff Layton
2024-09-06 19:31 ` Anna Schumaker
2024-09-06 20:34 ` Mike Snitzer
2024-09-06 21:09 ` Chuck Lever III
2024-09-10 16:45 ` Mike Snitzer
2024-09-10 19:14 ` Mike Snitzer
2024-09-10 19:24 ` Anna Schumaker
2024-09-10 20:31 ` Anna Schumaker
2024-09-10 22:11 ` Mike Snitzer
2024-09-11 17:51 ` Mike Snitzer
2024-09-11 18:48 ` Mike Snitzer
2024-09-13 18:12 ` Mike Snitzer
2024-09-11 0:43 ` NeilBrown
2024-09-11 16:03 ` Chuck Lever III
2024-09-12 23:31 ` NeilBrown
2024-09-12 23:42 ` Chuck Lever III
2024-09-13 12:27 ` Mike Snitzer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZtckdSIT6a3N-VTU@kernel.org \
--to=snitzer@kernel.org \
--cc=anna@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=jlayton@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
--cc=trondmy@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).