linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@kernel.org>
To: NeilBrown <neilb@suse.de>
Cc: Chuck Lever <chuck.lever@oracle.com>,
	Jeff Layton <jlayton@kernel.org>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	Anna Schumaker <anna@kernel.org>,
	Trond Myklebust <trondmy@hammerspace.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v15 16/26] nfsd: add LOCALIO support
Date: Fri, 6 Sep 2024 14:07:28 -0400	[thread overview]
Message-ID: <ZttE4DKrqqVa0ACn@kernel.org> (raw)
In-Reply-To: <ZtsZ8IEoV-DyqAzj@kernel.org>

On Fri, Sep 06, 2024 at 11:04:16AM -0400, Mike Snitzer wrote:
> On Fri, Sep 06, 2024 at 09:34:08AM +1000, NeilBrown wrote:
> > On Fri, 06 Sep 2024, Mike Snitzer wrote:
> > > On Wed, Sep 04, 2024 at 09:47:07AM -0400, Chuck Lever wrote:
> > > > On Wed, Sep 04, 2024 at 03:01:46PM +1000, NeilBrown wrote:
> > > > > On Wed, 04 Sep 2024, NeilBrown wrote:
> > > > > > 
> > > > > > I agree that dropping and reclaiming a lock is an anti-pattern and in
> > > > > > best avoided in general.  I cannot see a better alternative in this
> > > > > > case.
> > > > > 
> > > > > It occurred to me what I should spell out the alternate that I DO see so
> > > > > you have the option of disagreeing with my assessment that it isn't
> > > > > "better".
> > > > > 
> > > > > We need RCU to call into nfsd, we need a per-cpu ref on the net (which
> > > > > we can only get inside nfsd) and NOT RCU to call
> > > > > nfsd_file_acquire_local().
> > > > > 
> > > > > The current code combines these (because they are only used together)
> > > > > and so the need to drop rcu. 
> > > > > 
> > > > > I thought briefly that it could simply drop rcu and leave it dropped
> > > > > (__releases(rcu)) but not only do I generally like that LESS than
> > > > > dropping and reclaiming, I think it would be buggy.  While in the nfsd
> > > > > module code we need to be holding either rcu or a ref on the server else
> > > > > the code could disappear out from under the CPU.  So if we exit without
> > > > > a ref on the server - which we do if nfsd_file_acquire_local() fails -
> > > > > then we need to reclaim RCU *before* dropping the ref.  So the current
> > > > > code is slightly buggy.
> > > > > 
> > > > > We could instead split the combined call into multiple nfs_to
> > > > > interfaces.
> > > > > 
> > > > > So nfs_open_local_fh() in nfs_common/nfslocalio.c would be something
> > > > > like:
> > > > > 
> > > > >  rcu_read_lock();
> > > > >  net = READ_ONCE(uuid->net);
> > > > >  if (!net || !nfs_to.get_net(net)) {
> > > > >        rcu_read_unlock();
> > > > >        return ERR_PTR(-ENXIO);
> > > > >  }
> > > > >  rcu_read_unlock();
> > > > >  localio = nfs_to.nfsd_open_local_fh(....);
> > > > >  if (IS_ERR(localio))
> > > > >        nfs_to.put_net(net);
> > > > >  return localio;
> > > > > 
> > > > > So we have 3 interfaces instead of 1, but no hidden unlock/lock.
> > > > 
> > > > Splitting up the function call occurred to me as well, but I didn't
> > > > come up with a specific bit of surgery. Thanks for the suggestion.
> > > > 
> > > > At this point, my concern is that we will lose your cogent
> > > > explanation of why the release/lock is done. Having it in email is
> > > > great, but email is more ephemeral than actually putting it in the
> > > > code.
> > > > 
> > > > 
> > > > > As I said, I don't think this is a net win, but reasonable people might
> > > > > disagree with me.
> > > > 
> > > > The "win" here is that it makes this code self-documenting and
> > > > somewhat less likely to be broken down the road by changes in and
> > > > around this area. Since I'm more forgetful these days I lean towards
> > > > the more obvious kinds of coding solutions. ;-)
> > > > 
> > > > Mike, how do you feel about the 3-interface suggestion?
> > > 
> > > I dislike expanding from 1 indirect function call to 2 in rapid
> > > succession (3 for the error path, not a problem, just being precise.
> > > But I otherwise like it.. maybe.. heh.
> > > 
> > > FYI, I did run with the suggestion to make nfs_to a pointer that just
> > > needs a simple assignment rather than memcpy to initialize.  So Neil's
> > > above code becames:
> > > 
> > >         rcu_read_lock();
> > >         net = rcu_dereference(uuid->net);
> > >         if (!net || !nfs_to->nfsd_serv_try_get(net)) {
> > >                 rcu_read_unlock();
> > >                 return ERR_PTR(-ENXIO);
> > >         }
> > >         rcu_read_unlock();
> > >         /* We have an implied reference to net thanks to nfsd_serv_try_get */
> > >         localio = nfs_to->nfsd_open_local_fh(net, uuid->dom, rpc_clnt,
> > >                                              cred, nfs_fh, fmode);
> > >         if (IS_ERR(localio))
> > >                 nfs_to->nfsd_serv_put(net);
> > >         return localio;
> > > 
> > > I do think it cleans the code up... full patch is here:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next.v15-with-fixups&id=e85306941878a87070176702de687f2779436061
> > > 
> > > But I'm still on the fence.. someone help push me over!
> > 
> > I think the new code is unquestionable clearer, and not taking this
> > approach would be a micro-optimisation which would need to be
> > numerically justified.  So I'm pushing for the three-interface version
> > (despite what I said before).
> > 
> > Unfortunately the new code is not bug-free - not quite.
> > As soon as nfs_to->nfsd_serv_put() calls percpu_ref_put() the nfsd
> > module can be unloaded, and the "return" instruction might not be
> > present.  For this to go wrong would require a lot of bad luck, but if
> > the CPU took an interrupt at the wrong time were would be room.
> > 
> > [Ever since module_put_and_exit() was added (now ..and_kthread_exit)
> >  I've been sensitive to dropping the ref to a module in code running in
> >  the module]
> > 
> > So I think nfsd_serv_put (and nfsd_serv_try_get() __must_hold(RCU) and
> > nfs_open_local_fh() needs rcu_read_lock() before calling
> > nfs_to->nfsd_serv_put(net).
> 
> OK, yes I can see that, I implemented what you suggested at the end of
> your reply (see inline patch below)...
> 
> But I'd just like to point out that something like the below patch
> wouldn't be needed if we kept my "heavy" approach (nfs reference on
> nfsd modules via nfs_common using request_symbol):
> https://marc.info/?l=linux-nfs&m=172499445027800&w=2
> (that patch has stuff I since cleaned up, e.g. removed typedefs and
> EXPORT_SYMBOL_GPLs..)
> 
> I knew we were going to pay for being too cute with how nfs took its
> reference on nfsd.
> 
> So here we are, needing fiddly incremental fixes like this to close a
> really-small-yet-will-be-deadly race:

<snip required delicate rcu re-locking requirements patch>

I prefer this incremental re-implementation of my symbol_request patch
that eliminates all concerns about the validity of 'nfs_to' calls:

---
 fs/nfs/localio.c           |  5 +++
 fs/nfs_common/nfslocalio.c | 84 +++++++++++++++++++++++++++++++-------
 fs/nfsd/localio.c          |  2 +-
 include/linux/nfslocalio.h |  7 +++-
 4 files changed, 80 insertions(+), 18 deletions(-)

diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
index c29cdf51c458..43520ac0fde8 100644
--- a/fs/nfs/localio.c
+++ b/fs/nfs/localio.c
@@ -124,6 +124,10 @@ const struct rpc_program nfslocalio_program = {
 static void nfs_local_enable(struct nfs_client *clp)
 {
 	spin_lock(&clp->cl_localio_lock);
+	if (!nfs_to_nfsd_localio_ops_get()) {
+		spin_unlock(&clp->cl_localio_lock);
+		return;
+	}
 	set_bit(NFS_CS_LOCAL_IO, &clp->cl_flags);
 	trace_nfs_local_enable(clp);
 	spin_unlock(&clp->cl_localio_lock);
@@ -138,6 +142,7 @@ void nfs_local_disable(struct nfs_client *clp)
 	if (test_and_clear_bit(NFS_CS_LOCAL_IO, &clp->cl_flags)) {
 		trace_nfs_local_disable(clp);
 		nfs_uuid_invalidate_one_client(&clp->cl_uuid);
+		nfs_to_nfsd_localio_ops_put();
 	}
 	spin_unlock(&clp->cl_localio_lock);
 }
diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
index 42b479b9191f..9039e0f1afa3 100644
--- a/fs/nfs_common/nfslocalio.c
+++ b/fs/nfs_common/nfslocalio.c
@@ -7,6 +7,7 @@
 #include <linux/module.h>
 #include <linux/rculist.h>
 #include <linux/nfslocalio.h>
+#include <linux/refcount.h>
 #include <net/netns/generic.h>
 
 MODULE_LICENSE("GPL");
@@ -53,11 +54,8 @@ static nfs_uuid_t * nfs_uuid_lookup_locked(const uuid_t *uuid)
 	return NULL;
 }
 
-static struct module *nfsd_mod;
-
 void nfs_uuid_is_local(const uuid_t *uuid, struct list_head *list,
-		       struct net *net, struct auth_domain *dom,
-		       struct module *mod)
+		       struct net *net, struct auth_domain *dom)
 {
 	nfs_uuid_t *nfs_uuid;
 
@@ -73,9 +71,6 @@ void nfs_uuid_is_local(const uuid_t *uuid, struct list_head *list,
 		 */
 		list_move(&nfs_uuid->list, list);
 		rcu_assign_pointer(nfs_uuid->net, net);
-
-		__module_get(mod);
-		nfsd_mod = mod;
 	}
 	spin_unlock(&nfs_uuid_lock);
 }
@@ -83,10 +78,8 @@ EXPORT_SYMBOL_GPL(nfs_uuid_is_local);
 
 static void nfs_uuid_put_locked(nfs_uuid_t *nfs_uuid)
 {
-	if (nfs_uuid->net) {
-		module_put(nfsd_mod);
-		nfs_uuid->net = NULL;
-	}
+	if (nfs_uuid->net)
+		RCU_INIT_POINTER(nfs_uuid->net, NULL);
 	if (nfs_uuid->dom) {
 		auth_domain_put(nfs_uuid->dom);
 		nfs_uuid->dom = NULL;
@@ -123,14 +116,14 @@ struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
 	struct nfsd_file *localio;
 
 	/*
-	 * Not running in nfsd context, so must safely get reference on nfsd_serv.
+	 * NFS has a reference to NFSD and can safely make 'nfs_to' calls.
+	 *
+	 * But not running in NFSD context, so must safely get reference to nfsd_serv.
 	 * But the server may already be shutting down, if so disallow new localio.
+	 *
 	 * uuid->net is NOT a counted reference, but rcu_read_lock() ensures that
 	 * if uuid->net is not NULL, then calling nfsd_serv_try_get() is safe
 	 * and if it succeeds we will have an implied reference to the net.
-	 *
-	 * Otherwise NFS may not have ref on NFSD and therefore cannot safely
-	 * make 'nfs_to' calls.
 	 */
 	rcu_read_lock();
 	net = rcu_dereference(uuid->net);
@@ -153,6 +146,7 @@ EXPORT_SYMBOL_GPL(nfs_open_local_fh);
  * but cannot be statically linked, because that will make the NFS
  * module always depend on the NFSD module.
  *
+ * [FIXME: must adjust following 2 paragraphs]
  * 'nfs_to' provides NFS access to NFSD functions needed for LOCALIO,
  * its lifetime is tightly coupled to the NFSD module and will always
  * be available to NFS LOCALIO because any successful client<->server
@@ -170,3 +164,63 @@ EXPORT_SYMBOL_GPL(nfs_open_local_fh);
  */
 const struct nfsd_localio_operations *nfs_to;
 EXPORT_SYMBOL_GPL(nfs_to);
+
+static DEFINE_SPINLOCK(nfs_to_nfsd_lock);
+static refcount_t nfs_to_ref;
+
+bool nfs_to_nfsd_localio_ops_get(void)
+{
+	spin_lock(&nfs_to_nfsd_lock);
+
+	/* Only get nfsd_localio_operations on first reference */
+	if (refcount_read(&nfs_to_ref) == 0) {
+		refcount_set(&nfs_to_ref, 1);
+		/* fallthru */
+	} else {
+		refcount_inc(&nfs_to_ref);
+		spin_unlock(&nfs_to_nfsd_lock);
+		return true;
+	}
+
+	/* Must drop spinlock before call to symbol_request */
+	spin_unlock(&nfs_to_nfsd_lock);
+
+	/*
+	 * If NFSD isn't available LOCALIO isn't possible.
+	 * Use nfsd_open_local_fh symbol as the bellwether, if
+	 * available then nfs_common has NFSD module reference
+	 * on NFS's behalf and can safely call 'nfs_to' functions.
+	 */
+	if (!symbol_request(nfsd_open_local_fh))
+		return false;
+	return true;
+}
+EXPORT_SYMBOL_GPL(nfs_to_nfsd_localio_ops_get);
+
+void nfs_to_nfsd_localio_ops_put(void)
+{
+	spin_lock(&nfs_to_nfsd_lock);
+
+	if (!refcount_dec_and_test(&nfs_to_ref))
+		goto out;
+
+	symbol_put(nfsd_open_local_fh);
+	nfs_to = NULL;
+out:
+	spin_unlock(&nfs_to_nfsd_lock);
+}
+EXPORT_SYMBOL_GPL(nfs_to_nfsd_localio_ops_put);
+
+static int __init nfslocalio_init(void)
+{
+	refcount_set(&nfs_to_ref, 0);
+
+	return 0;
+}
+
+static void __exit nfslocalio_exit(void)
+{
+}
+
+module_init(nfslocalio_init);
+module_exit(nfslocalio_exit);
diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c
index 291e9c69cae4..291ad916d67a 100644
--- a/fs/nfsd/localio.c
+++ b/fs/nfsd/localio.c
@@ -114,7 +114,7 @@ static __be32 localio_proc_uuid_is_local(struct svc_rqst *rqstp)
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
 
 	nfs_uuid_is_local(&argp->uuid, &nn->local_clients,
-			  net, rqstp->rq_client, THIS_MODULE);
+			  net, rqstp->rq_client);
 
 	return rpc_success;
 }
diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h
index b353abe00357..2e6b9107a7d1 100644
--- a/include/linux/nfslocalio.h
+++ b/include/linux/nfslocalio.h
@@ -35,7 +35,7 @@ typedef struct {
 void nfs_uuid_begin(nfs_uuid_t *);
 void nfs_uuid_end(nfs_uuid_t *);
 void nfs_uuid_is_local(const uuid_t *, struct list_head *,
-		       struct net *, struct auth_domain *, struct module *);
+		       struct net *, struct auth_domain *);
 void nfs_uuid_invalidate_clients(struct list_head *list);
 void nfs_uuid_invalidate_one_client(nfs_uuid_t *nfs_uuid);
 
@@ -58,9 +58,12 @@ struct nfsd_localio_operations {
 	struct file *(*nfsd_file_file)(struct nfsd_file *);
 } ____cacheline_aligned;
 
-extern void nfsd_localio_ops_init(void);
 extern const struct nfsd_localio_operations *nfs_to;
 
+extern void nfsd_localio_ops_init(void);
+bool nfs_to_nfsd_localio_ops_get(void);
+void nfs_to_nfsd_localio_ops_put(void);
+
 struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *,
 		   struct rpc_clnt *, const struct cred *,
 		   const struct nfs_fh *, const fmode_t);
-- 
2.39.3


  reply	other threads:[~2024-09-06 18:07 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-31 22:37 [PATCH v15 00/26] nfs/nfsd: add support for LOCALIO Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 01/26] nfs_common: factor out nfs_errtbl and nfs_stat_to_errno Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 02/26] nfs_common: factor out nfs4_errtbl and nfs4_stat_to_errno Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 03/26] nfs: factor out {encode,decode}_opaque_fixed to nfs_xdr.h Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 04/26] NFSD: Handle @rqstp == NULL in check_nfsd_access() Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 05/26] NFSD: Refactor nfsd_setuser_and_check_port() Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 06/26] NFSD: Avoid using rqstp->rq_vers in nfsd_set_fh_dentry() Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 07/26] NFSD: Short-circuit fh_verify tracepoints for LOCALIO Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 08/26] nfsd: factor out __fh_verify to allow NULL rqstp to be passed Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 09/26] nfsd: add nfsd_file_acquire_local() Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 10/26] nfsd: add nfsd_serv_try_get and nfsd_serv_put Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 11/26] SUNRPC: remove call_allocate() BUG_ONs Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 12/26] SUNRPC: add svcauth_map_clnt_to_svc_cred_local Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 13/26] SUNRPC: replace program list with program array Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 14/26] nfs_common: add NFS LOCALIO auxiliary protocol enablement Mike Snitzer
2024-09-01 23:25   ` NeilBrown
2024-09-03 16:33     ` Mike Snitzer
2024-09-05 19:24   ` Anna Schumaker
2024-09-05 19:38     ` Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 15/26] nfs_common: prepare for the NFS client to use nfsd_file for LOCALIO Mike Snitzer
2024-09-01 23:37   ` NeilBrown
2024-08-31 22:37 ` [PATCH v15 16/26] nfsd: add LOCALIO support Mike Snitzer
2024-09-01 23:46   ` NeilBrown
2024-09-03 14:34   ` Chuck Lever
2024-09-03 14:40     ` Jeff Layton
2024-09-03 15:00       ` Mike Snitzer
2024-09-03 15:19         ` Jeff Layton
2024-09-03 15:29           ` Mike Snitzer
2024-09-03 15:59             ` Chuck Lever III
2024-09-03 16:09               ` Mike Snitzer
2024-09-03 17:07                 ` Chuck Lever III
2024-09-03 22:31               ` NeilBrown
2024-09-04  5:01                 ` NeilBrown
2024-09-04 13:47                   ` Chuck Lever
2024-09-05 14:21                     ` Mike Snitzer
2024-09-05 15:41                       ` Chuck Lever III
2024-09-05 23:34                       ` NeilBrown
2024-09-06 15:04                         ` Mike Snitzer
2024-09-06 18:07                           ` Mike Snitzer [this message]
2024-09-06 21:56                             ` NeilBrown
2024-09-06 22:33                               ` Chuck Lever III
2024-09-06 23:14                                 ` NeilBrown
2024-09-07 15:17                                   ` Mike Snitzer
2024-09-07 16:09                                     ` Chuck Lever III
2024-09-07 19:08                                       ` Mike Snitzer
2024-09-07 21:12                                         ` Chuck Lever III
2024-09-08 15:05                                           ` Chuck Lever III
2024-09-07 15:52                               ` Mike Snitzer
2024-09-04 13:54                   ` Jeff Layton
2024-09-04 13:56                     ` Chuck Lever III
2024-08-31 22:37 ` [PATCH v15 17/26] nfsd: implement server support for NFS_LOCALIO_PROGRAM Mike Snitzer
2024-09-03 14:11   ` Chuck Lever
2024-08-31 22:37 ` [PATCH v15 18/26] nfs: pass struct nfsd_file to nfs_init_pgio and nfs_init_commit Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 19/26] nfs: add LOCALIO support Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 20/26] nfs: enable localio for non-pNFS IO Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 21/26] pnfs/flexfiles: enable localio support Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 22/26] nfs/localio: use dedicated workqueues for filesystem read and write Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 23/26] nfs: implement client support for NFS_LOCALIO_PROGRAM Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 24/26] nfs: add Documentation/filesystems/nfs/localio.rst Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 25/26] nfs: add FAQ section to Documentation/filesystems/nfs/localio.rst Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 26/26] nfs: add "NFS Client and Server Interlock" section to localio.rst Mike Snitzer
2024-09-01 23:52 ` [PATCH v15 00/26] nfs/nfsd: add support for LOCALIO NeilBrown
2024-09-03 14:49 ` Jeff Layton
2024-09-06 19:31 ` Anna Schumaker
2024-09-06 20:34   ` Mike Snitzer
2024-09-06 21:09     ` Chuck Lever III
2024-09-10 16:45     ` Mike Snitzer
2024-09-10 19:14       ` Mike Snitzer
2024-09-10 19:24         ` Anna Schumaker
2024-09-10 20:31         ` Anna Schumaker
2024-09-10 22:11           ` Mike Snitzer
2024-09-11 17:51             ` Mike Snitzer
2024-09-11 18:48               ` Mike Snitzer
2024-09-13 18:12                 ` Mike Snitzer
2024-09-11  0:43   ` NeilBrown
2024-09-11 16:03     ` Chuck Lever III
2024-09-12 23:31       ` NeilBrown
2024-09-12 23:42         ` Chuck Lever III
2024-09-13 12:27           ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZttE4DKrqqVa0ACn@kernel.org \
    --to=snitzer@kernel.org \
    --cc=anna@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).