From: Mike Snitzer <snitzer@kernel.org>
To: NeilBrown <neilb@suse.de>
Cc: Chuck Lever <chuck.lever@oracle.com>,
Jeff Layton <jlayton@kernel.org>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
Anna Schumaker <anna@kernel.org>,
Trond Myklebust <trondmy@hammerspace.com>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v15 16/26] nfsd: add LOCALIO support
Date: Fri, 6 Sep 2024 14:07:28 -0400 [thread overview]
Message-ID: <ZttE4DKrqqVa0ACn@kernel.org> (raw)
In-Reply-To: <ZtsZ8IEoV-DyqAzj@kernel.org>
On Fri, Sep 06, 2024 at 11:04:16AM -0400, Mike Snitzer wrote:
> On Fri, Sep 06, 2024 at 09:34:08AM +1000, NeilBrown wrote:
> > On Fri, 06 Sep 2024, Mike Snitzer wrote:
> > > On Wed, Sep 04, 2024 at 09:47:07AM -0400, Chuck Lever wrote:
> > > > On Wed, Sep 04, 2024 at 03:01:46PM +1000, NeilBrown wrote:
> > > > > On Wed, 04 Sep 2024, NeilBrown wrote:
> > > > > >
> > > > > > I agree that dropping and reclaiming a lock is an anti-pattern and in
> > > > > > best avoided in general. I cannot see a better alternative in this
> > > > > > case.
> > > > >
> > > > > It occurred to me what I should spell out the alternate that I DO see so
> > > > > you have the option of disagreeing with my assessment that it isn't
> > > > > "better".
> > > > >
> > > > > We need RCU to call into nfsd, we need a per-cpu ref on the net (which
> > > > > we can only get inside nfsd) and NOT RCU to call
> > > > > nfsd_file_acquire_local().
> > > > >
> > > > > The current code combines these (because they are only used together)
> > > > > and so the need to drop rcu.
> > > > >
> > > > > I thought briefly that it could simply drop rcu and leave it dropped
> > > > > (__releases(rcu)) but not only do I generally like that LESS than
> > > > > dropping and reclaiming, I think it would be buggy. While in the nfsd
> > > > > module code we need to be holding either rcu or a ref on the server else
> > > > > the code could disappear out from under the CPU. So if we exit without
> > > > > a ref on the server - which we do if nfsd_file_acquire_local() fails -
> > > > > then we need to reclaim RCU *before* dropping the ref. So the current
> > > > > code is slightly buggy.
> > > > >
> > > > > We could instead split the combined call into multiple nfs_to
> > > > > interfaces.
> > > > >
> > > > > So nfs_open_local_fh() in nfs_common/nfslocalio.c would be something
> > > > > like:
> > > > >
> > > > > rcu_read_lock();
> > > > > net = READ_ONCE(uuid->net);
> > > > > if (!net || !nfs_to.get_net(net)) {
> > > > > rcu_read_unlock();
> > > > > return ERR_PTR(-ENXIO);
> > > > > }
> > > > > rcu_read_unlock();
> > > > > localio = nfs_to.nfsd_open_local_fh(....);
> > > > > if (IS_ERR(localio))
> > > > > nfs_to.put_net(net);
> > > > > return localio;
> > > > >
> > > > > So we have 3 interfaces instead of 1, but no hidden unlock/lock.
> > > >
> > > > Splitting up the function call occurred to me as well, but I didn't
> > > > come up with a specific bit of surgery. Thanks for the suggestion.
> > > >
> > > > At this point, my concern is that we will lose your cogent
> > > > explanation of why the release/lock is done. Having it in email is
> > > > great, but email is more ephemeral than actually putting it in the
> > > > code.
> > > >
> > > >
> > > > > As I said, I don't think this is a net win, but reasonable people might
> > > > > disagree with me.
> > > >
> > > > The "win" here is that it makes this code self-documenting and
> > > > somewhat less likely to be broken down the road by changes in and
> > > > around this area. Since I'm more forgetful these days I lean towards
> > > > the more obvious kinds of coding solutions. ;-)
> > > >
> > > > Mike, how do you feel about the 3-interface suggestion?
> > >
> > > I dislike expanding from 1 indirect function call to 2 in rapid
> > > succession (3 for the error path, not a problem, just being precise.
> > > But I otherwise like it.. maybe.. heh.
> > >
> > > FYI, I did run with the suggestion to make nfs_to a pointer that just
> > > needs a simple assignment rather than memcpy to initialize. So Neil's
> > > above code becames:
> > >
> > > rcu_read_lock();
> > > net = rcu_dereference(uuid->net);
> > > if (!net || !nfs_to->nfsd_serv_try_get(net)) {
> > > rcu_read_unlock();
> > > return ERR_PTR(-ENXIO);
> > > }
> > > rcu_read_unlock();
> > > /* We have an implied reference to net thanks to nfsd_serv_try_get */
> > > localio = nfs_to->nfsd_open_local_fh(net, uuid->dom, rpc_clnt,
> > > cred, nfs_fh, fmode);
> > > if (IS_ERR(localio))
> > > nfs_to->nfsd_serv_put(net);
> > > return localio;
> > >
> > > I do think it cleans the code up... full patch is here:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next.v15-with-fixups&id=e85306941878a87070176702de687f2779436061
> > >
> > > But I'm still on the fence.. someone help push me over!
> >
> > I think the new code is unquestionable clearer, and not taking this
> > approach would be a micro-optimisation which would need to be
> > numerically justified. So I'm pushing for the three-interface version
> > (despite what I said before).
> >
> > Unfortunately the new code is not bug-free - not quite.
> > As soon as nfs_to->nfsd_serv_put() calls percpu_ref_put() the nfsd
> > module can be unloaded, and the "return" instruction might not be
> > present. For this to go wrong would require a lot of bad luck, but if
> > the CPU took an interrupt at the wrong time were would be room.
> >
> > [Ever since module_put_and_exit() was added (now ..and_kthread_exit)
> > I've been sensitive to dropping the ref to a module in code running in
> > the module]
> >
> > So I think nfsd_serv_put (and nfsd_serv_try_get() __must_hold(RCU) and
> > nfs_open_local_fh() needs rcu_read_lock() before calling
> > nfs_to->nfsd_serv_put(net).
>
> OK, yes I can see that, I implemented what you suggested at the end of
> your reply (see inline patch below)...
>
> But I'd just like to point out that something like the below patch
> wouldn't be needed if we kept my "heavy" approach (nfs reference on
> nfsd modules via nfs_common using request_symbol):
> https://marc.info/?l=linux-nfs&m=172499445027800&w=2
> (that patch has stuff I since cleaned up, e.g. removed typedefs and
> EXPORT_SYMBOL_GPLs..)
>
> I knew we were going to pay for being too cute with how nfs took its
> reference on nfsd.
>
> So here we are, needing fiddly incremental fixes like this to close a
> really-small-yet-will-be-deadly race:
<snip required delicate rcu re-locking requirements patch>
I prefer this incremental re-implementation of my symbol_request patch
that eliminates all concerns about the validity of 'nfs_to' calls:
---
fs/nfs/localio.c | 5 +++
fs/nfs_common/nfslocalio.c | 84 +++++++++++++++++++++++++++++++-------
fs/nfsd/localio.c | 2 +-
include/linux/nfslocalio.h | 7 +++-
4 files changed, 80 insertions(+), 18 deletions(-)
diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
index c29cdf51c458..43520ac0fde8 100644
--- a/fs/nfs/localio.c
+++ b/fs/nfs/localio.c
@@ -124,6 +124,10 @@ const struct rpc_program nfslocalio_program = {
static void nfs_local_enable(struct nfs_client *clp)
{
spin_lock(&clp->cl_localio_lock);
+ if (!nfs_to_nfsd_localio_ops_get()) {
+ spin_unlock(&clp->cl_localio_lock);
+ return;
+ }
set_bit(NFS_CS_LOCAL_IO, &clp->cl_flags);
trace_nfs_local_enable(clp);
spin_unlock(&clp->cl_localio_lock);
@@ -138,6 +142,7 @@ void nfs_local_disable(struct nfs_client *clp)
if (test_and_clear_bit(NFS_CS_LOCAL_IO, &clp->cl_flags)) {
trace_nfs_local_disable(clp);
nfs_uuid_invalidate_one_client(&clp->cl_uuid);
+ nfs_to_nfsd_localio_ops_put();
}
spin_unlock(&clp->cl_localio_lock);
}
diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
index 42b479b9191f..9039e0f1afa3 100644
--- a/fs/nfs_common/nfslocalio.c
+++ b/fs/nfs_common/nfslocalio.c
@@ -7,6 +7,7 @@
#include <linux/module.h>
#include <linux/rculist.h>
#include <linux/nfslocalio.h>
+#include <linux/refcount.h>
#include <net/netns/generic.h>
MODULE_LICENSE("GPL");
@@ -53,11 +54,8 @@ static nfs_uuid_t * nfs_uuid_lookup_locked(const uuid_t *uuid)
return NULL;
}
-static struct module *nfsd_mod;
-
void nfs_uuid_is_local(const uuid_t *uuid, struct list_head *list,
- struct net *net, struct auth_domain *dom,
- struct module *mod)
+ struct net *net, struct auth_domain *dom)
{
nfs_uuid_t *nfs_uuid;
@@ -73,9 +71,6 @@ void nfs_uuid_is_local(const uuid_t *uuid, struct list_head *list,
*/
list_move(&nfs_uuid->list, list);
rcu_assign_pointer(nfs_uuid->net, net);
-
- __module_get(mod);
- nfsd_mod = mod;
}
spin_unlock(&nfs_uuid_lock);
}
@@ -83,10 +78,8 @@ EXPORT_SYMBOL_GPL(nfs_uuid_is_local);
static void nfs_uuid_put_locked(nfs_uuid_t *nfs_uuid)
{
- if (nfs_uuid->net) {
- module_put(nfsd_mod);
- nfs_uuid->net = NULL;
- }
+ if (nfs_uuid->net)
+ RCU_INIT_POINTER(nfs_uuid->net, NULL);
if (nfs_uuid->dom) {
auth_domain_put(nfs_uuid->dom);
nfs_uuid->dom = NULL;
@@ -123,14 +116,14 @@ struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
struct nfsd_file *localio;
/*
- * Not running in nfsd context, so must safely get reference on nfsd_serv.
+ * NFS has a reference to NFSD and can safely make 'nfs_to' calls.
+ *
+ * But not running in NFSD context, so must safely get reference to nfsd_serv.
* But the server may already be shutting down, if so disallow new localio.
+ *
* uuid->net is NOT a counted reference, but rcu_read_lock() ensures that
* if uuid->net is not NULL, then calling nfsd_serv_try_get() is safe
* and if it succeeds we will have an implied reference to the net.
- *
- * Otherwise NFS may not have ref on NFSD and therefore cannot safely
- * make 'nfs_to' calls.
*/
rcu_read_lock();
net = rcu_dereference(uuid->net);
@@ -153,6 +146,7 @@ EXPORT_SYMBOL_GPL(nfs_open_local_fh);
* but cannot be statically linked, because that will make the NFS
* module always depend on the NFSD module.
*
+ * [FIXME: must adjust following 2 paragraphs]
* 'nfs_to' provides NFS access to NFSD functions needed for LOCALIO,
* its lifetime is tightly coupled to the NFSD module and will always
* be available to NFS LOCALIO because any successful client<->server
@@ -170,3 +164,63 @@ EXPORT_SYMBOL_GPL(nfs_open_local_fh);
*/
const struct nfsd_localio_operations *nfs_to;
EXPORT_SYMBOL_GPL(nfs_to);
+
+static DEFINE_SPINLOCK(nfs_to_nfsd_lock);
+static refcount_t nfs_to_ref;
+
+bool nfs_to_nfsd_localio_ops_get(void)
+{
+ spin_lock(&nfs_to_nfsd_lock);
+
+ /* Only get nfsd_localio_operations on first reference */
+ if (refcount_read(&nfs_to_ref) == 0) {
+ refcount_set(&nfs_to_ref, 1);
+ /* fallthru */
+ } else {
+ refcount_inc(&nfs_to_ref);
+ spin_unlock(&nfs_to_nfsd_lock);
+ return true;
+ }
+
+ /* Must drop spinlock before call to symbol_request */
+ spin_unlock(&nfs_to_nfsd_lock);
+
+ /*
+ * If NFSD isn't available LOCALIO isn't possible.
+ * Use nfsd_open_local_fh symbol as the bellwether, if
+ * available then nfs_common has NFSD module reference
+ * on NFS's behalf and can safely call 'nfs_to' functions.
+ */
+ if (!symbol_request(nfsd_open_local_fh))
+ return false;
+ return true;
+}
+EXPORT_SYMBOL_GPL(nfs_to_nfsd_localio_ops_get);
+
+void nfs_to_nfsd_localio_ops_put(void)
+{
+ spin_lock(&nfs_to_nfsd_lock);
+
+ if (!refcount_dec_and_test(&nfs_to_ref))
+ goto out;
+
+ symbol_put(nfsd_open_local_fh);
+ nfs_to = NULL;
+out:
+ spin_unlock(&nfs_to_nfsd_lock);
+}
+EXPORT_SYMBOL_GPL(nfs_to_nfsd_localio_ops_put);
+
+static int __init nfslocalio_init(void)
+{
+ refcount_set(&nfs_to_ref, 0);
+
+ return 0;
+}
+
+static void __exit nfslocalio_exit(void)
+{
+}
+
+module_init(nfslocalio_init);
+module_exit(nfslocalio_exit);
diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c
index 291e9c69cae4..291ad916d67a 100644
--- a/fs/nfsd/localio.c
+++ b/fs/nfsd/localio.c
@@ -114,7 +114,7 @@ static __be32 localio_proc_uuid_is_local(struct svc_rqst *rqstp)
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
nfs_uuid_is_local(&argp->uuid, &nn->local_clients,
- net, rqstp->rq_client, THIS_MODULE);
+ net, rqstp->rq_client);
return rpc_success;
}
diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h
index b353abe00357..2e6b9107a7d1 100644
--- a/include/linux/nfslocalio.h
+++ b/include/linux/nfslocalio.h
@@ -35,7 +35,7 @@ typedef struct {
void nfs_uuid_begin(nfs_uuid_t *);
void nfs_uuid_end(nfs_uuid_t *);
void nfs_uuid_is_local(const uuid_t *, struct list_head *,
- struct net *, struct auth_domain *, struct module *);
+ struct net *, struct auth_domain *);
void nfs_uuid_invalidate_clients(struct list_head *list);
void nfs_uuid_invalidate_one_client(nfs_uuid_t *nfs_uuid);
@@ -58,9 +58,12 @@ struct nfsd_localio_operations {
struct file *(*nfsd_file_file)(struct nfsd_file *);
} ____cacheline_aligned;
-extern void nfsd_localio_ops_init(void);
extern const struct nfsd_localio_operations *nfs_to;
+extern void nfsd_localio_ops_init(void);
+bool nfs_to_nfsd_localio_ops_get(void);
+void nfs_to_nfsd_localio_ops_put(void);
+
struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *,
struct rpc_clnt *, const struct cred *,
const struct nfs_fh *, const fmode_t);
--
2.39.3
next prev parent reply other threads:[~2024-09-06 18:07 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-31 22:37 [PATCH v15 00/26] nfs/nfsd: add support for LOCALIO Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 01/26] nfs_common: factor out nfs_errtbl and nfs_stat_to_errno Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 02/26] nfs_common: factor out nfs4_errtbl and nfs4_stat_to_errno Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 03/26] nfs: factor out {encode,decode}_opaque_fixed to nfs_xdr.h Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 04/26] NFSD: Handle @rqstp == NULL in check_nfsd_access() Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 05/26] NFSD: Refactor nfsd_setuser_and_check_port() Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 06/26] NFSD: Avoid using rqstp->rq_vers in nfsd_set_fh_dentry() Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 07/26] NFSD: Short-circuit fh_verify tracepoints for LOCALIO Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 08/26] nfsd: factor out __fh_verify to allow NULL rqstp to be passed Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 09/26] nfsd: add nfsd_file_acquire_local() Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 10/26] nfsd: add nfsd_serv_try_get and nfsd_serv_put Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 11/26] SUNRPC: remove call_allocate() BUG_ONs Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 12/26] SUNRPC: add svcauth_map_clnt_to_svc_cred_local Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 13/26] SUNRPC: replace program list with program array Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 14/26] nfs_common: add NFS LOCALIO auxiliary protocol enablement Mike Snitzer
2024-09-01 23:25 ` NeilBrown
2024-09-03 16:33 ` Mike Snitzer
2024-09-05 19:24 ` Anna Schumaker
2024-09-05 19:38 ` Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 15/26] nfs_common: prepare for the NFS client to use nfsd_file for LOCALIO Mike Snitzer
2024-09-01 23:37 ` NeilBrown
2024-08-31 22:37 ` [PATCH v15 16/26] nfsd: add LOCALIO support Mike Snitzer
2024-09-01 23:46 ` NeilBrown
2024-09-03 14:34 ` Chuck Lever
2024-09-03 14:40 ` Jeff Layton
2024-09-03 15:00 ` Mike Snitzer
2024-09-03 15:19 ` Jeff Layton
2024-09-03 15:29 ` Mike Snitzer
2024-09-03 15:59 ` Chuck Lever III
2024-09-03 16:09 ` Mike Snitzer
2024-09-03 17:07 ` Chuck Lever III
2024-09-03 22:31 ` NeilBrown
2024-09-04 5:01 ` NeilBrown
2024-09-04 13:47 ` Chuck Lever
2024-09-05 14:21 ` Mike Snitzer
2024-09-05 15:41 ` Chuck Lever III
2024-09-05 23:34 ` NeilBrown
2024-09-06 15:04 ` Mike Snitzer
2024-09-06 18:07 ` Mike Snitzer [this message]
2024-09-06 21:56 ` NeilBrown
2024-09-06 22:33 ` Chuck Lever III
2024-09-06 23:14 ` NeilBrown
2024-09-07 15:17 ` Mike Snitzer
2024-09-07 16:09 ` Chuck Lever III
2024-09-07 19:08 ` Mike Snitzer
2024-09-07 21:12 ` Chuck Lever III
2024-09-08 15:05 ` Chuck Lever III
2024-09-07 15:52 ` Mike Snitzer
2024-09-04 13:54 ` Jeff Layton
2024-09-04 13:56 ` Chuck Lever III
2024-08-31 22:37 ` [PATCH v15 17/26] nfsd: implement server support for NFS_LOCALIO_PROGRAM Mike Snitzer
2024-09-03 14:11 ` Chuck Lever
2024-08-31 22:37 ` [PATCH v15 18/26] nfs: pass struct nfsd_file to nfs_init_pgio and nfs_init_commit Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 19/26] nfs: add LOCALIO support Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 20/26] nfs: enable localio for non-pNFS IO Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 21/26] pnfs/flexfiles: enable localio support Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 22/26] nfs/localio: use dedicated workqueues for filesystem read and write Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 23/26] nfs: implement client support for NFS_LOCALIO_PROGRAM Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 24/26] nfs: add Documentation/filesystems/nfs/localio.rst Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 25/26] nfs: add FAQ section to Documentation/filesystems/nfs/localio.rst Mike Snitzer
2024-08-31 22:37 ` [PATCH v15 26/26] nfs: add "NFS Client and Server Interlock" section to localio.rst Mike Snitzer
2024-09-01 23:52 ` [PATCH v15 00/26] nfs/nfsd: add support for LOCALIO NeilBrown
2024-09-03 14:49 ` Jeff Layton
2024-09-06 19:31 ` Anna Schumaker
2024-09-06 20:34 ` Mike Snitzer
2024-09-06 21:09 ` Chuck Lever III
2024-09-10 16:45 ` Mike Snitzer
2024-09-10 19:14 ` Mike Snitzer
2024-09-10 19:24 ` Anna Schumaker
2024-09-10 20:31 ` Anna Schumaker
2024-09-10 22:11 ` Mike Snitzer
2024-09-11 17:51 ` Mike Snitzer
2024-09-11 18:48 ` Mike Snitzer
2024-09-13 18:12 ` Mike Snitzer
2024-09-11 0:43 ` NeilBrown
2024-09-11 16:03 ` Chuck Lever III
2024-09-12 23:31 ` NeilBrown
2024-09-12 23:42 ` Chuck Lever III
2024-09-13 12:27 ` Mike Snitzer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZttE4DKrqqVa0ACn@kernel.org \
--to=snitzer@kernel.org \
--cc=anna@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=jlayton@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
--cc=trondmy@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.