From: Mike Snitzer <snitzer@kernel.org>
To: linux-nfs@vger.kernel.org
Cc: Anna Schumaker <anna@kernel.org>,
Trond Myklebust <trondmy@hammerspace.com>,
Chuck Lever <chuck.lever@oracle.com>,
Jeff Layton <jlayton@kernel.org>, NeilBrown <neilb@suse.de>
Subject: [for-6.13 PATCH v2 08/15] nfsd: rename nfsd_serv_ prefixed methods and variables with nfsd_net_
Date: Wed, 13 Nov 2024 22:59:45 -0500 [thread overview]
Message-ID: <20241114035952.13889-9-snitzer@kernel.org> (raw)
In-Reply-To: <20241114035952.13889-1-snitzer@kernel.org>
Also update Documentation/filesystems/nfs/localio.rst accordingly
and reduce the technical documentation debt that was previously
captured in that document.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
---
Documentation/filesystems/nfs/localio.rst | 81 +++++++----------------
fs/nfs_common/nfslocalio.c | 10 ++-
fs/nfsd/filecache.c | 2 +-
fs/nfsd/localio.c | 4 +-
fs/nfsd/netns.h | 11 +--
fs/nfsd/nfssvc.c | 34 +++++-----
include/linux/nfslocalio.h | 12 ++--
7 files changed, 64 insertions(+), 90 deletions(-)
diff --git a/Documentation/filesystems/nfs/localio.rst b/Documentation/filesystems/nfs/localio.rst
index eb3dc764cfce8..9d991a958c81c 100644
--- a/Documentation/filesystems/nfs/localio.rst
+++ b/Documentation/filesystems/nfs/localio.rst
@@ -218,64 +218,30 @@ NFS Client and Server Interlock
===============================
LOCALIO provides the nfs_uuid_t object and associated interfaces to
-allow proper network namespace (net-ns) and NFSD object refcounting:
+allow proper network namespace (net-ns) and NFSD object refcounting.
- We don't want to keep a long-term counted reference on each NFSD's
- net-ns in the client because that prevents a server container from
- completely shutting down.
-
- So we avoid taking a reference at all and rely on the per-cpu
- reference to the server (detailed below) being sufficient to keep
- the net-ns active. This involves allowing the NFSD's net-ns exit
- code to iterate all active clients and clear their ->net pointers
- (which are needed to find the per-cpu-refcount for the nfsd_serv).
-
- Details:
-
- - Embed nfs_uuid_t in nfs_client. nfs_uuid_t provides a list_head
- that can be used to find the client. It does add the 16-byte
- uuid_t to nfs_client so it is bigger than needed (given that
- uuid_t is only used during the initial NFS client and server
- LOCALIO handshake to determine if they are local to each other).
- If that is really a problem we can find a fix.
-
- - When the nfs server confirms that the uuid_t is local, it moves
- the nfs_uuid_t onto a per-net-ns list in NFSD's nfsd_net.
-
- - When each server's net-ns is shutting down - in a "pre_exit"
- handler, all these nfs_uuid_t have their ->net cleared. There is
- an rcu_synchronize() call between pre_exit() handlers and exit()
- handlers so any caller that sees nfs_uuid_t ->net as not NULL can
- safely manage the per-cpu-refcount for nfsd_serv.
-
- - The client's nfs_uuid_t is passed to nfsd_open_local_fh() so it
- can safely dereference ->net in a private rcu_read_lock() section
- to allow safe access to the associated nfsd_net and nfsd_serv.
-
-So LOCALIO required the introduction and use of NFSD's percpu_ref to
-interlock nfsd_destroy_serv() and nfsd_open_local_fh(), to ensure each
-nn->nfsd_serv is not destroyed while in use by nfsd_open_local_fh(), and
+LOCALIO required the introduction and use of NFSD's percpu nfsd_net_ref
+to interlock nfsd_shutdown_net() and nfsd_open_local_fh(), to ensure
+each net-ns is not destroyed while in use by nfsd_open_local_fh(), and
warrants a more detailed explanation:
- nfsd_open_local_fh() uses nfsd_serv_try_get() before opening its
+ nfsd_open_local_fh() uses nfsd_net_try_get() before opening its
nfsd_file handle and then the caller (NFS client) must drop the
- reference for the nfsd_file and associated nn->nfsd_serv using
- nfs_file_put_local() once it has completed its IO.
+ reference for the nfsd_file and associated net-ns using
+ nfsd_file_put_local() once it has completed its IO.
This interlock working relies heavily on nfsd_open_local_fh() being
afforded the ability to safely deal with the possibility that the
NFSD's net-ns (and nfsd_net by association) may have been destroyed
- by nfsd_destroy_serv() via nfsd_shutdown_net() -- which is only
- possible given the nfs_uuid_t ->net pointer managemenet detailed
- above.
+ by nfsd_destroy_serv() via nfsd_shutdown_net().
-All told, this elaborate interlock of the NFS client and server has been
-verified to fix an easy to hit crash that would occur if an NFSD
-instance running in a container, with a LOCALIO client mounted, is
-shutdown. Upon restart of the container and associated NFSD the client
-would go on to crash due to NULL pointer dereference that occurred due
-to the LOCALIO client's attempting to nfsd_open_local_fh(), using
-nn->nfsd_serv, without having a proper reference on nn->nfsd_serv.
+This interlock of the NFS client and server has been verified to fix an
+easy to hit crash that would occur if an NFSD instance running in a
+container, with a LOCALIO client mounted, is shutdown. Upon restart of
+the container and associated NFSD, the client would go on to crash due
+to NULL pointer dereference that occurred due to the LOCALIO client's
+attempting to nfsd_open_local_fh() without having a proper reference on
+NFSD's net-ns.
NFS Client issues IO instead of Server
======================================
@@ -308,14 +274,17 @@ fs/nfs/localio.c:nfs_local_commit().
With normal NFS that makes use of RPC to issue IO to the server, if an
application uses O_DIRECT the NFS client will bypass the pagecache but
-the NFS server will not. Because the NFS server's use of buffered IO
-affords applications to be less precise with their alignment when
-issuing IO to the NFS client. LOCALIO can be configured to use O_DIRECT
-semantics by setting the 'localio_O_DIRECT_semantics' nfs module
+the NFS server will not. The NFS server's use of buffered IO affords
+applications to be less precise with their alignment when issuing IO to
+the NFS client. But if all applications properly align their IO, LOCALIO
+can be configured to use end-to-end O_DIRECT semantics from the NFS
+client to the underlying local filesystem, that it is sharing with
+the NFS server, by setting the 'localio_O_DIRECT_semantics' nfs module
parameter to Y, e.g.:
- echo Y > /sys/module/nfs/parameters/localio_O_DIRECT_semantics
-Once enabled, it will cause LOCALIO to use O_DIRECT semantics (this may
-cause IO to fail if applications do not properly align their IO).
+ echo Y > /sys/module/nfs/parameters/localio_O_DIRECT_semantics
+Once enabled, it will cause LOCALIO to use end-to-end O_DIRECT semantics
+(but again, this may cause IO to fail if applications do not properly
+align their IO).
Security
========
diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
index 35a2e48731df6..e75cd21f4c8bc 100644
--- a/fs/nfs_common/nfslocalio.c
+++ b/fs/nfs_common/nfslocalio.c
@@ -159,6 +159,10 @@ static void nfs_uuid_add_file(nfs_uuid_t *nfs_uuid, struct nfs_file_localio *nfl
spin_unlock(&nfs_uuid_lock);
}
+/*
+ * Caller is responsible for calling nfsd_net_put and
+ * nfsd_file_put (via nfs_to_nfsd_file_put_local).
+ */
struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
struct rpc_clnt *rpc_clnt, const struct cred *cred,
const struct nfs_fh *nfs_fh, struct nfs_file_localio *nfl,
@@ -171,7 +175,7 @@ struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
* Not running in nfsd context, so must safely get reference on nfsd_serv.
* But the server may already be shutting down, if so disallow new localio.
* uuid->net is NOT a counted reference, but rcu_read_lock() ensures that
- * if uuid->net is not NULL, then calling nfsd_serv_try_get() is safe
+ * if uuid->net is not NULL, then calling nfsd_net_try_get() is safe
* and if it succeeds we will have an implied reference to the net.
*
* Otherwise NFS may not have ref on NFSD and therefore cannot safely
@@ -179,12 +183,12 @@ struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
*/
rcu_read_lock();
net = rcu_dereference(uuid->net);
- if (!net || !nfs_to->nfsd_serv_try_get(net)) {
+ if (!net || !nfs_to->nfsd_net_try_get(net)) {
rcu_read_unlock();
return ERR_PTR(-ENXIO);
}
rcu_read_unlock();
- /* We have an implied reference to net thanks to nfsd_serv_try_get */
+ /* We have an implied reference to net thanks to nfsd_net_try_get */
localio = nfs_to->nfsd_open_local_fh(net, uuid->dom, rpc_clnt,
cred, nfs_fh, fmode);
if (IS_ERR(localio))
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 9a62b4da89bb7..fac98b2cb463a 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -391,7 +391,7 @@ nfsd_file_put(struct nfsd_file *nf)
}
/**
- * nfsd_file_put_local - put nfsd_file reference and arm nfsd_serv_put in caller
+ * nfsd_file_put_local - put nfsd_file reference and arm nfsd_net_put in caller
* @nf: nfsd_file of which to put the reference
*
* First save the associated net to return to caller, then put
diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c
index 8beda4c851119..f9a91cd3b5ec7 100644
--- a/fs/nfsd/localio.c
+++ b/fs/nfsd/localio.c
@@ -25,8 +25,8 @@
#include "cache.h"
static const struct nfsd_localio_operations nfsd_localio_ops = {
- .nfsd_serv_try_get = nfsd_serv_try_get,
- .nfsd_serv_put = nfsd_serv_put,
+ .nfsd_net_try_get = nfsd_net_try_get,
+ .nfsd_net_put = nfsd_net_put,
.nfsd_open_local_fh = nfsd_open_local_fh,
.nfsd_file_put_local = nfsd_file_put_local,
.nfsd_file_get = nfsd_file_get,
diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index 26f7b34d1a030..8faef59d71229 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -140,9 +140,10 @@ struct nfsd_net {
struct svc_info nfsd_info;
#define nfsd_serv nfsd_info.serv
- struct percpu_ref nfsd_serv_ref;
- struct completion nfsd_serv_confirm_done;
- struct completion nfsd_serv_free_done;
+
+ struct percpu_ref nfsd_net_ref;
+ struct completion nfsd_net_confirm_done;
+ struct completion nfsd_net_free_done;
/*
* clientid and stateid data for construction of net unique COPY
@@ -229,8 +230,8 @@ struct nfsd_net {
extern bool nfsd_support_version(int vers);
extern unsigned int nfsd_net_id;
-bool nfsd_serv_try_get(struct net *net);
-void nfsd_serv_put(struct net *net);
+bool nfsd_net_try_get(struct net *net);
+void nfsd_net_put(struct net *net);
void nfsd_copy_write_verifier(__be32 verf[2], struct nfsd_net *nn);
void nfsd_reset_write_verifier(struct nfsd_net *nn);
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index 6ca5540424266..e937e2d0ce62c 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -214,32 +214,32 @@ int nfsd_minorversion(struct nfsd_net *nn, u32 minorversion, enum vers_op change
return 0;
}
-bool nfsd_serv_try_get(struct net *net) __must_hold(rcu)
+bool nfsd_net_try_get(struct net *net) __must_hold(rcu)
{
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
- return (nn && percpu_ref_tryget_live(&nn->nfsd_serv_ref));
+ return (nn && percpu_ref_tryget_live(&nn->nfsd_net_ref));
}
-void nfsd_serv_put(struct net *net) __must_hold(rcu)
+void nfsd_net_put(struct net *net) __must_hold(rcu)
{
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
- percpu_ref_put(&nn->nfsd_serv_ref);
+ percpu_ref_put(&nn->nfsd_net_ref);
}
-static void nfsd_serv_done(struct percpu_ref *ref)
+static void nfsd_net_done(struct percpu_ref *ref)
{
- struct nfsd_net *nn = container_of(ref, struct nfsd_net, nfsd_serv_ref);
+ struct nfsd_net *nn = container_of(ref, struct nfsd_net, nfsd_net_ref);
- complete(&nn->nfsd_serv_confirm_done);
+ complete(&nn->nfsd_net_confirm_done);
}
-static void nfsd_serv_free(struct percpu_ref *ref)
+static void nfsd_net_free(struct percpu_ref *ref)
{
- struct nfsd_net *nn = container_of(ref, struct nfsd_net, nfsd_serv_ref);
+ struct nfsd_net *nn = container_of(ref, struct nfsd_net, nfsd_net_ref);
- complete(&nn->nfsd_serv_free_done);
+ complete(&nn->nfsd_net_free_done);
}
/*
@@ -437,8 +437,8 @@ static void nfsd_shutdown_net(struct net *net)
if (!nn->nfsd_net_up)
return;
- percpu_ref_kill_and_confirm(&nn->nfsd_serv_ref, nfsd_serv_done);
- wait_for_completion(&nn->nfsd_serv_confirm_done);
+ percpu_ref_kill_and_confirm(&nn->nfsd_net_ref, nfsd_net_done);
+ wait_for_completion(&nn->nfsd_net_confirm_done);
nfsd_export_flush(net);
nfs4_state_shutdown_net(net);
@@ -449,8 +449,8 @@ static void nfsd_shutdown_net(struct net *net)
nn->lockd_up = false;
}
- wait_for_completion(&nn->nfsd_serv_free_done);
- percpu_ref_exit(&nn->nfsd_serv_ref);
+ wait_for_completion(&nn->nfsd_net_free_done);
+ percpu_ref_exit(&nn->nfsd_net_ref);
nn->nfsd_net_up = false;
nfsd_shutdown_generic();
@@ -654,12 +654,12 @@ int nfsd_create_serv(struct net *net)
if (nn->nfsd_serv)
return 0;
- error = percpu_ref_init(&nn->nfsd_serv_ref, nfsd_serv_free,
+ error = percpu_ref_init(&nn->nfsd_net_ref, nfsd_net_free,
0, GFP_KERNEL);
if (error)
return error;
- init_completion(&nn->nfsd_serv_free_done);
- init_completion(&nn->nfsd_serv_confirm_done);
+ init_completion(&nn->nfsd_net_free_done);
+ init_completion(&nn->nfsd_net_confirm_done);
if (nfsd_max_blksize == 0)
nfsd_max_blksize = nfsd_get_default_max_blksize();
diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h
index 7cfc6720ed26d..aa2b5c6561ab0 100644
--- a/include/linux/nfslocalio.h
+++ b/include/linux/nfslocalio.h
@@ -52,8 +52,8 @@ nfsd_open_local_fh(struct net *, struct auth_domain *, struct rpc_clnt *,
void nfs_close_local_fh(struct nfs_file_localio *);
struct nfsd_localio_operations {
- bool (*nfsd_serv_try_get)(struct net *);
- void (*nfsd_serv_put)(struct net *);
+ bool (*nfsd_net_try_get)(struct net *);
+ void (*nfsd_net_put)(struct net *);
struct nfsd_file *(*nfsd_open_local_fh)(struct net *,
struct auth_domain *,
struct rpc_clnt *,
@@ -77,12 +77,12 @@ struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *,
static inline void nfs_to_nfsd_net_put(struct net *net)
{
/*
- * Once reference to nfsd_serv is dropped, NFSD could be
- * unloaded, so ensure safe return from nfsd_file_put_local()
- * by always taking RCU.
+ * Once reference to net (and associated nfsd_serv) is dropped, NFSD
+ * could be unloaded, so ensure safe return from nfsd_net_put() by
+ * always taking RCU.
*/
rcu_read_lock();
- nfs_to->nfsd_serv_put(net);
+ nfs_to->nfsd_net_put(net);
rcu_read_unlock();
}
--
2.44.0
next prev parent reply other threads:[~2024-11-14 4:00 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-14 3:59 [for-6.13 PATCH v2 00/15] nfs/nfsd: fixes and improvements for LOCALIO Mike Snitzer
2024-11-14 3:59 ` [for-6.13 PATCH v2 01/15] nfs_common: must not hold RCU while calling nfsd_file_put_local Mike Snitzer
2024-11-14 3:59 ` [for-6.13 PATCH v2 02/15] nfs/localio: add direct IO enablement with sync and async IO support Mike Snitzer
2024-11-14 3:59 ` [for-6.13 PATCH v2 03/15] nfsd: add nfsd_file_{get,put} to 'nfs_to' nfsd_localio_operations Mike Snitzer
2024-11-14 3:59 ` [for-6.13 PATCH v2 04/15] nfs_common: rename functions that invalidate LOCALIO nfs_clients Mike Snitzer
2024-11-14 3:59 ` [for-6.13 PATCH v2 05/15] nfs_common: move localio_lock to new lock member of nfs_uuid_t Mike Snitzer
2024-11-14 3:59 ` [for-6.13 PATCH v2 06/15] nfs: cache all open LOCALIO nfsd_file(s) in client Mike Snitzer
2024-11-15 3:12 ` NeilBrown
2024-11-15 16:49 ` Mike Snitzer
2024-11-14 3:59 ` [for-6.13 PATCH v2 07/15] nfsd: update percpu_ref to manage references on nfsd_net Mike Snitzer
2024-11-14 3:59 ` Mike Snitzer [this message]
2024-11-14 3:59 ` [for-6.13 PATCH v2 09/15] nfsd: nfsd_file_acquire_local no longer returns GC'd nfsd_file Mike Snitzer
2024-11-14 3:59 ` [for-6.13 PATCH v2 10/15] nfs_common: rename nfslocalio nfs_uuid_lock to nfs_uuids_lock Mike Snitzer
2024-11-14 3:59 ` [for-6.13 PATCH v2 11/15] nfs_common: track all open nfsd_files per LOCALIO nfs_client Mike Snitzer
2024-11-14 18:53 ` Jeff Layton
2024-11-14 23:22 ` Mike Snitzer
2024-11-14 3:59 ` [for-6.13 PATCH v2 12/15] nfs_common: add nfs_localio trace events Mike Snitzer
2024-11-14 3:59 ` [for-6.13 PATCH v2 13/15] nfs/localio: remove redundant code and simplify LOCALIO enablement Mike Snitzer
2024-11-14 15:31 ` [for-6.13 PATCH v2-fixed " Mike Snitzer
2024-11-14 3:59 ` [for-6.13 PATCH v2 14/15] nfs: probe for LOCALIO when v4 client reconnects to server Mike Snitzer
2024-11-14 3:59 ` [for-6.13 PATCH v2 15/15] nfs: probe for LOCALIO when v3 " Mike Snitzer
2024-11-15 3:55 ` NeilBrown
2024-11-14 16:14 ` (subset) [for-6.13 PATCH v2 00/15] nfs/nfsd: fixes and improvements for LOCALIO cel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241114035952.13889-9-snitzer@kernel.org \
--to=snitzer@kernel.org \
--cc=anna@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
--cc=trondmy@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox