* [PATCH 0/6] nfs_localio: fixes for races and errors from older compilers
@ 2025-04-30 4:01 NeilBrown
2025-04-30 4:01 ` [PATCH 1/6] nfs_localio: use cmpxchg() to install new nfs_file_localio NeilBrown
` (5 more replies)
0 siblings, 6 replies; 13+ messages in thread
From: NeilBrown @ 2025-04-30 4:01 UTC (permalink / raw)
To: Trond Myklebust, Anna Schumaker, Mike Snitzer, Pali Rohár,
Vincent Mailhol
Cc: Chuck Lever, Jeff Layton, linux-nfs
Following the reports of older compilers complaining about the rcu
annotations in nfs-localio I reviewed the relevant code and found some
races and opportunities for simplification.
These patches address the various issues. They compile with old and new
versions of gcc and don't introduce new sprase warnings. I haven't
tested that localio, or even NFS, still work.
NeilBrown
[PATCH 1/6] nfs_localio: use cmpxchg() to install new
[PATCH 2/6] nfs_localio: always hold nfsd net ref with nfsd_file ref
[PATCH 3/6] nfs_localio: simplify interface to nfsd for getting
[PATCH 4/6] nfs_localio: change nfsd_file_put_local() to take a
[PATCH 5/6] nfs_localio: duplicate nfs_close_local_fh()
[PATCH 6/6] nfs_localio: protect race between nfs_uuid_put() and
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/6] nfs_localio: use cmpxchg() to install new nfs_file_localio
2025-04-30 4:01 [PATCH 0/6] nfs_localio: fixes for races and errors from older compilers NeilBrown
@ 2025-04-30 4:01 ` NeilBrown
2025-04-30 22:21 ` Mike Snitzer
2025-04-30 4:01 ` [PATCH 2/6] nfs_localio: always hold nfsd net ref with nfsd_file ref NeilBrown
` (4 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: NeilBrown @ 2025-04-30 4:01 UTC (permalink / raw)
To: Trond Myklebust, Anna Schumaker, Mike Snitzer, Pali Rohár,
Vincent Mailhol
Cc: Chuck Lever, Jeff Layton, linux-nfs
Rather than using nfs_uuid.lock to protect installing
a new ro_file or rw_file, change to use cmpxchg().
Removing the file already uses xchg() so this improves symmetry
and also makes the code a little simpler.
Also remove the optimisation of not taking the lock, and not removing
the nfs_file_localio from the linked list, when both ->ro_file and
->rw_file are already NULL. Given that ->nfs_uuid was not NULL, it is
extremely unlikely that neither ->ro_file or ->rw_file is NULL so
this optimisation can be of little value and it complicates
understanding of the code - why can the list_del_init() be skipped?
Finally, move the assignment of NULL to ->nfs_uuid until after
the last action on the nfs_file_localio (the list_del_init). As soon as
this is NULL a racing nfs_close_local_fh() can bypass all the locking
and go on to free the nfs_file_localio, so we must be certain to be
finished with it first.
Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
---
fs/nfs/localio.c | 11 +++--------
fs/nfs_common/nfslocalio.c | 36 ++++++++++++++++--------------------
2 files changed, 19 insertions(+), 28 deletions(-)
diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
index 5c21caeae075..3674dd86f095 100644
--- a/fs/nfs/localio.c
+++ b/fs/nfs/localio.c
@@ -279,14 +279,9 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred,
if (IS_ERR(new))
return NULL;
/* try to swap in the pointer */
- spin_lock(&clp->cl_uuid.lock);
- nf = rcu_dereference_protected(*pnf, 1);
- if (!nf) {
- nf = new;
- new = NULL;
- rcu_assign_pointer(*pnf, nf);
- }
- spin_unlock(&clp->cl_uuid.lock);
+ nf = unrcu_pointer(cmpxchg(pnf, NULL, RCU_INITIALIZER(new)));
+ if (!nf)
+ swap(nf, new);
rcu_read_lock();
}
nf = nfs_local_file_get(nf);
diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
index 6a0bdea6d644..d72eecb85ea9 100644
--- a/fs/nfs_common/nfslocalio.c
+++ b/fs/nfs_common/nfslocalio.c
@@ -285,28 +285,24 @@ void nfs_close_local_fh(struct nfs_file_localio *nfl)
return;
}
- ro_nf = rcu_access_pointer(nfl->ro_file);
- rw_nf = rcu_access_pointer(nfl->rw_file);
- if (ro_nf || rw_nf) {
- spin_lock(&nfs_uuid->lock);
- if (ro_nf)
- ro_nf = rcu_dereference_protected(xchg(&nfl->ro_file, NULL), 1);
- if (rw_nf)
- rw_nf = rcu_dereference_protected(xchg(&nfl->rw_file, NULL), 1);
-
- /* Remove nfl from nfs_uuid->files list */
- RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
- list_del_init(&nfl->list);
- spin_unlock(&nfs_uuid->lock);
- rcu_read_unlock();
+ ro_nf = unrcu_pointer(xchg(&nfl->ro_file, NULL));
+ rw_nf = unrcu_pointer(xchg(&nfl->rw_file, NULL));
- if (ro_nf)
- nfs_to_nfsd_file_put_local(ro_nf);
- if (rw_nf)
- nfs_to_nfsd_file_put_local(rw_nf);
- return;
- }
+ spin_lock(&nfs_uuid->lock);
+ /* Remove nfl from nfs_uuid->files list */
+ list_del_init(&nfl->list);
+ spin_unlock(&nfs_uuid->lock);
rcu_read_unlock();
+ /* Now we can allow racing nfs_close_local_fh() to
+ * skip the locking.
+ */
+ RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
+
+ if (ro_nf)
+ nfs_to_nfsd_file_put_local(ro_nf);
+ if (rw_nf)
+ nfs_to_nfsd_file_put_local(rw_nf);
+ return;
}
EXPORT_SYMBOL_GPL(nfs_close_local_fh);
--
2.49.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/6] nfs_localio: always hold nfsd net ref with nfsd_file ref
2025-04-30 4:01 [PATCH 0/6] nfs_localio: fixes for races and errors from older compilers NeilBrown
2025-04-30 4:01 ` [PATCH 1/6] nfs_localio: use cmpxchg() to install new nfs_file_localio NeilBrown
@ 2025-04-30 4:01 ` NeilBrown
2025-04-30 4:01 ` [PATCH 3/6] nfs_localio: simplify interface to nfsd for getting nfsd_file NeilBrown
` (3 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: NeilBrown @ 2025-04-30 4:01 UTC (permalink / raw)
To: Trond Myklebust, Anna Schumaker, Mike Snitzer, Pali Rohár,
Vincent Mailhol
Cc: Chuck Lever, Jeff Layton, linux-nfs
Having separate nfsd_file_put and nfsd_file_put_local in struct
nfsd_localio_operations doesn't make much sense. The difference is that
nfsd_file_put doesn't drop a reference to the nfs_net which is what
keeps nfsd from shutting down.
Currently, if nfsd tries to shutdown it will invalidate the files stored
in the list from the nfs_uuid and this will drop all references to the
nfsd net that the client holds. But the client could still hold some
references to nfsd_files for active IO. So nfsd might think is has
completely shut down local IO, but hasn't and has no way to wait for
those active IO requests to complete.
So this patch changes nfsd_file_get to nfsd_file_get_local and has it
increase the ref count on the nfsd net and it replaces all calls to
->nfsd_put_file to ->nfsd_put_file_local.
It also changes ->nfsd_open_local_fh to return with the refcount on the
net elevated precisely when a valid nfsd_file is returned.
This means that whenever the client holds a valid nfsd_file, there will
be an associated count on the nfsd net, and so the count can only reach
zero when all nfsd_files have been returned.
nfs_local_file_put() is changed to call nfs_to_nfsd_file_put_local()
instead of replacing calls to one with calls to the other because this
will help a later patch which changes nfs_to_nfsd_file_put_local() to
take an __rcu pointer while nfs_local_file_put() doesn't.
Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
---
fs/nfs/localio.c | 4 ++--
fs/nfs_common/nfslocalio.c | 5 ++---
fs/nfsd/filecache.c | 21 +++++++++++++++++++++
fs/nfsd/filecache.h | 1 +
fs/nfsd/localio.c | 9 +++++++--
include/linux/nfslocalio.h | 3 +--
6 files changed, 34 insertions(+), 9 deletions(-)
diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
index 3674dd86f095..27d5fff9747b 100644
--- a/fs/nfs/localio.c
+++ b/fs/nfs/localio.c
@@ -209,12 +209,12 @@ EXPORT_SYMBOL_GPL(nfs_local_probe_async);
static inline struct nfsd_file *nfs_local_file_get(struct nfsd_file *nf)
{
- return nfs_to->nfsd_file_get(nf);
+ return nfs_to->nfsd_file_get_local(nf);
}
static inline void nfs_local_file_put(struct nfsd_file *nf)
{
- nfs_to->nfsd_file_put(nf);
+ nfs_to_nfsd_file_put_local(nf);
}
/*
diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
index d72eecb85ea9..7e73bbf593b9 100644
--- a/fs/nfs_common/nfslocalio.c
+++ b/fs/nfs_common/nfslocalio.c
@@ -262,9 +262,8 @@ struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
/* We have an implied reference to net thanks to nfsd_net_try_get */
localio = nfs_to->nfsd_open_local_fh(net, uuid->dom, rpc_clnt,
cred, nfs_fh, fmode);
- if (IS_ERR(localio))
- nfs_to_nfsd_net_put(net);
- else
+ nfs_to_nfsd_net_put(net);
+ if (!IS_ERR(localio))
nfs_uuid_add_file(uuid, nfl);
return localio;
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index ab85e6a2454f..473697278d8f 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -386,6 +386,27 @@ nfsd_file_put_local(struct nfsd_file *nf)
return net;
}
+/**
+ * nfsd_file_get_local - get nfsd_file reference and reference to net
+ * @nf: nfsd_file of which to put the reference
+ *
+ * Get reference to both
+ */
+struct nfsd_file *
+nfsd_file_get_local(struct nfsd_file *nf)
+{
+ struct net *net = nf->nf_net;
+
+ if (nfsd_net_try_get(net)) {
+ nf = nfsd_file_get(nf);
+ if (!nf)
+ nfsd_net_put(net);
+ } else {
+ nf = NULL;
+ }
+ return nf;
+}
+
/**
* nfsd_file_file - get the backing file of an nfsd_file
* @nf: nfsd_file of which to access the backing file.
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
index 5865f9c72712..cd02f91aaef1 100644
--- a/fs/nfsd/filecache.h
+++ b/fs/nfsd/filecache.h
@@ -63,6 +63,7 @@ int nfsd_file_cache_start_net(struct net *net);
void nfsd_file_cache_shutdown_net(struct net *net);
void nfsd_file_put(struct nfsd_file *nf);
struct net *nfsd_file_put_local(struct nfsd_file *nf);
+struct nfsd_file *nfsd_file_get_local(struct nfsd_file *nf);
struct nfsd_file *nfsd_file_get(struct nfsd_file *nf);
struct file *nfsd_file_file(struct nfsd_file *nf);
void nfsd_file_close_inode_sync(struct inode *inode);
diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c
index 238647fa379e..2c0afd1067ea 100644
--- a/fs/nfsd/localio.c
+++ b/fs/nfsd/localio.c
@@ -29,8 +29,7 @@ static const struct nfsd_localio_operations nfsd_localio_ops = {
.nfsd_net_put = nfsd_net_put,
.nfsd_open_local_fh = nfsd_open_local_fh,
.nfsd_file_put_local = nfsd_file_put_local,
- .nfsd_file_get = nfsd_file_get,
- .nfsd_file_put = nfsd_file_put,
+ .nfsd_file_get_local = nfsd_file_get_local,
.nfsd_file_file = nfsd_file_file,
};
@@ -71,6 +70,9 @@ nfsd_open_local_fh(struct net *net, struct auth_domain *dom,
if (nfs_fh->size > NFS4_FHSIZE)
return ERR_PTR(-EINVAL);
+ if (!nfsd_net_try_get(net))
+ return ERR_PTR(-ENXIO);
+
/* nfs_fh -> svc_fh */
fh_init(&fh, NFS4_FHSIZE);
fh.fh_handle.fh_size = nfs_fh->size;
@@ -92,6 +94,9 @@ nfsd_open_local_fh(struct net *net, struct auth_domain *dom,
if (rq_cred.cr_group_info)
put_group_info(rq_cred.cr_group_info);
+ if (IS_ERR(localio))
+ nfsd_net_put(net);
+
return localio;
}
EXPORT_SYMBOL_GPL(nfsd_open_local_fh);
diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h
index 9aa8a43843d7..c3f34bae60e1 100644
--- a/include/linux/nfslocalio.h
+++ b/include/linux/nfslocalio.h
@@ -66,8 +66,7 @@ struct nfsd_localio_operations {
const struct nfs_fh *,
const fmode_t);
struct net *(*nfsd_file_put_local)(struct nfsd_file *);
- struct nfsd_file *(*nfsd_file_get)(struct nfsd_file *);
- void (*nfsd_file_put)(struct nfsd_file *);
+ struct nfsd_file *(*nfsd_file_get_local)(struct nfsd_file *);
struct file *(*nfsd_file_file)(struct nfsd_file *);
} ____cacheline_aligned;
--
2.49.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/6] nfs_localio: simplify interface to nfsd for getting nfsd_file
2025-04-30 4:01 [PATCH 0/6] nfs_localio: fixes for races and errors from older compilers NeilBrown
2025-04-30 4:01 ` [PATCH 1/6] nfs_localio: use cmpxchg() to install new nfs_file_localio NeilBrown
2025-04-30 4:01 ` [PATCH 2/6] nfs_localio: always hold nfsd net ref with nfsd_file ref NeilBrown
@ 2025-04-30 4:01 ` NeilBrown
2025-04-30 4:01 ` [PATCH 4/6] nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu struct NeilBrown
` (2 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: NeilBrown @ 2025-04-30 4:01 UTC (permalink / raw)
To: Trond Myklebust, Anna Schumaker, Mike Snitzer, Pali Rohár,
Vincent Mailhol
Cc: Chuck Lever, Jeff Layton, linux-nfs
The nfsd_localio_operations structure contains nfsd_file_get() to get a
reference to an nfsd_file. This is only used in one place, where
nfsd_open_local_fh() is also used.
This patch combines the two, calling nfsd_open_local_fh() passing a
pointer to where the nfsd_file pointer might be stored. If there is a
pointer there an nfsd_file_get() can get a reference, that reference is
returned. If not a new nfsd_file is acquired, stored at the pointer,
and returned.
We now get an extra reference *before* storing the new nfsd_file at the
given location. This avoids possible races with the nfsd_file being
freed before the final reference can be taken.
This patch moves the rcu_dereference() needed after fetching from
ro_file or rw_file into the nfsd code where the 'struct nfs_file' is
fully defined. This avoids an error reported by older versions of gcc
such as gcc-8 which complain about rcu_dereference() use in contexts
where the structure (which will supposedly be accessed) is not fully
defined.
Reported-by: Pali Rohár <pali@kernel.org>
Reported-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
---
fs/nfs/localio.c | 31 ++++----------------
fs/nfs_common/nfslocalio.c | 3 +-
fs/nfsd/localio.c | 59 +++++++++++++++++++++++++++-----------
include/linux/nfslocalio.h | 6 ++--
4 files changed, 52 insertions(+), 47 deletions(-)
diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
index 27d5fff9747b..030a54c8c9d8 100644
--- a/fs/nfs/localio.c
+++ b/fs/nfs/localio.c
@@ -207,11 +207,6 @@ void nfs_local_probe_async(struct nfs_client *clp)
}
EXPORT_SYMBOL_GPL(nfs_local_probe_async);
-static inline struct nfsd_file *nfs_local_file_get(struct nfsd_file *nf)
-{
- return nfs_to->nfsd_file_get_local(nf);
-}
-
static inline void nfs_local_file_put(struct nfsd_file *nf)
{
nfs_to_nfsd_file_put_local(nf);
@@ -226,12 +221,13 @@ static inline void nfs_local_file_put(struct nfsd_file *nf)
static struct nfsd_file *
__nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred,
struct nfs_fh *fh, struct nfs_file_localio *nfl,
+ struct nfsd_file __rcu **pnf,
const fmode_t mode)
{
struct nfsd_file *localio;
localio = nfs_open_local_fh(&clp->cl_uuid, clp->cl_rpcclient,
- cred, fh, nfl, mode);
+ cred, fh, nfl, pnf, mode);
if (IS_ERR(localio)) {
int status = PTR_ERR(localio);
trace_nfs_local_open_fh(fh, mode, status);
@@ -258,7 +254,7 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred,
struct nfs_fh *fh, struct nfs_file_localio *nfl,
const fmode_t mode)
{
- struct nfsd_file *nf, *new, __rcu **pnf;
+ struct nfsd_file *nf, __rcu **pnf;
if (!nfs_server_is_local(clp))
return NULL;
@@ -270,24 +266,9 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred,
else
pnf = &nfl->ro_file;
- new = NULL;
- rcu_read_lock();
- nf = rcu_dereference(*pnf);
- if (!nf) {
- rcu_read_unlock();
- new = __nfs_local_open_fh(clp, cred, fh, nfl, mode);
- if (IS_ERR(new))
- return NULL;
- /* try to swap in the pointer */
- nf = unrcu_pointer(cmpxchg(pnf, NULL, RCU_INITIALIZER(new)));
- if (!nf)
- swap(nf, new);
- rcu_read_lock();
- }
- nf = nfs_local_file_get(nf);
- rcu_read_unlock();
- if (new)
- nfs_to_nfsd_file_put_local(new);
+ nf = __nfs_local_open_fh(clp, cred, fh, nfl, pnf, mode);
+ if (IS_ERR(nf))
+ return NULL;
return nf;
}
EXPORT_SYMBOL_GPL(nfs_local_open_fh);
diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
index 7e73bbf593b9..d9e2f65912ef 100644
--- a/fs/nfs_common/nfslocalio.c
+++ b/fs/nfs_common/nfslocalio.c
@@ -237,6 +237,7 @@ static void nfs_uuid_add_file(nfs_uuid_t *nfs_uuid, struct nfs_file_localio *nfl
struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
struct rpc_clnt *rpc_clnt, const struct cred *cred,
const struct nfs_fh *nfs_fh, struct nfs_file_localio *nfl,
+ struct nfsd_file __rcu **pnf,
const fmode_t fmode)
{
struct net *net;
@@ -261,7 +262,7 @@ struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
rcu_read_unlock();
/* We have an implied reference to net thanks to nfsd_net_try_get */
localio = nfs_to->nfsd_open_local_fh(net, uuid->dom, rpc_clnt,
- cred, nfs_fh, fmode);
+ cred, nfs_fh, pnf, fmode);
nfs_to_nfsd_net_put(net);
if (!IS_ERR(localio))
nfs_uuid_add_file(uuid, nfl);
diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c
index 2c0afd1067ea..cb9042761eaf 100644
--- a/fs/nfsd/localio.c
+++ b/fs/nfsd/localio.c
@@ -24,20 +24,6 @@
#include "filecache.h"
#include "cache.h"
-static const struct nfsd_localio_operations nfsd_localio_ops = {
- .nfsd_net_try_get = nfsd_net_try_get,
- .nfsd_net_put = nfsd_net_put,
- .nfsd_open_local_fh = nfsd_open_local_fh,
- .nfsd_file_put_local = nfsd_file_put_local,
- .nfsd_file_get_local = nfsd_file_get_local,
- .nfsd_file_file = nfsd_file_file,
-};
-
-void nfsd_localio_ops_init(void)
-{
- nfs_to = &nfsd_localio_ops;
-}
-
/**
* nfsd_open_local_fh - lookup a local filehandle @nfs_fh and map to nfsd_file
*
@@ -46,6 +32,7 @@ void nfsd_localio_ops_init(void)
* @rpc_clnt: rpc_clnt that the client established
* @cred: cred that the client established
* @nfs_fh: filehandle to lookup
+ * @nfp: place to find the nfsd_file, or store it if it was non-NULL
* @fmode: fmode_t to use for open
*
* This function maps a local fh to a path on a local filesystem.
@@ -56,10 +43,11 @@ void nfsd_localio_ops_init(void)
* set. Caller (NFS client) is responsible for calling nfsd_net_put and
* nfsd_file_put (via nfs_to_nfsd_file_put_local).
*/
-struct nfsd_file *
+static struct nfsd_file *
nfsd_open_local_fh(struct net *net, struct auth_domain *dom,
struct rpc_clnt *rpc_clnt, const struct cred *cred,
- const struct nfs_fh *nfs_fh, const fmode_t fmode)
+ const struct nfs_fh *nfs_fh, struct nfsd_file __rcu **pnf,
+ const fmode_t fmode)
{
int mayflags = NFSD_MAY_LOCALIO;
struct svc_cred rq_cred;
@@ -73,6 +61,12 @@ nfsd_open_local_fh(struct net *net, struct auth_domain *dom,
if (!nfsd_net_try_get(net))
return ERR_PTR(-ENXIO);
+ rcu_read_lock();
+ localio = nfsd_file_get(rcu_dereference(*pnf));
+ rcu_read_unlock();
+ if (localio)
+ return localio;
+
/* nfs_fh -> svc_fh */
fh_init(&fh, NFS4_FHSIZE);
fh.fh_handle.fh_size = nfs_fh->size;
@@ -94,12 +88,43 @@ nfsd_open_local_fh(struct net *net, struct auth_domain *dom,
if (rq_cred.cr_group_info)
put_group_info(rq_cred.cr_group_info);
+ if (!IS_ERR(localio)) {
+ struct nfsd_file *new;
+ nfsd_file_get(localio);
+ again:
+ new = unrcu_pointer(cmpxchg(pnf, NULL, RCU_INITIALIZER(localio)));
+ if (new) {
+ /* Some other thread installed an nfsd_file */
+ if (nfsd_file_get(new) == NULL)
+ goto again;
+ /*
+ * Drop the ref we were going to install and the
+ * one we were going to return.
+ */
+ nfsd_file_put(localio);
+ nfsd_file_put(localio);
+ localio = new;
+ }
+ }
if (IS_ERR(localio))
nfsd_net_put(net);
return localio;
}
-EXPORT_SYMBOL_GPL(nfsd_open_local_fh);
+
+static const struct nfsd_localio_operations nfsd_localio_ops = {
+ .nfsd_net_try_get = nfsd_net_try_get,
+ .nfsd_net_put = nfsd_net_put,
+ .nfsd_open_local_fh = nfsd_open_local_fh,
+ .nfsd_file_put_local = nfsd_file_put_local,
+ .nfsd_file_get_local = nfsd_file_get_local,
+ .nfsd_file_file = nfsd_file_file,
+};
+
+void nfsd_localio_ops_init(void)
+{
+ nfs_to = &nfsd_localio_ops;
+}
/*
* UUID_IS_LOCAL XDR functions
diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h
index c3f34bae60e1..e6cd6ec447f5 100644
--- a/include/linux/nfslocalio.h
+++ b/include/linux/nfslocalio.h
@@ -50,10 +50,6 @@ void nfs_localio_invalidate_clients(struct list_head *nn_local_clients,
spinlock_t *nn_local_clients_lock);
/* localio needs to map filehandle -> struct nfsd_file */
-extern struct nfsd_file *
-nfsd_open_local_fh(struct net *, struct auth_domain *, struct rpc_clnt *,
- const struct cred *, const struct nfs_fh *,
- const fmode_t) __must_hold(rcu);
void nfs_close_local_fh(struct nfs_file_localio *);
struct nfsd_localio_operations {
@@ -64,6 +60,7 @@ struct nfsd_localio_operations {
struct rpc_clnt *,
const struct cred *,
const struct nfs_fh *,
+ struct nfsd_file __rcu **pnf,
const fmode_t);
struct net *(*nfsd_file_put_local)(struct nfsd_file *);
struct nfsd_file *(*nfsd_file_get_local)(struct nfsd_file *);
@@ -76,6 +73,7 @@ extern const struct nfsd_localio_operations *nfs_to;
struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *,
struct rpc_clnt *, const struct cred *,
const struct nfs_fh *, struct nfs_file_localio *,
+ struct nfsd_file __rcu **pnf,
const fmode_t);
static inline void nfs_to_nfsd_net_put(struct net *net)
--
2.49.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 4/6] nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu struct
2025-04-30 4:01 [PATCH 0/6] nfs_localio: fixes for races and errors from older compilers NeilBrown
` (2 preceding siblings ...)
2025-04-30 4:01 ` [PATCH 3/6] nfs_localio: simplify interface to nfsd for getting nfsd_file NeilBrown
@ 2025-04-30 4:01 ` NeilBrown
2025-04-30 4:01 ` [PATCH 5/6] nfs_localio: duplicate nfs_close_local_fh() NeilBrown
2025-04-30 4:01 ` [PATCH 6/6] nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh() NeilBrown
5 siblings, 0 replies; 13+ messages in thread
From: NeilBrown @ 2025-04-30 4:01 UTC (permalink / raw)
To: Trond Myklebust, Anna Schumaker, Mike Snitzer, Pali Rohár,
Vincent Mailhol
Cc: Chuck Lever, Jeff Layton, linux-nfs
Instead of calling rcu_dereference() before nfsd_file_put_local(), we
now pass a pointer to an __rcu value and call rcu_dereference() inside
that function.
Where rcu_dereference() is currently called the internals of "struct
nfsd_file" are not known and that causes older compilers such as gcc-8
to complain.
In some cases we have a __kernel (aka normal) pointer not an __rcu
pointer so we need to cast it to __rcu first. This is strictly a
weakening so no information is lost. Somewhat surprisingly, this cast
is accepted by gcc-8.
Also change nfs_to_nfsd_file_put_local() to handle receiving a NULL
pointer, as that makes some callers a bit simpler.
Reported-by: Pali Rohár <pali@kernel.org>
Reported-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
---
fs/nfs/localio.c | 9 ++++++++-
fs/nfs_common/nfslocalio.c | 14 ++++++--------
fs/nfsd/filecache.c | 3 ++-
fs/nfsd/filecache.h | 2 +-
include/linux/nfslocalio.h | 11 ++++++++---
5 files changed, 25 insertions(+), 14 deletions(-)
diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
index 030a54c8c9d8..157f5dd0ab22 100644
--- a/fs/nfs/localio.c
+++ b/fs/nfs/localio.c
@@ -207,8 +207,15 @@ void nfs_local_probe_async(struct nfs_client *clp)
}
EXPORT_SYMBOL_GPL(nfs_local_probe_async);
-static inline void nfs_local_file_put(struct nfsd_file *nf)
+static inline void nfs_local_file_put(struct nfsd_file *localio)
{
+ /* nfs_to_nfsd_file_put_local() expects an __rcu pointer
+ * but we have a __kernel pointer. It is always safe
+ * to cast a __kernel pointer to an __rcu pointer
+ * because the cast only weakens what is known about the pointer.
+ */
+ struct nfsd_file __rcu *nf = (struct nfsd_file __rcu*) localio;
+
nfs_to_nfsd_file_put_local(nf);
}
diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
index d9e2f65912ef..cbf3e38443f9 100644
--- a/fs/nfs_common/nfslocalio.c
+++ b/fs/nfs_common/nfslocalio.c
@@ -273,8 +273,8 @@ EXPORT_SYMBOL_GPL(nfs_open_local_fh);
void nfs_close_local_fh(struct nfs_file_localio *nfl)
{
- struct nfsd_file *ro_nf = NULL;
- struct nfsd_file *rw_nf = NULL;
+ struct nfsd_file __rcu *ro_nf;
+ struct nfsd_file __rcu *rw_nf;
nfs_uuid_t *nfs_uuid;
rcu_read_lock();
@@ -285,8 +285,8 @@ void nfs_close_local_fh(struct nfs_file_localio *nfl)
return;
}
- ro_nf = unrcu_pointer(xchg(&nfl->ro_file, NULL));
- rw_nf = unrcu_pointer(xchg(&nfl->rw_file, NULL));
+ ro_nf = xchg(&nfl->ro_file, RCU_INITIALIZER(NULL));
+ rw_nf = xchg(&nfl->rw_file, RCU_INITIALIZER(NULL));
spin_lock(&nfs_uuid->lock);
/* Remove nfl from nfs_uuid->files list */
@@ -298,10 +298,8 @@ void nfs_close_local_fh(struct nfs_file_localio *nfl)
*/
RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
- if (ro_nf)
- nfs_to_nfsd_file_put_local(ro_nf);
- if (rw_nf)
- nfs_to_nfsd_file_put_local(rw_nf);
+ nfs_to_nfsd_file_put_local(ro_nf);
+ nfs_to_nfsd_file_put_local(rw_nf);
return;
}
EXPORT_SYMBOL_GPL(nfs_close_local_fh);
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 473697278d8f..e1fdc8e2740f 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -378,8 +378,9 @@ nfsd_file_put(struct nfsd_file *nf)
* the reference of the nfsd_file.
*/
struct net *
-nfsd_file_put_local(struct nfsd_file *nf)
+nfsd_file_put_local(struct nfsd_file __rcu *nf_rcu)
{
+ struct nfsd_file *nf = rcu_dereference(nf_rcu);
struct net *net = nf->nf_net;
nfsd_file_put(nf);
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
index cd02f91aaef1..e433ccbc31dc 100644
--- a/fs/nfsd/filecache.h
+++ b/fs/nfsd/filecache.h
@@ -62,7 +62,7 @@ void nfsd_file_cache_shutdown(void);
int nfsd_file_cache_start_net(struct net *net);
void nfsd_file_cache_shutdown_net(struct net *net);
void nfsd_file_put(struct nfsd_file *nf);
-struct net *nfsd_file_put_local(struct nfsd_file *nf);
+struct net *nfsd_file_put_local(struct nfsd_file __rcu *nf);
struct nfsd_file *nfsd_file_get_local(struct nfsd_file *nf);
struct nfsd_file *nfsd_file_get(struct nfsd_file *nf);
struct file *nfsd_file_file(struct nfsd_file *nf);
diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h
index e6cd6ec447f5..e53fd61d0f8b 100644
--- a/include/linux/nfslocalio.h
+++ b/include/linux/nfslocalio.h
@@ -62,7 +62,7 @@ struct nfsd_localio_operations {
const struct nfs_fh *,
struct nfsd_file __rcu **pnf,
const fmode_t);
- struct net *(*nfsd_file_put_local)(struct nfsd_file *);
+ struct net *(*nfsd_file_put_local)(struct nfsd_file __rcu *);
struct nfsd_file *(*nfsd_file_get_local)(struct nfsd_file *);
struct file *(*nfsd_file_file)(struct nfsd_file *);
} ____cacheline_aligned;
@@ -88,14 +88,19 @@ static inline void nfs_to_nfsd_net_put(struct net *net)
rcu_read_unlock();
}
-static inline void nfs_to_nfsd_file_put_local(struct nfsd_file *localio)
+static inline void nfs_to_nfsd_file_put_local(struct nfsd_file __rcu *localio)
{
/*
* Must not hold RCU otherwise nfsd_file_put() can easily trigger:
* "Voluntary context switch within RCU read-side critical section!"
* by scheduling deep in underlying filesystem (e.g. XFS).
*/
- struct net *net = nfs_to->nfsd_file_put_local(localio);
+ struct net *net;
+
+ if (!localio)
+ return;
+
+ net = nfs_to->nfsd_file_put_local(localio);
nfs_to_nfsd_net_put(net);
}
--
2.49.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 5/6] nfs_localio: duplicate nfs_close_local_fh()
2025-04-30 4:01 [PATCH 0/6] nfs_localio: fixes for races and errors from older compilers NeilBrown
` (3 preceding siblings ...)
2025-04-30 4:01 ` [PATCH 4/6] nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu struct NeilBrown
@ 2025-04-30 4:01 ` NeilBrown
2025-04-30 4:01 ` [PATCH 6/6] nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh() NeilBrown
5 siblings, 0 replies; 13+ messages in thread
From: NeilBrown @ 2025-04-30 4:01 UTC (permalink / raw)
To: Trond Myklebust, Anna Schumaker, Mike Snitzer, Pali Rohár,
Vincent Mailhol
Cc: Chuck Lever, Jeff Layton, linux-nfs
nfs_close_local_fh() is called from two different places for quite
different use case.
It is called from nfs_uuid_put() when the nfs_uuid is being detached -
possibly because the nfs server is not longer serving that filesystem.
In this case there will always be an nfs_uuid and so rcu_read_lock() is
not needed.
It is also called when the nfs_file_localio is no longer needed. In
this case there may not be an active nfs_uuid.
These two can race, and handling the race properly while avoiding
excessive locking will require different handling on each side.
This patch prepares the way by opencoding nfs_close_local_fh() into
nfs_uuid_put(), then simplifying the code there as befits the context.
Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
---
fs/nfs_common/nfslocalio.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
index cbf3e38443f9..abf1591a3b7f 100644
--- a/fs/nfs_common/nfslocalio.c
+++ b/fs/nfs_common/nfslocalio.c
@@ -171,7 +171,24 @@ static bool nfs_uuid_put(nfs_uuid_t *nfs_uuid)
/* Walk list of files and ensure their last references dropped */
list_for_each_entry_safe(nfl, tmp, &local_files, list) {
- nfs_close_local_fh(nfl);
+ struct nfsd_file __rcu *ro_nf;
+ struct nfsd_file __rcu *rw_nf;
+
+ ro_nf = xchg(&nfl->ro_file, RCU_INITIALIZER(NULL));
+ rw_nf = xchg(&nfl->rw_file, RCU_INITIALIZER(NULL));
+
+ spin_lock(&nfs_uuid->lock);
+ /* Remove nfl from nfs_uuid->files list */
+ list_del_init(&nfl->list);
+ spin_unlock(&nfs_uuid->lock);
+ /* Now we can allow racing nfs_close_local_fh() to
+ * skip the locking.
+ */
+ RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
+
+ nfs_to_nfsd_file_put_local(ro_nf);
+ nfs_to_nfsd_file_put_local(rw_nf);
+
cond_resched();
}
--
2.49.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 6/6] nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()
2025-04-30 4:01 [PATCH 0/6] nfs_localio: fixes for races and errors from older compilers NeilBrown
` (4 preceding siblings ...)
2025-04-30 4:01 ` [PATCH 5/6] nfs_localio: duplicate nfs_close_local_fh() NeilBrown
@ 2025-04-30 4:01 ` NeilBrown
2025-04-30 17:34 ` Mike Snitzer
5 siblings, 1 reply; 13+ messages in thread
From: NeilBrown @ 2025-04-30 4:01 UTC (permalink / raw)
To: Trond Myklebust, Anna Schumaker, Mike Snitzer, Pali Rohár,
Vincent Mailhol
Cc: Chuck Lever, Jeff Layton, linux-nfs
nfs_uuid_put() and nfs_close_local_fh() can race if a "struct
nfs_file_localio" is released at the same time that nfsd calls
nfs_localio_invalidate_clients().
It is important that neither of these functions completes after the
other has started looking at a given nfs_file_localio and before it
finishes.
If nfs_uuid_put() exits while nfs_close_local_fh() is closing ro_file
and rw_file it could return to __nfd_file_cache_purge() while some files
are still referenced so the purge may not succeed.
If nfs_close_local_fh() exits while nfsd_uuid_put() is still closing the
files then the "struct nfs_file_localio" could be freed while
nfsd_uuid_put() is still looking at it. This side is currently handled
by copying the pointers out of ro_file and rw_file before deleting from
the list in nfsd_uuid. We need to preserve this while ensuring that
nfsd_uuid_put() does wait for nfs_close_local_fh().
This patch use nfl->uuid and nfl->list to provide the required
interlock.
nfs_uuid_put() removes the nfs_file_localio from the list, then drops
locks and puts the two files, then reclaims the spinlock and sets
->nfs_uuid to NULL.
nfs_close_local_fh() operates in the reverse order, setting ->nfs_uuid
to NULL, then closing the files, then unlinking from the list.
If nfs_uuid_put() finds that ->nfs_uuid is already NULL, it waits for
the nfs_file_localio to be removed from the list. If
nfs_close_local_fh() find that it has already been unlinked it waits for
->nfs_uuid to become NULL. This ensure that one of the two tries to
close the files, but they each waits for the other.
As nfs_uuid_put() is making the list empty, change from a
list_for_each_safe loop to a while that always takes the first entry.
This makes the intent more clear.
Also don't move the list to a temporary local list as this would defeat
the guarantees required for the interlock.
Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
---
fs/nfs_common/nfslocalio.c | 83 +++++++++++++++++++++++++++-----------
1 file changed, 59 insertions(+), 24 deletions(-)
diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
index abf1591a3b7f..0aaf0abeb110 100644
--- a/fs/nfs_common/nfslocalio.c
+++ b/fs/nfs_common/nfslocalio.c
@@ -151,8 +151,7 @@ EXPORT_SYMBOL_GPL(nfs_localio_enable_client);
*/
static bool nfs_uuid_put(nfs_uuid_t *nfs_uuid)
{
- LIST_HEAD(local_files);
- struct nfs_file_localio *nfl, *tmp;
+ struct nfs_file_localio *nfl;
spin_lock(&nfs_uuid->lock);
if (unlikely(!rcu_access_pointer(nfs_uuid->net))) {
@@ -166,35 +165,47 @@ static bool nfs_uuid_put(nfs_uuid_t *nfs_uuid)
nfs_uuid->dom = NULL;
}
- list_splice_init(&nfs_uuid->files, &local_files);
- spin_unlock(&nfs_uuid->lock);
-
/* Walk list of files and ensure their last references dropped */
- list_for_each_entry_safe(nfl, tmp, &local_files, list) {
+
+ while ((nfl = list_first_entry_or_null(&nfs_uuid->files,
+ struct nfs_file_localio,
+ list)) != NULL) {
struct nfsd_file __rcu *ro_nf;
struct nfsd_file __rcu *rw_nf;
+ /* If nfs_uuid is already NULL, nfs_close_local_fh is
+ * closing and we must wait, else we unlink and close.
+ */
+ if (nfl->nfs_uuid == NULL) {
+ /* nfs_close_local_fh() is doing the
+ * close and we must wait. until it unlinks
+ */
+ wait_var_event_spinlock(nfl,
+ list_first_entry_or_null(
+ &nfs_uuid->files,
+ struct nfs_file_localio,
+ list) != nfl,
+ &nfs_uuid->lock);
+ continue;
+ }
+
ro_nf = xchg(&nfl->ro_file, RCU_INITIALIZER(NULL));
rw_nf = xchg(&nfl->rw_file, RCU_INITIALIZER(NULL));
- spin_lock(&nfs_uuid->lock);
/* Remove nfl from nfs_uuid->files list */
list_del_init(&nfl->list);
spin_unlock(&nfs_uuid->lock);
+ nfs_to_nfsd_file_put_local(ro_nf);
+ nfs_to_nfsd_file_put_local(rw_nf);
+ cond_resched();
+ spin_lock(&nfs_uuid->lock);
/* Now we can allow racing nfs_close_local_fh() to
* skip the locking.
*/
RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
-
- nfs_to_nfsd_file_put_local(ro_nf);
- nfs_to_nfsd_file_put_local(rw_nf);
-
- cond_resched();
+ wake_up_var_locked(&nfl->nfs_uuid, &nfs_uuid->lock);
}
- spin_lock(&nfs_uuid->lock);
- BUG_ON(!list_empty(&nfs_uuid->files));
-
/* Remove client from nn->local_clients */
if (nfs_uuid->list_lock) {
spin_lock(nfs_uuid->list_lock);
@@ -302,22 +313,46 @@ void nfs_close_local_fh(struct nfs_file_localio *nfl)
return;
}
+ spin_lock(&nfs_uuid->lock);
+ if (!rcu_access_pointer(nfl->nfs_uuid)) {
+ /* nfs_uuid_put has finished here */
+ spin_unlock(&nfs_uuid->lock);
+ rcu_read_unlock();
+ return;
+ }
+ if (list_empty(&nfs_uuid->files)) {
+ /* nfs_uuid_put() has started closing files, wait for it
+ * to finished
+ */
+ spin_unlock(&nfs_uuid->lock);
+ rcu_read_unlock();
+ wait_var_event(&nfl->nfs_uuid,
+ rcu_access_pointer(nfl->nfs_uuid) == NULL);
+ return;
+ }
+ /* tell nfs_uuid_put() to wait for us */
+ RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
+ spin_unlock(&nfs_uuid->lock);
+ rcu_read_unlock();
+
ro_nf = xchg(&nfl->ro_file, RCU_INITIALIZER(NULL));
rw_nf = xchg(&nfl->rw_file, RCU_INITIALIZER(NULL));
+ nfs_to_nfsd_file_put_local(ro_nf);
+ nfs_to_nfsd_file_put_local(rw_nf);
+ rcu_read_lock();
+ if (WARN_ON(rcu_access_pointer(nfl->nfs_uuid) != nfs_uuid)) {
+ rcu_read_unlock();
+ return;
+ }
+ /* Remove nfl from nfs_uuid->files list and signal nfs_uuid_put()
+ * that we are done.
+ */
spin_lock(&nfs_uuid->lock);
- /* Remove nfl from nfs_uuid->files list */
list_del_init(&nfl->list);
+ wake_up_var_locked(&nfl->nfs_uuid, &nfs_uuid->lock);
spin_unlock(&nfs_uuid->lock);
rcu_read_unlock();
- /* Now we can allow racing nfs_close_local_fh() to
- * skip the locking.
- */
- RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
-
- nfs_to_nfsd_file_put_local(ro_nf);
- nfs_to_nfsd_file_put_local(rw_nf);
- return;
}
EXPORT_SYMBOL_GPL(nfs_close_local_fh);
--
2.49.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 6/6] nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()
2025-04-30 4:01 ` [PATCH 6/6] nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh() NeilBrown
@ 2025-04-30 17:34 ` Mike Snitzer
2025-04-30 20:20 ` Mike Snitzer
2025-04-30 22:12 ` NeilBrown
0 siblings, 2 replies; 13+ messages in thread
From: Mike Snitzer @ 2025-04-30 17:34 UTC (permalink / raw)
To: NeilBrown
Cc: Trond Myklebust, Anna Schumaker, Pali Rohár, Vincent Mailhol,
Chuck Lever, Jeff Layton, linux-nfs
Hi Neil,
With this change a simple write to a file (using pNFS flexfiles)
triggers this patch's nfs_close_local_fh WARN_ON:
[ 261.589009] ------------[ cut here ]------------
[ 261.589016] WARNING: CPU: 2 PID: 7220 at fs/nfs_common/nfslocalio.c:344 nfs_close_local_fh+0x1dd/0x1f0 [nfs_localio]
[ 261.589045] Modules linked in: tls nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfsidmap nfsd auth_rpcgss nfs_acl nft_nat nft_ct nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 veth bridge stp llc nfs lockd grace nfs_localio sunrpc netfs nf_tables nfnetlink overlay rfkill vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vfat fat intel_rapl_msr intel_rapl_common kvm_intel kvm nd_pmem dax_pmem nd_e820 pktcdvd libnvdimm irqbypass ppdev crct10dif_pclmul crc32_pclmul vmw_balloon i2c_piix4 ghash_clmulni_intel vmw_vmci pcspkr joydev i2c_smbus rapl parport_pc parport xfs sr_mod sd_mod cdrom sg ata_generic ata_piix mptspi crc32c_intel libata scsi_transport_spi serio_raw mptscsih vmxnet3 mptbase dm_mod fuse [last unloaded: sunrpc]
[ 261.589403] CPU: 2 UID: 0 PID: 7220 Comm: dd Kdump: loaded Tainted: G W ------- --- 6.12.24.0.hs.62.snitm+ #15
[ 261.589414] Tainted: [W]=WARN
[ 261.589417] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
[ 261.589423] RIP: 0010:nfs_close_local_fh+0x1dd/0x1f0 [nfs_localio]
[ 261.589440] Code: e2 ba 02 00 00 00 48 89 ee 4c 89 e7 e8 9c 9f cb e1 48 8b 43 20 48 85 c0 75 e2 48 89 ee 4c 89 e7 e8 28 a7 cb e1 e9 6f ff ff ff <0f> 0b e8 bc 36 d0 e1 e9 63 ff ff ff e8 02 86 8a e2 66 90 90 90 90
[ 261.589447] RSP: 0018:ffffb0fac4d5bc98 EFLAGS: 00010282
[ 261.589455] RAX: 0000000000000000 RBX: ffff94354d5f0270 RCX: 0000000000000002
[ 261.589461] RDX: ffff943544085040 RSI: ffff9435455559e8 RDI: ffff943544085040
[ 261.589466] RBP: ffff943544199e80 R08: 58132c4e3594ffff R09: fffffffae10a4a38
[ 261.589472] R10: 0000000000000001 R11: 000000000000000f R12: ffff94354baae380
[ 261.589477] R13: ffff94354baae3c0 R14: ffff943546916d08 R15: ffff943546916cf0
[ 261.589499] FS: 00007fd6724ff580(0000) GS:ffff943773b00000(0000) knlGS:0000000000000000
[ 261.589512] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 261.589518] CR2: 00007fdade5900a8 CR3: 000000010f3a8002 CR4: 00000000001726f0
[ 261.589539] Call Trace:
[ 261.589545] <TASK>
[ 261.589556] ff_layout_free_mirror+0x78/0xc0 [nfs_layout_flexfiles]
[ 261.589575] ff_layout_free_layoutreturn+0x64/0x110 [nfs_layout_flexfiles]
[ 261.589594] pnfs_roc_release+0x7e/0x140 [nfsv4]
[ 261.589830] nfs4_free_closedata+0x6c/0x80 [nfsv4]
[ 261.590013] rpc_free_task+0x36/0x60 [sunrpc]
[ 261.590209] nfs4_do_close+0x269/0x330 [nfsv4]
[ 261.590398] __put_nfs_open_context+0xcb/0x150 [nfs]
[ 261.590546] nfs_file_release+0x39/0x60 [nfs]
[ 261.590700] __fput+0xdc/0x2a0
[ 261.590713] __x64_sys_close+0x3e/0x70
[ 261.590723] do_syscall_64+0x7b/0x160
[ 261.590736] ? clear_bhb_loop+0x45/0xa0
[ 261.590744] ? clear_bhb_loop+0x45/0xa0
[ 261.590769] ? clear_bhb_loop+0x45/0xa0
[ 261.590777] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 261.590789] RIP: 0033:0x7fd671f2ebf8
[ 261.590796] Code: 01 02 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 48 8d 05 65 6b 2a 00 8b 00 85 c0 75 17 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 40 c3 0f 1f 80 00 00 00 00 53 89 fb 48 83 ec
[ 261.590803] RSP: 002b:00007ffd9826ea88 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[ 261.590812] RAX: ffffffffffffffda RBX: 0000558223012120 RCX: 00007fd671f2ebf8
[ 261.590819] RDX: 0000000000100000 RSI: 0000000000000000 RDI: 0000000000000001
[ 261.590826] RBP: 0000000000000001 R08: 00000000ffffffff R09: 0000000000000000
[ 261.590834] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000000001
[ 261.590843] R13: 0000000000000000 R14: 0000000000000000 R15: 00007fd67232d000
[ 261.590858] </TASK>
[ 261.590864] ---[ end trace 0000000000000000 ]---
After this last patch is applied, in nfs_close_local_fh() you have:
/* tell nfs_uuid_put() to wait for us */
RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
spin_unlock(&nfs_uuid->lock);
rcu_read_unlock();
ro_nf = xchg(&nfl->ro_file, RCU_INITIALIZER(NULL));
rw_nf = xchg(&nfl->rw_file, RCU_INITIALIZER(NULL));
nfs_to_nfsd_file_put_local(ro_nf);
nfs_to_nfsd_file_put_local(rw_nf);
rcu_read_lock();
if (WARN_ON(rcu_access_pointer(nfl->nfs_uuid) != nfs_uuid)) {
rcu_read_unlock();
return;
}
/* Remove nfl from nfs_uuid->files list and signal nfs_uuid_put()
* that we are done.
*/
spin_lock(&nfs_uuid->lock);
list_del_init(&nfl->list);
wake_up_var_locked(&nfl->nfs_uuid, &nfs_uuid->lock);
spin_unlock(&nfs_uuid->lock);
rcu_read_unlock();
}
this is bogus right?:
RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
...
if (WARN_ON(rcu_access_pointer(nfl->nfs_uuid) != nfs_uuid))
not sure what you were trying to do, maybe stale debugging? [but you didn't test so... ;) ]
Thanks,
Mike
On Wed, Apr 30, 2025 at 02:01:16PM +1000, NeilBrown wrote:
> nfs_uuid_put() and nfs_close_local_fh() can race if a "struct
> nfs_file_localio" is released at the same time that nfsd calls
> nfs_localio_invalidate_clients().
>
> It is important that neither of these functions completes after the
> other has started looking at a given nfs_file_localio and before it
> finishes.
>
> If nfs_uuid_put() exits while nfs_close_local_fh() is closing ro_file
> and rw_file it could return to __nfd_file_cache_purge() while some files
> are still referenced so the purge may not succeed.
>
> If nfs_close_local_fh() exits while nfsd_uuid_put() is still closing the
> files then the "struct nfs_file_localio" could be freed while
> nfsd_uuid_put() is still looking at it. This side is currently handled
> by copying the pointers out of ro_file and rw_file before deleting from
> the list in nfsd_uuid. We need to preserve this while ensuring that
> nfsd_uuid_put() does wait for nfs_close_local_fh().
>
> This patch use nfl->uuid and nfl->list to provide the required
> interlock.
>
> nfs_uuid_put() removes the nfs_file_localio from the list, then drops
> locks and puts the two files, then reclaims the spinlock and sets
> ->nfs_uuid to NULL.
>
> nfs_close_local_fh() operates in the reverse order, setting ->nfs_uuid
> to NULL, then closing the files, then unlinking from the list.
>
> If nfs_uuid_put() finds that ->nfs_uuid is already NULL, it waits for
> the nfs_file_localio to be removed from the list. If
> nfs_close_local_fh() find that it has already been unlinked it waits for
> ->nfs_uuid to become NULL. This ensure that one of the two tries to
> close the files, but they each waits for the other.
>
> As nfs_uuid_put() is making the list empty, change from a
> list_for_each_safe loop to a while that always takes the first entry.
> This makes the intent more clear.
> Also don't move the list to a temporary local list as this would defeat
> the guarantees required for the interlock.
>
> Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
> fs/nfs_common/nfslocalio.c | 83 +++++++++++++++++++++++++++-----------
> 1 file changed, 59 insertions(+), 24 deletions(-)
>
> diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
> index abf1591a3b7f..0aaf0abeb110 100644
> --- a/fs/nfs_common/nfslocalio.c
> +++ b/fs/nfs_common/nfslocalio.c
> @@ -151,8 +151,7 @@ EXPORT_SYMBOL_GPL(nfs_localio_enable_client);
> */
> static bool nfs_uuid_put(nfs_uuid_t *nfs_uuid)
> {
> - LIST_HEAD(local_files);
> - struct nfs_file_localio *nfl, *tmp;
> + struct nfs_file_localio *nfl;
>
> spin_lock(&nfs_uuid->lock);
> if (unlikely(!rcu_access_pointer(nfs_uuid->net))) {
> @@ -166,35 +165,47 @@ static bool nfs_uuid_put(nfs_uuid_t *nfs_uuid)
> nfs_uuid->dom = NULL;
> }
>
> - list_splice_init(&nfs_uuid->files, &local_files);
> - spin_unlock(&nfs_uuid->lock);
> -
> /* Walk list of files and ensure their last references dropped */
> - list_for_each_entry_safe(nfl, tmp, &local_files, list) {
> +
> + while ((nfl = list_first_entry_or_null(&nfs_uuid->files,
> + struct nfs_file_localio,
> + list)) != NULL) {
> struct nfsd_file __rcu *ro_nf;
> struct nfsd_file __rcu *rw_nf;
>
> + /* If nfs_uuid is already NULL, nfs_close_local_fh is
> + * closing and we must wait, else we unlink and close.
> + */
> + if (nfl->nfs_uuid == NULL) {
> + /* nfs_close_local_fh() is doing the
> + * close and we must wait. until it unlinks
> + */
> + wait_var_event_spinlock(nfl,
> + list_first_entry_or_null(
> + &nfs_uuid->files,
> + struct nfs_file_localio,
> + list) != nfl,
> + &nfs_uuid->lock);
> + continue;
> + }
> +
> ro_nf = xchg(&nfl->ro_file, RCU_INITIALIZER(NULL));
> rw_nf = xchg(&nfl->rw_file, RCU_INITIALIZER(NULL));
>
> - spin_lock(&nfs_uuid->lock);
> /* Remove nfl from nfs_uuid->files list */
> list_del_init(&nfl->list);
> spin_unlock(&nfs_uuid->lock);
> + nfs_to_nfsd_file_put_local(ro_nf);
> + nfs_to_nfsd_file_put_local(rw_nf);
> + cond_resched();
> + spin_lock(&nfs_uuid->lock);
> /* Now we can allow racing nfs_close_local_fh() to
> * skip the locking.
> */
> RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
> -
> - nfs_to_nfsd_file_put_local(ro_nf);
> - nfs_to_nfsd_file_put_local(rw_nf);
> -
> - cond_resched();
> + wake_up_var_locked(&nfl->nfs_uuid, &nfs_uuid->lock);
> }
>
> - spin_lock(&nfs_uuid->lock);
> - BUG_ON(!list_empty(&nfs_uuid->files));
> -
> /* Remove client from nn->local_clients */
> if (nfs_uuid->list_lock) {
> spin_lock(nfs_uuid->list_lock);
> @@ -302,22 +313,46 @@ void nfs_close_local_fh(struct nfs_file_localio *nfl)
> return;
> }
>
> + spin_lock(&nfs_uuid->lock);
> + if (!rcu_access_pointer(nfl->nfs_uuid)) {
> + /* nfs_uuid_put has finished here */
> + spin_unlock(&nfs_uuid->lock);
> + rcu_read_unlock();
> + return;
> + }
> + if (list_empty(&nfs_uuid->files)) {
> + /* nfs_uuid_put() has started closing files, wait for it
> + * to finished
> + */
> + spin_unlock(&nfs_uuid->lock);
> + rcu_read_unlock();
> + wait_var_event(&nfl->nfs_uuid,
> + rcu_access_pointer(nfl->nfs_uuid) == NULL);
> + return;
> + }
> + /* tell nfs_uuid_put() to wait for us */
> + RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
> + spin_unlock(&nfs_uuid->lock);
> + rcu_read_unlock();
> +
> ro_nf = xchg(&nfl->ro_file, RCU_INITIALIZER(NULL));
> rw_nf = xchg(&nfl->rw_file, RCU_INITIALIZER(NULL));
> + nfs_to_nfsd_file_put_local(ro_nf);
> + nfs_to_nfsd_file_put_local(rw_nf);
>
> + rcu_read_lock();
> + if (WARN_ON(rcu_access_pointer(nfl->nfs_uuid) != nfs_uuid)) {
> + rcu_read_unlock();
> + return;
> + }
> + /* Remove nfl from nfs_uuid->files list and signal nfs_uuid_put()
> + * that we are done.
> + */
> spin_lock(&nfs_uuid->lock);
> - /* Remove nfl from nfs_uuid->files list */
> list_del_init(&nfl->list);
> + wake_up_var_locked(&nfl->nfs_uuid, &nfs_uuid->lock);
> spin_unlock(&nfs_uuid->lock);
> rcu_read_unlock();
> - /* Now we can allow racing nfs_close_local_fh() to
> - * skip the locking.
> - */
> - RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
> -
> - nfs_to_nfsd_file_put_local(ro_nf);
> - nfs_to_nfsd_file_put_local(rw_nf);
> - return;
> }
> EXPORT_SYMBOL_GPL(nfs_close_local_fh);
>
> --
> 2.49.0
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6/6] nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()
2025-04-30 17:34 ` Mike Snitzer
@ 2025-04-30 20:20 ` Mike Snitzer
2025-04-30 22:16 ` NeilBrown
2025-04-30 22:12 ` NeilBrown
1 sibling, 1 reply; 13+ messages in thread
From: Mike Snitzer @ 2025-04-30 20:20 UTC (permalink / raw)
To: NeilBrown
Cc: Trond Myklebust, Anna Schumaker, Pali Rohár, Vincent Mailhol,
Chuck Lever, Jeff Layton, linux-nfs
FYI, If I remove that WARN_ON I then get this when tearing down NFSD
within a container:
[ 295.192357] percpu ref (nfsd_net_free [nfsd]) <= 0 (0) after switching to atomic
On Wed, Apr 30, 2025 at 01:34:03PM -0400, Mike Snitzer wrote:
> Hi Neil,
>
> With this change a simple write to a file (using pNFS flexfiles)
> triggers this patch's nfs_close_local_fh WARN_ON:
>
> [ 261.589009] ------------[ cut here ]------------
> [ 261.589016] WARNING: CPU: 2 PID: 7220 at fs/nfs_common/nfslocalio.c:344 nfs_close_local_fh+0x1dd/0x1f0 [nfs_localio]
> [ 261.589045] Modules linked in: tls nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfsidmap nfsd auth_rpcgss nfs_acl nft_nat nft_ct nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 veth bridge stp llc nfs lockd grace nfs_localio sunrpc netfs nf_tables nfnetlink overlay rfkill vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vfat fat intel_rapl_msr intel_rapl_common kvm_intel kvm nd_pmem dax_pmem nd_e820 pktcdvd libnvdimm irqbypass ppdev crct10dif_pclmul crc32_pclmul vmw_balloon i2c_piix4 ghash_clmulni_intel vmw_vmci pcspkr joydev i2c_smbus rapl parport_pc parport xfs sr_mod sd_mod cdrom sg ata_generic ata_piix mptspi crc32c_intel libata scsi_transport_spi serio_raw mptscsih vmxnet3 mptbase dm_mod fuse [last unloaded: sunrpc]
> [ 261.589403] CPU: 2 UID: 0 PID: 7220 Comm: dd Kdump: loaded Tainted: G W ------- --- 6.12.24.0.hs.62.snitm+ #15
> [ 261.589414] Tainted: [W]=WARN
> [ 261.589417] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
> [ 261.589423] RIP: 0010:nfs_close_local_fh+0x1dd/0x1f0 [nfs_localio]
> [ 261.589440] Code: e2 ba 02 00 00 00 48 89 ee 4c 89 e7 e8 9c 9f cb e1 48 8b 43 20 48 85 c0 75 e2 48 89 ee 4c 89 e7 e8 28 a7 cb e1 e9 6f ff ff ff <0f> 0b e8 bc 36 d0 e1 e9 63 ff ff ff e8 02 86 8a e2 66 90 90 90 90
> [ 261.589447] RSP: 0018:ffffb0fac4d5bc98 EFLAGS: 00010282
> [ 261.589455] RAX: 0000000000000000 RBX: ffff94354d5f0270 RCX: 0000000000000002
> [ 261.589461] RDX: ffff943544085040 RSI: ffff9435455559e8 RDI: ffff943544085040
> [ 261.589466] RBP: ffff943544199e80 R08: 58132c4e3594ffff R09: fffffffae10a4a38
> [ 261.589472] R10: 0000000000000001 R11: 000000000000000f R12: ffff94354baae380
> [ 261.589477] R13: ffff94354baae3c0 R14: ffff943546916d08 R15: ffff943546916cf0
> [ 261.589499] FS: 00007fd6724ff580(0000) GS:ffff943773b00000(0000) knlGS:0000000000000000
> [ 261.589512] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 261.589518] CR2: 00007fdade5900a8 CR3: 000000010f3a8002 CR4: 00000000001726f0
> [ 261.589539] Call Trace:
> [ 261.589545] <TASK>
> [ 261.589556] ff_layout_free_mirror+0x78/0xc0 [nfs_layout_flexfiles]
> [ 261.589575] ff_layout_free_layoutreturn+0x64/0x110 [nfs_layout_flexfiles]
> [ 261.589594] pnfs_roc_release+0x7e/0x140 [nfsv4]
> [ 261.589830] nfs4_free_closedata+0x6c/0x80 [nfsv4]
> [ 261.590013] rpc_free_task+0x36/0x60 [sunrpc]
> [ 261.590209] nfs4_do_close+0x269/0x330 [nfsv4]
> [ 261.590398] __put_nfs_open_context+0xcb/0x150 [nfs]
> [ 261.590546] nfs_file_release+0x39/0x60 [nfs]
> [ 261.590700] __fput+0xdc/0x2a0
> [ 261.590713] __x64_sys_close+0x3e/0x70
> [ 261.590723] do_syscall_64+0x7b/0x160
> [ 261.590736] ? clear_bhb_loop+0x45/0xa0
> [ 261.590744] ? clear_bhb_loop+0x45/0xa0
> [ 261.590769] ? clear_bhb_loop+0x45/0xa0
> [ 261.590777] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 261.590789] RIP: 0033:0x7fd671f2ebf8
> [ 261.590796] Code: 01 02 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 48 8d 05 65 6b 2a 00 8b 00 85 c0 75 17 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 40 c3 0f 1f 80 00 00 00 00 53 89 fb 48 83 ec
> [ 261.590803] RSP: 002b:00007ffd9826ea88 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
> [ 261.590812] RAX: ffffffffffffffda RBX: 0000558223012120 RCX: 00007fd671f2ebf8
> [ 261.590819] RDX: 0000000000100000 RSI: 0000000000000000 RDI: 0000000000000001
> [ 261.590826] RBP: 0000000000000001 R08: 00000000ffffffff R09: 0000000000000000
> [ 261.590834] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000000001
> [ 261.590843] R13: 0000000000000000 R14: 0000000000000000 R15: 00007fd67232d000
> [ 261.590858] </TASK>
> [ 261.590864] ---[ end trace 0000000000000000 ]---
>
> After this last patch is applied, in nfs_close_local_fh() you have:
>
> /* tell nfs_uuid_put() to wait for us */
> RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
> spin_unlock(&nfs_uuid->lock);
> rcu_read_unlock();
>
> ro_nf = xchg(&nfl->ro_file, RCU_INITIALIZER(NULL));
> rw_nf = xchg(&nfl->rw_file, RCU_INITIALIZER(NULL));
> nfs_to_nfsd_file_put_local(ro_nf);
> nfs_to_nfsd_file_put_local(rw_nf);
>
> rcu_read_lock();
> if (WARN_ON(rcu_access_pointer(nfl->nfs_uuid) != nfs_uuid)) {
> rcu_read_unlock();
> return;
> }
> /* Remove nfl from nfs_uuid->files list and signal nfs_uuid_put()
> * that we are done.
> */
> spin_lock(&nfs_uuid->lock);
> list_del_init(&nfl->list);
> wake_up_var_locked(&nfl->nfs_uuid, &nfs_uuid->lock);
> spin_unlock(&nfs_uuid->lock);
> rcu_read_unlock();
> }
>
> this is bogus right?:
>
> RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
> ...
> if (WARN_ON(rcu_access_pointer(nfl->nfs_uuid) != nfs_uuid))
>
> not sure what you were trying to do, maybe stale debugging? [but you didn't test so... ;) ]
>
> Thanks,
> Mike
>
>
> On Wed, Apr 30, 2025 at 02:01:16PM +1000, NeilBrown wrote:
> > nfs_uuid_put() and nfs_close_local_fh() can race if a "struct
> > nfs_file_localio" is released at the same time that nfsd calls
> > nfs_localio_invalidate_clients().
> >
> > It is important that neither of these functions completes after the
> > other has started looking at a given nfs_file_localio and before it
> > finishes.
> >
> > If nfs_uuid_put() exits while nfs_close_local_fh() is closing ro_file
> > and rw_file it could return to __nfd_file_cache_purge() while some files
> > are still referenced so the purge may not succeed.
> >
> > If nfs_close_local_fh() exits while nfsd_uuid_put() is still closing the
> > files then the "struct nfs_file_localio" could be freed while
> > nfsd_uuid_put() is still looking at it. This side is currently handled
> > by copying the pointers out of ro_file and rw_file before deleting from
> > the list in nfsd_uuid. We need to preserve this while ensuring that
> > nfsd_uuid_put() does wait for nfs_close_local_fh().
> >
> > This patch use nfl->uuid and nfl->list to provide the required
> > interlock.
> >
> > nfs_uuid_put() removes the nfs_file_localio from the list, then drops
> > locks and puts the two files, then reclaims the spinlock and sets
> > ->nfs_uuid to NULL.
> >
> > nfs_close_local_fh() operates in the reverse order, setting ->nfs_uuid
> > to NULL, then closing the files, then unlinking from the list.
> >
> > If nfs_uuid_put() finds that ->nfs_uuid is already NULL, it waits for
> > the nfs_file_localio to be removed from the list. If
> > nfs_close_local_fh() find that it has already been unlinked it waits for
> > ->nfs_uuid to become NULL. This ensure that one of the two tries to
> > close the files, but they each waits for the other.
> >
> > As nfs_uuid_put() is making the list empty, change from a
> > list_for_each_safe loop to a while that always takes the first entry.
> > This makes the intent more clear.
> > Also don't move the list to a temporary local list as this would defeat
> > the guarantees required for the interlock.
> >
> > Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
> > Signed-off-by: NeilBrown <neil@brown.name>
> > ---
> > fs/nfs_common/nfslocalio.c | 83 +++++++++++++++++++++++++++-----------
> > 1 file changed, 59 insertions(+), 24 deletions(-)
> >
> > diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
> > index abf1591a3b7f..0aaf0abeb110 100644
> > --- a/fs/nfs_common/nfslocalio.c
> > +++ b/fs/nfs_common/nfslocalio.c
> > @@ -151,8 +151,7 @@ EXPORT_SYMBOL_GPL(nfs_localio_enable_client);
> > */
> > static bool nfs_uuid_put(nfs_uuid_t *nfs_uuid)
> > {
> > - LIST_HEAD(local_files);
> > - struct nfs_file_localio *nfl, *tmp;
> > + struct nfs_file_localio *nfl;
> >
> > spin_lock(&nfs_uuid->lock);
> > if (unlikely(!rcu_access_pointer(nfs_uuid->net))) {
> > @@ -166,35 +165,47 @@ static bool nfs_uuid_put(nfs_uuid_t *nfs_uuid)
> > nfs_uuid->dom = NULL;
> > }
> >
> > - list_splice_init(&nfs_uuid->files, &local_files);
> > - spin_unlock(&nfs_uuid->lock);
> > -
> > /* Walk list of files and ensure their last references dropped */
> > - list_for_each_entry_safe(nfl, tmp, &local_files, list) {
> > +
> > + while ((nfl = list_first_entry_or_null(&nfs_uuid->files,
> > + struct nfs_file_localio,
> > + list)) != NULL) {
> > struct nfsd_file __rcu *ro_nf;
> > struct nfsd_file __rcu *rw_nf;
> >
> > + /* If nfs_uuid is already NULL, nfs_close_local_fh is
> > + * closing and we must wait, else we unlink and close.
> > + */
> > + if (nfl->nfs_uuid == NULL) {
> > + /* nfs_close_local_fh() is doing the
> > + * close and we must wait. until it unlinks
> > + */
> > + wait_var_event_spinlock(nfl,
> > + list_first_entry_or_null(
> > + &nfs_uuid->files,
> > + struct nfs_file_localio,
> > + list) != nfl,
> > + &nfs_uuid->lock);
> > + continue;
> > + }
> > +
> > ro_nf = xchg(&nfl->ro_file, RCU_INITIALIZER(NULL));
> > rw_nf = xchg(&nfl->rw_file, RCU_INITIALIZER(NULL));
> >
> > - spin_lock(&nfs_uuid->lock);
> > /* Remove nfl from nfs_uuid->files list */
> > list_del_init(&nfl->list);
> > spin_unlock(&nfs_uuid->lock);
> > + nfs_to_nfsd_file_put_local(ro_nf);
> > + nfs_to_nfsd_file_put_local(rw_nf);
> > + cond_resched();
> > + spin_lock(&nfs_uuid->lock);
> > /* Now we can allow racing nfs_close_local_fh() to
> > * skip the locking.
> > */
> > RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
> > -
> > - nfs_to_nfsd_file_put_local(ro_nf);
> > - nfs_to_nfsd_file_put_local(rw_nf);
> > -
> > - cond_resched();
> > + wake_up_var_locked(&nfl->nfs_uuid, &nfs_uuid->lock);
> > }
> >
> > - spin_lock(&nfs_uuid->lock);
> > - BUG_ON(!list_empty(&nfs_uuid->files));
> > -
> > /* Remove client from nn->local_clients */
> > if (nfs_uuid->list_lock) {
> > spin_lock(nfs_uuid->list_lock);
> > @@ -302,22 +313,46 @@ void nfs_close_local_fh(struct nfs_file_localio *nfl)
> > return;
> > }
> >
> > + spin_lock(&nfs_uuid->lock);
> > + if (!rcu_access_pointer(nfl->nfs_uuid)) {
> > + /* nfs_uuid_put has finished here */
> > + spin_unlock(&nfs_uuid->lock);
> > + rcu_read_unlock();
> > + return;
> > + }
> > + if (list_empty(&nfs_uuid->files)) {
> > + /* nfs_uuid_put() has started closing files, wait for it
> > + * to finished
> > + */
> > + spin_unlock(&nfs_uuid->lock);
> > + rcu_read_unlock();
> > + wait_var_event(&nfl->nfs_uuid,
> > + rcu_access_pointer(nfl->nfs_uuid) == NULL);
> > + return;
> > + }
> > + /* tell nfs_uuid_put() to wait for us */
> > + RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
> > + spin_unlock(&nfs_uuid->lock);
> > + rcu_read_unlock();
> > +
> > ro_nf = xchg(&nfl->ro_file, RCU_INITIALIZER(NULL));
> > rw_nf = xchg(&nfl->rw_file, RCU_INITIALIZER(NULL));
> > + nfs_to_nfsd_file_put_local(ro_nf);
> > + nfs_to_nfsd_file_put_local(rw_nf);
> >
> > + rcu_read_lock();
> > + if (WARN_ON(rcu_access_pointer(nfl->nfs_uuid) != nfs_uuid)) {
> > + rcu_read_unlock();
> > + return;
> > + }
> > + /* Remove nfl from nfs_uuid->files list and signal nfs_uuid_put()
> > + * that we are done.
> > + */
> > spin_lock(&nfs_uuid->lock);
> > - /* Remove nfl from nfs_uuid->files list */
> > list_del_init(&nfl->list);
> > + wake_up_var_locked(&nfl->nfs_uuid, &nfs_uuid->lock);
> > spin_unlock(&nfs_uuid->lock);
> > rcu_read_unlock();
> > - /* Now we can allow racing nfs_close_local_fh() to
> > - * skip the locking.
> > - */
> > - RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
> > -
> > - nfs_to_nfsd_file_put_local(ro_nf);
> > - nfs_to_nfsd_file_put_local(rw_nf);
> > - return;
> > }
> > EXPORT_SYMBOL_GPL(nfs_close_local_fh);
> >
> > --
> > 2.49.0
> >
> >
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6/6] nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()
2025-04-30 17:34 ` Mike Snitzer
2025-04-30 20:20 ` Mike Snitzer
@ 2025-04-30 22:12 ` NeilBrown
1 sibling, 0 replies; 13+ messages in thread
From: NeilBrown @ 2025-04-30 22:12 UTC (permalink / raw)
To: Mike Snitzer
Cc: Trond Myklebust, Anna Schumaker, Pali Rohár, Vincent Mailhol,
Chuck Lever, Jeff Layton, linux-nfs
On Thu, 01 May 2025, Mike Snitzer wrote:
> Hi Neil,
>
> With this change a simple write to a file (using pNFS flexfiles)
> triggers this patch's nfs_close_local_fh WARN_ON:
>
> [ 261.589009] ------------[ cut here ]------------
> [ 261.589016] WARNING: CPU: 2 PID: 7220 at fs/nfs_common/nfslocalio.c:344 nfs_close_local_fh+0x1dd/0x1f0 [nfs_localio]
> [ 261.589045] Modules linked in: tls nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfsidmap nfsd auth_rpcgss nfs_acl nft_nat nft_ct nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 veth bridge stp llc nfs lockd grace nfs_localio sunrpc netfs nf_tables nfnetlink overlay rfkill vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vfat fat intel_rapl_msr intel_rapl_common kvm_intel kvm nd_pmem dax_pmem nd_e820 pktcdvd libnvdimm irqbypass ppdev crct10dif_pclmul crc32_pclmul vmw_balloon i2c_piix4 ghash_clmulni_intel vmw_vmci pcspkr joydev i2c_smbus rapl parport_pc parport xfs sr_mod sd_mod cdrom sg ata_generic ata_piix mptspi crc32c_intel libata scsi_transport_spi serio_raw mptscsih vmxnet3 mptbase dm_mod fuse [last unloaded: sunrpc]
> [ 261.589403] CPU: 2 UID: 0 PID: 7220 Comm: dd Kdump: loaded Tainted: G W ------- --- 6.12.24.0.hs.62.snitm+ #15
> [ 261.589414] Tainted: [W]=WARN
> [ 261.589417] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
> [ 261.589423] RIP: 0010:nfs_close_local_fh+0x1dd/0x1f0 [nfs_localio]
> [ 261.589440] Code: e2 ba 02 00 00 00 48 89 ee 4c 89 e7 e8 9c 9f cb e1 48 8b 43 20 48 85 c0 75 e2 48 89 ee 4c 89 e7 e8 28 a7 cb e1 e9 6f ff ff ff <0f> 0b e8 bc 36 d0 e1 e9 63 ff ff ff e8 02 86 8a e2 66 90 90 90 90
> [ 261.589447] RSP: 0018:ffffb0fac4d5bc98 EFLAGS: 00010282
> [ 261.589455] RAX: 0000000000000000 RBX: ffff94354d5f0270 RCX: 0000000000000002
> [ 261.589461] RDX: ffff943544085040 RSI: ffff9435455559e8 RDI: ffff943544085040
> [ 261.589466] RBP: ffff943544199e80 R08: 58132c4e3594ffff R09: fffffffae10a4a38
> [ 261.589472] R10: 0000000000000001 R11: 000000000000000f R12: ffff94354baae380
> [ 261.589477] R13: ffff94354baae3c0 R14: ffff943546916d08 R15: ffff943546916cf0
> [ 261.589499] FS: 00007fd6724ff580(0000) GS:ffff943773b00000(0000) knlGS:0000000000000000
> [ 261.589512] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 261.589518] CR2: 00007fdade5900a8 CR3: 000000010f3a8002 CR4: 00000000001726f0
> [ 261.589539] Call Trace:
> [ 261.589545] <TASK>
> [ 261.589556] ff_layout_free_mirror+0x78/0xc0 [nfs_layout_flexfiles]
> [ 261.589575] ff_layout_free_layoutreturn+0x64/0x110 [nfs_layout_flexfiles]
> [ 261.589594] pnfs_roc_release+0x7e/0x140 [nfsv4]
> [ 261.589830] nfs4_free_closedata+0x6c/0x80 [nfsv4]
> [ 261.590013] rpc_free_task+0x36/0x60 [sunrpc]
> [ 261.590209] nfs4_do_close+0x269/0x330 [nfsv4]
> [ 261.590398] __put_nfs_open_context+0xcb/0x150 [nfs]
> [ 261.590546] nfs_file_release+0x39/0x60 [nfs]
> [ 261.590700] __fput+0xdc/0x2a0
> [ 261.590713] __x64_sys_close+0x3e/0x70
> [ 261.590723] do_syscall_64+0x7b/0x160
> [ 261.590736] ? clear_bhb_loop+0x45/0xa0
> [ 261.590744] ? clear_bhb_loop+0x45/0xa0
> [ 261.590769] ? clear_bhb_loop+0x45/0xa0
> [ 261.590777] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 261.590789] RIP: 0033:0x7fd671f2ebf8
> [ 261.590796] Code: 01 02 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 48 8d 05 65 6b 2a 00 8b 00 85 c0 75 17 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 40 c3 0f 1f 80 00 00 00 00 53 89 fb 48 83 ec
> [ 261.590803] RSP: 002b:00007ffd9826ea88 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
> [ 261.590812] RAX: ffffffffffffffda RBX: 0000558223012120 RCX: 00007fd671f2ebf8
> [ 261.590819] RDX: 0000000000100000 RSI: 0000000000000000 RDI: 0000000000000001
> [ 261.590826] RBP: 0000000000000001 R08: 00000000ffffffff R09: 0000000000000000
> [ 261.590834] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000000001
> [ 261.590843] R13: 0000000000000000 R14: 0000000000000000 R15: 00007fd67232d000
> [ 261.590858] </TASK>
> [ 261.590864] ---[ end trace 0000000000000000 ]---
>
> After this last patch is applied, in nfs_close_local_fh() you have:
>
> /* tell nfs_uuid_put() to wait for us */
> RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
> spin_unlock(&nfs_uuid->lock);
> rcu_read_unlock();
>
> ro_nf = xchg(&nfl->ro_file, RCU_INITIALIZER(NULL));
> rw_nf = xchg(&nfl->rw_file, RCU_INITIALIZER(NULL));
> nfs_to_nfsd_file_put_local(ro_nf);
> nfs_to_nfsd_file_put_local(rw_nf);
>
> rcu_read_lock();
> if (WARN_ON(rcu_access_pointer(nfl->nfs_uuid) != nfs_uuid)) {
> rcu_read_unlock();
> return;
> }
> /* Remove nfl from nfs_uuid->files list and signal nfs_uuid_put()
> * that we are done.
> */
> spin_lock(&nfs_uuid->lock);
> list_del_init(&nfl->list);
> wake_up_var_locked(&nfl->nfs_uuid, &nfs_uuid->lock);
> spin_unlock(&nfs_uuid->lock);
> rcu_read_unlock();
> }
>
> this is bogus right?:
>
> RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
> ...
> if (WARN_ON(rcu_access_pointer(nfl->nfs_uuid) != nfs_uuid))
>
> not sure what you were trying to do, maybe stale debugging? [but you didn't test so... ;) ]
Hi Mike,
thanks for highlighting that. Yes, clearly bogus.
This code went through several iterations until I felt it was the right
shape.
I added that WARN_ON at one point because I had dropped rcu_read_lock
and reclaimed it, and that was (I thought) all that was protecting
nfs_uuid. But that is certainly not needed now, even if it was earlier
in the development.
Thanks,
NeilBrown
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6/6] nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()
2025-04-30 20:20 ` Mike Snitzer
@ 2025-04-30 22:16 ` NeilBrown
0 siblings, 0 replies; 13+ messages in thread
From: NeilBrown @ 2025-04-30 22:16 UTC (permalink / raw)
To: Mike Snitzer
Cc: Trond Myklebust, Anna Schumaker, Pali Rohár, Vincent Mailhol,
Chuck Lever, Jeff Layton, linux-nfs
On Thu, 01 May 2025, Mike Snitzer wrote:
> FYI, If I remove that WARN_ON I then get this when tearing down NFSD
> within a container:
>
> [ 295.192357] percpu ref (nfsd_net_free [nfsd]) <= 0 (0) after switching to atomic
Thanks. I'll look into that.
I do hope to make time for some testing next week, but having promised
patches and as time was dragging on (it hasn't been my highest priority)
I thought it would be good to post what I had once I was happy with the
overall approach.
Thanks for testing!
NeilBrown
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/6] nfs_localio: use cmpxchg() to install new nfs_file_localio
2025-04-30 4:01 ` [PATCH 1/6] nfs_localio: use cmpxchg() to install new nfs_file_localio NeilBrown
@ 2025-04-30 22:21 ` Mike Snitzer
0 siblings, 0 replies; 13+ messages in thread
From: Mike Snitzer @ 2025-04-30 22:21 UTC (permalink / raw)
To: NeilBrown
Cc: Trond Myklebust, Anna Schumaker, Pali Rohár, Vincent Mailhol,
Chuck Lever, Jeff Layton, linux-nfs
On Wed, Apr 30, 2025 at 02:01:11PM +1000, NeilBrown wrote:
> Rather than using nfs_uuid.lock to protect installing
> a new ro_file or rw_file, change to use cmpxchg().
> Removing the file already uses xchg() so this improves symmetry
> and also makes the code a little simpler.
>
> Also remove the optimisation of not taking the lock, and not removing
> the nfs_file_localio from the linked list, when both ->ro_file and
> ->rw_file are already NULL. Given that ->nfs_uuid was not NULL, it is
> extremely unlikely that neither ->ro_file or ->rw_file is NULL so
> this optimisation can be of little value and it complicates
> understanding of the code - why can the list_del_init() be skipped?
>
> Finally, move the assignment of NULL to ->nfs_uuid until after
> the last action on the nfs_file_localio (the list_del_init). As soon as
> this is NULL a racing nfs_close_local_fh() can bypass all the locking
> and go on to free the nfs_file_localio, so we must be certain to be
> finished with it first.
>
> Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
> fs/nfs/localio.c | 11 +++--------
> fs/nfs_common/nfslocalio.c | 36 ++++++++++++++++--------------------
> 2 files changed, 19 insertions(+), 28 deletions(-)
>
> diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
> index 5c21caeae075..3674dd86f095 100644
> --- a/fs/nfs/localio.c
> +++ b/fs/nfs/localio.c
> @@ -279,14 +279,9 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred,
> if (IS_ERR(new))
> return NULL;
> /* try to swap in the pointer */
> - spin_lock(&clp->cl_uuid.lock);
> - nf = rcu_dereference_protected(*pnf, 1);
> - if (!nf) {
> - nf = new;
> - new = NULL;
> - rcu_assign_pointer(*pnf, nf);
> - }
> - spin_unlock(&clp->cl_uuid.lock);
> + nf = unrcu_pointer(cmpxchg(pnf, NULL, RCU_INITIALIZER(new)));
> + if (!nf)
> + swap(nf, new);
> rcu_read_lock();
> }
> nf = nfs_local_file_get(nf);
> diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
> index 6a0bdea6d644..d72eecb85ea9 100644
> --- a/fs/nfs_common/nfslocalio.c
> +++ b/fs/nfs_common/nfslocalio.c
> @@ -285,28 +285,24 @@ void nfs_close_local_fh(struct nfs_file_localio *nfl)
> return;
> }
>
> - ro_nf = rcu_access_pointer(nfl->ro_file);
> - rw_nf = rcu_access_pointer(nfl->rw_file);
> - if (ro_nf || rw_nf) {
> - spin_lock(&nfs_uuid->lock);
> - if (ro_nf)
> - ro_nf = rcu_dereference_protected(xchg(&nfl->ro_file, NULL), 1);
> - if (rw_nf)
> - rw_nf = rcu_dereference_protected(xchg(&nfl->rw_file, NULL), 1);
> -
> - /* Remove nfl from nfs_uuid->files list */
> - RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
> - list_del_init(&nfl->list);
> - spin_unlock(&nfs_uuid->lock);
> - rcu_read_unlock();
> + ro_nf = unrcu_pointer(xchg(&nfl->ro_file, NULL));
> + rw_nf = unrcu_pointer(xchg(&nfl->rw_file, NULL));
>
> - if (ro_nf)
> - nfs_to_nfsd_file_put_local(ro_nf);
> - if (rw_nf)
> - nfs_to_nfsd_file_put_local(rw_nf);
> - return;
> - }
> + spin_lock(&nfs_uuid->lock);
> + /* Remove nfl from nfs_uuid->files list */
> + list_del_init(&nfl->list);
> + spin_unlock(&nfs_uuid->lock);
> rcu_read_unlock();
> + /* Now we can allow racing nfs_close_local_fh() to
> + * skip the locking.
> + */
> + RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
> +
> + if (ro_nf)
> + nfs_to_nfsd_file_put_local(ro_nf);
> + if (rw_nf)
> + nfs_to_nfsd_file_put_local(rw_nf);
> + return;
> }
This return at the end of the function isn't needed.
> EXPORT_SYMBOL_GPL(nfs_close_local_fh);
>
> --
> 2.49.0
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/6] nfs_localio: use cmpxchg() to install new nfs_file_localio
2025-05-09 0:46 [PATCH 0/6 v2] nfs_localio: fixes for races and errors from older compilers NeilBrown
@ 2025-05-09 0:46 ` NeilBrown
0 siblings, 0 replies; 13+ messages in thread
From: NeilBrown @ 2025-05-09 0:46 UTC (permalink / raw)
To: Trond Myklebust, Anna Schumaker, Mike Snitzer, Pali Rohár,
Vincent Mailhol
Cc: Chuck Lever, Jeff Layton, linux-nfs
Rather than using nfs_uuid.lock to protect installing
a new ro_file or rw_file, change to use cmpxchg().
Removing the file already uses xchg() so this improves symmetry
and also makes the code a little simpler.
Also remove the optimisation of not taking the lock, and not removing
the nfs_file_localio from the linked list, when both ->ro_file and
->rw_file are already NULL. Given that ->nfs_uuid was not NULL, it is
extremely unlikely that neither ->ro_file or ->rw_file is NULL so
this optimisation can be of little value and it complicates
understanding of the code - why can the list_del_init() be skipped?
Finally, move the assignment of NULL to ->nfs_uuid until after
the last action on the nfs_file_localio (the list_del_init). As soon as
this is NULL a racing nfs_close_local_fh() can bypass all the locking
and go on to free the nfs_file_localio, so we must be certain to be
finished with it first.
Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
---
fs/nfs/localio.c | 11 +++--------
fs/nfs_common/nfslocalio.c | 39 +++++++++++++++++---------------------
2 files changed, 20 insertions(+), 30 deletions(-)
diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
index 5c21caeae075..3674dd86f095 100644
--- a/fs/nfs/localio.c
+++ b/fs/nfs/localio.c
@@ -279,14 +279,9 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred,
if (IS_ERR(new))
return NULL;
/* try to swap in the pointer */
- spin_lock(&clp->cl_uuid.lock);
- nf = rcu_dereference_protected(*pnf, 1);
- if (!nf) {
- nf = new;
- new = NULL;
- rcu_assign_pointer(*pnf, nf);
- }
- spin_unlock(&clp->cl_uuid.lock);
+ nf = unrcu_pointer(cmpxchg(pnf, NULL, RCU_INITIALIZER(new)));
+ if (!nf)
+ swap(nf, new);
rcu_read_lock();
}
nf = nfs_local_file_get(nf);
diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
index 6a0bdea6d644..bdf251332b6b 100644
--- a/fs/nfs_common/nfslocalio.c
+++ b/fs/nfs_common/nfslocalio.c
@@ -273,8 +273,8 @@ EXPORT_SYMBOL_GPL(nfs_open_local_fh);
void nfs_close_local_fh(struct nfs_file_localio *nfl)
{
- struct nfsd_file *ro_nf = NULL;
- struct nfsd_file *rw_nf = NULL;
+ struct nfsd_file *ro_nf;
+ struct nfsd_file *rw_nf;
nfs_uuid_t *nfs_uuid;
rcu_read_lock();
@@ -285,28 +285,23 @@ void nfs_close_local_fh(struct nfs_file_localio *nfl)
return;
}
- ro_nf = rcu_access_pointer(nfl->ro_file);
- rw_nf = rcu_access_pointer(nfl->rw_file);
- if (ro_nf || rw_nf) {
- spin_lock(&nfs_uuid->lock);
- if (ro_nf)
- ro_nf = rcu_dereference_protected(xchg(&nfl->ro_file, NULL), 1);
- if (rw_nf)
- rw_nf = rcu_dereference_protected(xchg(&nfl->rw_file, NULL), 1);
-
- /* Remove nfl from nfs_uuid->files list */
- RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
- list_del_init(&nfl->list);
- spin_unlock(&nfs_uuid->lock);
- rcu_read_unlock();
+ ro_nf = unrcu_pointer(xchg(&nfl->ro_file, NULL));
+ rw_nf = unrcu_pointer(xchg(&nfl->rw_file, NULL));
- if (ro_nf)
- nfs_to_nfsd_file_put_local(ro_nf);
- if (rw_nf)
- nfs_to_nfsd_file_put_local(rw_nf);
- return;
- }
+ spin_lock(&nfs_uuid->lock);
+ /* Remove nfl from nfs_uuid->files list */
+ list_del_init(&nfl->list);
+ spin_unlock(&nfs_uuid->lock);
rcu_read_unlock();
+ /* Now we can allow racing nfs_close_local_fh() to
+ * skip the locking.
+ */
+ RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
+
+ if (ro_nf)
+ nfs_to_nfsd_file_put_local(ro_nf);
+ if (rw_nf)
+ nfs_to_nfsd_file_put_local(rw_nf);
}
EXPORT_SYMBOL_GPL(nfs_close_local_fh);
--
2.49.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-05-09 0:49 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-30 4:01 [PATCH 0/6] nfs_localio: fixes for races and errors from older compilers NeilBrown
2025-04-30 4:01 ` [PATCH 1/6] nfs_localio: use cmpxchg() to install new nfs_file_localio NeilBrown
2025-04-30 22:21 ` Mike Snitzer
2025-04-30 4:01 ` [PATCH 2/6] nfs_localio: always hold nfsd net ref with nfsd_file ref NeilBrown
2025-04-30 4:01 ` [PATCH 3/6] nfs_localio: simplify interface to nfsd for getting nfsd_file NeilBrown
2025-04-30 4:01 ` [PATCH 4/6] nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu struct NeilBrown
2025-04-30 4:01 ` [PATCH 5/6] nfs_localio: duplicate nfs_close_local_fh() NeilBrown
2025-04-30 4:01 ` [PATCH 6/6] nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh() NeilBrown
2025-04-30 17:34 ` Mike Snitzer
2025-04-30 20:20 ` Mike Snitzer
2025-04-30 22:16 ` NeilBrown
2025-04-30 22:12 ` NeilBrown
-- strict thread matches above, loose matches on Subject: below --
2025-05-09 0:46 [PATCH 0/6 v2] nfs_localio: fixes for races and errors from older compilers NeilBrown
2025-05-09 0:46 ` [PATCH 1/6] nfs_localio: use cmpxchg() to install new nfs_file_localio NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox