Linux NFS development
 help / color / mirror / Atom feed
* [PATCH 0/8] nfsd: fixes for locally-triggerable bugs
@ 2026-06-01 17:31 Jeff Layton
  2026-06-01 17:31 ` [PATCH 1/8] nfsd: defer vfree of compound ops to fix rpc_status UAF Jeff Layton
                   ` (7 more replies)
  0 siblings, 8 replies; 12+ messages in thread
From: Jeff Layton @ 2026-06-01 17:31 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Lorenzo Bianconi, Anna Schumaker, Trond Myklebust, Anna Schumaker,
	Mike Snitzer
  Cc: Chris Mason, linux-nfs, linux-kernel, Trond Myklebust,
	Jeff Layton

These are bugs that Claude classified as locally-triggerable. A couple
can be triggered by an unprivileged user, but the rest require admin
access.

The last 3 patches fix one bug. I originally had a more targeted fix
that kres generated, but I think it's better to simplify the filecache
disposal mechanism to get rid of the bug rather than add more
complexity.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
Chris Mason (3):
      nfsd: hold rcu across localio cmpxchg retry
      nfs/localio: fix ref leak on nfs_uuid_add_file failure
      nfsd: guard nfsd_serv deref in nfsd_file_net_dispose

Jeff Layton (5):
      nfsd: defer vfree of compound ops to fix rpc_status UAF
      nfsd: widen nfsd_genl_rqstp address fields to sockaddr_storage
      nfsd: fix refcount leak in nfsd_file_lru_add on insertion failure
      nfsd: fix fcache_disposal UAF by inlining dispose state into nfsd_net
      nfsd: hold net namespace reference in nfsd_file

 fs/nfs_common/nfslocalio.c |  14 +++++-
 fs/nfsd/filecache.c        | 120 +++++++++++++++++----------------------------
 fs/nfsd/filecache.h        |   2 +-
 fs/nfsd/localio.c          |  12 +++--
 fs/nfsd/netns.h            |   3 +-
 fs/nfsd/nfs4xdr.c          |   2 +-
 fs/nfsd/nfsctl.c           |  12 ++---
 include/linux/nfslocalio.h |   9 +---
 8 files changed, 80 insertions(+), 94 deletions(-)
---
base-commit: d7203affbe85baad683cef946f661c5541966d97
change-id: 20260601-nfsd-testing-e3509d5e035e

Best regards,
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/8] nfsd: defer vfree of compound ops to fix rpc_status UAF
  2026-06-01 17:31 [PATCH 0/8] nfsd: fixes for locally-triggerable bugs Jeff Layton
@ 2026-06-01 17:31 ` Jeff Layton
  2026-06-01 17:31 ` [PATCH 2/8] nfsd: hold rcu across localio cmpxchg retry Jeff Layton
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2026-06-01 17:31 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Lorenzo Bianconi, Anna Schumaker, Trond Myklebust, Anna Schumaker,
	Mike Snitzer
  Cc: Chris Mason, linux-nfs, linux-kernel, Trond Myklebust,
	Jeff Layton

The rpc_status netlink dumpit walks every in-flight svc_rqst under
rcu_read_lock and, for NFSv4 requests, reads opnums out of
args->ops[]. But args->ops is a separate vmalloc buffer freed
synchronously by vfree() in nfsd4_release_compoundargs() at the end
of every compound. The dumpit's rcu_read_lock pins the svc_rqst
struct itself (freed via kfree_rcu), but nothing defers the vfree
of the ops buffer across the RCU grace period. A concurrent compound
completion can therefore free the buffer while the dumpit is reading
it — a use-after-free on vmalloc memory.

The trailing seqcount recheck (smp_load_acquire of rq_status_counter)
cannot undo a load that already retired against freed memory.

Fix by replacing vfree(args->ops) with kvfree_rcu_mightsleep(), which
defers the free until after an RCU grace period. This makes the
existing rcu_read_lock in the dumpit sufficient to protect the read.
The tradeoff is that completed compound ops buffers (up to
200 * sizeof(struct nfsd4_op)) persist in memory slightly longer,
across one grace period, before being reclaimed.

Fixes: bd9d6a3efa97 ("NFSD: add rpc_status netlink support")
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfs4xdr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 487a1f62ce15..90272241dacc 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -6686,7 +6686,7 @@ void nfsd4_release_compoundargs(struct svc_rqst *rqstp)
 	struct nfsd4_compoundargs *args = rqstp->rq_argp;
 
 	if (args->ops != args->iops) {
-		vfree(args->ops);
+		kvfree_rcu_mightsleep(args->ops);
 		args->ops = args->iops;
 	}
 	while (args->to_free) {

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/8] nfsd: hold rcu across localio cmpxchg retry
  2026-06-01 17:31 [PATCH 0/8] nfsd: fixes for locally-triggerable bugs Jeff Layton
  2026-06-01 17:31 ` [PATCH 1/8] nfsd: defer vfree of compound ops to fix rpc_status UAF Jeff Layton
@ 2026-06-01 17:31 ` Jeff Layton
  2026-06-01 17:31 ` [PATCH 3/8] nfs/localio: fix ref leak on nfs_uuid_add_file failure Jeff Layton
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2026-06-01 17:31 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Lorenzo Bianconi, Anna Schumaker, Trond Myklebust, Anna Schumaker,
	Mike Snitzer
  Cc: Chris Mason, linux-nfs, linux-kernel, Trond Myklebust,
	Jeff Layton

From: Chris Mason <clm@meta.com>

nfsd_file objects are freed via call_rcu (filecache.c:296), and
nfsd_file_slab is created without SLAB_TYPESAFE_BY_RCU
(KMEM_CACHE(nfsd_file, 0) at filecache.c:789), so the slab page
backing a freed nfsd_file becomes freely reclaimable once the RCU
grace period elapses.

The again: retry block in nfsd_open_local_fh() loads a pointer with
cmpxchg and then calls nfsd_file_get(new) (which is
refcount_inc_not_zero) without holding rcu_read_lock. The sole caller
nfs_open_local_fh() drops rcu_read_lock before invoking this helper,
so no outer reader-side critical section covers the load.

    CPU 0 (nfsd_open_local_fh)        CPU 1 (nfsd_file_put_local)
    -----                             -----
    new = cmpxchg(pnf, NULL, ...)
                                      nf = xchg(pnf, NULL)
                                      nfsd_file_put(nf)
                                        last ref -> call_rcu()
                                      /* grace period elapses;
                                         slab page recycled */
    nfsd_file_get(new)
      refcount_inc_not_zero(&new->nf_ref)
      /* operates on recycled memory */

A non-zero word at the nf_ref offset of the recycled object makes the
refcount bump appear to succeed, and the caller then dereferences
new->nf_net and new->nf_file out of freed memory.

Fix by taking rcu_read_lock() immediately before the cmpxchg and
releasing it on all three exits of the if (new) block: the goto-again
retry, the lost-race cleanup path, and the install-succeeded path.
nfsd_file_put() and nfsd_net_put() stay outside the RCU section so
they remain free to block.

Fixes: e6f7e1487ab5 ("nfs_localio: simplify interface to nfsd for getting nfsd_file")
Assisted-by: kres:claude-opus-4-7
Signed-off-by: Chris Mason <clm@meta.com>
---
 fs/nfsd/localio.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c
index be710d809a3b..c3eb0557b3e1 100644
--- a/fs/nfsd/localio.c
+++ b/fs/nfsd/localio.c
@@ -97,11 +97,15 @@ nfsd_open_local_fh(struct net *net, struct auth_domain *dom,
 		}
 		nfsd_file_get(localio);
 	again:
+		rcu_read_lock();
 		new = unrcu_pointer(cmpxchg(pnf, NULL, RCU_INITIALIZER(localio)));
 		if (new) {
 			/* Some other thread installed an nfsd_file */
-			if (nfsd_file_get(new) == NULL)
+			if (nfsd_file_get(new) == NULL) {
+				rcu_read_unlock();
 				goto again;
+			}
+			rcu_read_unlock();
 			/*
 			 * Drop the ref we were going to install (both file and
 			 * net) and the one we were going to return (only file).
@@ -110,6 +114,8 @@ nfsd_open_local_fh(struct net *net, struct auth_domain *dom,
 			nfsd_net_put(net);
 			nfsd_file_put(localio);
 			localio = new;
+		} else {
+			rcu_read_unlock();
 		}
 	} else
 		nfsd_net_put(net);

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/8] nfs/localio: fix ref leak on nfs_uuid_add_file failure
  2026-06-01 17:31 [PATCH 0/8] nfsd: fixes for locally-triggerable bugs Jeff Layton
  2026-06-01 17:31 ` [PATCH 1/8] nfsd: defer vfree of compound ops to fix rpc_status UAF Jeff Layton
  2026-06-01 17:31 ` [PATCH 2/8] nfsd: hold rcu across localio cmpxchg retry Jeff Layton
@ 2026-06-01 17:31 ` Jeff Layton
  2026-06-01 17:31 ` [PATCH 4/8] nfsd: guard nfsd_serv deref in nfsd_file_net_dispose Jeff Layton
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2026-06-01 17:31 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Lorenzo Bianconi, Anna Schumaker, Trond Myklebust, Anna Schumaker,
	Mike Snitzer
  Cc: Chris Mason, linux-nfs, linux-kernel, Trond Myklebust,
	Jeff Layton

From: Chris Mason <clm@meta.com>

When nfs_uuid_add_file() races with nfs_uuid_put() tearing down
uuid->net, it returns -ENXIO without publishing nfl->nfs_uuid via
rcu_assign_pointer().  nfs_open_local_fh() then enters its error
branch and only releases the slot pair plus its own entry-time net
ref, while the close path is a no-op:

    nfs_close_local_fh()
      nfs_uuid = rcu_dereference(nfl->nfs_uuid);
      if (!nfs_uuid) { rcu_read_unlock(); return; }  /* always */

nfsd_open_local_fh() returns localio holding a caller-owned +1
nfsd_file reference (from nfsd_file_get() after
nfsd_file_acquire_local()) and an entry-time nfsd_net reference
(from its first nfsd_net_try_get()) embedded as nf->nf_net.  Both
are leaked on the failure path, pinning one nfsd_file (and the
underlying struct file, dentry, inode) and one nfsd_net_ref per
occurrence, which blocks nfsd_net and netns teardown.

Fix by releasing the caller-owned file ref and its net ref through
the existing helper, using a stack-local RCU pointer so the helper
can xchg it out, then returning -ENXIO so callers do not
dereference a localio whose slot has been cleared:

    struct nfsd_file __rcu *tmp = RCU_INITIALIZER(localio);

    nfs_to_nfsd_file_put_local(pnf);
    nfs_to_nfsd_file_put_local(&tmp);
    localio = ERR_PTR(-ENXIO);

The trailing nfs_to_nfsd_net_put(net) continues to release the
outer net ref, so all three nfsd_net_try_get() increments are
balanced on the error branch.

Fixes: fdd015de7679 ("NFS/localio: nfs_uuid_put() fix races with nfs_open/close_local_fh()")
Assisted-by: kres:claude-opus-4-7
Signed-off-by: Chris Mason <clm@meta.com>
---
 fs/nfs_common/nfslocalio.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
index dd715cdb6c04..a3c1c5c2764a 100644
--- a/fs/nfs_common/nfslocalio.c
+++ b/fs/nfs_common/nfslocalio.c
@@ -292,8 +292,20 @@ struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
 	localio = nfs_to->nfsd_open_local_fh(net, uuid->dom, rpc_clnt, cred,
 					     nfs_fh, pnf, fmode);
 	if (!IS_ERR(localio) && nfs_uuid_add_file(uuid, nfl) < 0) {
-		/* Delete the cached file when racing with nfs_uuid_put() */
+		/*
+		 * Delete the cached file when racing with nfs_uuid_put().
+		 * Since nfl->nfs_uuid was never published via
+		 * rcu_assign_pointer(), nfs_close_local_fh() will early-return
+		 * and cannot clean up after us.  Drop the slot pair, then drop
+		 * the caller-owned nfsd_file ref (+1) and the entry-time
+		 * nfsd_net ref carried via nf->nf_net, and return -ENXIO so
+		 * the caller never dereferences the now-cleared localio.
+		 */
+		struct nfsd_file __rcu *tmp = RCU_INITIALIZER(localio);
+
 		nfs_to_nfsd_file_put_local(pnf);
+		nfs_to_nfsd_file_put_local(&tmp);
+		localio = ERR_PTR(-ENXIO);
 	}
 	nfs_to_nfsd_net_put(net);
 

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/8] nfsd: guard nfsd_serv deref in nfsd_file_net_dispose
  2026-06-01 17:31 [PATCH 0/8] nfsd: fixes for locally-triggerable bugs Jeff Layton
                   ` (2 preceding siblings ...)
  2026-06-01 17:31 ` [PATCH 3/8] nfs/localio: fix ref leak on nfs_uuid_add_file failure Jeff Layton
@ 2026-06-01 17:31 ` Jeff Layton
  2026-06-01 17:31 ` [PATCH 5/8] nfsd: widen nfsd_genl_rqstp address fields to sockaddr_storage Jeff Layton
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2026-06-01 17:31 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Lorenzo Bianconi, Anna Schumaker, Trond Myklebust, Anna Schumaker,
	Mike Snitzer
  Cc: Chris Mason, linux-nfs, linux-kernel, Trond Myklebust,
	Jeff Layton

From: Chris Mason <clm@meta.com>

nfsd_file_net_dispose() is the consumer side of l->freeme: the nfsd
service thread loop calls it to drain entries that the filecache
garbage collector and shrinker append via
nfsd_file_dispose_list_delayed().  During per-net teardown,
nn->nfsd_serv is cleared before the filecache laundrette is shut
down, so the service thread can still run a dispose pass that finds
more than eight entries on l->freeme and dereferences a NULL
svc_serv:

    nfsd service thread loop
      nfsd_file_net_dispose(nn)
        if (!list_empty(&l->freeme)) {
            ...
            svc_wake_up(nn->nfsd_serv);   /* nn->nfsd_serv == NULL */
        }

The sibling helper nfsd_file_dispose_list_delayed() already documents
this ordering and caches nn->nfsd_serv into a local before testing it
for NULL.  nfsd_file_net_dispose() was introduced with the same raw
svc_wake_up(nn->nfsd_serv) call and never picked up the guard.

Fix by loading nn->nfsd_serv into a local svc_serv pointer and only
calling svc_wake_up() when it is non-NULL, matching the pattern in
nfsd_file_dispose_list_delayed().

Fixes: ffb402596147 ("nfsd: Don't leave work of closing files to a work queue")
Assisted-by: kres:claude-opus-4-7
Signed-off-by: Chris Mason <clm@meta.com>
---
 fs/nfsd/filecache.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 2f0d4de779af..1e2e1f89216e 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -474,11 +474,20 @@ void nfsd_file_net_dispose(struct nfsd_net *nn)
 		for (i = 0; i < 8 && !list_empty(&l->freeme); i++)
 			list_move(l->freeme.next, &dispose);
 		spin_unlock(&l->lock);
-		if (!list_empty(&l->freeme))
-			/* Wake up another thread to share the work
+		if (!list_empty(&l->freeme)) {
+			/*
+			 * Wake up another thread to share the work
 			 * *before* doing any actual disposing.
+			 *
+			 * The filecache laundrette is shut down after
+			 * the nn->nfsd_serv pointer is cleared, but
+			 * before the svc_serv is freed.
 			 */
-			svc_wake_up(nn->nfsd_serv);
+			struct svc_serv *serv = nn->nfsd_serv;
+
+			if (serv)
+				svc_wake_up(serv);
+		}
 		nfsd_file_dispose_list(&dispose);
 	}
 }

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/8] nfsd: widen nfsd_genl_rqstp address fields to sockaddr_storage
  2026-06-01 17:31 [PATCH 0/8] nfsd: fixes for locally-triggerable bugs Jeff Layton
                   ` (3 preceding siblings ...)
  2026-06-01 17:31 ` [PATCH 4/8] nfsd: guard nfsd_serv deref in nfsd_file_net_dispose Jeff Layton
@ 2026-06-01 17:31 ` Jeff Layton
  2026-06-01 17:31 ` [PATCH 6/8] nfsd: fix refcount leak in nfsd_file_lru_add on insertion failure Jeff Layton
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2026-06-01 17:31 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Lorenzo Bianconi, Anna Schumaker, Trond Myklebust, Anna Schumaker,
	Mike Snitzer
  Cc: Chris Mason, linux-nfs, linux-kernel, Trond Myklebust,
	Jeff Layton

struct nfsd_genl_rqstp declares rq_daddr and rq_saddr as plain
"struct sockaddr" (16 bytes). When an IPv6 NFS client is connected,
nfsd_genl_rpc_status_compose_msg() casts these fields to
"struct sockaddr_in6 *" (28 bytes) and reads sin6_addr at offset 8..24,
which extends 8 bytes past the end of the 16-byte sockaddr field into
the adjacent rq_flags member. The 16-byte nla_put_in6_addr then ships 8
bytes of truncated IPv6 address followed by 8 bytes of rq_flags to
userspace via the NFSD_A_RPC_STATUS_SADDR6/DADDR6 netlink attributes.

This is reachable by any unprivileged process in the network namespace
because NFSD_CMD_RPC_STATUS_GET uses GENL_CMD_CAP_DUMP without
GENL_ADMIN_PERM.

Fix by widening rq_daddr and rq_saddr to struct sockaddr_storage so the
IPv6 casts operate within bounds, copying sizeof(struct sockaddr_storage)
bytes in the memcpy calls so the full address is captured, and
zero-initializing the genl_rqstp stack variable to prevent leaking
uninitialized tail bytes through netlink.

Fixes: bd9d6a3efa97 ("NFSD: add rpc_status netlink support")
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/nfsctl.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 92f65ca6f667..6fee49a7787f 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1414,8 +1414,8 @@ static int create_proc_exports_entry(void)
 unsigned int nfsd_net_id;
 
 struct nfsd_genl_rqstp {
-	struct sockaddr		rq_daddr;
-	struct sockaddr		rq_saddr;
+	struct sockaddr_storage	rq_daddr;
+	struct sockaddr_storage	rq_saddr;
 	unsigned long		rq_flags;
 	ktime_t			rq_stime;
 	__be32			rq_xid;
@@ -1450,7 +1450,7 @@ static int nfsd_genl_rpc_status_compose_msg(struct sk_buff *skb,
 			NFSD_A_RPC_STATUS_PAD))
 		return -ENOBUFS;
 
-	switch (genl_rqstp->rq_saddr.sa_family) {
+	switch (genl_rqstp->rq_saddr.ss_family) {
 	case AF_INET: {
 		const struct sockaddr_in *s_in, *d_in;
 
@@ -1527,7 +1527,7 @@ int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
 		list_for_each_entry_rcu(rqstp,
 				&nn->nfsd_serv->sv_pools[i].sp_all_threads,
 				rq_all) {
-			struct nfsd_genl_rqstp genl_rqstp;
+			struct nfsd_genl_rqstp genl_rqstp = {};
 			unsigned int status_counter;
 
 			if (rqstp_index++ < cb->args[1]) /* already consumed */
@@ -1551,9 +1551,9 @@ int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
 			genl_rqstp.rq_stime = rqstp->rq_stime;
 			genl_rqstp.rq_opcnt = 0;
 			memcpy(&genl_rqstp.rq_daddr, svc_daddr(rqstp),
-			       sizeof(struct sockaddr));
+			       sizeof(struct sockaddr_storage));
 			memcpy(&genl_rqstp.rq_saddr, svc_addr(rqstp),
-			       sizeof(struct sockaddr));
+			       sizeof(struct sockaddr_storage));
 
 #ifdef CONFIG_NFSD_V4
 			if (rqstp->rq_vers == NFS4_VERSION &&

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/8] nfsd: fix refcount leak in nfsd_file_lru_add on insertion failure
  2026-06-01 17:31 [PATCH 0/8] nfsd: fixes for locally-triggerable bugs Jeff Layton
                   ` (4 preceding siblings ...)
  2026-06-01 17:31 ` [PATCH 5/8] nfsd: widen nfsd_genl_rqstp address fields to sockaddr_storage Jeff Layton
@ 2026-06-01 17:31 ` Jeff Layton
  2026-06-01 17:31 ` [PATCH 7/8] nfsd: fix fcache_disposal UAF by inlining dispose state into nfsd_net Jeff Layton
  2026-06-01 17:31 ` [PATCH 8/8] nfsd: hold net namespace reference in nfsd_file Jeff Layton
  7 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2026-06-01 17:31 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Lorenzo Bianconi, Anna Schumaker, Trond Myklebust, Anna Schumaker,
	Mike Snitzer
  Cc: Chris Mason, linux-nfs, linux-kernel, Trond Myklebust,
	Jeff Layton

nfsd_file_lru_add() unconditionally increments nf_ref before attempting
to insert the nfsd_file into the LRU via list_lru_add_obj(). If the
insertion fails (the item is already linked), the incremented reference
is never released, permanently inflating the refcount.

The LRU shrinker callback (nfsd_file_lru_cb) uses refcount_dec_if_one()
to reclaim entries, which requires nf_ref == 1. An inflated refcount
therefore blocks eviction of the affected file cache entry for the
lifetime of the nfsd instance.

While this failure path is currently unreachable -- the sole caller in
nfsd_file_do_acquire() operates on freshly-allocated objects that cannot
already be on the LRU -- it represents a latent bug that would become
exploitable if a future change adds another call site or alters the
PENDING protocol.

Fix this by:
 - Adding a compensating refcount_dec() on the failure path. Bare
   refcount_dec (rather than nfsd_file_put) is correct here because
   the caller in nfsd_file_do_acquire still holds its own construction
   reference, so the count goes from 2 back to 1 without risk of
   reaching zero.
 - Changing WARN_ON(1) to WARN_ON_ONCE(1) to prevent log flooding if
   this path is ever hit repeatedly.
 - Returning early on failure to skip the unnecessary call to
   nfsd_file_schedule_laundrette(), since no entry was added to the LRU.

Fixes: 56221b42d717 ("nfsd: filecache: don't repeatedly add/remove files on the lru list")
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/filecache.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 1e2e1f89216e..d5b917e40d62 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -330,8 +330,11 @@ static void nfsd_file_lru_add(struct nfsd_file *nf)
 	refcount_inc(&nf->nf_ref);
 	if (list_lru_add_obj(&nfsd_file_lru, &nf->nf_lru))
 		trace_nfsd_file_lru_add(nf);
-	else
-		WARN_ON(1);
+	else {
+		refcount_dec(&nf->nf_ref);
+		WARN_ON_ONCE(1);
+		return;
+	}
 	nfsd_file_schedule_laundrette();
 }
 

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 7/8] nfsd: fix fcache_disposal UAF by inlining dispose state into nfsd_net
  2026-06-01 17:31 [PATCH 0/8] nfsd: fixes for locally-triggerable bugs Jeff Layton
                   ` (5 preceding siblings ...)
  2026-06-01 17:31 ` [PATCH 6/8] nfsd: fix refcount leak in nfsd_file_lru_add on insertion failure Jeff Layton
@ 2026-06-01 17:31 ` Jeff Layton
  2026-06-01 17:31 ` [PATCH 8/8] nfsd: hold net namespace reference in nfsd_file Jeff Layton
  7 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2026-06-01 17:31 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Lorenzo Bianconi, Anna Schumaker, Trond Myklebust, Anna Schumaker,
	Mike Snitzer
  Cc: Chris Mason, linux-nfs, linux-kernel, Trond Myklebust,
	Jeff Layton

nfsd_file_dispose_list_delayed() defers fput() to nfsd service threads
via a per-net freeme queue, preventing the shrinker and GC worker from
bearing the cost of closing files (see ffb402596147).  However, the
queue lives in a separately-allocated struct nfsd_fcache_disposal that
is freed by nfsd_free_fcache_disposal_net() during per-net teardown.
The global shrinker, laundrette, and fsnotify callbacks can still be
inside nfsd_file_dispose_list_delayed() dereferencing that pointer,
causing a use-after-free.

Inline the spinlock and freeme list directly into struct nfsd_net (as
fcache_dispose_lock and fcache_dispose_list), eliminating the separately
allocated struct nfsd_fcache_disposal entirely.  These fields now have
the same lifetime as the net namespace itself, so there is no dangling
pointer to chase.

nfsd_file_cache_start_net() now just initializes the inline fields and
cannot fail due to allocation.  nfsd_file_cache_shutdown_net() drains
the inline list directly instead of freeing a separate struct.  The
alloc/free helpers are removed.

Fixes: 1463b38e7cf3 ("NFSD: simplify per-net file cache management")
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/filecache.c | 90 +++++++++++++----------------------------------------
 fs/nfsd/netns.h     |  3 +-
 2 files changed, 24 insertions(+), 69 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index d5b917e40d62..03f01a0beced 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -62,11 +62,6 @@ static DEFINE_PER_CPU(unsigned long, nfsd_file_releases);
 static DEFINE_PER_CPU(unsigned long, nfsd_file_total_age);
 static DEFINE_PER_CPU(unsigned long, nfsd_file_evictions);
 
-struct nfsd_fcache_disposal {
-	spinlock_t lock;
-	struct list_head freeme;
-};
-
 static struct kmem_cache		*nfsd_file_slab;
 static struct kmem_cache		*nfsd_file_mark_slab;
 static struct list_lru			nfsd_file_lru;
@@ -425,31 +420,26 @@ nfsd_file_dispose_list(struct list_head *dispose)
 }
 
 /**
- * nfsd_file_dispose_list_delayed - move list of dead files to net's freeme list
+ * nfsd_file_dispose_list_delayed - queue dead files for disposal by nfsd threads
  * @dispose: list of nfsd_files to be disposed
  *
- * Transfers each file to the "freeme" list for its nfsd_net, to eventually
- * be disposed of by the per-net garbage collector.
+ * Transfers each file to the per-net freeme list in its nfsd_net and wakes
+ * an nfsd thread to do the actual close.  This keeps the cost of fput()
+ * in the nfsd threads rather than in the shrinker or GC worker.
  */
 static void
 nfsd_file_dispose_list_delayed(struct list_head *dispose)
 {
-	while(!list_empty(dispose)) {
+	while (!list_empty(dispose)) {
 		struct nfsd_file *nf = list_first_entry(dispose,
 						struct nfsd_file, nf_gc);
 		struct nfsd_net *nn = net_generic(nf->nf_net, nfsd_net_id);
-		struct nfsd_fcache_disposal *l = nn->fcache_disposal;
 		struct svc_serv *serv;
 
-		spin_lock(&l->lock);
-		list_move_tail(&nf->nf_gc, &l->freeme);
-		spin_unlock(&l->lock);
+		spin_lock(&nn->fcache_dispose_lock);
+		list_move_tail(&nf->nf_gc, &nn->fcache_dispose_list);
+		spin_unlock(&nn->fcache_dispose_lock);
 
-		/*
-		 * The filecache laundrette is shut down after the
-		 * nn->nfsd_serv pointer is cleared, but before the
-		 * svc_serv is freed.
-		 */
 		serv = nn->nfsd_serv;
 		if (serv)
 			svc_wake_up(serv);
@@ -467,25 +457,15 @@ nfsd_file_dispose_list_delayed(struct list_head *dispose)
  */
 void nfsd_file_net_dispose(struct nfsd_net *nn)
 {
-	struct nfsd_fcache_disposal *l = nn->fcache_disposal;
-
-	if (!list_empty(&l->freeme)) {
+	if (!list_empty(&nn->fcache_dispose_list)) {
 		LIST_HEAD(dispose);
 		int i;
 
-		spin_lock(&l->lock);
-		for (i = 0; i < 8 && !list_empty(&l->freeme); i++)
-			list_move(l->freeme.next, &dispose);
-		spin_unlock(&l->lock);
-		if (!list_empty(&l->freeme)) {
-			/*
-			 * Wake up another thread to share the work
-			 * *before* doing any actual disposing.
-			 *
-			 * The filecache laundrette is shut down after
-			 * the nn->nfsd_serv pointer is cleared, but
-			 * before the svc_serv is freed.
-			 */
+		spin_lock(&nn->fcache_dispose_lock);
+		for (i = 0; i < 8 && !list_empty(&nn->fcache_dispose_list); i++)
+			list_move(nn->fcache_dispose_list.next, &dispose);
+		spin_unlock(&nn->fcache_dispose_lock);
+		if (!list_empty(&nn->fcache_dispose_list)) {
 			struct svc_serv *serv = nn->nfsd_serv;
 
 			if (serv)
@@ -701,11 +681,11 @@ nfsd_file_queue_for_close(struct inode *inode, struct list_head *dispose)
 }
 
 /**
- * nfsd_file_close_inode - attempt a delayed close of a nfsd_file
+ * nfsd_file_close_inode - attempt a deferred close of a nfsd_file
  * @inode: inode of the file to attempt to remove
  *
  * Close out any open nfsd_files that can be reaped for @inode. The
- * actual freeing is deferred to the dispose_list_delayed infrastructure.
+ * actual freeing is deferred to the nfsd service threads.
  *
  * This is used by the fsnotify callbacks and setlease notifier.
  */
@@ -990,42 +970,14 @@ __nfsd_file_cache_purge(struct net *net)
 	nfsd_file_dispose_list(&dispose);
 }
 
-static struct nfsd_fcache_disposal *
-nfsd_alloc_fcache_disposal(void)
-{
-	struct nfsd_fcache_disposal *l;
-
-	l = kmalloc_obj(*l);
-	if (!l)
-		return NULL;
-	spin_lock_init(&l->lock);
-	INIT_LIST_HEAD(&l->freeme);
-	return l;
-}
-
-static void
-nfsd_free_fcache_disposal(struct nfsd_fcache_disposal *l)
-{
-	nfsd_file_dispose_list(&l->freeme);
-	kfree(l);
-}
-
-static void
-nfsd_free_fcache_disposal_net(struct net *net)
-{
-	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
-	struct nfsd_fcache_disposal *l = nn->fcache_disposal;
-
-	nfsd_free_fcache_disposal(l);
-}
-
 int
 nfsd_file_cache_start_net(struct net *net)
 {
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
 
-	nn->fcache_disposal = nfsd_alloc_fcache_disposal();
-	return nn->fcache_disposal ? 0 : -ENOMEM;
+	spin_lock_init(&nn->fcache_dispose_lock);
+	INIT_LIST_HEAD(&nn->fcache_dispose_list);
+	return 0;
 }
 
 /**
@@ -1044,8 +996,10 @@ nfsd_file_cache_purge(struct net *net)
 void
 nfsd_file_cache_shutdown_net(struct net *net)
 {
+	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+
 	nfsd_file_cache_purge(net);
-	nfsd_free_fcache_disposal_net(net);
+	nfsd_file_dispose_list(&nn->fcache_dispose_list);
 }
 
 void
diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index f6b8b340bf8e..5c33c96da28e 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -216,7 +216,8 @@ struct nfsd_net {
 	/* utsname taken from the process that starts the server */
 	char			nfsd_name[UNX_MAXNODENAME+1];
 
-	struct nfsd_fcache_disposal *fcache_disposal;
+	spinlock_t		fcache_dispose_lock;
+	struct list_head	fcache_dispose_list;
 
 	siphash_key_t		siphash_key;
 

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 8/8] nfsd: hold net namespace reference in nfsd_file
  2026-06-01 17:31 [PATCH 0/8] nfsd: fixes for locally-triggerable bugs Jeff Layton
                   ` (6 preceding siblings ...)
  2026-06-01 17:31 ` [PATCH 7/8] nfsd: fix fcache_disposal UAF by inlining dispose state into nfsd_net Jeff Layton
@ 2026-06-01 17:31 ` Jeff Layton
  2026-06-01 17:50   ` Al Viro
  7 siblings, 1 reply; 12+ messages in thread
From: Jeff Layton @ 2026-06-01 17:31 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Lorenzo Bianconi, Anna Schumaker, Trond Myklebust, Anna Schumaker,
	Mike Snitzer
  Cc: Chris Mason, linux-nfs, linux-kernel, Trond Myklebust,
	Jeff Layton

Take a net-namespace reference in nfsd_file_alloc() (get_net) and
release it in nfsd_file_free() (put_net), so that nf_net is always
valid for files that the GC or shrinker has isolated from the hash
table and LRU -- which __nfsd_file_cache_purge() cannot see.

Without this, nf_net can dangle for in-flight files whose net namespace
is torn down concurrently, causing a use-after-free when
nfsd_file_dispose_list_delayed() calls net_generic(nf->nf_net, ...).

Because nfsd_file_free() now calls put_net(nf->nf_net), the old
nfsd_file_put_local() pattern of returning nf->nf_net after
nfsd_file_put() is unsafe -- put_net() could theoretically drop the
last net namespace reference, leaving the returned pointer stale.
Fix this by moving the nfsd_net_put() call into nfsd_file_put_local()
itself, before the nfsd_file_put() that may trigger nfsd_file_free().
The function now returns void and the caller no longer needs to handle
the net reference.

Fixes: 43fd953fa7e2 ("nfsd: simplify the delayed disposal list code")
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/filecache.c        | 22 ++++++++++++++--------
 fs/nfsd/filecache.h        |  2 +-
 fs/nfsd/localio.c          |  4 ++--
 include/linux/nfslocalio.h |  9 ++-------
 4 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 03f01a0beced..6f75df344091 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -221,7 +221,7 @@ nfsd_file_alloc(struct net *net, struct inode *inode, unsigned char need,
 	nf->nf_birthtime = ktime_get();
 	nf->nf_file = NULL;
 	nf->nf_cred = get_current_cred();
-	nf->nf_net = net;
+	nf->nf_net = get_net(net);
 	nf->nf_flags = want_gc ?
 		BIT(NFSD_FILE_HASHED) | BIT(NFSD_FILE_PENDING) | BIT(NFSD_FILE_GC) :
 		BIT(NFSD_FILE_HASHED) | BIT(NFSD_FILE_PENDING);
@@ -295,6 +295,8 @@ nfsd_file_free(struct nfsd_file *nf)
 	if (WARN_ON_ONCE(!list_empty(&nf->nf_lru)))
 		return;
 
+	put_net(nf->nf_net);
+
 	call_rcu(&nf->nf_rcu, nfsd_file_slab_free);
 }
 
@@ -375,24 +377,28 @@ nfsd_file_put(struct nfsd_file *nf)
 }
 
 /**
- * nfsd_file_put_local - put nfsd_file reference and arm nfsd_net_put in caller
+ * nfsd_file_put_local - put nfsd_file reference and release nfsd_net ref
  * @pnf: nfsd_file of which to put the reference
  *
- * First save the associated net to return to caller, then put
- * the reference of the nfsd_file.
+ * Drops both the nfsd_file reference and the associated nfsd_net
+ * reference.  The nfsd_net ref is released before the file ref so
+ * that put_net() inside nfsd_file_free() cannot drop the last net
+ * namespace reference while the caller still needs it.
  */
-struct net *
+void
 nfsd_file_put_local(struct nfsd_file __rcu **pnf)
 {
 	struct nfsd_file *nf;
-	struct net *net = NULL;
 
 	nf = unrcu_pointer(xchg(pnf, NULL));
 	if (nf) {
-		net = nf->nf_net;
+		struct net *net = nf->nf_net;
+
+		rcu_read_lock();
+		nfsd_net_put(net);
+		rcu_read_unlock();
 		nfsd_file_put(nf);
 	}
-	return net;
 }
 
 /**
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
index 683b6437cacc..88e397176c48 100644
--- a/fs/nfsd/filecache.h
+++ b/fs/nfsd/filecache.h
@@ -66,7 +66,7 @@ void nfsd_file_cache_shutdown(void);
 int nfsd_file_cache_start_net(struct net *net);
 void nfsd_file_cache_shutdown_net(struct net *net);
 void nfsd_file_put(struct nfsd_file *nf);
-struct net *nfsd_file_put_local(struct nfsd_file __rcu **nf);
+void nfsd_file_put_local(struct nfsd_file __rcu **nf);
 struct nfsd_file *nfsd_file_get(struct nfsd_file *nf);
 struct file *nfsd_file_file(struct nfsd_file *nf);
 void nfsd_file_close_inode_sync(struct inode *inode);
diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c
index c3eb0557b3e1..e3295bae75a4 100644
--- a/fs/nfsd/localio.c
+++ b/fs/nfsd/localio.c
@@ -40,8 +40,8 @@
  * avoid all the NFS overhead with reads, writes and commits.
  *
  * On successful return, returned nfsd_file will have its nf_net member
- * set. Caller (NFS client) is responsible for calling nfsd_net_put and
- * nfsd_file_put (via nfs_to_nfsd_file_put_local).
+ * set. Caller (NFS client) is responsible for calling nfsd_file_put
+ * (via nfs_to_nfsd_file_put_local), which also releases the nfsd_net ref.
  */
 static struct nfsd_file *
 nfsd_open_local_fh(struct net *net, struct auth_domain *dom,
diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h
index 3d91043254e6..7267a69092d1 100644
--- a/include/linux/nfslocalio.h
+++ b/include/linux/nfslocalio.h
@@ -62,7 +62,7 @@ struct nfsd_localio_operations {
 						const struct nfs_fh *,
 						struct nfsd_file __rcu **pnf,
 						const fmode_t);
-	struct net *(*nfsd_file_put_local)(struct nfsd_file __rcu **);
+	void (*nfsd_file_put_local)(struct nfsd_file __rcu **);
 	struct file *(*nfsd_file_file)(struct nfsd_file *);
 	void (*nfsd_file_dio_alignment)(struct nfsd_file *,
 					u32 *, u32 *, u32 *);
@@ -96,12 +96,7 @@ static inline void nfs_to_nfsd_file_put_local(struct nfsd_file __rcu **localio)
 	 * must prevent nfsd shutdown from completing as nfs_close_local_fh()
 	 * does by blocking the nfs_uuid from being finally put.
 	 */
-	struct net *net;
-
-	net = nfs_to->nfsd_file_put_local(localio);
-
-	if (net)
-		nfs_to_nfsd_net_put(net);
+	nfs_to->nfsd_file_put_local(localio);
 }
 
 #else   /* CONFIG_NFS_LOCALIO */

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 8/8] nfsd: hold net namespace reference in nfsd_file
  2026-06-01 17:31 ` [PATCH 8/8] nfsd: hold net namespace reference in nfsd_file Jeff Layton
@ 2026-06-01 17:50   ` Al Viro
  2026-06-01 18:18     ` Jeff Layton
  2026-06-01 18:33     ` Jeff Layton
  0 siblings, 2 replies; 12+ messages in thread
From: Al Viro @ 2026-06-01 17:50 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Lorenzo Bianconi, Anna Schumaker, Trond Myklebust, Anna Schumaker,
	Mike Snitzer, Chris Mason, linux-nfs, linux-kernel,
	Trond Myklebust

On Mon, Jun 01, 2026 at 01:31:11PM -0400, Jeff Layton wrote:
> Take a net-namespace reference in nfsd_file_alloc() (get_net) and
> release it in nfsd_file_free() (put_net), so that nf_net is always
> valid for files that the GC or shrinker has isolated from the hash
> table and LRU -- which __nfsd_file_cache_purge() cannot see.
> 
> Without this, nf_net can dangle for in-flight files whose net namespace
> is torn down concurrently, causing a use-after-free when
> nfsd_file_dispose_list_delayed() calls net_generic(nf->nf_net, ...).
> 
> Because nfsd_file_free() now calls put_net(nf->nf_net), the old
> nfsd_file_put_local() pattern of returning nf->nf_net after
> nfsd_file_put() is unsafe -- put_net() could theoretically drop the
> last net namespace reference, leaving the returned pointer stale.
> Fix this by moving the nfsd_net_put() call into nfsd_file_put_local()
> itself, before the nfsd_file_put() that may trigger nfsd_file_free().
> The function now returns void and the caller no longer needs to handle
> the net reference.

That means that each nfsd_file_alloc()/nfsd_file_free() is now touching
the same cacheline on kernels with netns enabled.  Scalability implications
might be interesting...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 8/8] nfsd: hold net namespace reference in nfsd_file
  2026-06-01 17:50   ` Al Viro
@ 2026-06-01 18:18     ` Jeff Layton
  2026-06-01 18:33     ` Jeff Layton
  1 sibling, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2026-06-01 18:18 UTC (permalink / raw)
  To: Al Viro
  Cc: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Lorenzo Bianconi, Anna Schumaker, Trond Myklebust, Anna Schumaker,
	Mike Snitzer, Chris Mason, linux-nfs, linux-kernel,
	Trond Myklebust

On Mon, 2026-06-01 at 18:50 +0100, Al Viro wrote:
> That means that each nfsd_file_alloc()/nfsd_file_free() is now touching
> the same cacheline on kernels with netns enabled.  Scalability implications
> might be interesting...

That's a very good point.

There are some alternatives. Let me play around with them and see if I
can find something better.

Thanks for the review,
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 8/8] nfsd: hold net namespace reference in nfsd_file
  2026-06-01 17:50   ` Al Viro
  2026-06-01 18:18     ` Jeff Layton
@ 2026-06-01 18:33     ` Jeff Layton
  1 sibling, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2026-06-01 18:33 UTC (permalink / raw)
  To: Al Viro
  Cc: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Lorenzo Bianconi, Anna Schumaker, Trond Myklebust, Anna Schumaker,
	Mike Snitzer, Chris Mason, linux-nfs, linux-kernel,
	Trond Myklebust

On Mon, 2026-06-01 at 18:50 +0100, Al Viro wrote:
> On Mon, Jun 01, 2026 at 01:31:11PM -0400, Jeff Layton wrote:
> > Take a net-namespace reference in nfsd_file_alloc() (get_net) and
> > release it in nfsd_file_free() (put_net), so that nf_net is always
> > valid for files that the GC or shrinker has isolated from the hash
> > table and LRU -- which __nfsd_file_cache_purge() cannot see.
> > 
> > Without this, nf_net can dangle for in-flight files whose net namespace
> > is torn down concurrently, causing a use-after-free when
> > nfsd_file_dispose_list_delayed() calls net_generic(nf->nf_net, ...).
> > 
> > Because nfsd_file_free() now calls put_net(nf->nf_net), the old
> > nfsd_file_put_local() pattern of returning nf->nf_net after
> > nfsd_file_put() is unsafe -- put_net() could theoretically drop the
> > last net namespace reference, leaving the returned pointer stale.
> > Fix this by moving the nfsd_net_put() call into nfsd_file_put_local()
> > itself, before the nfsd_file_put() that may trigger nfsd_file_free().
> > The function now returns void and the caller no longer needs to handle
> > the net reference.
> 
> That means that each nfsd_file_alloc()/nfsd_file_free() is now touching
> the same cacheline on kernels with netns enabled.  Scalability implications
> might be interesting...

That's definitely a valid concern. I looked at a couple of alternatives
but they turned out to be pretty nasty.

One thing we can do is only take a net reference for GC'ed files, since
we don't need a reference for the others. That would cut down some of
the get/put_net activity, but it's still likely to be substantial. I'll
plan to implement that for v2.

I'll keep thinking about other ways to do this, but I think we're stuck
taking net references for at least some of these for now.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-06-01 18:33 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-01 17:31 [PATCH 0/8] nfsd: fixes for locally-triggerable bugs Jeff Layton
2026-06-01 17:31 ` [PATCH 1/8] nfsd: defer vfree of compound ops to fix rpc_status UAF Jeff Layton
2026-06-01 17:31 ` [PATCH 2/8] nfsd: hold rcu across localio cmpxchg retry Jeff Layton
2026-06-01 17:31 ` [PATCH 3/8] nfs/localio: fix ref leak on nfs_uuid_add_file failure Jeff Layton
2026-06-01 17:31 ` [PATCH 4/8] nfsd: guard nfsd_serv deref in nfsd_file_net_dispose Jeff Layton
2026-06-01 17:31 ` [PATCH 5/8] nfsd: widen nfsd_genl_rqstp address fields to sockaddr_storage Jeff Layton
2026-06-01 17:31 ` [PATCH 6/8] nfsd: fix refcount leak in nfsd_file_lru_add on insertion failure Jeff Layton
2026-06-01 17:31 ` [PATCH 7/8] nfsd: fix fcache_disposal UAF by inlining dispose state into nfsd_net Jeff Layton
2026-06-01 17:31 ` [PATCH 8/8] nfsd: hold net namespace reference in nfsd_file Jeff Layton
2026-06-01 17:50   ` Al Viro
2026-06-01 18:18     ` Jeff Layton
2026-06-01 18:33     ` Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox